A method, system, article of manufacture, and UAV configured to identify a target object shown in an image, such as shown in a perspective view that is a two-dimensional image or frame of video. The method comprises: identifying and track the position of a target object shown in a sequence of images or video, even when the target object may be traveling at high speeds, detecting the target object within an image based on one or more of the object's physical characteristics, such as its color, shape, size, chrominance, luminance, brightness, lightness, darkness, and/or other characteristics. Thus, in this context a target object may be anything having one or more detectable physical characteristics. The method also providing an improved and more intuitive user interface that enables a user to select a target object for tracking. As a result, the method and system improve the accuracy, usability, and robustness of the system.