Current Search: vision (x) » optical flow (x)
View All Items
- Title
- DEPTH FROM DEFOCUSED MOTION.
- Creator
-
Myles, Zarina, da Vitoria Lobo, Niels, University of Central Florida
- Abstract / Description
-
Motion in depth and/or zooming causes defocus blur. This work presents a solution to the problem of using defocus blur and optical flow information to compute depth at points that defocus when they move.We first formulate a novel algorithm which recovers defocus blur and affine parameters simultaneously. Next we formulate a novel relationship (the blur-depth relationship) between defocus blur, relative object depth and three parameters based on camera motion and intrinsic camera parameters.We...
Show moreMotion in depth and/or zooming causes defocus blur. This work presents a solution to the problem of using defocus blur and optical flow information to compute depth at points that defocus when they move.We first formulate a novel algorithm which recovers defocus blur and affine parameters simultaneously. Next we formulate a novel relationship (the blur-depth relationship) between defocus blur, relative object depth and three parameters based on camera motion and intrinsic camera parameters.We can handle the situation where a single image has points which have defocused, got sharper or are focally unperturbed. Moreover, our formulation is valid regardless of whether the defocus is due to the image plane being in front of or behind the point of sharp focus.The blur-depth relationship requires a sequence of at least three images taken with the camera moving either towards or away from the object. It can be used to obtain an initial estimate of relative depth using one of several non-linear methods. We demonstrate a solution based on the Extended Kalman Filter in which the measurement equation is the blur-depth relationship.The estimate of relative depth is then used to compute an initial estimate of camera motion parameters. In order to refine depth values, the values of relative depth and camera motion are then input into a second Extended Kalman Filter in which the measurement equations are the discrete motion equations. This set of cascaded Kalman filters can be employed iteratively over a longer sequence of images in order to further refine depth.We conduct several experiments on real scenery in order to demonstrate the range of object shapes that the algorithm can handle. We show that fairly good estimates of depth can be obtained with just three images.
Show less - Date Issued
- 2004
- Identifier
- CFE0000135, ucf:46179
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000135
- Title
- MARKERLESS TRACKING USING POLAR CORRELATION OF CAMERA OPTICAL FLOW.
- Creator
-
Gupta, Prince, da Vitoria Lobo, Niels, University of Central Florida
- Abstract / Description
-
We present a novel, real-time, markerless vision-based tracking system, employing a rigid orthogonal configuration of two pairs of opposing cameras. Our system uses optical flow over sparse features to overcome the limitation of vision-based systems that require markers or a pre-loaded model of the physical environment. We show how opposing cameras enable cancellation of common components of optical flow leading to an efficient tracking algorithm that captures five degrees of freedom...
Show moreWe present a novel, real-time, markerless vision-based tracking system, employing a rigid orthogonal configuration of two pairs of opposing cameras. Our system uses optical flow over sparse features to overcome the limitation of vision-based systems that require markers or a pre-loaded model of the physical environment. We show how opposing cameras enable cancellation of common components of optical flow leading to an efficient tracking algorithm that captures five degrees of freedom including direction of translation and angular velocity. Experiments comparing our device with an electromagnetic tracker show that its average tracking accuracy is 80% over 185 frames, and it is able to track large range motions even in outdoor settings. We also present how opposing cameras in vision-based inside-looking-out systems can be used for gesture recognition. To demonstrate our approach, we discuss three different algorithms for recovering motion parameters at different levels of complete recovery. We show how optical flow in opposing cameras can be used to recover motion parameters of the multi-camera rig. Experimental results show gesture recognition accuracy of 88.0%, 90.7% and 86.7% for our three techniques, respectively, across a set of 15 gestures.
Show less - Date Issued
- 2010
- Identifier
- CFE0003163, ucf:48611
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003163
- Title
- PATTERNS OF MOTION: DISCOVERY AND GENERALIZED REPRESENTATION.
- Creator
-
Saleemi, Imran, Shah, Mubarak, University of Central Florida
- Abstract / Description
-
In this dissertation, we address the problem of discovery and representation of motion patterns in a variety of scenarios, commonly encountered in vision applications. The overarching goal is to devise a generic representation, that captures any kind of object motion observable in video sequences. Such motion is a significant source of information typically employed for diverse applications such as tracking, anomaly detection, and action and event recognition. We present statistical...
Show moreIn this dissertation, we address the problem of discovery and representation of motion patterns in a variety of scenarios, commonly encountered in vision applications. The overarching goal is to devise a generic representation, that captures any kind of object motion observable in video sequences. Such motion is a significant source of information typically employed for diverse applications such as tracking, anomaly detection, and action and event recognition. We present statistical frameworks for representation of motion characteristics of objects, learned from tracks or optical flow, for static as well as moving cameras, and propose algorithms for their application to a variety of problems. The proposed motion pattern models and learning methods are general enough to be employed in a variety of problems as we demonstrate experimentally. We first propose a novel method to model and learn the scene activity, observed by a static camera. The motion patterns of objects in the scene are modeled in the form of a multivariate non-parametric probability density function of spatiotemporal variables (object locations and transition times between them). Kernel Density Estimation (KDE) is used to learn this model in a completely unsupervised fashion. Learning is accomplished by observing the trajectories of objects by a static camera over extended periods of time. The model encodes the probabilistic nature of the behavior of moving objects in the scene and is useful for activity analysis applications, such as persistent tracking and anomalous motion detection. In addition, the model also captures salient scene features, such as, the areas of occlusion and most likely paths. Once the model is learned, we use a unified Markov Chain Monte-Carlo (MCMC) based framework for generating the most likely paths in the scene, improving foreground detection, persistent labelling of objects during tracking and deciding whether a given trajectory represents an anomaly to the observed motion patterns. Experiments with real world videos are reported which validate the proposed approach. The representation and estimation framework proposed above, however, has a few limitations. This algorithm proposes to use a single global statistical distribution to represent all kinds of motion observed in a particular scene. It therefore, does not find a separation between multiple semantically distinct motion patterns in the scene. Instead, the learned model is a joint distribution over all possible patterns followed by objects. To overcome this limitation, we then propose a superior method for the discovery and statistical representation of motion patterns in a scene. The advantages of this approach over the first one are two-fold: first, this model is applicable to scenes of dense crowded motion where tracking may not be feasible, and second, it distinguishes between motion patterns that are distinct at a semantic level of abstraction. We propose a mixture model representation of salient patterns of optical flow, and present an algorithm for learning these patterns from dense optical flow in a hierarchical, unsupervised fashion. Using low level cues of noisy optical flow, K-means is employed to initialize a Gaussian mixture model for temporally segmented clips of video. The components of this mixture are then filtered and instances of motion patterns are computed using a simple motion model, by linking components across space and time. Motion patterns are then initialized and membership of instances in different motion patterns is established by using KL divergence between mixture distributions of pattern instances. Finally, a pixel level representation of motion patterns is proposed by deriving conditional expectation of optical flow. Results of extensive experiments are presented for multiple surveillance sequences containing numerous patterns involving both pedestrian and vehicular traffic. The proposed method exploits optical flow as the low level feature and performs a hierarchical clustering to obtain motion patterns; and we observe that the use of optical flow is also an integral part of a variety of other vision applications, for example, as features based representation of human actions. We, therefore, propose a new representation for articulated human actions using the motion patterns. The representation is based on hierarchical clustering of observed optical flow in four dimensional, spatial and motion flow space. The automatically discovered motion patterns, are the primitive actions, representative of flow at salient regions on the human body, much like trajectories of body joints, which are notoriously difficult to obtain automatically. The proposed method works in a completely unsupervised fashion, and in sharp contrast to state of the art representations like bag of video words, provides a truly semantically meaningful representation. Each primitive action depicts the most atomic sub-action, like left arm moving upwards, or right leg moving downward and leftward, and is represented by a mixture of four dimensional Gaussian distributions. A sequence of primitive actions are discovered in the test video, and labelled by computing the KL divergence between mixtures. The entire video sequence containing the human action, is thus reduced to a simple string, which is matched against similar strings of training videos to classify the action. The string matching is performed by global alignment, using the well-known Needleman-Wunsch algorithm. Experiments reported on multiple human actions data sets, confirm the validity, simplicity, and semantically meaningful nature of the proposed representation. Results obtained are encouraging and comparable to the state of the art.
Show less - Date Issued
- 2011
- Identifier
- CFE0003646, ucf:48836
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003646