You are here

OBJECT TRACKING AND ACTIVITY RECOGNITION IN VIDEO ACQUIRED USING MOBILE CAMERAS

Download pdf | Full Screen View

Date Issued:
2004
Abstract/Description:
Due to increasing demand on deployable surveillance systems in recent years, object tracking and activity recognition are receiving considerable attention in the research community. This thesis contributes to both the tracking and the activity recognition components of a surveillance system. In particular, for the tracking component, we propose two different approaches for tracking objects in video acquired by mobile cameras, each of which uses a different object shape representation. The first approach tracks the centroids of the objects in Forward Looking Infrared Imagery (FLIR) and is suitable for tracking objects that appear small in airborne video. The second approach tracks the complete contours of the objects, and is suitable for higher level vision problems, such as activity recognition, identification and classification. Using the contours tracked by the contour tracker, we propose a novel representation, called the action sketch, for recognizing human activities.Object Tracking in Airborne Imagery: Images obtained from an airborne vehicle generally appear small and can be represented by geometric shapes such as circle or rectangle. After detecting the object position in the first frame, the proposed object tracker models the intensity and the local standard deviation of the object region defined by the shape model. It then tracks the objects by computing the mean-shift vector that minimizes the distance between the kernel distribution for the hypothesized object and its prior. In cases when the ego-motion of the sensor causes the object to move more than the operational limits of the tracking module, a multi-resolution global motion compensation using the Gabor responses of consecutive frames is performed. The experiments performed on the AMCOM FLIR data set show the robustness of the proposed method, which combines automatic model update and global motion compensation into one framework.Contour Tracker: Contour tracking is performed by evolving an initial contour toward the correct object boundaries based on discriminant analysis, which is formulated as a variational calculus problem. Once the contour is initialized, the method generates an online shape model for the object along with the color and the texture priors for both the object and the background regions. A priori texture and color PDFs of the regions are then fused based on the discrimination properties of the features between the object and the background models. The models are then used to compute the posteriori contour likelihood and the evolution is obtained by the Maximum a Posteriori Estimation process, which updates the contour in the gradient ascent direction of the proposed energy functional. During occlusion, the online shape model is used to complete the missing object region. The proposed energy functional unifies commonly used boundary and region based contour approaches into a single framework through a support region defined around the hypothesized object contour. We tested the robustness of the proposed contour tracker using several real sequences and have verified qualitatively that the contours of the objects are perfectly tracked.Behavior Analysis: We propose a novel approach to represent human actions by modeling the dynamics (motion) and the structure (shape) of the objects in video. Both the motion and the shape are modeled using a compact representation, which is called the ``action sketch''. An action sketch is a view invariant representation obtained by analyzing important changes that occur during the motion of the objects. When an actor performs an action in 3D, the points on the actor generate space-time trajectories in four dimensions $(x,y,z,t)$. Projection of the world to the imaging coordinates converts the space-time trajectories into the spatio-temporal trajectories in three dimensions $(x,y,t)$. A set of spatio-temporal trajectories constitute a 3D volume, which we call an ``action volume''. This volum
Title: OBJECT TRACKING AND ACTIVITY RECOGNITION IN VIDEO ACQUIRED USING MOBILE CAMERAS.
19 views
7 downloads
Name(s): Yilmaz, Alper, Author
Shah, Mubarak, Committee Chair
University of Central Florida, Degree Grantor
Type of Resource: text
Date Issued: 2004
Publisher: University of Central Florida
Language(s): English
Abstract/Description: Due to increasing demand on deployable surveillance systems in recent years, object tracking and activity recognition are receiving considerable attention in the research community. This thesis contributes to both the tracking and the activity recognition components of a surveillance system. In particular, for the tracking component, we propose two different approaches for tracking objects in video acquired by mobile cameras, each of which uses a different object shape representation. The first approach tracks the centroids of the objects in Forward Looking Infrared Imagery (FLIR) and is suitable for tracking objects that appear small in airborne video. The second approach tracks the complete contours of the objects, and is suitable for higher level vision problems, such as activity recognition, identification and classification. Using the contours tracked by the contour tracker, we propose a novel representation, called the action sketch, for recognizing human activities.Object Tracking in Airborne Imagery: Images obtained from an airborne vehicle generally appear small and can be represented by geometric shapes such as circle or rectangle. After detecting the object position in the first frame, the proposed object tracker models the intensity and the local standard deviation of the object region defined by the shape model. It then tracks the objects by computing the mean-shift vector that minimizes the distance between the kernel distribution for the hypothesized object and its prior. In cases when the ego-motion of the sensor causes the object to move more than the operational limits of the tracking module, a multi-resolution global motion compensation using the Gabor responses of consecutive frames is performed. The experiments performed on the AMCOM FLIR data set show the robustness of the proposed method, which combines automatic model update and global motion compensation into one framework.Contour Tracker: Contour tracking is performed by evolving an initial contour toward the correct object boundaries based on discriminant analysis, which is formulated as a variational calculus problem. Once the contour is initialized, the method generates an online shape model for the object along with the color and the texture priors for both the object and the background regions. A priori texture and color PDFs of the regions are then fused based on the discrimination properties of the features between the object and the background models. The models are then used to compute the posteriori contour likelihood and the evolution is obtained by the Maximum a Posteriori Estimation process, which updates the contour in the gradient ascent direction of the proposed energy functional. During occlusion, the online shape model is used to complete the missing object region. The proposed energy functional unifies commonly used boundary and region based contour approaches into a single framework through a support region defined around the hypothesized object contour. We tested the robustness of the proposed contour tracker using several real sequences and have verified qualitatively that the contours of the objects are perfectly tracked.Behavior Analysis: We propose a novel approach to represent human actions by modeling the dynamics (motion) and the structure (shape) of the objects in video. Both the motion and the shape are modeled using a compact representation, which is called the ``action sketch''. An action sketch is a view invariant representation obtained by analyzing important changes that occur during the motion of the objects. When an actor performs an action in 3D, the points on the actor generate space-time trajectories in four dimensions $(x,y,z,t)$. Projection of the world to the imaging coordinates converts the space-time trajectories into the spatio-temporal trajectories in three dimensions $(x,y,t)$. A set of spatio-temporal trajectories constitute a 3D volume, which we call an ``action volume''. This volum
Identifier: CFE0000101 (IID), ucf:52858 (fedora)
Note(s): 2004-08-01
Ph.D.
College of Engineering and Computer Science, School of Computer Science
This record was generated from author submitted information.
Subject(s): tracking
level sets
occlusion
mean shift
contour
survey
behavior recogntion
action recognition
actino modelling
Persistent Link to This Record: http://purl.flvc.org/ucf/fd/CFE0000101
Restrictions on Access: public
Host Institution: UCF

In Collections