You are here

Human Action Detection, Tracking and Segmentation in Videos

Download pdf | Full Screen View

Date Issued:
2018
Abstract/Description:
This dissertation addresses the problem of human action detection, human tracking and segmentation in videos. They are fundamental tasks in computer vision and are extremely challenging to solve in realistic videos. We first propose a novel approach for action detection by exploring the generalization of deformable part models from 2D images to 3D spatiotemporal volumes. By focusing on the most distinctive parts of each action, our models adapt to intra-class variation and show robustness to clutter. This approach deals with detecting action performed by a single person. When there are multiple humans in the scene, humans need to be segmented and tracked from frame to frame before action recognition can be performed. Next, we propose a novel approach for multiple object tracking (MOT) by formulating detection and data association in one framework. Our method allows us to overcome the confinements of data association based MOT approaches, where the performance is dependent on the object detection results provided at input level. We show that automatically detecting and tracking targets in a single framework can help resolve the ambiguities due to frequent occlusion and heavy articulation of targets. In this tracker, targets are represented by bounding boxes, which is a coarse representation. However, pixel-wise object segmentation provides fine level information, which is desirable for later tasks. Finally, we propose a tracker that simultaneously solves three main problems: detection, data association and segmentation. This is especially important because the output of each of those three problems are highly correlated and the solution of one can greatly help improve the others. The proposed approach achieves more accurate segmentation results and also helps better resolve typical difficulties in multiple target tracking, such as occlusion, ID-switch and track drifting.
Title: Human Action Detection, Tracking and Segmentation in Videos.
49 views
24 downloads
Name(s): Tian, Yicong, Author
Shah, Mubarak, Committee Chair
Bagci, Ulas, Committee Member
Liu, Fei, Committee Member
Walker, John, Committee Member
University of Central Florida, Degree Grantor
Type of Resource: text
Date Issued: 2018
Publisher: University of Central Florida
Language(s): English
Abstract/Description: This dissertation addresses the problem of human action detection, human tracking and segmentation in videos. They are fundamental tasks in computer vision and are extremely challenging to solve in realistic videos. We first propose a novel approach for action detection by exploring the generalization of deformable part models from 2D images to 3D spatiotemporal volumes. By focusing on the most distinctive parts of each action, our models adapt to intra-class variation and show robustness to clutter. This approach deals with detecting action performed by a single person. When there are multiple humans in the scene, humans need to be segmented and tracked from frame to frame before action recognition can be performed. Next, we propose a novel approach for multiple object tracking (MOT) by formulating detection and data association in one framework. Our method allows us to overcome the confinements of data association based MOT approaches, where the performance is dependent on the object detection results provided at input level. We show that automatically detecting and tracking targets in a single framework can help resolve the ambiguities due to frequent occlusion and heavy articulation of targets. In this tracker, targets are represented by bounding boxes, which is a coarse representation. However, pixel-wise object segmentation provides fine level information, which is desirable for later tasks. Finally, we propose a tracker that simultaneously solves three main problems: detection, data association and segmentation. This is especially important because the output of each of those three problems are highly correlated and the solution of one can greatly help improve the others. The proposed approach achieves more accurate segmentation results and also helps better resolve typical difficulties in multiple target tracking, such as occlusion, ID-switch and track drifting.
Identifier: CFE0007378 (IID), ucf:52069 (fedora)
Note(s): 2018-12-01
Ph.D.
Engineering and Computer Science, Computer Science
Doctoral
This record was generated from author submitted information.
Subject(s): action detection -- multiple target tracking -- video object segmentation
Persistent Link to This Record: http://purl.flvc.org/ucf/fd/CFE0007378
Restrictions on Access: public 2018-12-15
Host Institution: UCF

In Collections