Current Search: Kinect (x)
-
-
Title
-
A Study of Localization and Latency Reduction for Action Recognition.
-
Creator
-
Masood, Syed, Tappen, Marshall, Foroosh, Hassan, Stanley, Kenneth, Sukthankar, Rahul, University of Central Florida
-
Abstract / Description
-
The success of recognizing periodic actions in single-person-simple-background datasets, such as Weizmann and KTH, has created a need for more complex datasets to push the performance of action recognition systems. In this work, we create a new synthetic action dataset and use it to highlight weaknesses in current recognition systems. Experiments show that introducing background complexity to action video sequences causes a significant degradation in recognition performance. Moreover, this...
Show moreThe success of recognizing periodic actions in single-person-simple-background datasets, such as Weizmann and KTH, has created a need for more complex datasets to push the performance of action recognition systems. In this work, we create a new synthetic action dataset and use it to highlight weaknesses in current recognition systems. Experiments show that introducing background complexity to action video sequences causes a significant degradation in recognition performance. Moreover, this degradation cannot be fixed by fine-tuning system parameters or by selecting better feature points. Instead, we show that the problem lies in the spatio-temporal cuboid volume extracted from the interest point locations. Having identified the problem, we show how improved results can be achieved by simple modifications to the cuboids.For the above method however, one requires near-perfect localization of the action within a video sequence. To achieve this objective, we present a two stage weakly supervised probabilistic model for simultaneous localization and recognition of actions in videos. Different from previous approaches, our method is novel in that it (1) eliminates the need for manual annotations for the training procedure and (2) does not require any human detection or tracking in the classification stage. The first stage of our framework is a probabilistic action localization model which extracts the most promising sub-windows in a video sequence where an action can take place. We use a non-linear classifier in the second stage of our framework for the final classification task. We show the effectiveness of our proposed model on two well known real-world datasets: UCF Sports and UCF11 datasets.Another application of the weakly supervised probablistic model proposed above is in the gaming environment. An important aspect in designing interactive, action-based interfaces is reliably recognizing actions with minimal latency. High latency causes the system's feedback to lag behind and thus significantly degrade the interactivity of the user experience. With slight modification to the weakly supervised probablistic model we proposed for action localization, we show how it can be used for reducing latency when recognizing actions in Human Computer Interaction (HCI) environments. This latency-aware learning formulation trains a logistic regression-based classifier that automatically determines distinctive canonical poses from the data and uses these to robustly recognize actions in the presence of ambiguous poses. We introduce a novel (publicly released) dataset for the purpose of our experiments. Comparisons of our method against both a Bag of Words and a Conditional Random Field (CRF) classifier show improved recognition performance for both pre-segmented and online classification tasks.
Show less
-
Date Issued
-
2012
-
Identifier
-
CFE0004575, ucf:49210
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0004575
-
-
Title
-
LABELED SAMPLING CONSENSUS: A NOVEL ALGORITHM FOR ROBUSTLY FITTING MULTIPLE STRUCTURES USING COMPRESSED SAMPLING.
-
Creator
-
Messina, Carl, Foroosh, Hassan, University of Central Florida
-
Abstract / Description
-
The ability to robustly fit structures in datasets that contain outliers is a very important task in Image Processing, Pattern Recognition and Computer Vision. Random Sampling Consensus or RANSAC is a very popular method for this task, due to its ability to handle over 50% outliers. The problem with RANSAC is that it is only capable of finding a single structure. Therefore, if a dataset contains multiple structures, they must be found sequentially by finding the best fit, removing the points,...
Show moreThe ability to robustly fit structures in datasets that contain outliers is a very important task in Image Processing, Pattern Recognition and Computer Vision. Random Sampling Consensus or RANSAC is a very popular method for this task, due to its ability to handle over 50% outliers. The problem with RANSAC is that it is only capable of finding a single structure. Therefore, if a dataset contains multiple structures, they must be found sequentially by finding the best fit, removing the points, and repeating the process. However, removing incorrect points from the dataset could prove disastrous. This thesis offers a novel approach to sampling consensus that extends its ability to discover multiple structures in a single iteration through the dataset. The process introduced is an unsupervised method, requiring no previous knowledge to the distribution of the input data. It uniquely assigns labels to different instances of similar structures. The algorithm is thus called Labeled Sampling Consensus or L-SAC. These unique instances will tend to cluster around one another allowing the individual structures to be extracted using simple clustering techniques. Since divisions instead of modes are analyzed, only a single instance of a structure need be recovered. This ability of L-SAC allows a novel sampling procedure to be presented "compressing" the required samples needed compared to traditional sampling schemes while ensuring all structures have been found. L-SAC is a flexible framework that can be applied to many problem domains.
Show less
-
Date Issued
-
2011
-
Identifier
-
CFE0003893, ucf:48727
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0003893
-
-
Title
-
Automatic 3D human modeling: an initial stage towards 2-way inside interaction in mixed reality.
-
Creator
-
Xiong, Yiyan, Hughes, Charles, Pattanaik, Sumanta, Laviola II, Joseph, Moshell, Michael, University of Central Florida
-
Abstract / Description
-
3D human models play an important role in computer graphics applications from a wide range of domains, including education, entertainment, medical care simulation and military training. In many situations, we want the 3D model to have a visual appearance that matches that of a specific living person and to be able to be controlled by that person in a natural manner. Among other uses, this approach supports the notion of human surrogacy, where the virtual counterpart provides a remote presence...
Show more3D human models play an important role in computer graphics applications from a wide range of domains, including education, entertainment, medical care simulation and military training. In many situations, we want the 3D model to have a visual appearance that matches that of a specific living person and to be able to be controlled by that person in a natural manner. Among other uses, this approach supports the notion of human surrogacy, where the virtual counterpart provides a remote presence for the human who controls the virtual character's behavior. In this dissertation, a human modeling pipeline is proposed for the problem of creating a 3D digital model of a real person. Our solution involves reshaping a 3D human template with a 2D contour of the participant and then mapping the captured texture of that person to the generated mesh. Our method produces an initial contour of a participant by extracting the user image from a natural background. One particularly novel contribution in our approach is the manner in which we improve the initial vertex estimate. We do so through a variant of the ShortStraw corner-finding algorithm commonly used in sketch-based systems. Here, we develop improvements to ShortStraw, presenting an algorithm called IStraw, and then introduce adaptations of this improved version to create a corner-based contour segmentatiuon algorithm. This algorithm provides significant improvements on contour matching over previously developed systems, and does so with low computational complexity. The system presented here advances the state of the art in the following aspects. First, the human modeling process is triggered automatically by matching the participant's pose with an initial pose through a tracking device and software. In our case, the pose capture and skeletal model are provided by the Microsoft Kinect and its associated SDK. Second, color image, depth data, and human tracking information from the Kinect and its SDK are used to automatically extract the contour of the participant and then generate a 3D human model with skeleton. Third, using the pose and the skeletal model, we segment the contour into eight parts and then match the contour points on each segment to a corresponding anchor set associated with a 3D human template. Finally, we map the color image of the person to the 3D model as its corresponding texture map. The whole modeling process only take several seconds and the resulting human model looks like the real person. The geometry of the 3D model matches the contour of the real person, and the model has a photorealistic texture. Furthermore, the mesh of the human model is attached to the skeleton provided in the template, so the model can support programmed animations or be controlled by real people. This human control is commonly done through a literal mapping (motion capture) or a gesture-based puppetry system. Our ultimate goal is to create a mixed reality (MR) system, in which the participants can manipulate virtual objects, and in which these virtual objects can affect the participant, e.g., by restricting theirmobility. This MR system prototype design motivated the work of this dissertation, since a realistic 3D human model of the participant is an essential part of implementing this vision.
Show less
-
Date Issued
-
2014
-
Identifier
-
CFE0005277, ucf:50543
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0005277