Current Search: recognition (x)
View All Items
Pages
- Title
- An Unsupervised Consensus Control Chart Pattern Recognition Framework.
- Creator
-
Haghtalab, Siavash, Xanthopoulos, Petros, Pazour, Jennifer, Rabelo, Luis, University of Central Florida
- Abstract / Description
-
Early identification and detection of abnormal time series patterns is vital for a number of manufacturing.Slide shifts and alterations of time series patterns might be indicative of some anomalyin the production process, such as machinery malfunction. Usually due to the continuous flow of data monitoring of manufacturing processes requires automated Control Chart Pattern Recognition(CCPR) algorithms. The majority of CCPR literature consists of supervised classification algorithms. Less...
Show moreEarly identification and detection of abnormal time series patterns is vital for a number of manufacturing.Slide shifts and alterations of time series patterns might be indicative of some anomalyin the production process, such as machinery malfunction. Usually due to the continuous flow of data monitoring of manufacturing processes requires automated Control Chart Pattern Recognition(CCPR) algorithms. The majority of CCPR literature consists of supervised classification algorithms. Less studies consider unsupervised versions of the problem. Despite the profound advantageof unsupervised methodology for less manual data labeling their use is limited due to thefact that their performance is not robust enough for practical purposes. In this study we propose the use of a consensus clustering framework. Computational results show robust behavior compared to individual clustering algorithms.
Show less - Date Issued
- 2014
- Identifier
- CFE0005178, ucf:50670
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005178
- Title
- Improving Efficiency in Deep Learning for Large Scale Visual Recognition.
- Creator
-
Liu, Baoyuan, Foroosh, Hassan, Qi, GuoJun, Welch, Gregory, Sukthankar, Rahul, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
The emerging recent large scale visual recognition methods, and in particular the deep Convolutional Neural Networks (CNN), are promising to revolutionize many computer vision based artificial intelligent applications, such as autonomous driving and online image retrieval systems. One of the main challenges in large scale visual recognition is the complexity of the corresponding algorithms. This is further exacerbated by the fact that in most real-world scenarios they need to run in real time...
Show moreThe emerging recent large scale visual recognition methods, and in particular the deep Convolutional Neural Networks (CNN), are promising to revolutionize many computer vision based artificial intelligent applications, such as autonomous driving and online image retrieval systems. One of the main challenges in large scale visual recognition is the complexity of the corresponding algorithms. This is further exacerbated by the fact that in most real-world scenarios they need to run in real time and on platforms that have limited computational resources. This dissertation focuses on improving the efficiency of such large scale visual recognition algorithms from several perspectives. First, to reduce the complexity of large scale classification to sub-linear with the number of classes, a probabilistic label tree framework is proposed. A test sample is classified by traversing the label tree from the root node. Each node in the tree is associated with a probabilistic estimation of all the labels. The tree is learned recursively with iterative maximum likelihood optimization. Comparing to the hard label partition proposed previously, the probabilistic framework performs classification more accurately with similar efficiency. Second, we explore the redundancy of parameters in Convolutional Neural Networks (CNN) and employ sparse decomposition to significantly reduce both the amount of parameters and computational complexity. Both inter-channel and inner-channel redundancy is exploit to achieve more than 90\% sparsity with approximately 1\% drop of classification accuracy. We also propose a CPU based efficient sparse matrix multiplication algorithm to reduce the actual running time of CNN models with sparse convolutional kernels. Third, we propose a multi-stage framework based on CNN to achieve better efficiency than a single traditional CNN model. With a combination of cascade model and the label tree framework, the proposed method divides the input images in both the image space and the label space, and processes each image with CNN models that are most suitable and efficient. The average complexity of the framework is significantly reduced, while the overall accuracy remains the same as in the single complex model.
Show less - Date Issued
- 2016
- Identifier
- CFE0006472, ucf:51436
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006472
- Title
- Online, Supervised and Unsupervised Action Localization in Videos.
- Creator
-
Soomro, Khurram, Shah, Mubarak, Heinrich, Mark, Hu, Haiyan, Bagci, Ulas, Yun, Hae-Bum, University of Central Florida
- Abstract / Description
-
Action recognition classifies a given video among a set of action labels, whereas action localization determines the location of an action in addition to its class. The overall aim of this dissertation is action localization. Many of the existing action localization approaches exhaustively search (spatially and temporally) for an action in a video. However, as the search space increases with high resolution and longer duration videos, it becomes impractical to use such sliding window...
Show moreAction recognition classifies a given video among a set of action labels, whereas action localization determines the location of an action in addition to its class. The overall aim of this dissertation is action localization. Many of the existing action localization approaches exhaustively search (spatially and temporally) for an action in a video. However, as the search space increases with high resolution and longer duration videos, it becomes impractical to use such sliding window techniques. The first part of this dissertation presents an efficient approach for localizing actions by learning contextual relations between different video regions in training. In testing, we use the context information to estimate the probability of each supervoxel belonging to the foreground action and use Conditional Random Field (CRF) to localize actions. In the above method and typical approaches to this problem, localization is performed in an offline manner where all the video frames are processed together. This prevents timely localization and prediction of actions/interactions - an important consideration for many tasks including surveillance and human-machine interaction. Therefore, in the second part of this dissertation we propose an online approach to the challenging problem of localization and prediction of actions/interactions in videos. In this approach, we use human poses and superpixels in each frame to train discriminative appearance models and perform online prediction of actions/interactions with Structural SVM. Above two approaches rely on human supervision in the form of assigning action class labels to videos and annotating actor bounding boxes in each frame of training videos. Therefore, in the third part of this dissertation we address the problem of unsupervised action localization. Given unlabeled videos without annotations, this approach aims at: 1) Discovering action classes using a discriminative clustering approach, and 2) Localizing actions using a variant of Knapsack problem.
Show less - Date Issued
- 2017
- Identifier
- CFE0006917, ucf:51685
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006917
- Title
- Visual Scanpath Training for Facial Affect Recognition in a Psychiatric Sample.
- Creator
-
Chan, Chi, Bedwell, Jeffrey, Cassisi, Jeffrey, Sims, Valerie, University of Central Florida
- Abstract / Description
-
Social cognition is essential for functional outcome and quality of life in psychiatric patients. Facial affect recognition (FAR), a domain of social cognition, is impaired in many patients with schizophrenia and bipolar disorder. There is evidence that abnormal visual scanpath patterns may underlie FAR deficits, and metacognitive factors may impact task performance. The present study aimed to develop a brief, individually-administered, computerized training program to normalize scanpath...
Show moreSocial cognition is essential for functional outcome and quality of life in psychiatric patients. Facial affect recognition (FAR), a domain of social cognition, is impaired in many patients with schizophrenia and bipolar disorder. There is evidence that abnormal visual scanpath patterns may underlie FAR deficits, and metacognitive factors may impact task performance. The present study aimed to develop a brief, individually-administered, computerized training program to normalize scanpath patterns in order to improve FAR in patient with a psychosis history or bipolar I disorder. The program was developed using scanpath data from 19 nonpsychiatric controls (NC) while they completed a FAR tasks that involved identification of mild or extreme intensity happy, sad, angry, and fearful faces, and a neutral expression. Patients were randomized to a waitlist (WG; n = 16) or training group (TG; n = 18). Both patient groups completed a baseline FAR task (T0), the training (or a repeated FAR task as a control for WG; T1), and a post-training FAR task (T2). Patients evaluated their own performance and eyetracking data were recorded. Results indicated that the patient groups did not differ from NC on FAR performance, metacognitive accuracy, or scanpath patterns at T0. TG was compliant with the training program and showed changes in scanpath patterns during T1, but returned to baseline scanpath patterns at T2. WG and TG did not differ at T2 on FAR performance, metacognitive accuracy, or scanpath patterns. Across both patient groups, FAR performance for mild intensity emotions were more sensitive to the effect of time than for extreme intensity emotions. Exploratory analysis showed that at baseline, greater severity of negative symptoms was associated with poorer metacognitive accuracy (i.e., accuracy in their evaluation of their performance). Limitations to the study and future directions are discussed.
Show less - Date Issued
- 2016
- Identifier
- CFE0006280, ucf:51613
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006280
- Title
- Prototype Development in General Purpose Representation and Association Machine Using Communication Theory.
- Creator
-
Li, Huihui, Wei, Lei, Rahnavard, Nazanin, Vosoughi, Azadeh, Da Vitoria Lobo, Niels, Wang, Wei, University of Central Florida
- Abstract / Description
-
Biological system study has been an intense research area in neuroscience and cognitive science for decades of years. Biological human brain is created as an intelligent system that integrates various types of sensor information and processes them intelligently. Neurons, as activated brain cells help the brain to make instant and rough decisions. From the 1950s, researchers start attempting to understand the strategies the biological system employs, then eventually translate them into machine...
Show moreBiological system study has been an intense research area in neuroscience and cognitive science for decades of years. Biological human brain is created as an intelligent system that integrates various types of sensor information and processes them intelligently. Neurons, as activated brain cells help the brain to make instant and rough decisions. From the 1950s, researchers start attempting to understand the strategies the biological system employs, then eventually translate them into machine-based algorithms. Modern computers have been developed to meet our need to handle computational tasks which our brains are not capable of performing with precision and speed. While in these existing man-made intelligent systems, most of them are designed for specific purposes. The modern computers solve sophistic problems based on fixed representation and association formats, instead of employing versatile approaches to explore the unsolved problems.Because of the above limitations of the conventional machines, General Purpose Representation and Association Machine (GPRAM) System is proposed to focus on using a versatile approach with hierarchical representation and association structures to do a quick and rough assessment on multitasks. Through lessons learned from neuroscience, error control coding and digital communications, a prototype of GPRAM system by employing (7,4) Hamming codes and short Low-Density Parity Check (LDPC) codes is implemented. Types of learning processes are presented, which prove the capability of GPRAM for handling multitasks.Furthermore, a study of low resolution simple patterns and face images recognition using an Image Processing Unit (IPU) structure for GPRAM system is presented. IPU structure consists of a randomly constructed LDPC code, an iterative decoder, a switch and scaling, and decision devices. All the input images have been severely degraded to mimic human Visual Information Variability (VIV) experienced in human visual system. The numerical results show that 1) IPU can reliably recognize simple pattern images in different shapes and sizes; 2) IPU demonstrates an excellent multi-class recognition performance for the face images with high degradation. Our results are comparable to popular machine learning recognition methods towards images without any quality degradation; 3) A bunch of methods have been discussed for improving IPU recognition performance, e.g. designing various detection and power scaling methods, constructing specific LDPC codes with large minimum girth, etc.Finally, novel methods to optimize M-ary PSK, M-ary DPSK, and dual-ring QAM signaling with non-equal symbol probabilities over AWGN channels are presented. In digital communication systems, MPSK, MDPSK, and dual-ring QAM signaling with equiprobable symbols have been well analyzed and widely used in practice. Inspired by bio-systems, we suggest investigating signaling with non-equiprobable symbol probabilities, since in bio-systems it is highly-unlikely to follow the ideal setting and uniform construction of single type of system. The results show that the optimizing system has lower error probabilities than conventional systems and the improvements are dramatic. Even though the communication systems are used as the testing environment, clearly, our final goal is to extend current communication theory to accommodate or better understand bio-neural information processing systems.
Show less - Date Issued
- 2017
- Identifier
- CFE0006758, ucf:51846
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006758
- Title
- Scene Understanding for Real Time Processing of Queries over Big Data Streaming Video.
- Creator
-
Aved, Alexander, Hua, Kien, Foroosh, Hassan, Zou, Changchun, Ni, Liqiang, University of Central Florida
- Abstract / Description
-
With heightened security concerns across the globe and the increasing need to monitor, preserve and protect infrastructure and public spaces to ensure proper operation, quality assurance and safety, numerous video cameras have been deployed. Accordingly, they also need to be monitored effectively and efficiently. However, relying on human operators to constantly monitor all the video streams is not scalable or cost effective. Humans can become subjective, fatigued, even exhibit bias and it is...
Show moreWith heightened security concerns across the globe and the increasing need to monitor, preserve and protect infrastructure and public spaces to ensure proper operation, quality assurance and safety, numerous video cameras have been deployed. Accordingly, they also need to be monitored effectively and efficiently. However, relying on human operators to constantly monitor all the video streams is not scalable or cost effective. Humans can become subjective, fatigued, even exhibit bias and it is difficult to maintain high levels of vigilance when capturing, searching and recognizing events that occur infrequently or in isolation.These limitations are addressed in the Live Video Database Management System (LVDBMS), a framework for managing and processing live motion imagery data. It enables rapid development of video surveillance software much like traditional database applications are developed today. Such developed video stream processing applications and ad hoc queries are able to "reuse" advanced image processing techniques that have been developed. This results in lower software development and maintenance costs. Furthermore, the LVDBMS can be intensively tested to ensure consistent quality across all associated video database applications. Its intrinsic privacy framework facilitates a formalized approach to the specification and enforcement of verifiable privacy policies. This is an important step towards enabling a general privacy certification for video surveillance systems by leveraging a standardized privacy specification language.With the potential to impact many important fields ranging from security and assembly line monitoring to wildlife studies and the environment, the broader impact of this work is clear. The privacy framework protects the general public from abusive use of surveillance technology; success in addressing the (")trust(") issue will enable many new surveillance-related applications. Although this research focuses on video surveillance, the proposed framework has the potential to support many video-based analytical applications.
Show less - Date Issued
- 2013
- Identifier
- CFE0004648, ucf:49900
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004648
- Title
- Pen-based Methods For Recognition and Animation of Handwritten Physics Solutions.
- Creator
-
Cheema, Salman, Laviola II, Joseph, Hughes, Charles, Sukthankar, Gita, Hammond, Tracy, University of Central Florida
- Abstract / Description
-
There has been considerable interest in constructing pen-based intelligent tutoring systems due to the natural interaction metaphor and low cognitive load afforded by pen-based interaction. We believe that pen-based intelligent tutoring systems can be further enhanced by integrating animation techniques. In this work, we explore methods for recognizing and animating sketched physics diagrams. Our methodologies enable an Intelligent Tutoring System (ITS) to understand the scenario and...
Show moreThere has been considerable interest in constructing pen-based intelligent tutoring systems due to the natural interaction metaphor and low cognitive load afforded by pen-based interaction. We believe that pen-based intelligent tutoring systems can be further enhanced by integrating animation techniques. In this work, we explore methods for recognizing and animating sketched physics diagrams. Our methodologies enable an Intelligent Tutoring System (ITS) to understand the scenario and requirements posed by a given problem statement and to couple this knowledge with a computational model of the student's handwritten solution. These pieces of information are used to construct meaningful animations and feedback mechanisms that can highlight errors in student solutions. We have constructed a prototype ITS that can recognize mathematics and diagrams in a handwritten solution and infer implicit relationships among diagram elements, mathematics and annotations such as arrows and dotted lines. We use natural language processing to identify the domain of a given problem, and use this information to select one or more of four domain-specific physics simulators to animate the user's sketched diagram. We enable students to use their answers to guide animation behavior and also describe a novel algorithm for checking recognized student solutions. We provide examples of scenarios that can be modeled using our prototype system and discuss the strengths and weaknesses of our current prototype.Additionally, we present the findings of a user study that aimed to identify animation requirements for physics tutoring systems. We describe a taxonomy for categorizing different types of animations for physics problems and highlight how the taxonomy can be used to define requirements for 50 physics problems chosen from a university textbook. We also present a discussion of 56 handwritten solutions acquired from physics students and describe how suitable animations could be constructed for each of them.
Show less - Date Issued
- 2014
- Identifier
- CFE0005472, ucf:50380
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005472
- Title
- Taming Wild Faces: Web-Scale, Open-Universe Face Identification in Still and Video Imagery.
- Creator
-
Ortiz, Enrique, Shah, Mubarak, Sukthankar, Rahul, Da Vitoria Lobo, Niels, Wang, Jun, Li, Xin, University of Central Florida
- Abstract / Description
-
With the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos and videos. In this dissertation, we explore unconstrained still-image and video-based face recognition in real-world scenarios, e.g. social photo sharing and movie trailers, where people of interest are recognized and all others are ignored. In such a scenario, we must obtain high precision in recognizing the known identities,...
Show moreWith the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos and videos. In this dissertation, we explore unconstrained still-image and video-based face recognition in real-world scenarios, e.g. social photo sharing and movie trailers, where people of interest are recognized and all others are ignored. In such a scenario, we must obtain high precision in recognizing the known identities, while accurately rejecting those of no interest.Recent advancements in face recognition research has seen Sparse Representation-based Classification (SRC) advance to the forefront of competing methods. However, its drawbacks, slow speed and sensitivity to variations in pose, illumination, and occlusion, have hindered its wide-spread applicability. The contributions of this dissertation are three-fold: 1. For still-image data, we propose a novel Linearly Approximated Sparse Representation-based Classification (LASRC) algorithm that uses linear regression to perform sample selection for l1-minimization, thus harnessing the speed of least-squares and the robustness of SRC. On our large dataset collected from Facebook, LASRC performs equally to standard SRC with a speedup of 100-250x.2. For video, applying the popular l1-minimization for face recognition on a frame-by-frame basis is prohibitively expensive computationally, so we propose a new algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and employing the knowledge that the face track frames belong to the same individual. Employing MSSRC results in a speedup of 5x on average over SRC on a frame-by-frame basis.3. Finally, we make the observation that MSSRC sometimes assigns inconsistent identities to the same individual in a scene that could be corrected based on their visual similarity. Therefore, we construct a probabilistic affinity graph combining appearance and co-occurrence similarities to model the relationship between face tracks in a video. Using this relationship graph, we employ random walk analysis to propagate strong class predictions among similar face tracks, while dampening weak predictions. Our method results in a performance gain of 15.8% in average precision over using MSSRC alone.
Show less - Date Issued
- 2014
- Identifier
- CFE0005536, ucf:50313
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005536
- Title
- Time and Space Efficient Techniques for Facial Recognition.
- Creator
-
Alrasheed, Waleed, Mikhael, Wasfy, DeMara, Ronald, Haralambous, Michael, Wei, Lei, Myers, Brent, University of Central Florida
- Abstract / Description
-
In recent years, there has been an increasing interest in face recognition. As a result, many new facial recognition techniques have been introduced. Recent developments in the field of face recognition have led to an increase in the number of available face recognition commercial products. However, Face recognition techniques are currently constrained by three main factors: recognition accuracy, computational complexity, and storage requirements. The problem is that most of the current face...
Show moreIn recent years, there has been an increasing interest in face recognition. As a result, many new facial recognition techniques have been introduced. Recent developments in the field of face recognition have led to an increase in the number of available face recognition commercial products. However, Face recognition techniques are currently constrained by three main factors: recognition accuracy, computational complexity, and storage requirements. The problem is that most of the current face recognition techniques succeed in improving one or two of these factors at the expense of the others.In this dissertation, four novel face recognition techniques that improve the storage and computational requirements of face recognition systems are presented and analyzed. Three of the four novel face recognition techniques to be introduced, namely, Quantized/truncated Transform Domain (QTD), Frequency Domain Thresholding and Quantization (FD-TQ), and Normalized Transform Domain (NTD). All the three techniques utilize the Two-dimensional Discrete Cosine Transform (DCT-II), which reduces the dimensionality of facial feature images, thereby reducing the computational complexity. The fourth novel face recognition technique is introduced, namely, the Normalized Histogram Intensity (NHI). It is based on utilizing the pixel intensity histogram of poses' subimages, which reduces the computational complexity and the needed storage requirements. Various simulation experiments using MATLAB were conducted to test the proposed methods. For the purpose of benchmarking the performance of the proposed methods, the simulation experiments were performed using current state-of-the-art face recognition techniques, namely, Two Dimensional Principal Component Analysis (2DPCA), Two-Directional Two-Dimensional Principal Component Analysis ((2D)^2PCA), and Transform Domain Two Dimensional Principal Component Analysis (TD2DPCA). The experiments were applied to the ORL, Yale, and FERET databases.The experimental results for the proposed techniques confirm that the use of any of the four novel techniques examined in this study results in a significant reduction in computational complexity and storage requirements compared to the state-of-the-art techniques without sacrificing the recognition accuracy.
Show less - Date Issued
- 2013
- Identifier
- CFE0005297, ucf:50566
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005297
- Title
- Analysis of Behaviors in Crowd Videos.
- Creator
-
Mehran, Ramin, Shah, Mubarak, Sukthankar, Gita, Behal, Aman, Tappen, Marshall, Moore, Brian, University of Central Florida
- Abstract / Description
-
In this dissertation, we address the problem of discovery and representation of group activity of humans and objects in a variety of scenarios, commonly encountered in vision applications. The overarching goal is to devise a discriminative representation of human motion in social settings, which captures a wide variety of human activities observable in video sequences. Such motion emerges from the collective behavior of individuals and their interactions and is a significant source of...
Show moreIn this dissertation, we address the problem of discovery and representation of group activity of humans and objects in a variety of scenarios, commonly encountered in vision applications. The overarching goal is to devise a discriminative representation of human motion in social settings, which captures a wide variety of human activities observable in video sequences. Such motion emerges from the collective behavior of individuals and their interactions and is a significant source of information typically employed for applications such as event detection, behavior recognition, and activity recognition. We present new representations of human group motion for static cameras, and propose algorithms for their application to variety of problems.We first propose a method to model and learn the scene activity of a crowd using Social Force Model for the first time in the computer vision community. We present a method to densely estimate the interaction forces between people in a crowd, observed by a static camera. Latent Dirichlet Allocation (LDA) is used to learn the model of the normal activities over extended periods of time. Randomly selected spatio-temporal volumes of interaction forces are used to learn the model of normal behavior of the scene. The model encodes the latent topics of social interaction forces in the scene for normal behaviors. We classify a short video sequence of $n$ frames as normal or abnormal by using the learnt model. Once a sequence of frames is classified as an abnormal, the regions of anomalies in the abnormal frames are localized using the magnitude of interaction forces.The representation and estimation framework proposed above, however, has a few limitations. This algorithm proposes to use a global estimation of the interaction forces within the crowd. It, therefore, is incapable of identifying different groups of objects based on motion or behavior in the scene. Although the algorithm is capable of learning the normal behavior and detects the abnormality, but it is incapable of capturing the dynamics of different behaviors.To overcome these limitations, we then propose a method based on the Lagrangian framework for fluid dynamics, by introducing a streakline representation of flow. Streaklines are traced in a fluid flow by injecting color material, such as smoke or dye, which is transported with the flow and used for visualization. In the context of computer vision, streaklines may be used in a similar way to transport information about a scene, and they are obtained by repeatedly initializing a fixed grid of particles at each frame, then moving both current and past particles using optical flow. Streaklines are the locus of points that connect particles which originated from the same initial position.This approach is advantageous over the previous representations in two aspects: first, its rich representation captures the dynamics of the crowd and changes in space and time in the scene where the optical flow representation is not enough, and second, this model is capable of discovering groups of similar behavior within a crowd scene by performing motion segmentation. We propose a method to distinguish different group behaviors such as divergent/convergent motion and lanes using this framework. Finally, we introduce flow potentials as a discriminative feature to recognize crowd behaviors in a scene. Results of extensive experiments are presented for multiple real life crowd sequences involving pedestrian and vehicular traffic.The proposed method exploits optical flow as the low level feature and performs integration and clustering to obtain coherent group motion patterns. However, we observe that in crowd video sequences, as well as a variety of other vision applications, the co-occurrence and inter-relation of motion patterns are the main characteristics of group behaviors. In other words, the group behavior of objects is a mixture of individual actions or behaviors in specific geometrical layout and temporal order.We, therefore, propose a new representation for group behaviors of humans using the inter-relation of motion patterns in a scene. The representation is based on bag of visual phrases of spatio-temporal visual words. We present a method to match the high-order spatial layout of visual words that preserve the geometry of the visual words under similarity transformations. To perform the experiments we collected a dataset of group choreography performances from the YouTube website. The dataset currently contains four categories of group dances.
Show less - Date Issued
- 2011
- Identifier
- CFE0004482, ucf:49317
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004482
- Title
- Reproductive life history and signal evolution in a multi-species assemblage of electric fish.
- Creator
-
Waddell, Joseph, Crampton, William, Fedorka, Kenneth, Quintana-Ascencio, Pedro, Stoddard, Philip, University of Central Florida
- Abstract / Description
-
Animals that co-occur in sympatry with multiple closely-related species use reproductive mate attraction signals not only to assess the quality of a potential conspecific mate (sexual selection), but also to discriminate conspecifics from heterospecifics (species recognition). However, the extent to which sexual selection and species recognition may interact, or even conflict, is poorly known. Neotropical electric fish offer unrivaled opportunities for understanding this problem. They...
Show moreAnimals that co-occur in sympatry with multiple closely-related species use reproductive mate attraction signals not only to assess the quality of a potential conspecific mate (sexual selection), but also to discriminate conspecifics from heterospecifics (species recognition). However, the extent to which sexual selection and species recognition may interact, or even conflict, is poorly known. Neotropical electric fish offer unrivaled opportunities for understanding this problem. They generate simple, stereotyped mate attraction signals that are easy to record and quantify, and that are well-understood from the neurobiological perspective. Additionally, they live in electrically-crowded environments, where multiple congeners live and reproduce in close proximity. This dissertation reports an investigation of electric signal diversity and reproductive life history in a nine-species assemblage of the electric fish genus Brachyhypopomus from the upper Amazon. A year-long quantitative sampling program yielded a library of electric signal recordings from (>)3,000 individuals and an accompanying collection of preserved specimens from which suites of informative life history traits were measured. These data were used to understand basic reproductive biology, and to describe sexually dimorphic and interspecific diversity in electric signals. By integrating approaches from ecology, physiology, and evolutionary biology, novel perspectives are provided on: 1. how sexual selection and species recognition interact to shape signal diversity and the occupation of signal space in multi-species animal communities; 2. how extreme seasonal variation in Amazonian ecosystems influences trade-offs in the allocation of reproductive resources (-) including mate attraction signals, and; 3. how environmental variation shapes general life-history traits in a diverse tropical animal assemblage.
Show less - Date Issued
- 2017
- Identifier
- CFE0006925, ucf:51689
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006925
- Title
- Context-Centric Affect Recognition From Paralinguistic Features of Speech.
- Creator
-
Marpaung, Andreas, Gonzalez, Avelino, DeMara, Ronald, Sukthankar, Gita, Wu, Annie, Lisetti, Christine, University of Central Florida
- Abstract / Description
-
As the field of affect recognition has progressed, many researchers have shifted from having unimodal approaches to multimodal ones. In particular, the trends in paralinguistic speech affect recognition domain have been to integrate other modalities such as facial expression, body posture, gait, and linguistic speech. Our work focuses on integrating contextual knowledge into paralinguistic speech affect recognition. We hypothesize that a framework to recognize affect through paralinguistic...
Show moreAs the field of affect recognition has progressed, many researchers have shifted from having unimodal approaches to multimodal ones. In particular, the trends in paralinguistic speech affect recognition domain have been to integrate other modalities such as facial expression, body posture, gait, and linguistic speech. Our work focuses on integrating contextual knowledge into paralinguistic speech affect recognition. We hypothesize that a framework to recognize affect through paralinguistic features of speech can improve its performance by integrating relevant contextual knowledge. This dissertation describes our research to integrate contextual knowledge into the paralinguistic affect recognition process from acoustic features of speech. We conceived, built, and tested a two-phased system called the Context-Based Paralinguistic Affect Recognition System (CxBPARS). The first phase of this system is context-free and uses the AdaBoost classifier that applies data on the acoustic pitch, jitter, shimmer, Harmonics-to-Noise Ratio (HNR), and the Noise-to-Harmonics Ratio (NHR) to make an initial judgment about the emotion most likely exhibited by the human elicitor. The second phase then adds context modeling to improve upon the context-free classifications from phase I. CxBPARS was inspired by a human subject study performed as part of this work where test subjects were asked to classify an elicitor's emotion strictly from paralinguistic sounds, and then subsequently provided with contextual information to improve their selections. CxBPARS was rigorously tested and found to, at the worst case, improve the success rate from the state-of-the-art's 42% to 53%.
Show less - Date Issued
- 2019
- Identifier
- CFE0007836, ucf:52831
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007836
- Title
- EXPLOITING OPPONENT MODELING FOR LEARNING IN MULTI-AGENT ADVERSARIAL GAMES.
- Creator
-
Laviers, Kennard, Sukthankar, Gita, University of Central Florida
- Abstract / Description
-
An issue with learning effective policies in multi-agent adversarial games is that the size of the search space can be prohibitively large when the actions of both teammates and opponents are considered simultaneously. Opponent modeling, predicting an opponent's actions in advance of execution, is one approach for selecting actions in adversarial settings, but it is often performed in an ad hoc way. In this dissertation, we introduce several methods for using opponent modeling, in the form of...
Show moreAn issue with learning effective policies in multi-agent adversarial games is that the size of the search space can be prohibitively large when the actions of both teammates and opponents are considered simultaneously. Opponent modeling, predicting an opponent's actions in advance of execution, is one approach for selecting actions in adversarial settings, but it is often performed in an ad hoc way. In this dissertation, we introduce several methods for using opponent modeling, in the form of predictions about the players' physical movements, to learn team policies. To explore the problem of decision-making in multi-agent adversarial scenarios, we use our approach for both offline play generation and real-time team response in the Rush 2008 American football simulator. Simultaneously predicting the movement trajectories, future reward, and play strategies of multiple players in real-time is a daunting task but we illustrate how it is possible to divide and conquer this problem with an assortment of data-driven models. By leveraging spatio-temporal traces of player movements, we learn discriminative models of defensive play for opponent modeling. With the reward information from previous play matchups, we use a modified version of UCT (Upper Conference Bounds applied to Trees) to create new offensive plays and to learn play repairs to counter predicted opponent actions. In team games, players must coordinate effectively to accomplish tasks while foiling their opponents either in a preplanned or emergent manner. An effective team policy must generate the necessary coordination, yet considering all possibilities for creating coordinating subgroups is computationally infeasible. Automatically identifying and preserving the coordination between key subgroups of teammates can make search more productive by pruning policies that disrupt these relationships. We demonstrate that combining opponent modeling with automatic subgroup identification can be used to create team policies with a higher average yardage than either the baseline game or domain-specific heuristics.
Show less - Date Issued
- 2011
- Identifier
- CFE0003914, ucf:48720
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003914
- Title
- High Performance Techniques for Face Recognition.
- Creator
-
Aldhahab, Ahmed, Mikhael, Wasfy, Atia, George, Jones, W Linwood, Wei, Lei, Elshennawy, Ahmad, University of Central Florida
- Abstract / Description
-
The identification of individuals using face recognition techniques is a challenging task. This is due to the variations resulting from facial expressions, makeup, rotations, illuminations, gestures, etc. Also, facial images contain a great deal of redundant information, which negatively affects the performance of the recognition system. The dimensionality and the redundancy of the facial features have a direct effect on the face recognition accuracy. Not all the features in the feature...
Show moreThe identification of individuals using face recognition techniques is a challenging task. This is due to the variations resulting from facial expressions, makeup, rotations, illuminations, gestures, etc. Also, facial images contain a great deal of redundant information, which negatively affects the performance of the recognition system. The dimensionality and the redundancy of the facial features have a direct effect on the face recognition accuracy. Not all the features in the feature vector space are useful. For example, non-discriminating features in the feature vector space not only degrade the recognition accuracy but also increase the computational complexity.In the field of computer vision, pattern recognition, and image processing, face recognition has become a popular research topic. This is due to its wide spread applications in security and control, which allow the identified individual to access secure areas, personal information, etc. The performance of any recognition system depends on three factors: 1) the storage requirements, 2) the computational complexity, and 3) the recognition rates.Two different recognition system families are presented and developed in this dissertation. Each family consists of several face recognition systems. Each system contains three main steps, namely, preprocessing, feature extraction, and classification. Several preprocessing steps, such as cropping, facial detection, dividing the facial image into sub-images, etc. are applied to the facial images. This reduces the effect of the irrelevant information (background) and improves the system performance. In this dissertation, either a Neural Network (NN) based classifier or Euclidean distance is used for classification purposes. Five widely used databases, namely, ORL, YALE, FERET, FEI, and LFW, each containing different facial variations, such as light condition, rotations, facial expressions, facial details, etc., are used to evaluate the proposed systems. The experimental results of the proposed systems are analyzed using K-folds Cross Validation (CV).In the family-1, Several systems are proposed for face recognition. Each system employs different integrated tools in the feature extraction step. These tools, Two Dimensional Discrete Multiwavelet Transform (2D DMWT), 2D Radon Transform (2D RT), 2D or 3D DWT, and Fast Independent Component Analysis (FastICA), are applied to the processed facial images to reduce the dimensionality and to obtain discriminating features. Each proposed system produces a unique representation, and achieves less storage requirements and better performance than the existing methods.For further facial compression, there are three face recognition systems in the second family. Each system uses different integrated tools to obtain better facial representation. The integrated tools, Vector Quantization (VQ), Discrete cosine Transform (DCT), and 2D DWT, are applied to the facial images for further facial compression and better facial representation. In the systems using the tools VQ/2D DCT and VQ/ 2D DWT, each pose in the databases is represented by one centroid with 4*4*16 dimensions. In the third system, VQ/ Facial Part Detection (FPD), each person in the databases is represented by four centroids with 4*Centroids (4*4*16) dimensions. The systems in the family-2 are proposed to further reduce the dimensions of the data compared to the systems in the family-1 while attaining comparable results. For example, in family-1, the integrated tools, FastICA/ 2D DMWT, applied to different combinations of sub-images in the FERET database with K-fold=5 (9 different poses used in the training mode), reduce the dimensions of the database by 97.22% and achieve 99% accuracy. In contrast, the integrated tools, VQ/ FPD, in the family-2 reduce the dimensions of the data by 99.31% and achieve 97.98% accuracy. In this example, the integrated tools, VQ/ FPD, accomplished further data compression and less accuracy compared to those reported by FastICA/ 2D DMWT tools. Various experiments and simulations using MATLAB are applied. The experimental results of both families confirm the improvements in the storage requirements, as well as the recognition rates as compared to some recently reported methods.
Show less - Date Issued
- 2017
- Identifier
- CFE0006709, ucf:51878
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006709
- Title
- Detecting, Tracking, and Recognizing Activities in Aerial Video.
- Creator
-
Reilly, Vladimir, Shah, Mubarak, Georgiopoulos, Michael, Stanley, Kenneth, Dogariu, Aristide, University of Central Florida
- Abstract / Description
-
In this dissertation we address the problem of detecting humans and vehicles, tracking their identities in crowded scenes, and finally determining human activities. First, we tackle the problem of detecting moving as well as stationary objects in scenes that contain parallax and shadows. We constrain the search of pedestrians and vehicles by representing them as shadow casting out of plane or (SCOOP) objects.Next, we propose a novel method for tracking a large number of densely moving objects...
Show moreIn this dissertation we address the problem of detecting humans and vehicles, tracking their identities in crowded scenes, and finally determining human activities. First, we tackle the problem of detecting moving as well as stationary objects in scenes that contain parallax and shadows. We constrain the search of pedestrians and vehicles by representing them as shadow casting out of plane or (SCOOP) objects.Next, we propose a novel method for tracking a large number of densely moving objects in aerial video. We divide the scene into grid cells to define a set of local scene constraints which we use as part of the matching cost function to solve the tracking problem which allows us to track fast-moving objects in low frame rate videos.Finally, we propose a method for recognizing human actions from few examples. We use the bag of words action representation, assume that most of the classes have many examples, and construct Support Vector Machine models for each class. We then use Support Vector Machines for classes with many examples to improve the decision function of the Support Vector Machine that was trained using few examples via late fusion of weighted decision values.
Show less - Date Issued
- 2012
- Identifier
- CFE0004627, ucf:49935
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004627
- Title
- Cost-Sensitive Learning-based Methods for Imbalanced Classification Problems with Applications.
- Creator
-
Razzaghi, Talayeh, Xanthopoulos, Petros, Karwowski, Waldemar, Pazour, Jennifer, Mikusinski, Piotr, University of Central Florida
- Abstract / Description
-
Analysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties createbias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive...
Show moreAnalysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties createbias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive learning method (CSL) to deal with the classification of imperfect data. Typically, most traditional approaches for classification demonstrate poor performance in an environment with imperfect data. We propose the use of CSL with Support Vector Machine, which is a well-known data mining algorithm. The results reveal that the proposed algorithm produces more accurate classifiers and is more robust with respect to imperfect data. Furthermore, we explore the best performance measures to tackle imperfect data along with addressing real problems in quality control and business analytics.
Show less - Date Issued
- 2014
- Identifier
- CFE0005542, ucf:50298
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0005542
- Title
- ATHEISTS, DEVILS, AND COMMUNISTS: COGNITIVE MAPPING OF ATTITUDES AND STEREOTYPES OF ATHEISTS.
- Creator
-
Najle, Maxine, Sims, Valerie, University of Central Florida
- Abstract / Description
-
Negative attitudes towards atheists are hardly a new trend in our society. However, given the pervasiveness of the prejudices and the lack of foundation for them, it seems warranted to explore the underlying elements of these attitudes. Identifying these constitutive elements may help pick apart the different contributing factors and perhaps mitigate or at least understand them in the future. The present study was designed to identify which myths or stereotypes about atheists are most...
Show moreNegative attitudes towards atheists are hardly a new trend in our society. However, given the pervasiveness of the prejudices and the lack of foundation for them, it seems warranted to explore the underlying elements of these attitudes. Identifying these constitutive elements may help pick apart the different contributing factors and perhaps mitigate or at least understand them in the future. The present study was designed to identify which myths or stereotypes about atheists are most influential in these attitudes. A Lexical Decision Task was utilized to identify which words related to popular stereotypes are most related to the label atheists. The labels Atheists, Christians, and Students were compared to positive words, negatives words, words or interests, neutral words, and non-word strings. Analyses revealed no significant differences among the participants' reaction times in these various comparisons, regardless of religion, level of belief in god, level of spirituality, or being acquainted with atheists. Possible explanations for these results are discussed in this thesis.
Show less - Date Issued
- 2012
- Identifier
- CFH0004318, ucf:45041
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFH0004318
- Title
- MODELING SCENES AND HUMAN ACTIVITIES IN VIDEOS.
- Creator
-
Basharat, Arslan, Shah, Mubarak, University of Central Florida
- Abstract / Description
-
In this dissertation, we address the problem of understanding human activities in videos by developing a two-pronged approach: coarse level modeling of scene activities and fine level modeling of individual activities. At the coarse level, where the resolution of the video is low, we rely on person tracks. At the fine level, richer features are available to identify different parts of the human body, therefore we rely on the body joint tracks. There are three main goals of this dissertation: ...
Show moreIn this dissertation, we address the problem of understanding human activities in videos by developing a two-pronged approach: coarse level modeling of scene activities and fine level modeling of individual activities. At the coarse level, where the resolution of the video is low, we rely on person tracks. At the fine level, richer features are available to identify different parts of the human body, therefore we rely on the body joint tracks. There are three main goals of this dissertation: (1) identify unusual activities at the coarse level, (2) recognize different activities at the fine level, and (3) predict the behavior for synthesizing and tracking activities at the fine level. The first goal is addressed by modeling activities at the coarse level through two novel and complementing approaches. The first approach learns the behavior of individuals by capturing the patterns of motion and size of objects in a compact model. Probability density function (pdf) at each pixel is modeled as a multivariate Gaussian Mixture Model (GMM), which is learnt using unsupervised expectation maximization (EM). In contrast, the second approach learns the interaction of object pairs concurrently present in the scene. This can be useful in detecting more complex activities than those modeled by the first approach. We use a 14-dimensional Kernel Density Estimation (KDE) that captures motion and size of concurrently tracked objects. The proposed models have been successfully used to automatically detect activities like unusual person drop-off and pickup, jaywalking, etc. The second and third goals of modeling human activities at the fine level are addressed by employing concepts from theory of chaos and non-linear dynamical systems. We show that the proposed model is useful for recognition and prediction of the underlying dynamics of human activities. We treat the trajectories of human body joints as the observed time series generated from an underlying dynamical system. The observed data is used to reconstruct a phase (or state) space of appropriate dimension by employing the delay-embedding technique. This transformation is performed without assuming an exact model of the underlying dynamics and provides a characteristic representation that will prove to be vital for recognition and prediction tasks. For recognition, properties of phase space are captured in terms of dynamical and metric invariants, which include the Lyapunov exponent, correlation integral, and correlation dimension. A composite feature vector containing these invariants represents the action and will be used for classification. For prediction, kernel regression is used in the phase space to compute predictions with a specified initial condition. This approach has the advantage of modeling dynamics without making any assumptions about the exact form (polynomial, radial basis, etc.) of the mapping function. We demonstrate the utility of these predictions for human activity synthesis and tracking.
Show less - Date Issued
- 2009
- Identifier
- CFE0002897, ucf:48042
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0002897
- Title
- Data-Driven Simulation Modeling of Construction and Infrastructure Operations Using Process Knowledge Discovery.
- Creator
-
Akhavian, Reza, Behzadan, Amir, Oloufa, Amr, Yun, Hae-Bum, Sukthankar, Gita, Zheng, Qipeng, University of Central Florida
- Abstract / Description
-
Within the architecture, engineering, and construction (AEC) domain, simulation modeling is mainly used to facilitate decision-making by enabling the assessment of different operational plans and resource arrangements, that are otherwise difficult (if not impossible), expensive, or time consuming to be evaluated in real world settings. The accuracy of such models directly affects their reliability to serve as a basis for important decisions such as project completion time estimation and...
Show moreWithin the architecture, engineering, and construction (AEC) domain, simulation modeling is mainly used to facilitate decision-making by enabling the assessment of different operational plans and resource arrangements, that are otherwise difficult (if not impossible), expensive, or time consuming to be evaluated in real world settings. The accuracy of such models directly affects their reliability to serve as a basis for important decisions such as project completion time estimation and resource allocation. Compared to other industries, this is particularly important in construction and infrastructure projects due to the high resource costs and the societal impacts of these projects. Discrete event simulation (DES) is a decision making tool that can benefit the process of design, control, and management of construction operations. Despite recent advancements, most DES models used in construction are created during the early planning and design stage when the lack of factual information from the project prohibits the use of realistic data in simulation modeling. The resulting models, therefore, are often built using rigid (subjective) assumptions and design parameters (e.g. precedence logic, activity durations). In all such cases and in the absence of an inclusive methodology to incorporate real field data as the project evolves, modelers rely on information from previous projects (a.k.a. secondary data), expert judgments, and subjective assumptions to generate simulations to predict future performance. These and similar shortcomings have to a large extent limited the use of traditional DES tools to preliminary studies and long-term planning of construction projects.In the realm of the business process management, process mining as a relatively new research domain seeks to automatically discover a process model by observing activity records and extracting information about processes. The research presented in this Ph.D. Dissertation was in part inspired by the prospect of construction process mining using sensory data collected from field agents. This enabled the extraction of operational knowledge necessary to generate and maintain the fidelity of simulation models. A preliminary study was conducted to demonstrate the feasibility and applicability of data-driven knowledge-based simulation modeling with focus on data collection using wireless sensor network (WSN) and rule-based taxonomy of activities. The resulting knowledge-based simulation models performed very well in properly predicting key performance measures of real construction systems. Next, a pervasive mobile data collection and mining technique was adopted and an activity recognition framework for construction equipment and worker tasks was developed. Data was collected using smartphone accelerometers and gyroscopes from construction entities to generate significant statistical time- and frequency-domain features. The extracted features served as the input of different types of machine learning algorithms that were applied to various construction activities. The trained predictive algorithms were then used to extract activity durations and calculate probability distributions to be fused into corresponding DES models. Results indicated that the generated data-driven knowledge-based simulation models outperform static models created based upon engineering assumptions and estimations with regard to compatibility of performance measure outputs to reality.
Show less - Date Issued
- 2015
- Identifier
- CFE0006023, ucf:51014
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006023
- Title
- The Role of Occupational Branding in the Professionalization of Technical Communication.
- Creator
-
Thomas, Chelsea, Jones, Dan, Flammia, Madelyn, Bell, Kathleen, University of Central Florida
- Abstract / Description
-
This study investigates the relationship between professional identity and professional status by exploring the quest for professionalization within technical communication. An established professional identity is crucial to an occupation's professionalization process, as it enables members of a given field to create a common sense of being and facilitates a recognizable personal and collective identity. Such recognition is vital to an occupation's rise to professional status, as it creates a...
Show moreThis study investigates the relationship between professional identity and professional status by exploring the quest for professionalization within technical communication. An established professional identity is crucial to an occupation's professionalization process, as it enables members of a given field to create a common sense of being and facilitates a recognizable personal and collective identity. Such recognition is vital to an occupation's rise to professional status, as it creates a distilled image of the ideal practitioner for outsiders and forms the basis upon which claims of expertise may be made. By constructing the meaning surrounding their profession, members are able to portray an image which designates their knowledge as a scarce expertise and their profession as the appropriate source for the services they provide.A lack of professional identity constitutes the primary factor hindering technical communication from realizing the professionalization process, as it prevents the formation of practitioners' common sense of being, promotes the absence of identifiability and precludes the possibility of recognition by larger society. Without an established professional identity, the field cannot formulate a culturally-relevant perception of its role, claim professional expertise or jurisdiction over their work, or achieve the social and cultural legitimacy necessary in order to increase its professional status. By implementing processes of occupational branding within the professional project, efforts involving the construction of collective professional identity will increase professional status by enabling a group's management of professional meaning, facilitating the creation of an occupational brand and assisting in value production.
Show less - Date Issued
- 2016
- Identifier
- CFE0006189, ucf:51137
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006189