Current Search: Zhang, Qi (x)
-
-
Title
-
Leaning Robust Sequence Features via Dynamic Temporal Pattern Discovery.
-
Creator
-
Hu, Hao, Wang, Liqiang, Zhang, Shaojie, Liu, Fei, Qi, GuoJun, Zhou, Qun, University of Central Florida
-
Abstract / Description
-
As a major type of data, time series possess invaluable latent knowledge for describing the real world and human society. In order to improve the ability of intelligent systems for understanding the world and people, it is critical to design sophisticated machine learning algorithms for extracting robust time series features from such latent knowledge. Motivated by the successful applications of deep learning in computer vision, more and more machine learning researchers put their attentions...
Show moreAs a major type of data, time series possess invaluable latent knowledge for describing the real world and human society. In order to improve the ability of intelligent systems for understanding the world and people, it is critical to design sophisticated machine learning algorithms for extracting robust time series features from such latent knowledge. Motivated by the successful applications of deep learning in computer vision, more and more machine learning researchers put their attentions on the topic of applying deep learning techniques to time series data. However, directly employing current deep models in most time series domains could be problematic. A major reason is that temporal pattern types that current deep models are aiming at are very limited, which cannot meet the requirement of modeling different underlying patterns of data coming from various sources. In this study we address this problem by designing different network structures explicitly based on specific domain knowledge such that we can extract features via most salient temporal patterns. More specifically, we mainly focus on two types of temporal patterns: order patterns and frequency patterns. For order patterns, which are usually related to brain and human activities, we design a hashing-based neural network layer to globally encode the ordinal pattern information into the resultant features. It is further generalized into a specially designed Recurrent Neural Networks (RNN) cell which can learn order patterns in an online fashion. On the other hand, we believe audio-related data such as music and speech can benefit from modeling frequency patterns. Thus, we do so by developing two types of RNN cells. The first type tries to directly learn the long-term dependencies on frequency domain rather than time domain. The second one aims to dynamically filter out the ``noise" frequencies based on temporal contexts. By proposing various deep models based on different domain knowledge and evaluating them on extensive time series tasks, we hope this work can provide inspirations for others and increase the community's interests on the problem of applying deep learning techniques to more time series tasks.
Show less
-
Date Issued
-
2019
-
Identifier
-
CFE0007470, ucf:52679
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0007470
-
-
Title
-
Global Data Association for Multiple Pedestrian Tracking.
-
Creator
-
Dehghan, Afshin, Shah, Mubarak, Qi, GuoJun, Bagci, Ulas, Zhang, Shaojie, Zheng, Qipeng, University of Central Florida
-
Abstract / Description
-
Multi-object tracking is one of the fundamental problems in computer vision. Almost all multi-object tracking systems consist of two main components; detection and data association. In the detection step, object hypotheses are generated in each frame of a sequence. Later, detections that belong to the same target are linked together to form final trajectories. The latter step is called data association. There are several challenges that render this problem difficult, such as occlusion,...
Show moreMulti-object tracking is one of the fundamental problems in computer vision. Almost all multi-object tracking systems consist of two main components; detection and data association. In the detection step, object hypotheses are generated in each frame of a sequence. Later, detections that belong to the same target are linked together to form final trajectories. The latter step is called data association. There are several challenges that render this problem difficult, such as occlusion, background clutter and pose changes. This dissertation aims to address these challenges by tackling the data association component of tracking and contributes three novel methods for solving data association. Firstly, this dissertation will present a new framework for multi-target tracking that uses a novel data association technique using the Generalized Maximum Clique Problem (GMCP) formulation. The majority of current methods, such as bipartite matching, incorporate a limited temporal locality of the sequence into the data association problem. This makes these methods inherently prone to ID-switches and difficulties caused by long-term occlusions, a cluttered background and crowded scenes. On the other hand, our approach incorporates both motion and appearance in a global manner. Unlike limited temporal locality methods which incorporate a few frames into the data association problem, this method incorporates the whole temporal span and solves the data association problem for one object at a time. Generalized Minimum Clique Graph (GMCP) is used to solve the optimization problem of our data association method. The proposed method is supported by superior results on several benchmark sequences. GMCP leads us to a more accurate approach to multi-object tracking by considering all the pairwise relationships in a batch of frames; however, it has some limitations. Firstly, it finds target trajectories one-by-one, missing joint optimization. Secondly, for optimization we use a greedy solver, based on local neighborhood search, making our optimization prone to local minimas. Finally GMCP tracker is slow, which is a burden when dealing with time-sensitive applications. In order to address these problems, we propose a new graph theoretic problem, called Generalized Maximum Multi Clique Problem (GMMCP). GMMCP tracker has all the advantages of the GMCP tracker while addressing its limitations. A solution is presented to GMMCP where no simplification is assumed in problem formulation or problem optimization. GMMCP is NP hard but it can be formulated through a Binary-Integer Program where the solution to small- and medium-sized tracking problems can be found efficiently. To improve speed, Aggregated Dummy Nodes are used for modeling occlusions and miss detections. This also reduces the size of the input graph without using any heuristics. We show that using the speed-up method, our tracker lends itself to a real-time implementation, increasing its potential usefulness in many applications. In test against several tracking datasets, we show that the proposed method outperforms competitive methods. Thus far we have assumed that the number of people do not exceed a few dozens. However, this is not always the case. In many scenarios such as, marathon, political rallies or religious rites, the number of people in a frame may reach few hundreds or even few thousands. Tracking in high-density crowd sequences is a challenging problem due to several reasons. Human detection methods often fail to localize objects correctly in extremely crowded scenes. This limits the use of data association based tracking methods. Additionally, it is hard to extend existing multi-target tracking to track targets in highly-crowded scenes, because the large number of targets increases the computational complexity. Furthermore, the small apparent target size makes it challenging to extract features to discriminate targets from their surroundings. Finally, we present a tracker that addresses the above-mentioned problems. We formulate online crowd tracking as a Binary Quadratic Programing, where both detection and data association problems are solved together. Our formulation employs target's individual information in the form of appearance and motion as well as contextual cues in the form of neighborhood motion, spatial proximity and grouping constraints. Due to large number of targets, state-of-the-art commercial quadratic programing solvers fail to efficiently find the solution to the proposed optimization. In order to overcome the computational complexity of available solvers, we propose to use the most recent version of Modified Frank-Wolfe algorithms with SWAP steps. The proposed tracker can track hundreds of targets efficiently and improves state-of-the-art results by significant margin on high density crowd sequences.
Show less
-
Date Issued
-
2016
-
Identifier
-
CFE0006095, ucf:51201
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0006095
-
-
Title
-
Spatiotemporal Graphs for Object Segmentation and Human Pose Estimation in Videos.
-
Creator
-
Zhang, Dong, Shah, Mubarak, Qi, GuoJun, Bagci, Ulas, Yun, Hae-Bum, University of Central Florida
-
Abstract / Description
-
Images and videos can be naturally represented by graphs, with spatial graphs for images and spatiotemporal graphs for videos. However, for different applications, there are usually different formulations of the graphs, and algorithms for each formulation have different complexities. Therefore, wisely formulating the problem to ensure an accurate and efficient solution is one of the core issues in Computer Vision research. We explore three problems in this domain to demonstrate how to...
Show moreImages and videos can be naturally represented by graphs, with spatial graphs for images and spatiotemporal graphs for videos. However, for different applications, there are usually different formulations of the graphs, and algorithms for each formulation have different complexities. Therefore, wisely formulating the problem to ensure an accurate and efficient solution is one of the core issues in Computer Vision research. We explore three problems in this domain to demonstrate how to formulate all of these problems in terms of spatiotemporal graphs and obtain good and efficient solutions.The first problem we explore is video object segmentation. The goal is to segment the primary moving objects in the videos. This problem is important for many applications, such as content based video retrieval, video summarization, activity understanding and targeted content replacement. In our framework, we use object proposals, which are object-like regions obtained by low-level visual cues. Each object proposal has an object-ness score associated with it, which indicates how likely this object proposal corresponds to an object. The problem is formulated as a directed acyclic graph, for which nodes represent the object proposals and edges represent the spatiotemporal relationship between nodes. A dynamic programming solution is employed to select one object proposal from each video frame, while ensuring their consistency throughout the video frames. Gaussian mixture models (GMMs) are used for modeling the background and foreground, and Markov Random Fields (MRFs) are employed to smooth the pixel-level segmentation.In the above spatiotemporal graph formulation, we consider the object segmentation in only single video. Next, we consider multiple videos and model the video co-segmentation problem as a spatiotemporal graph. The goal here is to simultaneously segment the moving objects from multiple videos and assign common objects the same labels. The problem is formulated as a regulated maximum clique problem using object proposals. The object proposals are tracked in adjacent frames to generate a pool of candidate tracklets. Then an undirected graph is built with the nodes corresponding to the tracklets from all the videos and edges representing the similarities between the tracklets. A modified Bron-Kerbosch Algorithm is applied to the graph in order to select the prominent objects contained in these videos, hence relate the segmentation of each object in different videos.In online and surveillance videos, the most important object class is the human. In contrast to generic video object segmentation and co-segmentation, specific knowledge about humans, which is defined by a pose (i.e. human skeleton), can be employed to help the segmentation and tracking of people in the videos. We formulate the problem of human pose estimation in videos using the spatiotemporal graph. In this formulation, the nodes represent different body parts in the video frames and edges represent the spatiotemporal relationship between body parts in adjacent frames. The graph is carefully designed to ensure an exact and efficient solution. The overall objective for the new formulation is to remove the simple cycles from the traditional graph-based formulations. Dynamic programming is employed in different stages in the method to select the best tracklets and human pose configurations
Show less
-
Date Issued
-
2016
-
Identifier
-
CFE0006429, ucf:51488
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0006429
-
-
Title
-
Generation and characterization of sub-70 isolated attosecond pulses.
-
Creator
-
Zhang, Qi, Chang, Zenghu, Delfyett, Peter, Gaume, Romain, Saha, Haripada, University of Central Florida
-
Abstract / Description
-
Dynamics occurring on microscopic scales, such as electronic motion inside atoms and molecules, are governed by quantum mechanics. However, the Schr(&)#246;dinger equation is usually too complicated to solve analytically for systems other than the hydrogen atom. Even for some simple atoms such as helium, it still takes months to do a full numerical analysis. Therefore, practical problems are often solved only after simplification. The results are then compared with the experimental outcome in...
Show moreDynamics occurring on microscopic scales, such as electronic motion inside atoms and molecules, are governed by quantum mechanics. However, the Schr(&)#246;dinger equation is usually too complicated to solve analytically for systems other than the hydrogen atom. Even for some simple atoms such as helium, it still takes months to do a full numerical analysis. Therefore, practical problems are often solved only after simplification. The results are then compared with the experimental outcome in both the spectral and temporal domain. For accurate experimental comparison, temporal resolution on the attosecond scale is required. This had not been achieved until the first demonstration of the single attosecond pulse in 2001. After this breakthrough, (")attophysics(") immediately became a hot field in the physics and optics community. While the attosecond pulse has served as an irreplaceable tool in many fundamental research studies of ultrafast dynamics, the pulse generation process itself is an interesting topic in the ultrafast field. When an intense femtosecond laser is tightly focused on a gaseous target, electrons inside the neutral atoms are ripped away through tunneling ionization. Under certain circumstances, the electrons are able to reunite with the parent ions and release photon bursts lasting only tens to hundreds of attoseconds. This process repeats itself every half cycle of the driving pulse, generating a train of single attosecond pulses which lasts longer than one femtosecond. To achieve true temporal resolution on the attosecond time scale, single isolated attosecond pulses are required, meaning only one attosecond pulse can be produced per driving pulse.Up to now, there are only a few methods which have been demonstrated experimentally to generate isolated attosecond pulses. Pioneering work generated single attosecond pulse using a carrier-envelope phase-stabilized 3.3 fs laser pulse, which is out of reach for most research groups. An alternative method termed as polarization gating generated single attosecond pulses with 5 fs driving pulses, which is still difficult to achieve experimentally. Most recently, a new technique termed as Double Optical Gating (DOG) was developed in our group to allow the generation of single attosecond pulse with longer driving pulse durations. For example, isolated 150 as pulses were demonstrated with a 25 fs driving laser directly from a commercially-available Ti:Sapphire amplifier. Isolated attosecond pulses as short as 107 as have been demonstrated with the DOG scheme before this work. Here, we employ this method to shorten the pulse duration even further, demonstrating world-record isolated 67 as pulses. Optical pulses with attosecond duration are the shortest controllable process up to now and are much faster than the electron response times in any electronic devices. In consequence, it is also a challenge to characterize attosecond pulses experimentally, especially when they feature a broadband spectrum. Similar challenges have previously been met in characterizing femtosecond laser pulses, with many schemes already proposed and well-demonstrated experimentally. Similar schemes can be applied in characterizing attosecond pulses with narrow bandwidth. The limitation of these techniques is presented here, and a method recently developed to overcome those limitations is discussed. At last, several experimental advances toward the characterization of the isolated 25 as pulses, which is one atomic unit time, are discussed briefly.
Show less
-
Date Issued
-
2014
-
Identifier
-
CFE0005450, ucf:50375
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0005450