Current Search: Pensky, Marianna (x)
View All Items
Pages
- Title
- Visual Saliency Detection and Semantic Segmentation.
- Creator
-
Souly, Nasim, Shah, Mubarak, Bagci, Ulas, Qi, GuoJun, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
Visual saliency is the ability to select the most relevant data in the scene and reduce the amount of data that needs to be processed. We propose a novel unsupervised approach to detect visual saliency in videos. For this, we employ a hierarchical segmentation technique to obtain supervoxels of a video, and simultaneously, we build a dictionary from cuboids of the video. Then we create a feature matrix from coefficients of dictionary elements. Next, we decompose this matrix into sparse and...
Show moreVisual saliency is the ability to select the most relevant data in the scene and reduce the amount of data that needs to be processed. We propose a novel unsupervised approach to detect visual saliency in videos. For this, we employ a hierarchical segmentation technique to obtain supervoxels of a video, and simultaneously, we build a dictionary from cuboids of the video. Then we create a feature matrix from coefficients of dictionary elements. Next, we decompose this matrix into sparse and redundant parts and obtain salient regions using group lasso. Our experiments provide promising results in terms of predicting eye movement. Moreover, we apply our method on action recognition task and achieve better results. Saliency detection only highlights important regions, in Semantic Segmentation, the aim is to assign a semantic label to each pixel in the image. Even though semantic segmentation can be achieved by simply applying classifiers to each pixel or a region, the results may not be desirable since general context information is not considered. To address this issue, we propose two supervised methods. First, an approach to discover interactions between labels and regions using a sparse estimation of precision matrix obtained by graphical lasso. Second, a knowledge-based method to incorporate dependencies among regions in the image during inference. High-level knowledge rules - such as co-occurrence- are extracted from training data and transformed into constraints in Integer Programming formulation. A difficulty in the most supervised semantic segmentation approaches is the lack of enough training data. To address this, a semi-supervised learning approach to exploit the plentiful amount of available unlabeled,as well as synthetic images generated via Generative Adversarial Networks (GAN), is presented. Furthermore, an extension of the proposed model to use additional weakly labeled data is proposed. We demonstrate our approaches on three challenging bench-marking datasets
Show less - Date Issued
- 2017
- Identifier
- CFE0006918, ucf:51694
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006918
- Title
- Analysis of Employment and Earnings Using Varying Coefficient Models to Assess Success of Minorities and Women.
- Creator
-
Goedeker, Amanda, Pensky, Marianna, Song, Zixia, Swanson, Jason, Huang, Hsin-Hsiung, University of Central Florida
- Abstract / Description
-
The objective of this thesis is to examine the success of minorities (black, and Hispanic/Latino employees) and women in the United States workforce, defining success by employment percentage and earnings. The goal of this thesis is to study the impact gender, race, passage of time, and national economic status reflected in gross domestic product have on the success of minorities and women. In particular, this thesis considers the impact of these factors in Science, Technology, Engineering...
Show moreThe objective of this thesis is to examine the success of minorities (black, and Hispanic/Latino employees) and women in the United States workforce, defining success by employment percentage and earnings. The goal of this thesis is to study the impact gender, race, passage of time, and national economic status reflected in gross domestic product have on the success of minorities and women. In particular, this thesis considers the impact of these factors in Science, Technology, Engineering and Math (STEM) industries. Varying coefficient models are utilized in the analysis of data sets for national employment percentages and earnings.
Show less - Date Issued
- 2016
- Identifier
- CFE0006458, ucf:51425
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006458
- Title
- Improving Efficiency in Deep Learning for Large Scale Visual Recognition.
- Creator
-
Liu, Baoyuan, Foroosh, Hassan, Qi, GuoJun, Welch, Gregory, Sukthankar, Rahul, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
The emerging recent large scale visual recognition methods, and in particular the deep Convolutional Neural Networks (CNN), are promising to revolutionize many computer vision based artificial intelligent applications, such as autonomous driving and online image retrieval systems. One of the main challenges in large scale visual recognition is the complexity of the corresponding algorithms. This is further exacerbated by the fact that in most real-world scenarios they need to run in real time...
Show moreThe emerging recent large scale visual recognition methods, and in particular the deep Convolutional Neural Networks (CNN), are promising to revolutionize many computer vision based artificial intelligent applications, such as autonomous driving and online image retrieval systems. One of the main challenges in large scale visual recognition is the complexity of the corresponding algorithms. This is further exacerbated by the fact that in most real-world scenarios they need to run in real time and on platforms that have limited computational resources. This dissertation focuses on improving the efficiency of such large scale visual recognition algorithms from several perspectives. First, to reduce the complexity of large scale classification to sub-linear with the number of classes, a probabilistic label tree framework is proposed. A test sample is classified by traversing the label tree from the root node. Each node in the tree is associated with a probabilistic estimation of all the labels. The tree is learned recursively with iterative maximum likelihood optimization. Comparing to the hard label partition proposed previously, the probabilistic framework performs classification more accurately with similar efficiency. Second, we explore the redundancy of parameters in Convolutional Neural Networks (CNN) and employ sparse decomposition to significantly reduce both the amount of parameters and computational complexity. Both inter-channel and inner-channel redundancy is exploit to achieve more than 90\% sparsity with approximately 1\% drop of classification accuracy. We also propose a CPU based efficient sparse matrix multiplication algorithm to reduce the actual running time of CNN models with sparse convolutional kernels. Third, we propose a multi-stage framework based on CNN to achieve better efficiency than a single traditional CNN model. With a combination of cascade model and the label tree framework, the proposed method divides the input images in both the image space and the label space, and processes each image with CNN models that are most suitable and efficient. The average complexity of the framework is significantly reduced, while the overall accuracy remains the same as in the single complex model.
Show less - Date Issued
- 2016
- Identifier
- CFE0006472, ucf:51436
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006472
- Title
- Confluence of Vision and Natural Language Processing for Cross-media Semantic Relations Extraction.
- Creator
-
Tariq, Amara, Foroosh, Hassan, Qi, GuoJun, Gonzalez, Avelino, Pensky, Marianna, University of Central Florida
- Abstract / Description
-
In this dissertation, we focus on extracting and understanding semantically meaningful relationshipsbetween data items of various modalities; especially relations between images and naturallanguage. We explore the ideas and techniques to integrate such cross-media semantic relationsfor machine understanding of large heterogeneous datasets, made available through the expansionof the World Wide Web. The datasets collected from social media websites, news media outletsand blogging platforms...
Show moreIn this dissertation, we focus on extracting and understanding semantically meaningful relationshipsbetween data items of various modalities; especially relations between images and naturallanguage. We explore the ideas and techniques to integrate such cross-media semantic relationsfor machine understanding of large heterogeneous datasets, made available through the expansionof the World Wide Web. The datasets collected from social media websites, news media outletsand blogging platforms usually contain multiple modalities of data. Intelligent systems are needed to automatically make sense out of these datasets and present them in such a way that humans can find the relevant pieces of information or get a summary of the available material. Such systems have to process multiple modalities of data such as images, text, linguistic features, and structured data in reference to each other. For example, image and video search and retrieval engines are required to understand the relations between visual and textual data so that they can provide relevant answers in the form of images and videos to the users' queries presented in the form of text.We emphasize the automatic extraction of semantic topics or concepts from the data available in any form such as images, free-flowing text or metadata. These semantic concepts/topics become the basis of semantic relations across heterogeneous data types, e.g., visual and textual data. A classic problem involving image-text relations is the automatic generation of textual descriptions of images. This problem is the main focus of our work. In many cases, large amount of text is associated with images. Deep exploration of linguistic features of such text is required to fully utilize the semantic information encoded in it. A news dataset involving images and news articles is an example of this scenario. We devise frameworks for automatic news image description generation based on the semantic relations of images, as well as semantic understanding of linguistic features of the news articles.
Show less - Date Issued
- 2016
- Identifier
- CFE0006507, ucf:51401
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006507