Current Search: semantics (x)
-
-
Title
-
ORTHOGRAPHIC SIMILARITY AND FALSE RECOGNITION FOR UNFAMILIAR WORDS.
-
Creator
-
Perrotte, Jeffrey, Parra-Tatge, Marisol, University of Central Florida
-
Abstract / Description
-
There is evidence of false recognition (FR) driven by orthographic similarities within languages (Lambert, Chang, & Lin, 2001; Raser, 1972) and some evidence that FR crosses languages (Parra, 2013). No study has investigated whether FR based on orthographic similarities occurs for unknown words in an unknown language. This study aimed to answer this question. It further explored whether FR based on orthographic similarities is more likely in a known (English) than in an unknown (Spanish)...
Show moreThere is evidence of false recognition (FR) driven by orthographic similarities within languages (Lambert, Chang, & Lin, 2001; Raser, 1972) and some evidence that FR crosses languages (Parra, 2013). No study has investigated whether FR based on orthographic similarities occurs for unknown words in an unknown language. This study aimed to answer this question. It further explored whether FR based on orthographic similarities is more likely in a known (English) than in an unknown (Spanish) language. Forty-six English monolinguals participated. They studied 50 English and 50 Spanish words during a study phase. A recognition test was given immediately after the study phase. It consisted of 40 Spanish and 40 English words. It included list words (i.e., words presented at study); homographs (i.e., words not presented at study, orthographically similar to words presented at study); and unrelated words (i.e., words not presented at study, not orthographically similar to words presented at study). The LSD post-hoc test showed significant results supporting the hypothesis that false recognition based on orthographic similarities occurs for words in a known language (English) and in an unknown language (Spanish). Further evidence was provided by the LSD post-hoc test supporting the hypothesis that false recognition based on orthographic similarities was more likely to occur in a known language than an unknown language. Results provided evidence that the meaning and orthographic form are used when information is encoded thereby influencing recognition decisions. Furthermore, these results emphasize the significance of orthography when information is encoded and retrieved. Keywords: false recognition, orthography, semantic, orthographic distinctiveness, semantic distinctiveness, English monolingual
Show less
-
Date Issued
-
2015
-
Identifier
-
CFH0004906, ucf:45501
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH0004906
-
-
Title
-
Confluence of Vision and Natural Language Processing for Cross-media Semantic Relations Extraction.
-
Creator
-
Tariq, Amara, Foroosh, Hassan, Qi, GuoJun, Gonzalez, Avelino, Pensky, Marianna, University of Central Florida
-
Abstract / Description
-
In this dissertation, we focus on extracting and understanding semantically meaningful relationshipsbetween data items of various modalities; especially relations between images and naturallanguage. We explore the ideas and techniques to integrate such cross-media semantic relationsfor machine understanding of large heterogeneous datasets, made available through the expansionof the World Wide Web. The datasets collected from social media websites, news media outletsand blogging platforms...
Show moreIn this dissertation, we focus on extracting and understanding semantically meaningful relationshipsbetween data items of various modalities; especially relations between images and naturallanguage. We explore the ideas and techniques to integrate such cross-media semantic relationsfor machine understanding of large heterogeneous datasets, made available through the expansionof the World Wide Web. The datasets collected from social media websites, news media outletsand blogging platforms usually contain multiple modalities of data. Intelligent systems are needed to automatically make sense out of these datasets and present them in such a way that humans can find the relevant pieces of information or get a summary of the available material. Such systems have to process multiple modalities of data such as images, text, linguistic features, and structured data in reference to each other. For example, image and video search and retrieval engines are required to understand the relations between visual and textual data so that they can provide relevant answers in the form of images and videos to the users' queries presented in the form of text.We emphasize the automatic extraction of semantic topics or concepts from the data available in any form such as images, free-flowing text or metadata. These semantic concepts/topics become the basis of semantic relations across heterogeneous data types, e.g., visual and textual data. A classic problem involving image-text relations is the automatic generation of textual descriptions of images. This problem is the main focus of our work. In many cases, large amount of text is associated with images. Deep exploration of linguistic features of such text is required to fully utilize the semantic information encoded in it. A news dataset involving images and news articles is an example of this scenario. We devise frameworks for automatic news image description generation based on the semantic relations of images, as well as semantic understanding of linguistic features of the news articles.
Show less
-
Date Issued
-
2016
-
Identifier
-
CFE0006507, ucf:51401
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0006507
-
-
Title
-
Automatically Acquiring a Semantic Network of Related Concepts.
-
Creator
-
Szumlanski, Sean, Gomez, Fernando, Wu, Annie, Hughes, Charles, Sims, Valerie, University of Central Florida
-
Abstract / Description
-
We describe the automatic acquisition of a semantic network in which over 7,500 of the most frequently occurring nouns in the English language are linked to their semantically related concepts in the WordNet noun ontology. Relatedness between nouns is discovered automatically from lexical co-occurrence in Wikipedia texts using a novel adaptation of an information theoretic inspired measure. Our algorithm then capitalizes on salient sense clustering among these semantic associates to...
Show moreWe describe the automatic acquisition of a semantic network in which over 7,500 of the most frequently occurring nouns in the English language are linked to their semantically related concepts in the WordNet noun ontology. Relatedness between nouns is discovered automatically from lexical co-occurrence in Wikipedia texts using a novel adaptation of an information theoretic inspired measure. Our algorithm then capitalizes on salient sense clustering among these semantic associates to automatically disambiguate them to their corresponding WordNet noun senses (i.e., concepts). The resultant concept-to-concept associations, stemming from 7,593 target nouns, with 17,104 distinct senses among them, constitute a large-scale semantic network with 208,832 undirected edges between related concepts. Our work can thus be conceived of as augmenting the WordNet noun ontology with RelatedTo links.The network, which we refer to as the Szumlanski-Gomez Network (SGN), has been subjected to a variety of evaluative measures, including manual inspection by human judges and quantitative comparison to gold standard data for semantic relatedness measurements. We have also evaluated the network's performance in an applied setting on a word sense disambiguation (WSD) task in which the network served as a knowledge source for established graph-based spreading activation algorithms, and have shown: a) the network is competitive with WordNet when used as a stand-alone knowledge source for WSD, b) combining our network with WordNet achieves disambiguation results that exceed the performance of either resource individually, and c) our network outperforms a similar resource, WordNet++ (Ponzetto (&) Navigli, 2010), that has been automatically derived from annotations in the Wikipedia corpus.Finally, we present a study on human perceptions of relatedness. In our study, we elicited quantitative evaluations of semantic relatedness from human subjects using a variation of the classical methodology that Rubenstein and Goodenough (1965) employed to investigate human perceptions of semantic similarity. Judgments from individual subjects in our study exhibit high average correlation to the elicited relatedness means using leave-one-out sampling (r = 0.77, ? = 0.09, N = 73), although not as high as average human correlation in previous studies of similarity judgments, for which Resnik (1995) established an upper bound of r = 0.90 (? = 0.07, N = 10). These results suggest that human perceptions of relatedness are less strictly constrained than evaluations of similarity, and establish a clearer expectation for what constitutes human-like performance by a computational measure of semantic relatedness. We also contrast the performance of a variety of similarity and relatedness measures on our dataset to their performance on similarity norms and introduce our own dataset as a supplementary evaluative standard for relatedness measures.
Show less
-
Date Issued
-
2013
-
Identifier
-
CFE0004759, ucf:49767
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0004759
-
-
Title
-
SEMANTIC BIAS AS AN APPLICATION OF THE UNIVERSAL GRAMMAR MODEL IN THE RUSSIAN LANGUAGE.
-
Creator
-
Gural, Iryna, Modianos, Doan T., Villegas, Alvaro, University of Central Florida
-
Abstract / Description
-
The theory of the Universal Grammar developed by Chomsky has been known for many years. The main idea behind the theory was that the processing of the language does not depend on the culture but it universal among all the languages. Further psycholinguistic studies developed the ideas about schematic comprehension of the language, giving rise to the idea of the "garden path effect". Research focused on the processing of the ambiguous sentences and found the tendency for readers to prefer...
Show moreThe theory of the Universal Grammar developed by Chomsky has been known for many years. The main idea behind the theory was that the processing of the language does not depend on the culture but it universal among all the languages. Further psycholinguistic studies developed the ideas about schematic comprehension of the language, giving rise to the idea of the "garden path effect". Research focused on the processing of the ambiguous sentences and found the tendency for readers to prefer interpretations of specific sentence areas as objects. The current study summarizes the ideas of psycholinguistic study and incorporates a novel language structure to study readers' syntactic preferences. In addition, conducting the study in Russian language accompanies previous research in other languages, also arguing in favor of the Universal Grammar model given the hypothesis was supported. It was hypothesized that readers would prefer the comparison of the two direct objects over the subjects, which would be reflected by faster reading times. Self-paced reading ask was administered to the participants in order to measure their reading times. The analysis found no significant differences in the reading times of the critical area, thus hypothesis was not supported. Possible explanations, limitations, and further directions are discussed.
Show less
-
Date Issued
-
2019
-
Identifier
-
CFH2000513, ucf:45697
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH2000513
-
-
Title
-
Detecting Semantic Method Clones in Java Code using Method IOE-Behavior.
-
Creator
-
Elva, Rochelle, Leavens, Gary, Johnson, Mark, Orooji, Ali, Hughes, Charles, University of Central Florida
-
Abstract / Description
-
The determination of semantic equivalence is an undecidable problem; however, this dissertation shows that a reasonable approximation can be obtained using a combination of static and dynamic analysis. This study investigates the detection of functional duplicates, referred to as semantic method clones (SMCs), in Java code. My algorithm extends the input-output notion of observable behavior, used in related work [1, 2], to include the effects of the method. The latter property refers to the...
Show moreThe determination of semantic equivalence is an undecidable problem; however, this dissertation shows that a reasonable approximation can be obtained using a combination of static and dynamic analysis. This study investigates the detection of functional duplicates, referred to as semantic method clones (SMCs), in Java code. My algorithm extends the input-output notion of observable behavior, used in related work [1, 2], to include the effects of the method. The latter property refers to the persistent changes to the heap, brought about by the execution of the method. To differentiate this from the typical input-output behavior used by other researchers, I have coined the term method IOE-Behavior; which means its input-output and effects behavior [3]. Two methods are defined as semantic method clones, if they have identical IOE-Behavior; that is, for the same inputs (actual parameters and initial heap state), they produce the same output (that is result- for non-void methods, and final heap state).The detection process consists of two static pre-filters used to identify candidate clone sets. This is followed by dynamic tests that actually run the candidate methods, to determine semantic equivalence. The first filter groups the methods by type. The second filter refines the output of the first, grouping methods by their effects. This algorithm is implemented in my tool JSCTracker, used to automate the SMC detection process. The algorithm and tool are validated using a case study comprising of 12 open source Java projects, from different application domains and ranging in size from 2 KLOC (thousand lines of code) to 300 KLOC. The objectives of the case study are posed as 4 research questions:1. Can method IOE-Behavior be used in SMC detection?2. What is the impact of the use of the pre-filters on the efficiency of the algorithm?3. How does the performance of method IOE-Behavior compare to using only input-output for identifying SMCs?4. How reliable are the results obtained when method IOE-Behavior is used in SMC detection? Responses to these questions are obtained by checking each software sample with JSCTracker and analyzing the results.The number of SMCs detected range from 0 45 with an average execution time of 8.5 seconds. The use of the two pre-filters reduces the number of methods that reach the dynamic test phase, by an average of 34%. The IOE-Behavior approach takes an average of 0.010 seconds per method while the input-output approach takes an average of 0.015 seconds. The former also identifies an average of 32% false positives, while the SMCs identified using input-output, have an average of 92% false positives. In terms of reliability, the IOE-Behavior method produces results with precision values of an average of 68% and recall value of 76% on average.These reliability values represent an improvement of over 37% (for precision) of the values in related work [4]. Thus, it is my conclusion that IOE-Behavior can be used to detect SMCs in Java code with reasonable reliability.
Show less
-
Date Issued
-
2013
-
Identifier
-
CFE0004835, ucf:49689
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0004835
-
-
Title
-
VIDEO CONTENT EXTRACTION: SCENE SEGMENTATION, LINKING AND ATTENTION DETECTION.
-
Creator
-
Zhai, Yun, Shah, Mubarak, University of Central Florida
-
Abstract / Description
-
In this fast paced digital age, a vast amount of videos are produced every day, such as movies, TV programs, personal home videos, surveillance video, etc. This places a high demand for effective video data analysis and management techniques. In this dissertation, we have developed new techniques for segmentation, linking and understanding of video scenes. Firstly, we have developed a video scene segmentation framework that segments the video content into story units. Then, a linking method...
Show moreIn this fast paced digital age, a vast amount of videos are produced every day, such as movies, TV programs, personal home videos, surveillance video, etc. This places a high demand for effective video data analysis and management techniques. In this dissertation, we have developed new techniques for segmentation, linking and understanding of video scenes. Firstly, we have developed a video scene segmentation framework that segments the video content into story units. Then, a linking method is designed to find the semantic correlation between video scenes/stories. Finally, to better understand the video content, we have developed a spatiotemporal attention detection model for videos. Our general framework for temporal scene segmentation, which is applicable to several video domains, is formulated in a statistical fashion and uses the Markov chain Monte Carlo (MCMC) technique to determine the boundaries between video scenes. In this approach, a set of arbitrary scene boundaries are initialized at random locations and are further automatically updated using two types of updates: diffusion and jumps. The posterior probability of the target distribution of the number of scenes and their corresponding boundary locations are computed based on the model priors and the data likelihood. Model parameter updates are controlled by the MCMC hypothesis ratio test, and samples are collected to generate the final scene boundaries. The major contribution of the proposed framework is two-fold: (1) it is able to find weak boundaries as well as strong boundaries, i.e., it does not rely on the fixed threshold; (2) it can be applied to different video domains. We have tested the proposed method on two video domains: home videos and feature films. On both of these domains we have obtained very accurate results, achieving on the average of 86% precision and 92% recall for home video segmentation, and 83% precision and 83% recall for feature films. The video scene segmentation process divides videos into meaningful units. These segments (or stories) can be further organized into clusters based on their content similarities. In the second part of this dissertation, we have developed a novel concept tracking method, which links news stories that focus on the same topic across multiple sources. The semantic linkage between the news stories is reflected in the combination of both their visual content and speech content. Visually, each news story is represented by a set of key frames, which may or may not contain human faces. The facial key frames are linked based on the analysis of the extended facial regions, and the non-facial key frames are correlated using the global matching. The textual similarity of the stories is expressed in terms of the normalized textual similarity between the keywords in the speech content of the stories. The developed framework has also been applied to the task of story ranking, which computes the interestingness of the stories. The proposed semantic linking framework and the story ranking method have both been tested on a set of 60 hours of open-benchmark video data (CNN and ABC news) from the TRECVID 2003 evaluation forum organized by NIST. Above 90% system precision has been achieved for the story linking task. The combination of both visual and speech cues has boosted the un-normalized recall by 15%. We have developed PEGASUS, a content based video retrieval system with fast speech and visual feature indexing and search. The system is available on the web: http://pegasus.cs.ucf.edu:8080/index.jsp. Given a video sequence, one important task is to understand what is present or what is happening in its content. To achieve this goal, target objects or activities need to be detected, localized and recognized in either the spatial and/or temporal domain. In the last portion of this dissertation, we present a visual attention detection method, which automatically generates the spatiotemporal saliency maps of input video sequences. The saliency map is later used in the detections of interesting objects and activities in videos by significantly narrowing the search range. Our spatiotemporal visual attention model generates the saliency maps based on both the spatial and temporal signals in the video sequences. In the temporal attention model, motion contrast is computed based on the planar motions (homography) between images, which are estimated by applying RANSAC on point correspondences in the scene. To compensate for the non-uniformity of the spatial distribution of interest-points, spanning areas of motion segments are incorporated in the motion contrast computation. In the spatial attention model, we have developed a fast method for computing pixel-level saliency maps using color histograms of images. Finally, a dynamic fusion technique is applied to combine both the temporal and spatial saliency maps, where temporal attention is dominant over the spatial model when large motion contrast exists, and vice versa. The proposed spatiotemporal attention framework has been extensively applied on multiple video sequences to highlight interesting objects and motions present in the sequences. We have achieved 82% user satisfactory rate on the point-level attention detection and over 92% user satisfactory rate on the object-level attention detection.
Show less
-
Date Issued
-
2006
-
Identifier
-
CFE0001216, ucf:46944
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0001216
-
-
Title
-
SEMANTIC VIDEO RETRIEVAL USING HIGH LEVEL CONTEXT.
-
Creator
-
Aytar, Yusuf, Shah, Mubarak, University of Central Florida
-
Abstract / Description
-
Video retrieval searching and retrieving videos relevant to a user defined query is one of the most popular topics in both real life applications and multimedia research. This thesis employs concepts from Natural Language Understanding in solving the video retrieval problem. Our main contribution is the utilization of the semantic word similarity measures for video retrieval through the trained concept detectors, and the visual co-occurrence relations between such concepts. We...
Show moreVideo retrieval searching and retrieving videos relevant to a user defined query is one of the most popular topics in both real life applications and multimedia research. This thesis employs concepts from Natural Language Understanding in solving the video retrieval problem. Our main contribution is the utilization of the semantic word similarity measures for video retrieval through the trained concept detectors, and the visual co-occurrence relations between such concepts. We propose two methods for content-based retrieval of videos: (1) A method for retrieving a new concept (a concept which is not known to the system and no annotation is available) using semantic word similarity and visual co-occurrence, which is an unsupervised method. (2) A method for retrieval of videos based on their relevance to a user defined text query using the semantic word similarity and visual content of videos. For evaluation purposes, we mainly used the automatic search and the high level feature extraction test set of TRECVID'06 and TRECVID'07 benchmarks. These two data sets consist of 250 hours of multilingual news video captured from American, Arabic, German and Chinese TV channels. Although our method for retrieving a new concept is an unsupervised method, it outperforms the trained concept detectors (which are supervised) on 7 out of 20 test concepts, and overall it performs very close to the trained detectors. On the other hand, our visual content based semantic retrieval method performs more than 100% better than the text-based retrieval method. This shows that using visual content alone we can have significantly good retrieval results.
Show less
-
Date Issued
-
2008
-
Identifier
-
CFE0002158, ucf:47521
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0002158
-
-
Title
-
RECEPTIVE AND EXPRESSIVE SINGLE WORD VOCABULARY ERRORS OF PRESCHOOL CHILDREN WITH DEVELOPMENTAL DISABILITIES.
-
Creator
-
Hirn, Juliana L, Towson, Jacqueline, University of Central Florida
-
Abstract / Description
-
Vocabulary growth during the preschool years is critical for language development. Preschool children with developmental disabilities often have more difficulty with learning and developing language, therefore making more errors in vocabulary. It is important to recognize what type of errors children are demonstrating, especially as it relates to receptive and expressive language abilities. This study explores the error patterns preschool children with developmental disabilities make during...
Show moreVocabulary growth during the preschool years is critical for language development. Preschool children with developmental disabilities often have more difficulty with learning and developing language, therefore making more errors in vocabulary. It is important to recognize what type of errors children are demonstrating, especially as it relates to receptive and expressive language abilities. This study explores the error patterns preschool children with developmental disabilities make during receptive and expressive single word vocabulary tests. A secondary analysis of preexisting data was conducted from a sample of 68 preschool children with developmental disabilities ranging in severity. Based on a coding system developed by the author, errors were classified according to type. The majority of the errors children made were classified as No Response types of errors, with the second most common error being Semantic Perceptual errors of receptive and expressive picture naming tasks. Understanding the types of errors preschool children with disabilities make will help to enhance their language and therapy needed to thrive as a learner, especially as they begin elementary school.
Show less
-
Date Issued
-
2017
-
Identifier
-
CFH2000261, ucf:46010
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH2000261
-
-
Title
-
A STUDY OF SEMANTIC PROCESSING PERFORMANCE.
-
Creator
-
Dever, Daryn A, Szalma, James, Neigel, Alexis, University of Central Florida
-
Abstract / Description
-
Examining the role of individual differences, especially variations in human motivation, in vigilance tasks will result in a better understanding of sustained semantic attention and processing, which has, to date, received limited study in the literature (see Fraulini, Hancock, Neigel, Claypoole, & Szalma, 2017; Epling, Russell, & Helton, 2016; Thomson et al., 2016). This present study seeks to understand how individual differences in intrinsic motivation affect performance in a short...
Show moreExamining the role of individual differences, especially variations in human motivation, in vigilance tasks will result in a better understanding of sustained semantic attention and processing, which has, to date, received limited study in the literature (see Fraulini, Hancock, Neigel, Claypoole, & Szalma, 2017; Epling, Russell, & Helton, 2016; Thomson et al., 2016). This present study seeks to understand how individual differences in intrinsic motivation affect performance in a short semantic vigilance task. Performance across two conditions (lure vs. standard condition) were compared in the present study of 79 undergraduate students at the University of Central Florida. The results indicated significant main effects of intrinsic motivation on pre- and post-task stress factors, workload, and performance measures, which included correct detections, false alarms, and response time. Sensitivity and response bias, which are indices of signal detection theory, were also examined in the present study. Intrinsic motivation influenced sensitivity, but not response bias, which was affected by period on watch. The theoretical and practical implications of this research are also discussed.
Show less
-
Date Issued
-
2017
-
Identifier
-
CFH2000245, ucf:45984
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH2000245
-
-
Title
-
EMPIRICAL MODELING OF A MARIJUANA EXPECTANCY MEMORY NETWORK IN CHILDREN AS A FUNCTION OF AGE AND MARIJUANA USE.
-
Creator
-
Alfonso, Jacqueline, Dunn, Michael, University of Central Florida
-
Abstract / Description
-
The present investigation modeled the expectancy memory organization and likely memory activation patterns of marijuana expectancies of children across age and marijuana use. The first phase of the study surveyed 142 children to obtain their first associate to marijuana use. From their responses, the Marijuana Expectancy Inventory for Children and Adolescents (MEICA) was developed. The second phase of the study administered the MEICA to a second sample of 392 children to model marijuana...
Show moreThe present investigation modeled the expectancy memory organization and likely memory activation patterns of marijuana expectancies of children across age and marijuana use. The first phase of the study surveyed 142 children to obtain their first associate to marijuana use. From their responses, the Marijuana Expectancy Inventory for Children and Adolescents (MEICA) was developed. The second phase of the study administered the MEICA to a second sample of 392 children to model marijuana expectancy organization and probable memory activation paths of marijuana users versus never-users. Results indicated that irrespective of age, adolescents who have used marijuana tend to emphasize positive-negative effects, whereas adolescents who have never used marijuana tend to emphasize psychological-physiological effects. Memory activation patterns also differed by marijuana use history such that users are more likely to begin their paths with short-term positive effects of marijuana, versus non-users who access long-term cognitive and physiological effects with more likelihood. This study is the first to examine specific marijuana outcome expectancies of children and adolescents as they relate to marijuana-using behavior. Implications for marijuana prevention and intervention programs, future research, and limitations of the current investigation are discussed.
Show less
-
Date Issued
-
2005
-
Identifier
-
CFE0000897, ucf:46629
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000897
-
-
Title
-
EXTRACTING QUANTITATIVE INFORMATIONFROM NONNUMERIC MARKETING DATA: AN AUGMENTEDLATENT SEMANTIC ANALYSIS APPROACH.
-
Creator
-
Arroniz, Inigo, Michaels, Ronald, University of Central Florida
-
Abstract / Description
-
Despite the widespread availability and importance of nonnumeric data, marketers do not have the tools to extract information from large amounts of nonnumeric data. This dissertation attempts to fill this void: I developed a scalable methodology that is capable of extracting information from extremely large volumes of nonnumeric data. The proposed methodology integrates concepts from information retrieval and content analysis to analyze textual information. This approach avoids a pervasive...
Show moreDespite the widespread availability and importance of nonnumeric data, marketers do not have the tools to extract information from large amounts of nonnumeric data. This dissertation attempts to fill this void: I developed a scalable methodology that is capable of extracting information from extremely large volumes of nonnumeric data. The proposed methodology integrates concepts from information retrieval and content analysis to analyze textual information. This approach avoids a pervasive difficulty of traditional content analysis, namely the classification of terms into predetermined categories, by creating a linear composite of all terms in the document and, then, weighting the terms according to their inferred meaning. In the proposed approach, meaning is inferred by the collocation of the term across all the texts in the corpus. It is assumed that there is a lower dimensional space of concepts that underlies word usage. The semantics of each word are inferred by identifying its various contexts in a document and across documents (i.e., in the corpus). After the semantic similarity space is inferred from the corpus, the words in each document are weighted to obtain their representation on the lower dimensional semantic similarity space, effectively mapping the terms to the concept space and ultimately creating a score that measures the concept of interest. I propose an empirical application of the outlined methodology. For this empirical illustration, I revisit an important marketing problem, the effect of movie critics on the performance of the movies. In the extant literature, researchers have used an overall numerical rating of the review to capture the content of the movie reviews. I contend that valuable information present in the textual materials remains uncovered. I use the proposed methodology to extract this information from the nonnumeric text contained in a movie review. The proposed setting is particularly attractive to validate the methodology because the setting allows for a simple test of the text-derived metrics by comparing them to the numeric ratings provided by the reviewers. I empirically show the application of this methodology and traditional computer-aided content analytic methods to study an important marketing topic, the effect of movie critics on movie performance. In the empirical application of the proposed methodology, I use two datasets that combined contain more than 9,000 movie reviews nested in more than 250 movies. I am restudying this marketing problem in the light of directly obtaining information from the reviews instead of following the usual practice of using an overall rating or a classification of the review as either positive or negative. I find that the addition of direct content and structure of the review adds a significant amount of exploratory power as a determinant of movie performance, even in the presence of actual reviewer overall ratings (stars) and other controls. This effect is robust across distinct opertaionalizations of both the review content and the movie performance metrics. In fact, my findings suggest that as we move from sales to profitability to financial return measures, the role of the content of the review, and therefore the critic's role, becomes increasingly important.
Show less
-
Date Issued
-
2007
-
Identifier
-
CFE0001617, ucf:47164
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0001617
-
-
Title
-
GREEN BUILDING: PUBLIC OPINION, SEMANTICS, AND HEURISTIC PROCESSING.
-
Creator
-
Webb, Christina, Schraufnagel, Scot, University of Central Florida
-
Abstract / Description
-
Research on public support for green building has, to date, been incomplete. Understanding the demographics of individuals that support green building has remained secondary to merely determining real opinions on the topic. The identity of supporters and the motivation behind their support is the focus of this research. Specifically, is support for green building dependent on the way in which the issue is framed? This research aims to focus on those that are spreading the message about green...
Show moreResearch on public support for green building has, to date, been incomplete. Understanding the demographics of individuals that support green building has remained secondary to merely determining real opinions on the topic. The identity of supporters and the motivation behind their support is the focus of this research. Specifically, is support for green building dependent on the way in which the issue is framed? This research aims to focus on those that are spreading the message about green building, industry experts, and the mass public. By exposing how green building experts talk about the issue, we may begin to understand why public support for green building has yet to reach the kind of mainstream acceptance other planning and design techniques have,such as New Urbanism. I predict that green building experts perceived low levels of public awareness, with the exception of those within the Northwest region, which I believ will perceive higher levels of awareness. In addition, I assume that industry experts will be most focused on energy efficiency as a primary concept of green building. As for the public, I hypothesize that those aware of green building and individuals age 50 and older will be more likely to support green building. With the introduction of source cues, I expect that support for green building will decrease when respondents received either an environmentalism cue or a government program cue. Using survey instruments, I was able to determine that all green building experts perceive public awareness as low and do, in fact, focus their efforts on energy efficiency. With regards to the public, support was highest among those that are aware, as well as those age 50 and older. In addition, insertion of source cues decreased support for green building, with the government program source cue providing the lowest levels of support for green building.
Show less
-
Date Issued
-
2005
-
Identifier
-
CFE0000600, ucf:46525
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000600
-
-
Title
-
THE ACQUISITION OF LEXICAL KNOWLEDGE FROM THE WEB FOR ASPECTS OF SEMANTIC INTERPRETATION.
-
Creator
-
Schwartz, Hansen, Gomez, Fernando, University of Central Florida
-
Abstract / Description
-
This work investigates the effective acquisition of lexical knowledge from the Web to perform semantic interpretation. The Web provides an unprecedented amount of natural language from which to gain knowledge useful for semantic interpretation. The knowledge acquired is described as common sense knowledge, information one uses in his or her daily life to understand language and perception. Novel approaches are presented for both the acquisition of this knowledge and use of the knowledge in...
Show moreThis work investigates the effective acquisition of lexical knowledge from the Web to perform semantic interpretation. The Web provides an unprecedented amount of natural language from which to gain knowledge useful for semantic interpretation. The knowledge acquired is described as common sense knowledge, information one uses in his or her daily life to understand language and perception. Novel approaches are presented for both the acquisition of this knowledge and use of the knowledge in semantic interpretation algorithms. The goal is to increase accuracy over other automatic semantic interpretation systems, and in turn enable stronger real world applications such as machine translation, advanced Web search, sentiment analysis, and question answering. The major contributions of this dissertation consist of two methods of acquiring lexical knowledge from the Web, namely a database of common sense knowledge and Web selectors. The first method is a framework for acquiring a database of concept relationships. To acquire this knowledge, relationships between nouns are found on the Web and analyzed over WordNet using information-theory, producing information about concepts rather than ambiguous words. For the second contribution, words called Web selectors are retrieved which take the place of an instance of a target word in its local context. The selectors serve for the system to learn the types of concepts that the sense of a target word should be similar. Web selectors are acquired dynamically as part of a semantic interpretation algorithm, while the relationships in the database are useful to stand-alone programs. A final contribution of this dissertation concerns a novel semantic similarity measure and an evaluation of similarity and relatedness measures on tasks of concept similarity. Such tasks are useful when applying acquired knowledge to semantic interpretation. Applications to word sense disambiguation, an aspect of semantic interpretation, are used to evaluate the contributions. Disambiguation systems which utilize semantically annotated training data are considered supervised. The algorithms of this dissertation are considered minimally-supervised; they do not require training data created by humans, though they may use human-created data sources. In the case of evaluating a database of common sense knowledge, integrating the knowledge into an existing minimally-supervised disambiguation system significantly improved results -- a 20.5\% error reduction. Similarly, the Web selectors disambiguation system, which acquires knowledge directly as part of the algorithm, achieved results comparable with top minimally-supervised systems, an F-score of 80.2\% on a standard noun disambiguation task. This work enables the study of many subsequent related tasks for improving semantic interpretation and its application to real-world technologies. Other aspects of semantic interpretation, such as semantic role labeling could utilize the same methods presented here for word sense disambiguation. As the Web continues to grow, the capabilities of the systems in this dissertation are expected to increase. Although the Web selectors system achieves great results, a study in this dissertation shows likely improvements from acquiring more data. Furthermore, the methods for acquiring a database of common sense knowledge could be applied in a more exhaustive fashion for other types of common sense knowledge. Finally, perhaps the greatest benefits from this work will come from the enabling of real world technologies that utilize semantic interpretation.
Show less
-
Date Issued
-
2011
-
Identifier
-
CFE0003688, ucf:48805
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0003688
-
-
Title
-
Describing Images by Semantic Modeling using Attributes and Tags.
-
Creator
-
Mahmoudkalayeh, Mahdi, Shah, Mubarak, Sukthankar, Gita, Rahnavard, Nazanin, Zhang, Teng, University of Central Florida
-
Abstract / Description
-
This dissertation addresses the problem of describing images using visual attributes and textual tags, a fundamental task that narrows down the semantic gap between the visual reasoning of humans and machines. Automatic image annotation assigns relevant textual tags to the images. In this dissertation, we propose a query-specific formulation based on Weighted Multi-view Non-negative Matrix Factorization to perform automatic image annotation. Our proposed technique seamlessly adapt to the...
Show moreThis dissertation addresses the problem of describing images using visual attributes and textual tags, a fundamental task that narrows down the semantic gap between the visual reasoning of humans and machines. Automatic image annotation assigns relevant textual tags to the images. In this dissertation, we propose a query-specific formulation based on Weighted Multi-view Non-negative Matrix Factorization to perform automatic image annotation. Our proposed technique seamlessly adapt to the changes in training data, naturally solves the problem of feature fusion and handles the challenge of the rare tags. Unlike tags, attributes are category-agnostic, hence their combination models an exponential number of semantic labels. Motivated by the fact that most attributes describe local properties, we propose exploiting localization cues, through semantic parsing of human face and body to improve person-related attribute prediction. We also demonstrate that image-level attribute labels can be effectively used as weak supervision for the task of semantic segmentation. Next, we analyze the Selfie images by utilizing tags and attributes. We collect the first large-scale Selfie dataset and annotate it with different attributes covering characteristics such as gender, age, race, facial gestures, and hairstyle. We then study the popularity and sentiments of the selfies given an estimated appearance of various semantic concepts. In brief, we automatically infer what makes a good selfie. Despite its extensive usage, the deep learning literature falls short in understanding the characteristics and behavior of the Batch Normalization. We conclude this dissertation by providing a fresh view, in light of information geometry and Fisher kernels to why the batch normalization works. We propose Mixture Normalization that disentangles modes of variation in the underlying distribution of the layer outputs and confirm that it effectively accelerates training of different batch-normalized architectures including Inception-V3, Densely Connected Networks, and Deep Convolutional Generative Adversarial Networks while achieving better generalization error.
Show less
-
Date Issued
-
2019
-
Identifier
-
CFE0007493, ucf:52640
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0007493
-
-
Title
-
Visual Saliency Detection and Semantic Segmentation.
-
Creator
-
Souly, Nasim, Shah, Mubarak, Bagci, Ulas, Qi, GuoJun, Pensky, Marianna, University of Central Florida
-
Abstract / Description
-
Visual saliency is the ability to select the most relevant data in the scene and reduce the amount of data that needs to be processed. We propose a novel unsupervised approach to detect visual saliency in videos. For this, we employ a hierarchical segmentation technique to obtain supervoxels of a video, and simultaneously, we build a dictionary from cuboids of the video. Then we create a feature matrix from coefficients of dictionary elements. Next, we decompose this matrix into sparse and...
Show moreVisual saliency is the ability to select the most relevant data in the scene and reduce the amount of data that needs to be processed. We propose a novel unsupervised approach to detect visual saliency in videos. For this, we employ a hierarchical segmentation technique to obtain supervoxels of a video, and simultaneously, we build a dictionary from cuboids of the video. Then we create a feature matrix from coefficients of dictionary elements. Next, we decompose this matrix into sparse and redundant parts and obtain salient regions using group lasso. Our experiments provide promising results in terms of predicting eye movement. Moreover, we apply our method on action recognition task and achieve better results. Saliency detection only highlights important regions, in Semantic Segmentation, the aim is to assign a semantic label to each pixel in the image. Even though semantic segmentation can be achieved by simply applying classifiers to each pixel or a region, the results may not be desirable since general context information is not considered. To address this issue, we propose two supervised methods. First, an approach to discover interactions between labels and regions using a sparse estimation of precision matrix obtained by graphical lasso. Second, a knowledge-based method to incorporate dependencies among regions in the image during inference. High-level knowledge rules - such as co-occurrence- are extracted from training data and transformed into constraints in Integer Programming formulation. A difficulty in the most supervised semantic segmentation approaches is the lack of enough training data. To address this, a semi-supervised learning approach to exploit the plentiful amount of available unlabeled,as well as synthetic images generated via Generative Adversarial Networks (GAN), is presented. Furthermore, an extension of the proposed model to use additional weakly labeled data is proposed. We demonstrate our approaches on three challenging bench-marking datasets
Show less
-
Date Issued
-
2017
-
Identifier
-
CFE0006918, ucf:51694
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0006918
-
-
Title
-
Using Hashtags to Disambiguate Aboutness in Social Media Discourse: A Case Study of #OrlandoStrong.
-
Creator
-
DeArmas, Nicholas, Vie, Stephanie, Salter, Anastasia, Beever, Jonathan, Dodd, Melissa, Wheeler, Stephanie, University of Central Florida
-
Abstract / Description
-
While the field of writing studies has studied digital writing as a response to multiple calls for more research on digital forms of writing, research on hashtags has yet to build bridges between different disciplines' approaches to studying the uses and effects of hashtags. This dissertation builds that bridge in its interdisciplinary approach to the study of hashtags by focusing on how hashtags can be fully appreciated at the intersection of the fields of information research, linguistics,...
Show moreWhile the field of writing studies has studied digital writing as a response to multiple calls for more research on digital forms of writing, research on hashtags has yet to build bridges between different disciplines' approaches to studying the uses and effects of hashtags. This dissertation builds that bridge in its interdisciplinary approach to the study of hashtags by focusing on how hashtags can be fully appreciated at the intersection of the fields of information research, linguistics, rhetoric, ethics, writing studies, new media studies, and discourse studies. Hashtags are writing innovations that perform unique digital functions rhetorically while still hearkening back to functions of both print and oral rhetorical traditions. Hashtags function linguistically as indicators of semantic meaning; additionally, hashtags also perform the role of search queries on social media, retrieving texts that include the same hashtag. Information researchers refer to the relationship between a search query and its results using the term (")aboutness(") (Kehoe and Gee, 2011). By considering how hashtags have an aboutness, the humanities can call upon information research to better understand the digital aspects of the hashtag's search function. Especially when hashtags are used to organize discourse, aboutness has an effect on how a discourse community's agendas and goals are expressed, as well as framing what is relevant and irrelevant to the discourse. As digital activists increasingly use hashtags to organize and circulate the goals of their discourse communities, knowledge of ethical strategies for hashtag use will help to better preserve a relevant aboutness for their discourse while enabling them to better leverage their hashtag for circulation. In this dissertation, through a quantitative and qualitative analysis of the Twitter discourse that used #OrlandoStrong over the five-month period before the first anniversary of the Pulse shooting, I trace how the #OrlandoStrong discourse community used innovative rhetorical strategies to combat irrelevant content from ambiguating their discourse space. In Chapter One, I acknowledge the call from scholars to study digital tools and briefly describe the history of the Pulse shooting, reflecting on non-digital texts that employed #OrlandoStrong as memorials in the Orlando area. In Chapter Two, I focus on the literature surrounding hashtags, discourse, aboutness, intertextuality, hashtag activism, and informational compositions. In Chapter Three, I provide an overview of the stages of grounded theory methodology and the implications of critical discourse analysis before I detail how I approached the collection, coding, and analysis of the #OrlandoStrong Tweets I studied. The results of my study are reported in Chapter Four, offering examples of Tweets that were important to understanding how the discourse space became ambiguous through the use of hashtags. In Chapter Five, I reflect on ethical approaches to understanding the consequences of hashtag use, and then I offer an ethical recommendation for hashtag use by hashtag activists. I conclude Chapter Five with an example of a classroom activity that allows students to use hashtags to better understand the relationship between aboutness, (dis)ambiguation, discourse communities, and ethics. This classroom activity is provided with the hope that instructors from different disciplines will be able to provide ethical recommendations to future activists who may benefit from these rhetorical strategies.
Show less
-
Date Issued
-
2018
-
Identifier
-
CFE0007322, ucf:52136
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0007322