Current Search: Computational Linguistics (x)
View All Items
- Title
- HEBREW AND COMPUTER-MEDIATED COMMUNICATION: THE EFFECTS OF A LANGUAGE MANIPULATION ON PERCEPTION, IDENTITY, AND PRESERVATION.
- Creator
-
Nir, Tamar, Sims, Valerie K., University of Central Florida
- Abstract / Description
-
This study aimed to explore the ways in which Hebrew is currently being manipulated online through a linguistic deviation called Fakatsa. In this study, participants were asked to rate random statements of frivolous or serious topics in either standard grammatical Hebrew or Fakatsa Hebrew conditions on specific judgment values. It was hypothesized that participants would rate the Fakatsa writer negatively on certain characteristics, such as intelligence, education, religiosity, and...
Show moreThis study aimed to explore the ways in which Hebrew is currently being manipulated online through a linguistic deviation called Fakatsa. In this study, participants were asked to rate random statements of frivolous or serious topics in either standard grammatical Hebrew or Fakatsa Hebrew conditions on specific judgment values. It was hypothesized that participants would rate the Fakatsa writer negatively on certain characteristics, such as intelligence, education, religiosity, and nationalism and positively on other characteristics, such as femininity and creativity. Twenty-four participants completed this experiment. Results showed that participants responded as expected for certain negative attributes typical of Fakatsa and deviations to computer-mediated communication and did not respond as expected for any the positive attributes typical of Fakatsa. The results showed that fluent Hebrew speakers viewed users of the Fakatsa manipulation differently than users of standard Hebrew and may suggest personal biases and perceptions when encountering computer-mediated communication.
Show less - Date Issued
- 2016
- Identifier
- CFH2000043, ucf:45531
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFH2000043
- Title
- THE ACQUISITION OF LEXICAL KNOWLEDGE FROM THE WEB FOR ASPECTS OF SEMANTIC INTERPRETATION.
- Creator
-
Schwartz, Hansen, Gomez, Fernando, University of Central Florida
- Abstract / Description
-
This work investigates the effective acquisition of lexical knowledge from the Web to perform semantic interpretation. The Web provides an unprecedented amount of natural language from which to gain knowledge useful for semantic interpretation. The knowledge acquired is described as common sense knowledge, information one uses in his or her daily life to understand language and perception. Novel approaches are presented for both the acquisition of this knowledge and use of the knowledge in...
Show moreThis work investigates the effective acquisition of lexical knowledge from the Web to perform semantic interpretation. The Web provides an unprecedented amount of natural language from which to gain knowledge useful for semantic interpretation. The knowledge acquired is described as common sense knowledge, information one uses in his or her daily life to understand language and perception. Novel approaches are presented for both the acquisition of this knowledge and use of the knowledge in semantic interpretation algorithms. The goal is to increase accuracy over other automatic semantic interpretation systems, and in turn enable stronger real world applications such as machine translation, advanced Web search, sentiment analysis, and question answering. The major contributions of this dissertation consist of two methods of acquiring lexical knowledge from the Web, namely a database of common sense knowledge and Web selectors. The first method is a framework for acquiring a database of concept relationships. To acquire this knowledge, relationships between nouns are found on the Web and analyzed over WordNet using information-theory, producing information about concepts rather than ambiguous words. For the second contribution, words called Web selectors are retrieved which take the place of an instance of a target word in its local context. The selectors serve for the system to learn the types of concepts that the sense of a target word should be similar. Web selectors are acquired dynamically as part of a semantic interpretation algorithm, while the relationships in the database are useful to stand-alone programs. A final contribution of this dissertation concerns a novel semantic similarity measure and an evaluation of similarity and relatedness measures on tasks of concept similarity. Such tasks are useful when applying acquired knowledge to semantic interpretation. Applications to word sense disambiguation, an aspect of semantic interpretation, are used to evaluate the contributions. Disambiguation systems which utilize semantically annotated training data are considered supervised. The algorithms of this dissertation are considered minimally-supervised; they do not require training data created by humans, though they may use human-created data sources. In the case of evaluating a database of common sense knowledge, integrating the knowledge into an existing minimally-supervised disambiguation system significantly improved results -- a 20.5\% error reduction. Similarly, the Web selectors disambiguation system, which acquires knowledge directly as part of the algorithm, achieved results comparable with top minimally-supervised systems, an F-score of 80.2\% on a standard noun disambiguation task. This work enables the study of many subsequent related tasks for improving semantic interpretation and its application to real-world technologies. Other aspects of semantic interpretation, such as semantic role labeling could utilize the same methods presented here for word sense disambiguation. As the Web continues to grow, the capabilities of the systems in this dissertation are expected to increase. Although the Web selectors system achieves great results, a study in this dissertation shows likely improvements from acquiring more data. Furthermore, the methods for acquiring a database of common sense knowledge could be applied in a more exhaustive fashion for other types of common sense knowledge. Finally, perhaps the greatest benefits from this work will come from the enabling of real world technologies that utilize semantic interpretation.
Show less - Date Issued
- 2011
- Identifier
- CFE0003688, ucf:48805
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0003688
- Title
- An intelligent editor for natural language processing of unrestricted text.
- Creator
-
Glinos, Demetrios George, Gomez, Fernando, Arts and Sciences
- Abstract / Description
-
University of Central Florida College of Arts and Sciences Thesis; The understanding of natural language by computational methods has been a continuing and elusive problem in artificial intelligence. In recent years there has been a resurgence in natural language processing research. Much of this work has been on empirical or corpus-based methods which use a data-driven approach to train systems on large amounts of real language data. Using corpus-based methods, the performance of part-of...
Show moreUniversity of Central Florida College of Arts and Sciences Thesis; The understanding of natural language by computational methods has been a continuing and elusive problem in artificial intelligence. In recent years there has been a resurgence in natural language processing research. Much of this work has been on empirical or corpus-based methods which use a data-driven approach to train systems on large amounts of real language data. Using corpus-based methods, the performance of part-of-speech (POS) taggers, which assign to the individual words of a sentence their appropriate part of speech category (e.g., noun, verb, preposition), now rivals human performance levels, achieving accuracies exceeding 95%. Such taggers have proved useful as preprocessors for such tasks as parsing, speech synthesis, and information retrieval. Parsing remains, however, a difficult problem, even with the benefit of POS tagging. Moveover, as sentence length increases, there is a corresponding combinatorial explosing of alternative possible parses. Consider the following sentence from a New York Times online article: After Salinas was arrested for murder in 1995 and lawyers for the bank had begun monitoring his accounts, his personal banker in New York quietly advised Salinas' wife to move the money elsewhere, apparently without the consent of the legal department. To facilitate the parsing and other tasks, we would like to decompose this sentence into the following three shorter sentences which, taken together, convey the same meaning as the original: 1. Salinas was arrested for murder in 1995. 2. Lawyers for the bank had begun monitoring his accounts. 3. His personal banker in New York quietly advised Salinas' wife to move the money elsewhere, apparently without the consent of the legal department. This study investigates the development of heuristics for decomposing such long sentences into sets of shorter sentences without affecting the meaning of the original sentences. Without parsing or semantic analysis, heuristic rules were developed based on: (1) the output of a POS tagger (Brill's tagger); (2) the punctuation contained in the input sentences; and (3) the words themselves. The heuristic algorithms were implemented in an intelligent editor program which first augmented the POS tags and assigned tags to punctuation, and then tested the rules against a corpus of 25 New York Times online articles containing approximately 1,200 sentences and over 32,000 words, with good results. Recommendations are made for improving the algorithms and for continuing this line of research.
Show less - Date Issued
- 1999
- Identifier
- CFR0008181, ucf:53055
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFR0008181
- Title
- Automatically Acquiring a Semantic Network of Related Concepts.
- Creator
-
Szumlanski, Sean, Gomez, Fernando, Wu, Annie, Hughes, Charles, Sims, Valerie, University of Central Florida
- Abstract / Description
-
We describe the automatic acquisition of a semantic network in which over 7,500 of the most frequently occurring nouns in the English language are linked to their semantically related concepts in the WordNet noun ontology. Relatedness between nouns is discovered automatically from lexical co-occurrence in Wikipedia texts using a novel adaptation of an information theoretic inspired measure. Our algorithm then capitalizes on salient sense clustering among these semantic associates to...
Show moreWe describe the automatic acquisition of a semantic network in which over 7,500 of the most frequently occurring nouns in the English language are linked to their semantically related concepts in the WordNet noun ontology. Relatedness between nouns is discovered automatically from lexical co-occurrence in Wikipedia texts using a novel adaptation of an information theoretic inspired measure. Our algorithm then capitalizes on salient sense clustering among these semantic associates to automatically disambiguate them to their corresponding WordNet noun senses (i.e., concepts). The resultant concept-to-concept associations, stemming from 7,593 target nouns, with 17,104 distinct senses among them, constitute a large-scale semantic network with 208,832 undirected edges between related concepts. Our work can thus be conceived of as augmenting the WordNet noun ontology with RelatedTo links.The network, which we refer to as the Szumlanski-Gomez Network (SGN), has been subjected to a variety of evaluative measures, including manual inspection by human judges and quantitative comparison to gold standard data for semantic relatedness measurements. We have also evaluated the network's performance in an applied setting on a word sense disambiguation (WSD) task in which the network served as a knowledge source for established graph-based spreading activation algorithms, and have shown: a) the network is competitive with WordNet when used as a stand-alone knowledge source for WSD, b) combining our network with WordNet achieves disambiguation results that exceed the performance of either resource individually, and c) our network outperforms a similar resource, WordNet++ (Ponzetto (&) Navigli, 2010), that has been automatically derived from annotations in the Wikipedia corpus.Finally, we present a study on human perceptions of relatedness. In our study, we elicited quantitative evaluations of semantic relatedness from human subjects using a variation of the classical methodology that Rubenstein and Goodenough (1965) employed to investigate human perceptions of semantic similarity. Judgments from individual subjects in our study exhibit high average correlation to the elicited relatedness means using leave-one-out sampling (r = 0.77, ? = 0.09, N = 73), although not as high as average human correlation in previous studies of similarity judgments, for which Resnik (1995) established an upper bound of r = 0.90 (? = 0.07, N = 10). These results suggest that human perceptions of relatedness are less strictly constrained than evaluations of similarity, and establish a clearer expectation for what constitutes human-like performance by a computational measure of semantic relatedness. We also contrast the performance of a variety of similarity and relatedness measures on our dataset to their performance on similarity norms and introduce our own dataset as a supplementary evaluative standard for relatedness measures.
Show less - Date Issued
- 2013
- Identifier
- CFE0004759, ucf:49767
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004759
- Title
- Using Hashtags to Disambiguate Aboutness in Social Media Discourse: A Case Study of #OrlandoStrong.
- Creator
-
DeArmas, Nicholas, Vie, Stephanie, Salter, Anastasia, Beever, Jonathan, Dodd, Melissa, Wheeler, Stephanie, University of Central Florida
- Abstract / Description
-
While the field of writing studies has studied digital writing as a response to multiple calls for more research on digital forms of writing, research on hashtags has yet to build bridges between different disciplines' approaches to studying the uses and effects of hashtags. This dissertation builds that bridge in its interdisciplinary approach to the study of hashtags by focusing on how hashtags can be fully appreciated at the intersection of the fields of information research, linguistics,...
Show moreWhile the field of writing studies has studied digital writing as a response to multiple calls for more research on digital forms of writing, research on hashtags has yet to build bridges between different disciplines' approaches to studying the uses and effects of hashtags. This dissertation builds that bridge in its interdisciplinary approach to the study of hashtags by focusing on how hashtags can be fully appreciated at the intersection of the fields of information research, linguistics, rhetoric, ethics, writing studies, new media studies, and discourse studies. Hashtags are writing innovations that perform unique digital functions rhetorically while still hearkening back to functions of both print and oral rhetorical traditions. Hashtags function linguistically as indicators of semantic meaning; additionally, hashtags also perform the role of search queries on social media, retrieving texts that include the same hashtag. Information researchers refer to the relationship between a search query and its results using the term (")aboutness(") (Kehoe and Gee, 2011). By considering how hashtags have an aboutness, the humanities can call upon information research to better understand the digital aspects of the hashtag's search function. Especially when hashtags are used to organize discourse, aboutness has an effect on how a discourse community's agendas and goals are expressed, as well as framing what is relevant and irrelevant to the discourse. As digital activists increasingly use hashtags to organize and circulate the goals of their discourse communities, knowledge of ethical strategies for hashtag use will help to better preserve a relevant aboutness for their discourse while enabling them to better leverage their hashtag for circulation. In this dissertation, through a quantitative and qualitative analysis of the Twitter discourse that used #OrlandoStrong over the five-month period before the first anniversary of the Pulse shooting, I trace how the #OrlandoStrong discourse community used innovative rhetorical strategies to combat irrelevant content from ambiguating their discourse space. In Chapter One, I acknowledge the call from scholars to study digital tools and briefly describe the history of the Pulse shooting, reflecting on non-digital texts that employed #OrlandoStrong as memorials in the Orlando area. In Chapter Two, I focus on the literature surrounding hashtags, discourse, aboutness, intertextuality, hashtag activism, and informational compositions. In Chapter Three, I provide an overview of the stages of grounded theory methodology and the implications of critical discourse analysis before I detail how I approached the collection, coding, and analysis of the #OrlandoStrong Tweets I studied. The results of my study are reported in Chapter Four, offering examples of Tweets that were important to understanding how the discourse space became ambiguous through the use of hashtags. In Chapter Five, I reflect on ethical approaches to understanding the consequences of hashtag use, and then I offer an ethical recommendation for hashtag use by hashtag activists. I conclude Chapter Five with an example of a classroom activity that allows students to use hashtags to better understand the relationship between aboutness, (dis)ambiguation, discourse communities, and ethics. This classroom activity is provided with the hope that instructors from different disciplines will be able to provide ethical recommendations to future activists who may benefit from these rhetorical strategies.
Show less - Date Issued
- 2018
- Identifier
- CFE0007322, ucf:52136
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007322