Current Search: Concept Acquisition (x)
View All Items
- Title
- SYNTAX-BASED CONCEPT EXTRACTION FOR QUESTION ANSWERING.
- Creator
-
Glinos, Demetrios, Gomez, Fernando, University of Central Florida
- Abstract / Description
-
Question answering (QA) stands squarely along the path from document retrieval to text understanding. As an area of research interest, it serves as a proving ground where strategies for document processing, knowledge representation, question analysis, and answer extraction may be evaluated in real world information extraction contexts. The task is to go beyond the representation of text documents as "bags of words" or data blobs that can be scanned for keyword combinations and word...
Show moreQuestion answering (QA) stands squarely along the path from document retrieval to text understanding. As an area of research interest, it serves as a proving ground where strategies for document processing, knowledge representation, question analysis, and answer extraction may be evaluated in real world information extraction contexts. The task is to go beyond the representation of text documents as "bags of words" or data blobs that can be scanned for keyword combinations and word collocations in the manner of internet search engines. Instead, the goal is to recognize and extract the semantic content of the text, and to organize it in a manner that supports reasoning about the concepts represented. The issue presented is how to obtain and query such a structure without either a predefined set of concepts or a predefined set of relationships among concepts. This research investigates a means for acquiring from text documents both the underlying concepts and their interrelationships. Specifically, a syntax-based formalism for representing atomic propositions that are extracted from text documents is presented, together with a method for constructing a network of concept nodes for indexing such logical forms based on the discourse entities they contain. It is shown that meaningful questions can be decomposed into Boolean combinations of question patterns using the same formalism, with free variables representing the desired answers. It is further shown that this formalism can be used for robust question answering using the concept network and WordNet synonym, hypernym, hyponym, and antonym relationships. This formalism was implemented in the Semantic Extractor (SEMEX) research tool and was tested against the factoid questions from the 2005 Text Retrieval Conference (TREC), which operated upon the AQUAINT corpus of newswire documents. After adjusting for the limitations of the tool and the document set, correct answers were found for approximately fifty percent of the questions analyzed, which compares favorably with other question answering systems.
Show less - Date Issued
- 2006
- Identifier
- CFE0000985, ucf:46711
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000985
- Title
- SINBAD AUTOMATION OF SCIENTIFIC PROCESS: FROM HIDDEN FACTOR ANALYSIS TO THEORY SYNTHESIS.
- Creator
-
KURSUN, OLCAY, Favorov, Oleg V., University of Central Florida
- Abstract / Description
-
Modern science is turning to progressively more complex and data-rich subjects, which challenges the existing methods of data analysis and interpretation. Consequently, there is a pressing need for development of ever more powerful methods of extracting order from complex data and for automation of all steps of the scientific process. Virtual Scientist is a set of computational procedures that automate the method of inductive inference to derive a theory from observational data dominated by...
Show moreModern science is turning to progressively more complex and data-rich subjects, which challenges the existing methods of data analysis and interpretation. Consequently, there is a pressing need for development of ever more powerful methods of extracting order from complex data and for automation of all steps of the scientific process. Virtual Scientist is a set of computational procedures that automate the method of inductive inference to derive a theory from observational data dominated by nonlinear regularities. The procedures utilize SINBAD a novel computational method of nonlinear factor analysis that is based on the principle of maximization of mutual information among non-overlapping sources (Imax), yielding higher-order features of the data that reveal hidden causal factors controlling the observed phenomena. One major advantage of this approach is that it is not dependent on a particular choice of learning algorithm to use for the computations. The procedures build a theory of the studied subject by finding inferentially useful hidden factors, learning interdependencies among its variables, reconstructing its functional organization, and describing it by a concise graph of inferential relations among its variables. The graph is a quantitative model of the studied subject, capable of performing elaborate deductive inferences and explaining behaviors of the observed variables by behaviors of other such variables and discovered hidden factors. The set of Virtual Scientist procedures is a powerful analytical and theory-building tool designed to be used in research of complex scientific problems characterized by multivariate and nonlinear relations.
Show less - Date Issued
- 2004
- Identifier
- CFE0000043, ucf:46124
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0000043