You are here
Larger-first partial parsing
- Date Issued:
- 2003
- Abstract/Description:
- University of Central Florida College of Engineering Thesis; Larger-first partial parsing is a primarily top-down approach to partial parsing that is opposite to current easy-first, or primarily bottom-up, strategies. A rich partial tree structure is captured by an algorithm that assigns a hierarchy of structural tags to each of the input tokens in a sentence. Part-of-speech tags are first assigned to the words in a sentence by a part-of-speech tagger. A cascade of Deterministic Finite State Automata then uses this part-of-speech information to identify syntactic relations primarily ina descending order of their size. The cascade is divided into four specialized sections: (1) a Comma Network, which identifies syntactic relations associated with commas; (2) a Conjunction Network, which partially disambiguates phrasal conjunctions and fully disambiguates clausal conjunctions; (3) a Clause Network, which identifies non-comma-delimited clauses; and (4) a Phrase Network, which identifies the remaining base phrases in the sentence. Each automaton is capable of adding one ore more levels of structural tags to the to the tokens in a sentence. The larger-first approach is compared against a well-known easy-first approach. The results indicate that this larger-first approach is capable of (1) producing a more detailed partial parse than an easy first approach; (2) providing better containment of attachment ambiguity; (3) handling overlapping syntactic relations; and (4) achieving a higher accuracy than the easy-first approach. The automata of each network were developed by an empirical analysis of several sources and are presented here in details.
Title: | Larger-first partial parsing. |
32 views
14 downloads |
---|---|---|
Name(s): |
Van Delden, Sebastian Alexander, Author Gomez, Fernando, Committee Chair Engineering and Computer Science, Degree Grantor |
|
Type of Resource: | text | |
Date Issued: | 2003 | |
Publisher: | University of Central Florida | |
Language(s): | English | |
Abstract/Description: | University of Central Florida College of Engineering Thesis; Larger-first partial parsing is a primarily top-down approach to partial parsing that is opposite to current easy-first, or primarily bottom-up, strategies. A rich partial tree structure is captured by an algorithm that assigns a hierarchy of structural tags to each of the input tokens in a sentence. Part-of-speech tags are first assigned to the words in a sentence by a part-of-speech tagger. A cascade of Deterministic Finite State Automata then uses this part-of-speech information to identify syntactic relations primarily ina descending order of their size. The cascade is divided into four specialized sections: (1) a Comma Network, which identifies syntactic relations associated with commas; (2) a Conjunction Network, which partially disambiguates phrasal conjunctions and fully disambiguates clausal conjunctions; (3) a Clause Network, which identifies non-comma-delimited clauses; and (4) a Phrase Network, which identifies the remaining base phrases in the sentence. Each automaton is capable of adding one ore more levels of structural tags to the to the tokens in a sentence. The larger-first approach is compared against a well-known easy-first approach. The results indicate that this larger-first approach is capable of (1) producing a more detailed partial parse than an easy first approach; (2) providing better containment of attachment ambiguity; (3) handling overlapping syntactic relations; and (4) achieving a higher accuracy than the easy-first approach. The automata of each network were developed by an empirical analysis of several sources and are presented here in details. | |
Identifier: | CFR0000760 (IID), ucf:52932 (fedora) | |
Note(s): |
2003-12-01 Ph.D. Electrical Engineering and Computer Science Doctorate This record was generated from author submitted information. Electronically reproduced by the University of Central Florida from a book held in the John C. Hitt Library at the University of Central Florida, Orlando. |
|
Subject(s): |
Dissertations Academic -- Engineering Engineering -- Dissertations Academic Natural language processing (Computer science) Parsing (Computer grammar) Sequential machine theory |
|
Persistent Link to This Record: | http://purl.flvc.org/ucf/fd/CFR0000760 | |
Restrictions on Access: | public | |
Host Institution: | UCF |