You are here
PHONEME-BASED VIDEO INDEXING USING PHONETIC DISPARITY SEARCH
- Date Issued:
- 2010
- Abstract/Description:
- This dissertation presents and evaluates a method to the video indexing problem by investigating a categorization method that transcribes audio content through Automatic Speech Recognition (ASR) combined with Dynamic Contextualization (DC), Phonetic Disparity Search (PDS) and Metaphone indexation. The suggested approach applies genome pattern matching algorithms with computational summarization to build a database infrastructure that provides an indexed summary of the original audio content. PDS complements the contextual phoneme indexing approach by optimizing topic seek performance and accuracy in large video content structures. A prototype was established to translate news broadcast video into text and phonemes automatically by using ASR utterance conversions. Each phonetic utterance extraction was then categorized, converted to Metaphones, and stored in a repository with contextual topical information attached and indexed for posterior search analysis. Following the original design strategy, a custom parallel interface was built to measure the capabilities of dissimilar phonetic queries and provide an interface for result analysis. The postulated solution provides evidence of a superior topic matching when compared to traditional word and phoneme search methods. Experimental results demonstrate that PDS can be 3.7% better than the same phoneme query, Metaphone search proved to be 154.6% better than the same phoneme seek and 68.1 % better than the equivalent word search.
Title: | PHONEME-BASED VIDEO INDEXING USING PHONETIC DISPARITY SEARCH. |
47 views
25 downloads |
---|---|---|
Name(s): |
Leon-Barth, Carlos, Author DeMara, Ronald, Committee Chair University of Central Florida, Degree Grantor |
|
Type of Resource: | text | |
Date Issued: | 2010 | |
Publisher: | University of Central Florida | |
Language(s): | English | |
Abstract/Description: | This dissertation presents and evaluates a method to the video indexing problem by investigating a categorization method that transcribes audio content through Automatic Speech Recognition (ASR) combined with Dynamic Contextualization (DC), Phonetic Disparity Search (PDS) and Metaphone indexation. The suggested approach applies genome pattern matching algorithms with computational summarization to build a database infrastructure that provides an indexed summary of the original audio content. PDS complements the contextual phoneme indexing approach by optimizing topic seek performance and accuracy in large video content structures. A prototype was established to translate news broadcast video into text and phonemes automatically by using ASR utterance conversions. Each phonetic utterance extraction was then categorized, converted to Metaphones, and stored in a repository with contextual topical information attached and indexed for posterior search analysis. Following the original design strategy, a custom parallel interface was built to measure the capabilities of dissimilar phonetic queries and provide an interface for result analysis. The postulated solution provides evidence of a superior topic matching when compared to traditional word and phoneme search methods. Experimental results demonstrate that PDS can be 3.7% better than the same phoneme query, Metaphone search proved to be 154.6% better than the same phoneme seek and 68.1 % better than the equivalent word search. | |
Identifier: | CFE0003480 (IID), ucf:48979 (fedora) | |
Note(s): |
2010-12-01 Ph.D. Engineering and Computer Science, School of Electrical Engineering and Computer Science Masters This record was generated from author submitted information. |
|
Subject(s): |
Video Indexing Phonetic Search Word Spotting ASR Speech Recognition |
|
Persistent Link to This Record: | http://purl.flvc.org/ucf/fd/CFE0003480 | |
Restrictions on Access: | campus 2012-06-01 | |
Host Institution: | UCF |