Current Search: Speech (x)
Pages
-
-
Title
-
The generation of synthetic speech sounds by digital coding.
-
Creator
-
Steinberger, Eddy Alan, null, null, Engineering
-
Abstract / Description
-
FLorida Technological University College of Engineering Thesis; The feasibility of representing human speech by serial digital codes was investigated by exercising specially constructed digital logic coupled with standard audio output equipment. The theories being tested represent a radical departure from previous efforts in the field of speech research. Therefore, this initial investigation was limited in scope to a study of unconnected English language speech sounds at the phenome level.
-
Date Issued
-
1975
-
Identifier
-
CFR0002823, ucf:52917
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFR0002823
-
-
Title
-
Speech Synthesis Utilizing Microcomputer Control.
-
Creator
-
Uzel, Joseph N., Patz, Benjamin W., Engineering
-
Abstract / Description
-
Florida Technological University College of Engineering Thesis; This report explores the subject of speech synthesis. Information given includes a brief explanation of speech production in man, an historical view of speech synthesis, and four types of electronic synthesizers in use today. Also included is a brief presentation on phonetics, the study of speech sounds. An understanding of this subject is necessary to see how a synthesizer must produce certain sounds, and how these sounds are...
Show moreFlorida Technological University College of Engineering Thesis; This report explores the subject of speech synthesis. Information given includes a brief explanation of speech production in man, an historical view of speech synthesis, and four types of electronic synthesizers in use today. Also included is a brief presentation on phonetics, the study of speech sounds. An understanding of this subject is necessary to see how a synthesizer must produce certain sounds, and how these sounds are put together to create words. Finally a description of a limited text speech synthesizer is presented. This system allows the user to enter English text via a keyboard and have it output in spoken form. The future of speech synthesis appears to be very bright. This report also gives some possible applications of verbal computer communication.
Show less
-
Date Issued
-
1978
-
Identifier
-
CFR0004781, ucf:52972
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFR0004781
-
-
Title
-
THE SPEECH SITUATION CHECKLIST: A NORMATIVE AND COMPARATIVE INVESTIGATION OF CHILDREN WHO DO AND DO NOT STUTTER.
-
Creator
-
Verghese, Susha, Vanryckeghem, Martine, University of Central Florida
-
Abstract / Description
-
Studies conducted over the past decades have identified the presence of a greater amount of negative emotional reaction and speech disruption in particular speech situations among children who stutter, compared to those who do not (Brutten & Vanryckeghem, 2003b; Knudson, 1939; Meyers, 1986; Trotter, 1983). Laboratory investigations have been utilized to describe the particular situations that elicit the greatest or least amount of speech concern and fluency failures. More recently, in...
Show moreStudies conducted over the past decades have identified the presence of a greater amount of negative emotional reaction and speech disruption in particular speech situations among children who stutter, compared to those who do not (Brutten & Vanryckeghem, 2003b; Knudson, 1939; Meyers, 1986; Trotter, 1983). Laboratory investigations have been utilized to describe the particular situations that elicit the greatest or least amount of speech concern and fluency failures. More recently, in order to deal with the limitation of laboratory research, the use of self-report tests have gained popularity as a means of exploring the extent of negative emotional reaction and speech disruption in a wide array of speaking situations. However, the availability of such instruments for use with children has been limited. Toward this end, the Speech Situation Checklist (SSC) was designed for use with youngsters who do and do not stutter (Brutten 1965b, 2003b). Past investigations utilizing the SSC for Children have reported on reliability and validity information and provided useful normative data (Brutten & Vanryckeghem, 2003b; Trotter, 1983). Additionally, the findings from those research studies have consistently revealed statistically significant differences in speech-related negative emotional response and speech disorganization between children who do and do not stutter. However, since its initial construction, the SSC has undergone modifications and paucity of normative data for the current American form of the SSC has restricted its clinical use. To fill this void, the revised SSC for children was utilized in the present study to obtain current normative and comparative data for American grade-school stuttering and nonstuttering children. Additionally, the effect of age and gender (and their interaction) on the emotional reaction and speech disruption scores of the SSC was examined. The SSC self-report test was administered to 79 nonstuttering and 19 stuttering elementary and middle-school children between the ages of 6 and 13. Only those nonstutterers who showed no evidence of a speech, language, reading, writing or learning difficulty, or any additional motor or behavioral problems were included in the subject pool. Similarly, only those stuttering participants who did not demonstrate any language or speech disorder other than stuttering were contained in the study. Measures of central tendency and variance indicated an overall mean score of 78.26 (SD=19.34) and 85.69 (SD=22.25) for the sample of nonstuttering children on the Emotional Reaction section and Speech Disruption section of the SSC, respectively. For the group of stutterers the overall mean for Emotional Reaction was 109.53 (SD=34.35) and 109.42 (SD=21.33) for the Speech Disruption section. This difference in group means proved to be statistically significant for both emotional response (t=3.816, p=. 001) and fluency failures (t=4.169, p=. 000), indicating that, as a group, children who stutter report significantly more in the way of emotional response to and fluency failures in the situations described in the SSC, compared to their fluent peers. Significant high correlations were also obtained between the report of emotional response and the extent of fluency failures in the various speaking situations for both the group of nonstuttering (.70) and stuttering (.71) children. As far as the effect of age and gender is concerned, the present study found no significant difference in the ER and SD scores between the male and female or the younger and older group of nonstuttering children. Interestingly, a significant age by gender interaction was obtained for the nonstuttering children, only on the Speech Disruption section of the test.
Show less
-
Date Issued
-
2004
-
Identifier
-
CFE0000239, ucf:46270
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000239
-
-
Title
-
VOICE AUTHENTICATIONA STUDY OF POLYNOMIAL REPRESENTATION OF SPEECH SIGNALS.
-
Creator
-
Strange, John, Mohapatra, Ram, University of Central Florida
-
Abstract / Description
-
A subset of speech recognition is the use of speech recognition techniques for voice authentication. Voice authentication is an alternative security application to the other biometric security measures such as the use of fingerprints or iris scans. Voice authentication has advantages over the other biometric measures in that it can be utilized remotely, via a device like a telephone. However, voice authentication has disadvantages in that the authentication system typically requires a large...
Show moreA subset of speech recognition is the use of speech recognition techniques for voice authentication. Voice authentication is an alternative security application to the other biometric security measures such as the use of fingerprints or iris scans. Voice authentication has advantages over the other biometric measures in that it can be utilized remotely, via a device like a telephone. However, voice authentication has disadvantages in that the authentication system typically requires a large memory and processing time than do fingerprint or iris scanning systems. Also, voice authentication research has yet to provide an authentication system as reliable as the other biometric measures. Most voice recognition systems use Hidden Markov Models (HMMs) as their basic probabilistic framework. Also, most voice recognition systems use a frame based approach to analyze the voice features. An example of research which has been shown to provide more accurate results is the use of a segment based model. The HMMs impose a requirement that each frame has conditional independence from the next. However, at a fixed frame rate, typically 10 ms., the adjacent feature vectors might span the same phonetic segment and often exhibit smooth dynamics and are highly correlated. The relationship between features of different phonetic segments is much weaker. Therefore, the segment based approach makes fewer conditional independence assumptions which are also violated to a lesser degree than for the frame based approach. Thus, the HMMs using segmental based approaches are more accurate. The speech polynomials (feature vectors) used in the segmental model have been shown to be Chebychev polynomials. Use of the properties of these polynomials has made it possible to reduce the computation time for speech recognition systems. Also, representing the spoken word waveform as a Chebychev polynomial allows for the recognition system to easily extract useful and repeatable features from the waveform allowing for a more accurate identification of the speaker. This thesis describes the segmental approach to speech recognition and addresses in detail the use of Chebychev polynomials in the representation of spoken words, specifically in the area of speaker recognition. .
Show less
-
Date Issued
-
2005
-
Identifier
-
CFE0000366, ucf:46340
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000366
-
-
Title
-
SUPPORTING REAL-TIME PDA INTERACTION WITH VIRTUAL ENVIRONMENT.
-
Creator
-
Shah, Radhey, Chatterjee, Mainak, University of Central Florida
-
Abstract / Description
-
Personal Digital Assistants (PDAs) are becoming more and more powerful with advances in technology and are expanding their applications in a variety of fields. This work explores the use of PDAs in Virtual Environments (VE). The goal is to support highly interactive bi-directional user interactions in Virtual Environments in more natural and less cumbersome ways. A proxy-based approach is adopted to support a wide-range of handheld devices and have a multi-PDA interaction with the virtual...
Show morePersonal Digital Assistants (PDAs) are becoming more and more powerful with advances in technology and are expanding their applications in a variety of fields. This work explores the use of PDAs in Virtual Environments (VE). The goal is to support highly interactive bi-directional user interactions in Virtual Environments in more natural and less cumbersome ways. A proxy-based approach is adopted to support a wide-range of handheld devices and have a multi-PDA interaction with the virtual world. The architecture consists of three components in the complete system, a PDA, a desktop that acts as a proxy and Virtual Environment Software Sandbox (VESS), software developed at the Institute for Simulation and Training (IST). The purpose of the architecture is to enable issuing text and voice commands from PDA to virtual entities in VESS through the proxy. The commands are a pre-defined set of simple words such as 'move forward', 'turn right', 'go', and 'stop'. These commands are matched at the proxy and sent to VESS as text in XML format. The response from VESS is received at the proxy and forwarded back to the PDA. Performance measures with respect to response time characteristics of text messages between PDA and proxy over Wi-Fi networks are conducted. The results are discussed with respect to the acceptable delays for human perception in order to have real-time interaction between a PDA and an avatar in virtual world.
Show less
-
Date Issued
-
2004
-
Identifier
-
CFE0000195, ucf:46158
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000195
-
-
Title
-
A Comparison of the Verbal Transformation Effect in Normal and Learning Disabled Children.
-
Creator
-
Kissell, Ellen E., Mullin, Thomas A., Social Sciences
-
Abstract / Description
-
Florida Technological University College of Social Sciences Thesis
-
Date Issued
-
1976
-
Identifier
-
CFR0008176, ucf:53063
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFR0008176
-
-
Title
-
DISCUSSION ON EFFECTIVE RESTORATION OF ORAL SPEECH USING VOICE CONVERSION TECHNIQUES BASED ON GAUSSIAN MIXTURE MODELING.
-
Creator
-
Alverio, Gustavo, Mikhael, Wasfy, University of Central Florida
-
Abstract / Description
-
Today's world consists of many ways to communicate information. One of the most effective ways to communicate is through the use of speech. Unfortunately many lose the ability to converse. This in turn leads to a large negative psychological impact. In addition, skills such as lecturing and singing must now be restored via other methods. The usage of text-to-speech synthesis has been a popular resolution of restoring the capability to use oral speech. Text to speech synthesizers convert...
Show moreToday's world consists of many ways to communicate information. One of the most effective ways to communicate is through the use of speech. Unfortunately many lose the ability to converse. This in turn leads to a large negative psychological impact. In addition, skills such as lecturing and singing must now be restored via other methods. The usage of text-to-speech synthesis has been a popular resolution of restoring the capability to use oral speech. Text to speech synthesizers convert text into speech. Although text to speech systems are useful, they only allow for few default voice selections that do not represent that of the user. In order to achieve total restoration, voice conversion must be introduced. Voice conversion is a method that adjusts a source voice to sound like a target voice. Voice conversion consists of a training and converting process. The training process is conducted by composing a speech corpus to be spoken by both source and target voice. The speech corpus should encompass a variety of speech sounds. Once training is finished, the conversion function is employed to transform the source voice into the target voice. Effectively, voice conversion allows for a speaker to sound like any other person. Therefore, voice conversion can be applied to alter the voice output of a text to speech system to produce the target voice. The thesis investigates how one approach, specifically the usage of voice conversion using Gaussian mixture modeling, can be applied to alter the voice output of a text to speech synthesis system. Researchers found that acceptable results can be obtained from using these methods. Although voice conversion and text to speech synthesis are effective in restoring voice, a sample of the speaker before voice loss must be used during the training process. Therefore it is vital that voice samples are made to combat voice loss.
Show less
-
Date Issued
-
2007
-
Identifier
-
CFE0001793, ucf:47286
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0001793
-
-
Title
-
COMPREHENDING SYNTHETIC SPEECH: PERSONAL AND PRODUCTION INFLUENCES.
-
Creator
-
Wang Costello, Jingjing, Gilson, Richard, University of Central Florida
-
Abstract / Description
-
With the increasing prevalence of voice-production technology across societies, clear comprehension while listening to synthetic speech is an obvious goal. Common human factors influences include the listener's language familiarity and age. Production factors include the speaking rate and clarity. This study investigated the speaking comprehension performance of younger and older adults who learned English as their first or second language. Presentations varied by the rate of delivery in...
Show moreWith the increasing prevalence of voice-production technology across societies, clear comprehension while listening to synthetic speech is an obvious goal. Common human factors influences include the listener's language familiarity and age. Production factors include the speaking rate and clarity. This study investigated the speaking comprehension performance of younger and older adults who learned English as their first or second language. Presentations varied by the rate of delivery in words per minute (wpm) and in two forms, synthetic or natural speech. The results showed that younger adults had significantly higher comprehension performance than older adults. English as First Language (EFL) participants performed better than English as Second Language (ESL) participants for both younger and older adults, although the performance gap for the older adults was significantly larger than for younger adults. Younger adults performed significantly better than older adults at the slow speech rate (127 wpm), but surprisingly at the medium speech rate (188 wpm), both age groups performed similarly. Both young and older participants had better comprehension when listening to synthetic speech than natural speech. Both theoretical and design implications are provided from these findings. A cognitive diagnostic tool is proposed as a recommendation for future research.
Show less
-
Date Issued
-
2011
-
Identifier
-
CFE0003925, ucf:48703
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0003925
-
-
Title
-
THE ROLE OF STRESS IN THE DIFFERENTIAL DIAGNOSIS OF APRAXIA OF SPEECH AND APHASIA.
-
Creator
-
Ferranti, Jennifer G, Troche, Joshua, Bislick-Wilson, Lauren, University of Central Florida
-
Abstract / Description
-
The intent of this thesis is to explore and develop the quantification of AOS features, particularly the deficits of prosodic elements, lexical stress and duration. This study investigated whether PVI can be used as a sensitive tool for the differential diagnosis of AOS. Specifically, we sought to determine whether analysis of vowel length of stressed and unstressed syllables is helpful in differentiating between individuals with AOS and aphasia versus aphasia alone. Significant differences...
Show moreThe intent of this thesis is to explore and develop the quantification of AOS features, particularly the deficits of prosodic elements, lexical stress and duration. This study investigated whether PVI can be used as a sensitive tool for the differential diagnosis of AOS. Specifically, we sought to determine whether analysis of vowel length of stressed and unstressed syllables is helpful in differentiating between individuals with AOS and aphasia versus aphasia alone. Significant differences support the hypothesis that PVI, analyzed from vowel length, is uniquely affected in AOS. This yields reason for further research in prosodic deficits in differential diagnosis, as well as application of this theory for a speech-language pathologist.
Show less
-
Date Issued
-
2018
-
Identifier
-
CFH2000388, ucf:45746
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH2000388
-
-
Title
-
Standing Up Comedy: Analyzing Rhetorical Approaches to Identity in Stand-up Comedy.
-
Creator
-
Grabert, Christopher, Holic, Nathan, Wheeler, Stephanie, Brenckle, Martha, University of Central Florida
-
Abstract / Description
-
My thesis addresses contemporary conversations about stand-up comedy and the art-form's capacity for facilitating complex rhetorical decision-making. I examine how stand-up comedians have positioned themselves on-stage through choices pertaining revealing personal behaviors, personas, and beliefs in public settings. Ultimately, I argue that the art of stand-up does not require truth-telling on-stage, and that there exists an implicit contract between performers and audiences which details...
Show moreMy thesis addresses contemporary conversations about stand-up comedy and the art-form's capacity for facilitating complex rhetorical decision-making. I examine how stand-up comedians have positioned themselves on-stage through choices pertaining revealing personal behaviors, personas, and beliefs in public settings. Ultimately, I argue that the art of stand-up does not require truth-telling on-stage, and that there exists an implicit contract between performers and audiences which details comedians' license to share falsehoods, exaggerations, and embellishments on-stage without the repercussions that accompany these actions in other discourse settings. Finally, I evaluate how comics have handled this rhetorical (")license,(") with some performers delivering easily identifiable falsehoods on stage through characters and caricatures, and others choosing to deliver autobiographical material in spite of the license. My research offers a framework through which audiences may digest the speech utterances in standup comedy performances as the product of purely rhetorical, calculated choices. I will propose that audiences treat each stand-up performance, no matter how seemingly intimate or personal, as artifice. I then offer case studies of three comedians who approach the notion of crafting anon-stage persona in different fashions and evaluate how each of these comedians utilize the implicit license of stand-up comedy. My research contributes to conversations in rhetoric and composition related to the performance of public and private (")selves.(")
Show less
-
Date Issued
-
2019
-
Identifier
-
CFE0007889, ucf:52773
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0007889
-
-
Title
-
The Impact of Degraded Speech and Stimulus Familiarity in a Dichotic Listening Task.
-
Creator
-
Sinatra, Anne, Sims, Valerie, Hancock, Peter, Szalma, James, Chin, Matthew, Renk, Kimberly, University of Central Florida
-
Abstract / Description
-
It has been previously established that when engaged in a difficult attention intensive task, which involves repeating information while blocking out other information (the dichotic listening task), participants are often able to report hearing their own names in an unattended audio channel (Moray, 1959). This phenomenon, called the cocktail party effect is a result of words that are important to oneself having a lower threshold, resulting in less attention being necessary to process them ...
Show moreIt has been previously established that when engaged in a difficult attention intensive task, which involves repeating information while blocking out other information (the dichotic listening task), participants are often able to report hearing their own names in an unattended audio channel (Moray, 1959). This phenomenon, called the cocktail party effect is a result of words that are important to oneself having a lower threshold, resulting in less attention being necessary to process them (Treisman, 1960). The current studies examined the ability of a person who was engaged in an attention demanding task to hear and recall low-threshold words from a fictional story. These low-threshold words included a traditional alert word, (")fire(") and fictional character names from a popular franchise(-)Harry Potter. Further, the role of stimulus degradation was examined by including synthetic and accented speech in the task to determine how it would impact attention and performance.In Study 1 participants repeated passages from a novel that was largely unfamiliar to them, The Secret Garden while blocking out a passage from a much more familiar source, Harry Potter and the Deathly Hallows. Each unattended Harry Potter passage was edited so that it would include 4 names from the series, and the word (")fire(") twice. The type of speech present in the attended and unattended ears (Natural or Synthetic) was varied to examine the impact that processing a degraded speech would have on performance. The speech that the participant shadowed did not impact unattended recall, however it did impact shadowing accuracy. The speech type that was present in the unattended ear did impact the ability to recall low-threshold, Harry Potter information. When the unattended speech type was synthetic, significantly less Harry Potter information was recalled. Interestingly, while Harry Potter information was recalled by participants with both high and low Harry Potter experience, the traditional low-threshold word, (")fire(") was not noticed by participants. In order to determine if synthetic speech impeded the ability to report low-threshold Harry Potter names due to being degraded or simply being different than natural speech, Study 2 was designed. In Study 2 the attended (shadowed) speech was held constant as American Natural speech, and the unattended ear was manipulated. An accent which was different than the native accent of the participants was included as a mild form of degradation. There were four experimental stimuli which contained one of the following in the unattended ear: American Natural, British Natural, American Synthetic and British Synthetic. Overall, more unattended information was reported when the unattended channel was Natural than Synthetic. This implies that synthetic speech does take more working memory processing power than even an accented natural speech. Further, it was found that experience with the Harry Potter franchise played a role in the ability to report unattended Harry Potter information. Those who had high levels of Harry Potter experience, particularly with audiobooks, were able to process and report Harry Potter information from the unattended stimulus when it was British Natural. While, those with low Harry Potter experience were not able to report unattended Harry Potter information from this slightly degraded stimulus. Therefore, it is believed that the previous audiobook experience of those in the high Harry Potter experience group acted as training and resulted in less working memory being necessary to encode the unattended Harry Potter information. A pilot study was designed in order to examine the impact of story familiarity in the attended and unattended channels of a dichotic listening task. In the pilot study, participants shadowed a Harry Potter passage (familiar) in one condition with a passage from The Secret Garden (unfamiliar) playing in the unattended ear. A second condition had participants shadowing The Secret Garden (unfamiliar) with a passage from Harry Potter (familiar) present in the unattended ear. There was no significant difference in the number of unattended names recalled. Those with low Harry Potter experience reported significantly less attended information when they shadowed Harry Potter than when they shadowed The Secret Garden. Further, there appeared to be a trend such that those with high Harry Potter experience were reporting more attended information when they shadowed Harry Potter than The Secret Garden. This implies that experience with a franchise and characters may make it easier to recall information about a passage, while lack of experience provides no assistance. Overall, the results of the studies indicate that we do treat fictional characters in a way similarly to ourselves. Names and information about fictional characters were able to break through into attention during a task that required a great deal of attention. The experience one had with the characters also served to assist the working memory in processing the information in degraded circumstances. These results have important implications for training, design of alerts, and the use of popular media in the classroom.
Show less
-
Date Issued
-
2012
-
Identifier
-
CFE0004256, ucf:49535
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0004256
-
-
Title
-
AUTOMATED REGRESSION TESTING APPROACH TO EXPANSION AND REFINEMENT OF SPEECH RECOGNITION GRAMMARS.
-
Creator
-
Dookhoo, Raul, DeMara, Ronald, University of Central Florida
-
Abstract / Description
-
This thesis describes an approach to automated regression testing for speech recognition grammars. A prototype Audio Regression Tester called ART has been developed using Microsoft's Speech API and C#. ART allows a user to perform any of three tasks: automatically generate a new XML-based grammar file from standardized SQL database entries, record and cross-reference audio files for use by an underlying speech recognition engine, and perform regression tests with the aid of an oracle...
Show moreThis thesis describes an approach to automated regression testing for speech recognition grammars. A prototype Audio Regression Tester called ART has been developed using Microsoft's Speech API and C#. ART allows a user to perform any of three tasks: automatically generate a new XML-based grammar file from standardized SQL database entries, record and cross-reference audio files for use by an underlying speech recognition engine, and perform regression tests with the aid of an oracle grammar. ART takes as input a wave sound file containing speech and a newly created XML grammar file. It then simultaneously executes two tests: one with the wave file and the new grammar file and the other with the wave file and the oracle grammar. The comparison result of the tests is used to determine whether the test was successful or not. This allows rapid exhaustive evaluations of additions to grammar files to guarantee forward process as the complexity of the voice domain grows. The data used in this research to derive results were taken from the LifeLike project. However, the capabilities of ART extend beyond LifeLike. The results gathered have shown that using a person's recorded voice to do regression testing is as effective as having the person do live testing. A cost-benefit analysis, using two published equations, one for Cost and the other for Benefit, was also performed to determine if automated regression testing is really more effective than manual testing. Cost captures the salaries of the engineers who perform regression testing tasks and Benefit captures revenue gains or losses related to changes in product release time. ART had a higher benefit of $21461.08 when compared to manual regression testing which had a benefit of $21393.99. Coupled with its excellent error detection rates, ART has proven to be very efficient and cost-effective in speech grammar creation and refinement.
Show less
-
Date Issued
-
2008
-
Identifier
-
CFE0002437, ucf:47703
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0002437
-
-
Title
-
ACCURACY OF PARENTAL REPORT ON PHONOLOGICAL INVENTORIES OF TODDLERS.
-
Creator
-
Teske, Kristin, Carson, Cecyle, University of Central Florida
-
Abstract / Description
-
Considering the diminishing availability of professional resources, increasing costs, and time requirements involved in early childhood mass screenings, parents are an essential source of information. In this study, the Survey of Speech Development (SSD) (Perry-Carson & Steel, 2001; Steel, 2000) was used to determine the accuracy of parents in reporting the speech sound inventories of their toddlers. Parents of 30 children, who were between the ages of 27 to 33 months old, completed the SSD...
Show moreConsidering the diminishing availability of professional resources, increasing costs, and time requirements involved in early childhood mass screenings, parents are an essential source of information. In this study, the Survey of Speech Development (SSD) (Perry-Carson & Steel, 2001; Steel, 2000) was used to determine the accuracy of parents in reporting the speech sound inventories of their toddlers. Parents of 30 children, who were between the ages of 27 to 33 months old, completed the SSD prior to a speech and language assessment session. Based on assessment results, the children were classified as normal developing or language delayed. A 20-minute play interaction between the parent and child was recorded during the assessment and was transcribed later for analysis. Speech sounds (consonants) were coded as present or absent and comparisons were made between the parents results on the SSD and data from the 20-minute speech sample. A point-by-point reliability analysis of the speech sounds on the SSD compared to those produced in the speech sample revealed an overall parental accuracy of 75%. Further, no differences were found between parent reports and transcribed accounts for total number of different consonants. This was true for parents of both language delayed and language normal toddlers. Results suggest that if given a systematic means of providing information, parents are a reliable source of information regarding sounds their toddlers produce.
Show less
-
Date Issued
-
2005
-
Identifier
-
CFE0000676, ucf:46543
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000676
-
-
Title
-
EFFECTS OF SIMULTANEOUS EXERCISE AND SPEECH TASKS ON THE PERCEPTION OF EFFORT AND VOCAL MEASURES IN AEROBIC INSTRUCTURS.
-
Creator
-
Koblick, Heather, Hoffman-Ruddy, Bari, University of Central Florida
-
Abstract / Description
-
The purpose of this study was to investigate the effects of voice production and perception of dyspnea in aerobic instructors during simultaneous tasks of exercise and speech production. The study aimed to document changes that occur during four conditions: 1) voice production without exercise and no use of amplification; 2) voice production without exercise and the use of amplification; 3) voice production during exercise without the use of amplification; 4) voice production during exercise...
Show moreThe purpose of this study was to investigate the effects of voice production and perception of dyspnea in aerobic instructors during simultaneous tasks of exercise and speech production. The study aimed to document changes that occur during four conditions: 1) voice production without exercise and no use of amplification; 2) voice production without exercise and the use of amplification; 3) voice production during exercise without the use of amplification; 4) voice production during exercise with the use of amplification. Participants included ten aerobic instructors (two male and eight female). The dependent variables included vocal intensity, average fundamental frequency (F0), noise-to-harmonic ratio (NHR), jitter percent (jitt %), shimmer percent (shim %), and participants' self-perception of dyspnea. The results indicated that speech alone, whether it was with or without amplification, had no effect on the sensation of dyspnea. However, when combining speech with exercise, the speech task became increasingly difficult, even more so without the use of amplification. Exercise was observed to inhibit vocal loudness levels as vocal intensity measures were lowest in the conditions with exercise with the use of amplification. Increases in F0 occurred in conditions involving exercise without the use of amplification. Moreover, four participants in various conditions exhibited frequencies that diverged from their gender's normal range. Participants' NHR increased during periods of exercise, however no participants were found to have NHR measures outside the normal range. Four participants were found to have moderate laryngeal pathology that was hemorrhagic in nature. Findings suggest that traditional treatment protocols may need to be modified beyond hygienic approaches in order to address both the respiratory and laryngeal work-loads that are encountered in this population and others involving similar occupational tasks.
Show less
-
Date Issued
-
2004
-
Identifier
-
CFE0000274, ucf:46234
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000274
-
-
Title
-
WORKING MEMORY, SEARCH, AND SIGNAL DETECTION: IMPLICATIONS FOR INTERACTIVE VOICE RESPONSE SYSTEM MENU DESIGN.
-
Creator
-
Commarford, Patrick, Smither, Janan, University of Central Florida
-
Abstract / Description
-
Many researchers and speech user interface practitioners assert that interactive voice response (IVR) menus must be relatively short due to constraints of the human memory system. These individuals commonly cite Miller's (1956) paper to support their claims. The current paper argues that these authors commonly misuse the information provided in Miller's paper and that hypotheses drawn from modern theories of working memory (e.g., Baddeley and Hitch, 1974) would lead to the opposite conclusion...
Show moreMany researchers and speech user interface practitioners assert that interactive voice response (IVR) menus must be relatively short due to constraints of the human memory system. These individuals commonly cite Miller's (1956) paper to support their claims. The current paper argues that these authors commonly misuse the information provided in Miller's paper and that hypotheses drawn from modern theories of working memory (e.g., Baddeley and Hitch, 1974) would lead to the opposite conclusion that reducing menu length by creating a greater number of menus and a deeper structure will actually be more demanding on users' working memories and will lead to poorer performance and poorer user satisfaction. The primary purpose of this series of experiments was to gain a greater understanding of the role of working memory in speech-enabled IVR use. The experiments also sought to determine whether theories of visual search and signal detection theory (SDT) could be used to predict auditory search behavior. Results of this experiment indicate that creating a deeper structure with shorter menus is detrimental to performance and satisfaction and more demanding of working memory resource. Further the experiment provides support for arguments developed from Macgregor, Lee, and Lam's dual criterion decision model and is a first step toward applying SDT to the IVR domain.
Show less
-
Date Issued
-
2006
-
Identifier
-
CFE0000987, ucf:46715
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0000987
-
-
Title
-
RECONSTRUCTING THE VOCAL CAPABILITIES OF HOMO HEIDELBERGENSIS, A CLOSE HUMAN ANCESTOR.
-
Creator
-
Stanley, Austin Blake, Starbuck, John, University of Central Florida
-
Abstract / Description
-
The discovery of 5,500 Homo heidelbergensis fossil specimens at the Sima de los Huesos archaeological site in Spain has opened up the opportunity for research to be conducted on the vocal capabilities of this species. Previous research has revealed that the range of vowel sounds an individual can produce, known as the vowel space, is directly affected by the dimensions of the vocal tract. The vowel spaces of two hominins, Homo sapiens and Homo neanderthalensis, have been reconstructed through...
Show moreThe discovery of 5,500 Homo heidelbergensis fossil specimens at the Sima de los Huesos archaeological site in Spain has opened up the opportunity for research to be conducted on the vocal capabilities of this species. Previous research has revealed that the range of vowel sounds an individual can produce, known as the vowel space, is directly affected by the dimensions of the vocal tract. The vowel spaces of two hominins, Homo sapiens and Homo neanderthalensis, have been reconstructed through previous research. However, the vowel space of Homo heidelbergensis has not yet been reconstructed. In this research, I aim to explore how the dimensions of the Homo heidelbergensis vocal tract affect the vowel space of that species. This was pursued by measuring the craniospinal dimensions of five Homo heidelbergensis specimens through three dimensional imaging software. When measurements were unattainable due to limitations in the fossil record, regression equations were used to predict missing measurements. By doing so, the vowel space of this species was reconstructed, and crucial information into the vocal capabilities of this close human ancestor was revealed.
Show less
-
Date Issued
-
2018
-
Identifier
-
CFH2000312, ucf:45726
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH2000312
-
-
Title
-
A qualitative analysis of key concepts in Islam from the perspective of imams.
-
Creator
-
Dobiyanski, Chandler, Matusitz, Jonathan, Yu, Nan, Barfield, Rufus, University of Central Florida
-
Abstract / Description
-
The continuous occurrence of terrorist attacks in the name of Islam has shown this ideology and its tenets are at least somewhat connected to jihadists committing attacks in its name. This ideology in terms of 13 themes was investigated by the researcher in 58 sermons outlined in the tables in the appendix. These themes include: brotherhood, death, freedom, human rights, justice and equality, love, oppression, peace and treaty, self-defense, sin, submission, terrorism and truth vs. lies. The...
Show moreThe continuous occurrence of terrorist attacks in the name of Islam has shown this ideology and its tenets are at least somewhat connected to jihadists committing attacks in its name. This ideology in terms of 13 themes was investigated by the researcher in 58 sermons outlined in the tables in the appendix. These themes include: brotherhood, death, freedom, human rights, justice and equality, love, oppression, peace and treaty, self-defense, sin, submission, terrorism and truth vs. lies. The researcher used a sample of 10 sermons from U.S.- born imams and 10 sermons from foreign-born imams as the basis for the analysis for the theories and themes. Conducting a thematic analysis of U.S.-born and foreign-born imams' sermons, the researcher uncovered their true interpretations of these themes. Following this, the researcher investigated the imams' speech codes.The researcher found that imams who were born in the United States focused more on religious speech codes compared to the international imams who focused more prominently on cultural speech codes. In terms of social codes, foreign-born imams seem to be more focused on relationships, while those born in the United States focuses more on religious conduct. In terms of religious codes, foreign-born imams seem to have a checklist of requirements in how to act, including referencing believers vs. disbelievers and historical aspects of the codes, while those born in the United States focused on more codes that referred to everyday activities, people and the kind of conduct that a Muslim should have. In terms of cultural codes, foreign-born imams seem to have an immediate need to physically defend against outside forces. This is compared to the United States-born imams, who discuss how to better oneself, how cultural aspects are a distraction and how Muslim converts are more inspirational than the Muslim-born since the converts actively rejected their cultural norms in favor of Islam.
Show less
-
Date Issued
-
2018
-
Identifier
-
CFE0007324, ucf:52145
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0007324
-
-
Title
-
PHONEME-BASED VIDEO INDEXING USING PHONETIC DISPARITY SEARCH.
-
Creator
-
Leon-Barth, Carlos, DeMara, Ronald, University of Central Florida
-
Abstract / Description
-
This dissertation presents and evaluates a method to the video indexing problem by investigating a categorization method that transcribes audio content through Automatic Speech Recognition (ASR) combined with Dynamic Contextualization (DC), Phonetic Disparity Search (PDS) and Metaphone indexation. The suggested approach applies genome pattern matching algorithms with computational summarization to build a database infrastructure that provides an indexed summary of the original audio content....
Show moreThis dissertation presents and evaluates a method to the video indexing problem by investigating a categorization method that transcribes audio content through Automatic Speech Recognition (ASR) combined with Dynamic Contextualization (DC), Phonetic Disparity Search (PDS) and Metaphone indexation. The suggested approach applies genome pattern matching algorithms with computational summarization to build a database infrastructure that provides an indexed summary of the original audio content. PDS complements the contextual phoneme indexing approach by optimizing topic seek performance and accuracy in large video content structures. A prototype was established to translate news broadcast video into text and phonemes automatically by using ASR utterance conversions. Each phonetic utterance extraction was then categorized, converted to Metaphones, and stored in a repository with contextual topical information attached and indexed for posterior search analysis. Following the original design strategy, a custom parallel interface was built to measure the capabilities of dissimilar phonetic queries and provide an interface for result analysis. The postulated solution provides evidence of a superior topic matching when compared to traditional word and phoneme search methods. Experimental results demonstrate that PDS can be 3.7% better than the same phoneme query, Metaphone search proved to be 154.6% better than the same phoneme seek and 68.1 % better than the equivalent word search.
Show less
-
Date Issued
-
2010
-
Identifier
-
CFE0003480, ucf:48979
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0003480
-
-
Title
-
Deconstructing Disability, Assistive Technology: Secondary Orality, The Path to Universal Access.
-
Creator
-
Tripathi, Tara Prakash, Grajeda, Anthony, Campbell, James, Mauer, Barry, Metcalf, David, University of Central Florida
-
Abstract / Description
-
When Thomas Edison applied for a patent for his phonograph, he listed the talking books for the blind as one of the benefits of his invention. Edison was correct in his claim about talking books or audio books. Audio books have immensely helped the blind to achieve their academic and professional goals. Blind and visually impaired people have also been using audio books for pleasure reading. But several studies have demonstrated the benefits of audio books for people who are not defined as...
Show moreWhen Thomas Edison applied for a patent for his phonograph, he listed the talking books for the blind as one of the benefits of his invention. Edison was correct in his claim about talking books or audio books. Audio books have immensely helped the blind to achieve their academic and professional goals. Blind and visually impaired people have also been using audio books for pleasure reading. But several studies have demonstrated the benefits of audio books for people who are not defined as disabled. Many nondisabled people listen to audio books and take advantage of speech based technology, such as text-to-speech programs, in their daily activities.Speech-based technology, however, has remained on the margins of the academic environments, where hegemony of the sense of vision is palpable. Dominance of the sense of sight can be seen in school curricula, class rooms, libraries, academic conferences, books and journals, and virtually everywhere else. This dissertation analyzes the reason behind such an apathy towards technology based on speech.Jacques Derrida's concept of 'metaphysics of presence' helps us understand the arbitrary privileging of one side of a binary at the expense of the other side. I demonstrate in this dissertation that both, the 'disabled' and technology used by them, are on the less privileged side of the binary formation they are part of. I use Derrida's method of 'deconstruction' to deconstruct the binaries of 'assistive' and 'main stream technology' on one hand, and that of the 'disabled' and 'nondisabled' on the other. Donna Haraway and Katherine Hayles present an alternative reading of body to conceive of a post-gendered posthuman identity, I borrow from their work on cyborgism and posthumanism to conceive of a technology driven post-disabled world. Cyberspace is a good and tested example of an identity without body and a space without disability.The opposition between mainstream and speech-based assistive technology can be deconstructed with the example of what Walter Ong calls 'secondary orality.' Both disabled and non-disabled use the speech-based technology in their daily activities. Sighted people are increasingly listening to audio books and podcasts. Secondary Orality is also manifest on their GPS devices. Thus, Secondary Orality is a common element in assistive and mainstream technologies, hitherto segregated by designers. The way Derrida uses the concept of 'incest' to deconstruct binary opposition between Nature and Culture, I employ 'secondary orality' as a deconstructing tool in the context of mainstream and assistive technology. Mainstream electronic devices, smart phones, mp3 players, computers, for instance, can now be controlled with speech and they also can read the screen aloud. With Siri assistant, the new application on iPhone that allows the device to be controlled with speech, we seem to be very close to (")the age of talking computers(") that William Crossman foretells. As a result of such a progress in speech technology, I argue, we don't need the concept of speech based assistive technology any more.
Show less
-
Date Issued
-
2012
-
Identifier
-
CFE0004259, ucf:49521
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0004259
-
-
Title
-
The Effect of Speech Elicitation Method on Second Language Phonemic Accuracy.
-
Creator
-
Carrasquel, Nicole, Farina, Marcella, Folse, Keith, Purmensky, Kerry, Clark, M. H., University of Central Florida
-
Abstract / Description
-
The present study, a One-Group Posttest-Only Repeated-Measures Design, examined the effect of speech elicitation method on second language (L2) phonemic accuracy of high functional load initial phonemes found in frequently occurring nouns in American English. This effect was further analyzed by including the variable of first language (L1) to determine if L1 moderated any effects found. The data consisted of audio recordings of 61 adult English learners (ELs) enrolled in English for Academic...
Show moreThe present study, a One-Group Posttest-Only Repeated-Measures Design, examined the effect of speech elicitation method on second language (L2) phonemic accuracy of high functional load initial phonemes found in frequently occurring nouns in American English. This effect was further analyzed by including the variable of first language (L1) to determine if L1 moderated any effects found. The data consisted of audio recordings of 61 adult English learners (ELs) enrolled in English for Academic Purposes (EAP) courses at a large, public, post-secondary institution in the United States. Phonemic accuracy was judged by two independent raters as either approximating a standard American English (SAE) pronunciation of the intended phoneme or not, thus a dichotomous scale, and scores were assigned to each participant in terms of the three speech elicitation methods of word reading, word repetition, and picture naming.Results from a repeated measures ANOVA test revealed a statistically significant difference in phonemic accuracy (F(1.47, 87.93) = 25.94, p = .000) based on speech elicitation method, while the two-factor mixed design ANOVA test indicated no statistically significant differences for the moderator variable of native language. However, post-hoc analyses revealed that mean scores of picture naming tasks differed significantly from the other two elicitation methods of word reading and word repetition.Moreover, the results of this study should heighten attention to the role that various speech elicitation methods, or input modalities, might play on L2 productive accuracy. Implications for practical application suggest that caution should be used when utilizing pictures to elicit specific vocabulary words(-)even high-frequency words(-)as they might result in erroneous productions or no utterance at all. These methods could inform pronunciation instructors about best teaching practices when pronunciation accuracy is the objective. Finally, the impact of L1 on L2 pronunciation accuracy might not be as important as once thought.
Show less
-
Date Issued
-
2017
-
Identifier
-
CFE0006725, ucf:51880
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0006725
Pages