Return to search

Spoken Language Identification from Processing and Pattern Analysis of Spectrograms

Prior speech and linguistics research has focused on the use of phonemes recognition in speech, and their use in formulation of recognizable words, to determine language identification. Some languages have additional phoneme sounds, which can help identify a language; however, most of the phonemes are common to a wide variety of languages. Legacy approaches recognize strings of phonemes as syllables, used by dictionary queries to see if a word can be found to uniquely identify a language.
This dissertation research considers an alternative means of determining language identification of speech data based solely on analysis of frequency-domain data. An analytical approach to speech language identification by three comparative techniques is performed. First, a character-based pattern analysis is performed using the Rix and Forster algorithm to replicate their research on language identification. Second, techniques of phoneme recognition and their relative pattern of occurrence in speech samples are measured for performance in ability for language identification using the Rix and Forster approach. Finally, an experiment using statistical analysis of time-ensemble frequency spectrum data is assessed for its ability to establish spectral patterns for language identification, along with performance. This novel approach is applied to spectrogram audio data using pattern analysis techniques for language identification. It applies the Rix and Forster method to the ensemble of spectral frequencies used over the duration of a speech waveform. This novel approach is compared to the applications of the Rix and Forster algorithm to character-based and phoneme symbols for language identification on the basis of statistical accuracy, processing time requirements, and spatial processing resource needs. The audio spectrum analysis also demonstrates the ability to perform speaker identification using the same techniques performed for language identification.
The results of this research demonstrate the efficacy of audio frequency-domain pattern analysis applied to speech waveform data. It provides an efficient technique in language identification without reliance upon linguistic approaches using phonemes or word derivations. This work also demonstrates a quick, automated means by which information gatherers, travelers, and diplomatic officials might obtain rapid language identification supporting time-critical determination of appropriate translator resource needs.

Identiferoai:union.ndltd.org:nova.edu/oai:nsuworks.nova.edu:gscis_etd-1151
Date01 January 2014
CreatorsFord, George Harold
PublisherNSUWorks
Source SetsNova Southeastern University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceCEC Theses and Dissertations

Page generated in 0.0023 seconds