Global ETD Search

71	閩南語神經性構音障礙病患子音時長之聲學研究 / Consonant Duration in the Speech of Taiwanese Neurogenic Dysarthrics: An Acoustic Study 郭令育, Guo, Ling-Yu Hugo Unknown Date (has links) 本文旨在比較台灣閩南語常人、弛緩型(flaccid dysarthrics)、與痙攣型(spastic dysarthrics)神經性構音障礙病患子音絕對及相對音長之差異，並探討此病態語音在了解說話運動系統之特徵以及了解語言產製過程上可能的暗示。結果顯示神經性構音障礙病患的子音絕對音長顯著長於常人受試者，但兩組病患延長的型態並不相同。首先，弛緩型病患子音音長顯著長於痙攣型病患；再者，以子音的帶音與否、發音位置、發音方法這三個向度來看，弛緩型病患之子音音長展現了全面性的延長，而痙攣型病患則產生了選擇性的延長，其子音之延長，大多集中在有聲子音上，這些差異乃因兩組病人病理狀況不同所致。另一個結果則顯示，雖然神經性構音障礙病患，在子音音長呈現了延長的狀況，但在子音間相對音長次序上，仍保持到某種程度的完好，可是，子音雖保持一定的次序，但彼此間音長的差值及比值卻也發生了改變。而子音在音節中的所佔比例卻又和正常人無異。因此，本文推斷，在常人子音的產製上，是要求精確的絕對音長，而且子音間音長亦要保持一定的相對關係；但若因病變致使說話者無法同時兼顧二者時，說話者會選擇犧牲絕對音長，而仍保持子音音長間的相互關係。這可能是語言產製與感知的一些經濟原則(principles of economy)互動所導致的結果。此外，構音障礙病患表就子音在音節中的相對時長這個向度上，和常人無異，這可能是語言文法與生理限制二項因素互動所致。 / This study aims (1) to investigate the absolute and relative timing of consonant duration in Taiwanese neurogenic dysarthric speech by means of acoustic measurements, (2) and to explore their implication on the characteristics of motor system and on the status of timing control in speech production. The results show that absolute consonant duration in flaccid and spastic dysarthric speech are significantly longer than that in normal controls. However, the "lengthening" phenomena display different patterns in flaccid and spastic groups. First, for the patients with the same degree of muscular dysfunction, the flaccid dysarthrics lengthen the consonant duration significantly greater than the spastic ones do. Second, whereas absolute consonant duration in flaccid group shows overall lengthening regardless of voicing state, stricture type, and place of articulation, that in spastic group displays selective lengthening. Besides, though the absolute consonant duration is lengthened in dysarthric speech, the durational opposition between consonants is more or less distinct. However, only the durational ranking between consonants (external timing relation) is maintained in dysarthric, but the durational distance or ratio between consonants (internal timing relation) is modified. Relative timing of consonant duration, that is, the consonant-to-syllable ratio (cs-ratio) within a syllable remains intact in dysarthric speech even though the segmental duration is lengthened. Lengthened absolute consonant duration in dysarthric speech is accounted for by the neurological deficits and compensatory effects. Longer absolute consonant duration in flaccid than in spastic speech is attributed to slowness and weakness without spasticity of speech musculature in flaccid dysarthria. Overall lengthening of consonant duration in flaccid group results from the entirely-impaired cranial nerves innervating speech musculature, whereas selective lengthening of consonant in spastic group is ascribed to spasticity and the selective pattern as well as directionality of neuromuscular impairments in spastic dysarthria. The reserved durational ranking between consonants in dysarthric speech may result from the compromise of principles of economy in speech production and perception. The normal-like CS-ratios in dysarthric speech stem from the interaction between biological constraints and the grammar of a language. Based on the data collected from dysarthric speech, this study discusses the temporal variance as well as invariance in speech motor system and the status of timing in the grammar of a language. The importance of principles of economy in speech production and perception is also indicated. 子音時長閩南語聲學語音學神經語言學相對時長說話節奏掌握語言病理學 consonant duration Taiwanese acoustic phonetics neurolinguistics relative timing flaccid and spastic dysarthria speech timing speech pathology
72	Explicit Segmentation Of Speech For Indian Languages Ranjani, H G 03 1900 (has links) Speech segmentation is the process of identifying the boundaries between words, syllables or phones in the recorded waveforms of spoken natural languages. The lowest level of speech segmentation is the breakup and classification of the sound signal into a string of phones. The difficulty of this problem is compounded by the phenomenon of co-articulation of speech sounds. The classical solution to this problem is to manually label and segment spectrograms. In the first step of this two step process, a trained person listens to a speech signal, recognizes the word and phone sequence, and roughly determines the position of each phonetic boundary. The second step involves examining several features of the speech signal to place a boundary mark at the point where these features best satisfy a certain set of conditions specific for that kind of phonetic boundary. Manual segmentation of speech into phones is a highly time-consuming and painstaking process. Required for a variety of applications, such as acoustic analysis, or building speech synthesis databases for high-quality speech output systems, the time required to carry out this process for even relatively small speech databases can rapidly accumulate to prohibitive levels. This calls for automating the segmentation process. The state-of-art segmentation techniques use Hidden Markov Models (HMM) for phone states. They give an average accuracy of over 95% within 20 ms of manually obtained boundaries. However, HMM based methods require large training data for good performance. Another major disadvantage of such speech recognition based segmentation techniques is that they cannot handle very long utterances, Which are necessary for prosody modeling in speech synthesis applications. Development of Text to Speech (TTS) systems in Indian languages has been difficult till date owing to the non-availability of sizeable segmented speech databases of good quality. Further, no prosody models exist for most of the Indian languages. Therefore, long utterances (at the paragraph level and monologues) have been recorded, as part of this work, for creating the databases. This thesis aims at automating segmentation of very long speech sentences recorded for the application of corpus-based TTS synthesis for multiple Indian languages. In this explicit segmentation problem, we need to force align boundaries in any utterance from its known phonetic transcription. The major disadvantage of forcing boundary alignments on the entire speech waveform of a long utterance is the accumulation of boundary errors. To overcome this, we force boundaries between 2 known phones (here, 2 successive stop consonants are chosen) at a time. Here, the approach used is silence detection as a marker for stop consonants. This method gives around 89% (for Hindi database) accuracy and is language independent and training free. These stop consonants act as anchor points for the next stage. Two methods for explicit segmentation have been proposed. Both the methods rely on the accuracy of the above stop consonant detection stage. Another common stage is the recently proposed implicit method which uses Bach scale filter bank to obtain the feature vectors. The Euclidean Distance of the Mean of the Logarithm (EDML) of these feature vectors shows peaks at the point where the spectrum changes. The method performs with an accuracy of 87% within 20 ms of manually obtained boundaries and also achieves a low deletion and insertion rate of 3.2% and 21.4% respectively, for 100 sentences of Hindi database. The first method is a three stage approach. The first is the stop consonant detection stage followed by the next, which uses Quatieri’s sinusoidal model to classify sounds as voiced/unvoiced within 2 successive stop consonants. The final stage uses the EDML function of Bach scale feature vectors to further obtain boundaries within the voiced and unvoiced regions. It gives a Frame Error Rate (FER) of 26.1% for Hindi database. The second method proposed uses duration statistics of the phones of the language. It again uses the EDML function of Bach scale filter bank to obtain the peaks at the phone transitions and uses the duration statistics to assign probability to each peak being a boundary. In this method, the FER performance improves to 22.8% for the Hindi database. Both the methods are equally promising for the fact that they give low frame error rates. Results show that the second method outperforms the first, because it incorporates the knowledge of durations. For the proposed approaches to be useful, manual interventions are required at the output of each stage. However, this intervention is less tedious and reduces the time taken to segment each sentence by around 60% as compared to the time taken for manual segmentation. The approaches have been successfully tested on 3 different languages, 100 sentences each -Kannada, Tamil and English (we have used TIMIT database for validating the algorithms). In conclusion, a practical solution to the segmentation problem is proposed. Also, the algorithm being training free, language independent (ES-SABSF method) and speaker independent makes it useful in developing TTS systems for multiple languages reducing the segmentation overhead. This method is currently being used in the lab for segmenting long Kannada utterances, spoken by reading a set of 1115 phonetically rich sentences. Speech Processing Speech Segmentation Indian Languages - Speech Segmentation Stop-Consonant Detection Speech Segmentation - Algorithms Batch-Scale Filter Bank ES-SABSF Segmentation Method ES-DSBSF Segmentation Method Hidden Markov Models HMM) Text to Speech (TTS) Computer Science
73	Les quadriconsonantiques dans le lexique de l'arabe / Quadri-consonant groups in the lexicon of Arabic Bachmar, Karim 25 November 2011 (has links) La thèse se répartit en deux tomes. Les quadriconsonantiques forment deux groupes de radicaux distincts, à savoir : les radicaux de forme ABAB et les radicaux de forme ABCD. L’analyse de ces radicaux, en appliquant la TME (Théorie, Matrice, Etymon) élaborée par G. Bohas, permet de définir leur fonctionnement aux plans sémantique, sémantico phonétique et structurel. La première partie Tome 1 analyse les quadriconsonantiques de forme ABAB. La deuxième partie Tome 2 est consacrée aux quadriconsonantiques de forme ABCD.Concernant les radicaux ABAB, dont la structure est issue d’un redoublement de l’unique étymon AB, le travail d’analyse va plus s’orienter sur la sémantique. Il est démontré que le redoublement ne s’accompagne pas d’une modification sémantique systématique, contrairement à ce que l’on observe dans les parlers d’orient et d’occident.La deuxième partie de la thèse, Tome 2, dans les mêmes conditions que précédemment, étudie les radicaux ABCD dans le cadre de la TME en prenant en compte la contrainte phonétique formulée par Angoujard (1997), notée : CPA. L’objectif est de déterminer leur mode de fonctionnement tant sur le plan structurel que sur le plan sémantico phonétique. L’étude de ces radicaux ABCD ne se limite pas uniquement à montrer le fonctionnement structurel des radicaux mais établit une relation entre la TME de Bohas et la CPA d’Angoujard. / The thesis is divided into two volumes. Quadri-consonant groups form two distinct sets of radicals: ABAB and ABCD pattern radicals. Applying the Theory of Matrices and Etymons (TME) elaborated by Georges Bohas to the analysis of these radicals enables their functioning at the semantic, semantic-phonetic and structural levels to be defined.The first part which constitutes Volume 1 consists in the analysis of ABAB pattern quadric-consonant groups while the second part contained in Volume 2 is devoted to ABCD pattern quadric-consonant groups. In the study of the ABAB pattern radicals, the structure of which is the result of a reduplication of the single AB etymon, the analytical work focuses more on semantics. The analysis demonstrates that reduplication is not accompanied by a systematic semantic modification, contrary to what may be observed in eastern and western dialects.Under the same conditions, the second part of the thesis, Volume 2, consists in a study of ABCD pattern radicals which employs the framework of TME while also taking into consideration the phonetic constraints formulated by Angoujard (1997): CPA. The objective is to determine the modes of functioning of the ABCD pattern radicals on both the structural and semantic-phonetic levels. The study of these ABCD pattern radicals is not merely restricted to demonstrating the structural functioning of these radicals but also establishes a relationship between TME elaborated by Bohas and Angoujard’s CPA. Matrice Etymon Radical Invariant notionnel Hypéronymie Incrémentation Principe du contour obligatoire Corrélation sémantico-phonétique Lexique de l’arabe Sémantique de l’arabe Polysémie Homonymie Croisement des étymons Préfixation Redoublement (linguistique) Théorie des matrices et des étymons Matrix Etymon Radical Notional invariant Hypernymy Incrementation Obligatory contour principle Semantic-phonetic correlation Lexicon of Arabic Semantics of Arabic Polysemy Homonymy Etymon blending Prefixation Reduplication Theory of matrices and etymons
74	La consonne /R/ comme indice de la variation lectale : cas du français en contact avec le créole guadeloupéen / /R/ consonant as indication of lectal variation : case of French language in contact with Guadeloupean Creole Akpossan, Johanne 20 January 2015 (has links) Cette thèse a pour objectif de définir l’apport de la phonétique expérimentale dans l’identification d’une variété lectale, en prenant pour exemple les langues parlées en Guadeloupe. En Guadeloupe, deux langues cohabitent : le français et le créole. Mais, dans les faits, il y a une diversité de variétés de français d’une part, et de créole d’autre part. Chacune de ces variétés va de l’acrolecte au basilecte en passant par le mésolecte : il y a donc un continuum français et un continuum créole. La situation sociolinguistique de la Guadeloupe peut être ainsi représentée par un double continuum.Ces différentes variétés de français peuvent-elles se distinguer par des caractéristiques (1) acoustiques, (2) phonétiques, (3) phonologiques et (4) perceptives de la consonne /R/? La durée du contact avec le créole, a t-elle une influence sur la variété de français parlée par un locuteur ?Nos résultats montrent que plus la variété de français est basilectale, (1) plus la diffusion de l’énergie spectrale du /R/ est faible avec un taux de bruit réduit et une hauteur moyenne des fréquences basse ; (2) plus la variante fricatisée du /R/ est rare et plus la variante approximante est fréquente ; (3) plus le taux d’élision du /R/ en coda de syllabe augmente ainsi que le taux de réalisation de /R/ en tant que [w] en contexte labial; (4) plus la variété est perçue comme ayant un faible degré d’accent français. Généralement, plus la durée du contact entre le français et le créole est longue, plus cette variété est basilectale.Si les caractéristiques de la consonne /R/ permettent de discriminer la variété acrolectale de la variété basilectale (variétés extrêmes), il apparait plus difficile d’établir une liste d’indices (ou « lectomètres ») qui permettraient d’identifier les variétés se trouvant dans la zone intermédiaire : le mésolecte est doté d’une certaine imprévisibilité. / The goal of this thesis is to determine the contribution of experimental phonetics in the identification of a lectal variety, in taking for example languages spoken in Guadeloupe. In Guadeloupe, two languages coexist : French and Creole. But in fact, there is a diversity of varieties of French on the one hand, and of Creole on the other hand. Each of these varieties goes from acrolect to basilect through mesolect : so there are a French continuum and a Creole continuum. Thus, the sociolinguistic situation of Guadeloupe can be represented by a double continuum.These different varieties of French can they be distinguished by (1) acoustic, (2) phonetic, (3) phonological (4) and perceptual characteristics of /R/ consonant? Does the contact duration with Creole have an influence on the variety of French spoken by a speaker?Our results show that the more basilectal the variety of French is, (1) the lower spectral diffusion of /R/ energy is, with a reduced rate noise and a low frequency mean; (2) the more infrequent /R/ constrictive variants are and the more common /R/ approximant variants are ; (3) the greater rates of /R/ elision in coda of syllable and /R/ realization as [w] in labial context increase ;(4) and the more the variety is perceived as having a low degree of French accent. Usually, the longer duration of the contact between French and Creole is, the more basilectal the variety of French is.If characteristics of /R/ consonant can distinguish acrolect and basilect (extreme varieties), it’s not so easy to establish a list of indications (or « lectomètres ») in order to identify varieties in the intermediate zone: mesolect has a certain unpredictability. Socio-Phonétique Acoustique Consonne R Contact des langues (Double) continuum Variétés du français Français régional Créole guadeloupéen Interlecte Acrolecte Basilecte Lectomètre Identification du locuteur Langues parlées en Guadeloupe Français des (Petites) Antilles Linguistique comparée Phonétique expérimentale Créole à base lexicale française Sociophonetics Acoustics R consonant Language contact (Double) continuum Varieties of French Regional French language Guadeloupean Creole Interlect Acrolect Basilect Lectomètre Speaker identication Languages spoken in Guadeloupe French language of (Lesser) Antilles 414.8
75	Neurophysiological Mechanisms of Speech Intelligibility under Masking and Distortion Vibha Viswanathan (11189856) 29 July 2021 (has links) <pre><p>Difficulty understanding speech in background noise is the most common hearing complaint. Elucidating the neurophysiological mechanisms underlying speech intelligibility in everyday environments with multiple sound sources and distortions is hence important for any technology that aims to improve real-world listening. Using a combination of behavioral, electroencephalography (EEG), and computational modeling experiments, this dissertation provides insight into how the brain analyzes such complex scenes, and what roles different acoustic cues play in facilitating this process and in conveying phonetic content. Experiment #1 showed that brain oscillations selectively track the temporal envelopes (i.e., modulations) of attended speech in a mixture of competing talkers, and that the strength and pattern of this attention effect differs between individuals. Experiment #2 showed that the fidelity of neural tracking of attended-speech envelopes is strongly shaped by the modulations in interfering sounds as well as the temporal fine structure (TFS) conveyed by the cochlea, and predicts speech intelligibility in diverse listening environments. Results from Experiments #1 and #2 support the theory that temporal coherence of sound elements across envelopes and/or TFS shapes scene analysis and speech intelligibility. Experiment #3 tested this theory further by measuring and computationally modeling consonant categorization behavior in a range of background noises and distortions. We found that a physiologically plausible model that incorporated temporal-coherence effects predicted consonant confusions better than conventional speech-intelligibility models, providing independent evidence that temporal coherence influences scene analysis. Finally, results from Experiment #3 also showed that TFS is used to extract speech content (voicing) for consonant categorization even when intact envelope cues are available. Together, the novel insights provided by our results can guide future models of speech intelligibility and scene analysis, clinical diagnostics, improved assistive listening devices, and other audio technologies.</p></pre> Neuroscience cocktail-party problem theta rhythms gamma rhythms EEG network analysis speech intelligibility envelope coding fine structure modulation masking scene analysis temporal coherence consonant confusions wideband inhibition computational modeling comodulation masking release temporal coding cochlear implants selective attention speech perception speech-in-noise
76	Moderní řečové příznaky používané při diagnóze chorob / State of the art speech features used during the Parkinson disease diagnosis Bílý, Ondřej January 2011 (has links) This work deals with the diagnosis of Parkinson's disease by analyzing the speech signal. At the beginning of this work there is described speech signal production. The following is a description of the speech signal analysis, its preparation and subsequent feature extraction. Next there is described Parkinson's disease and change of the speech signal by this disability. The following describes the symptoms, which are used for the diagnosis of Parkinson's disease (FCR, VSA, VOT, etc.). Another part of the work deals with the selection and reduction symptoms using the learning algorithms (SVM, ANN, k-NN) and their subsequent evaluation. In the last part of the thesis is described a program to count symptoms. Further is described selection and the end evaluated all the result.

Page generated in 0.0748 seconds