Global ETD Search

1	Recognition of Human Emotion in Speech Using Modulation Spectral Features and Support Vector Machines Wu, Siqing 09 September 2009 (has links) Automatic recognition of human emotion in speech aims at recognizing the underlying emotional state of a speaker from the speech signal. The area has received rapidly increasing research interest over the past few years. However, designing powerful spectral features for high-performance speech emotion recognition (SER) remains an open challenge. Most spectral features employed in current SER techniques convey short-term spectral properties only while omitting useful long-term temporal modulation information. In this thesis, modulation spectral features (MSFs) are proposed for SER, with support vector machines used for machine learning. By employing an auditory filterbank and a modulation filterbank for speech analysis, an auditory-inspired long-term spectro-temporal (ST) representation is obtained, which captures both acoustic frequency and temporal modulation frequency components. The MSFs are then extracted from the ST representation, thereby conveying information important for human speech perception but missing from conventional short-term spectral features (STSFs). Experiments show that the proposed features outperform features based on mel-frequency cepstral coefficients and perceptual linear predictive coefficients, two commonly used STSFs. The MSFs further render a substantial improvement in recognition performance when used to augment the extensively used prosodic features, and recognition accuracy above 90% is accomplished for classifying seven emotion categories. Moreover, the proposed features in combination with prosodic features attain estimation performance comparable to human evaluation for recognizing continuous emotions. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2009-09-08 13:01:54.941 Emotion recognition Speech modulation Spectro-temporal representation Affective computing
2	Développement de la perception de la parole et du traitement auditif des modulations spectro-temporelles : études comportementales chez le nourrisson / Development of speech perception and spectro-temporal modulation processing : behavioral studies in infants Cabrera, Laurianne 22 November 2013 (has links) Cette thèse vise à caractériser le traitement auditif des informations spectro-temporelles impliquées dans la perception de la parole au cours du développement précoce. Dans ce but, les capacités de discrimination de contrastes phonétiques sont évaluées à l’aide de deux méthodes comportementales chez des enfants âgés de 6 et 10 mois. Les sons de parole sont dégradés par des « vocodeurs » conçus pour réduire sélectivement les modulations spectrales et/ou temporelles des stimuli phonétiquement contrastés.Les trois premières études de cette thèse montrent que les informations spectro-temporelles fines de la parole (les indices de modulation de fréquence et détails spectraux) ne sont pas nécessaires aux enfants français de 6 mois pour percevoir le trait phonétique de voisement et de lieu d’articulation. Comme pour les adultes français, les informations de modulation d’amplitude les plus lentes semblent suffire pour percevoir ces traits phonétiques. Les deux dernières études montrent cependant que les informations spectro-temporelles fines sont requises pour la discrimination de tons lexicaux (variations de hauteur liée au sens de mots monosyllabiques) chez les enfants français et taiwanais de 6 mois. De plus, ces études montrent l’influence de l’expérience linguistique sur le poids perceptif de ces informations de modulations dans la discrimination de la parole chez les jeunes adultes et les enfants français et taiwanais de 10 mois.Ces études montrent que les mécanismes auditifs spectro-temporels sous-tendant la perception de la parole sont efficaces dès l’âge de 6 mois, mais que ceux-ci vont être influencés par l’exposition à l’environnement linguistique dans les mois suivants. Enfin, cette thèse discute les implications de ces résultats vis-à-vis de l’implantation précoce des enfants sourds profonds qui reçoivent des informations de modulations dégradées. / The goal of this doctoral research was to characterize the auditory processing of the spectro-temporal cues involved in speech perception during development. The ability to discriminate phonetic contrasts was evaluated in 6- and 10-month-old infants using two behavioral methods. The speech sounds were processed by “vocoders” designed to reduce selectively the spectro-temporal modulation content of the phonetically contrasting stimuli. The first three studies showed that fine spectro-temporal modulation cues (the frequency-modulation cues and spectral details) are not required for the discrimination of voicing and place of articulation in French-learning 6-month-old infants. As for French adults, 6-month-old infants can discriminate those phonetic features on the sole basis of the slowest amplitude-modulation cues. The last two studies revealed that the fine modulation cues are required for lexical-tone (pitch variations related to the meaning of one-syllable word) discrimination in French- and Mandarin-learning 6-month-old infants. Furthermore, the results showed the influence of linguistic experience on the perceptual weight of these modulation cues in both young adults and 10-month-old infants learning either French or Mandarin.This doctoral research showed that the spectro-temporal auditory mechanisms involved in speech perception are efficient at 6 months of age, but will be influenced by the linguistic environment during the following months. Finally, the present research discusses the implications of these findings for cochlear implantation in profoundly deaf infants who have only access to impoverished speech modulation cues. Perception de la parole Informations spectro-temporelles Discrimination phonétique Tons lexicaux Nourrissons Speech perception Spectro-temporal cues Phonetic discrimination Lexical tones Infants 152.15
3	Sum frequency generation study of CO adsorbed on palladium single crystal and nanoparticles : adsorption and catalytic oxidation as a function of size Wang, Jijin 05 December 2013 (has links) (PDF) The CO reaction on metals is of great interest experimentally and theoretically because it serves as a model system to understand molecular chemisorption and catalyzed reactions on metals. This thesis aims at progressing along the general trends of surface science: bridging the pressure and material gaps in the study of catalysts. Sum Frequency Generation (SFG) is at the heart of this work. It involves a nonlinear optical process with an IR pulse induced coherent first-order polarization up-converted by a visible pulse into a second-order polarization at the sum frequency. In this thesis it is used to record CO vibrational spectra on the Pd nanoparticles (NP)/MgO/Ag(100) to understand the adsorption and oxidation thanks to its specific advantages in surface science: sensitivity and surface selectivity. The questions proposed are the possible roles of the adsorption sites which only exist on the NPs, the effect of the size of NPs and the presence of oxygen on the CO adsorption and catalytic reactivity, the effect of adsorption of oxygen (from 'normal' - dissociative chemisorption to 'sub-surface'), the variation of reactivity of CO in the different sites when pressure and temperature increase. (1) We have studied CO adsorption on Pd(100) as a reference. Below a CO coverage of 0.5 ML SFG results confirm previous IRAS studies. Above 0.5 ML, we have observed in much more details than previously two vibrational bands assigned to CO at compressed and uncompressed bridge sites, of which we have measured the frequency and intensity and the decoherence time T₂ as a function of coverage. (2) Pd NP size effect on CO adsorption is studied (from Pd(100) to particles with about 300 atoms). At pressures below 10⁻³ mbar the CO spectra on a coalesced layer and on large NPs are dominated by the same bridge band as on Pd(100). The CO singleton frequency decreases with coverage, revealing the evolution of chemisorption with size. DFT calculations done at ENS Lyon reveal that the main mechanism is the strain induced by the substrate which increases the Pd-Pd bondlength, favors electron back donation to CO, weakens the CO bond and probably reinforce the CO-metal bond. (3) Because of a limit of our maximal temperature, we have to study the CO catalytic oxidation in an excess of oxygen to avoid self-poisoning by CO. The results strongly suggest that bridge sites are the key sites in catalysis in our experimental condition. However, while a fraction of bridge sites are more reactive on NPs, a large fraction of them seem less reactive with respect to Pd(100). The reactivity of CO on (100) facet decreases at smaller NP size. It emerges the ideal that the reaction proceeds by the most reactive sites, and that the other sites are only reservoir in reactivity, if the diffusion between sites are high enough. Oxygen modifies the adsorption of co-reactants. In the case of CO + O / Pd NPs / MgO, below 10⁻⁴ mbar oxygen does not seem to influence significantly CO adsorption; between 10⁻³ and 10⁻¹ mbar the spectroscopic signature of CO compression disappears, and above 1 mbar a new class of a top sites appears, suggesting that some oxygen species (perhaps "subsurface") favors CO adsorption on linear sites. A pump-probe experiment has been done to compare the effect of pump on different adsorption sites. All this confirms the interest of SFG vibrational spectroscopy for catalysis. An additional contribution of this thesis to SFG is the study of the spectro-temporal aspects of SFG emission. SFG spectra containing several bands are modeled in details based on an ODT/Au system and compared to experimental spectra, showing that in SFG spectra are affected by the spectro-temporal shape of the visible laser. The standard deconvolution method used in the literature is only approximate. Accurate spectro-temporal spectrum modeling is required to evaluate precisely the relative intensities when several bands are present. [CHIM:OTHE] Chemical Sciences/Other [CHIM:OTHE] Chimie/Autre Sum frequency generation (SFG) CO Nanoparticle Adsorption Catalysis Oxidation Size effects Adsorption sites Spectro-temporal
4	Demodulation of Narrowband Speech Spectrograms Aragonda, Haricharan January 2014 (has links) (PDF) Speech is a non-stationary signal and contains modulations in both spectral and temporal domains. Based on the type of modulations studied, most speech processing algorithms can be classified into short-time analysis algorithms, narrow-band analysis algorithms, or joint spectro-temporal analysis algorithms. While traditional methods of speech analysis study the modulation along either time (Short-time analysis algorithms) or frequency (Narrowband analysis) at a time. A new class of algorithms that work simultaneously along both temporal as well as spectral dimensions, called the spectro-temporal analysis algorithms, have become prominent over the past decade. Joint spectro-temporal analysis (also referred to as 2-D speech analysis) has shown promise in applications such as formant estimation, pitch estimation, speech recognition, etc. Over the past decade, 2-D speech analysis has been independently motivated from several directions. Broadly these motivations for 2-D speech models can be grouped into speech-production motivated, source-separation/machine- learning motivated and neurophysiology motivated. In this thesis, we develop 2-D speech model based on the speech production motivation. The overall organization of the thesis is as follows: We first develop the context of 2-D speech processing in Chapter one, we then proceed to develop a 2-D multicomponent AM-FM model for narrowband spectrogram patch of voiced speech and experiment with the perceptual significance of number of components needed to represent a spectrogram patch in Chapter two. In Chapter three we develop a demodulation algorithm called the inphase and the quadrature phase demodulation (IQ), compared to the state-of-the art sinusoidal demodulation, the AM obtained using this method is more robust to carrier estimation errors. The demodulation algorithm was verified on call voiced sentences taken from the TIMIT database. In chapter four we develop a demodulation algorithm based on Riesz transform, a natural extension of the Hilbert transform to higher dimensions, unlike the sinusoidal and the IQ demodulation techniques, Riesz-transform-based demodulation does not require explicit carrier estimation and is also robust to pitch discontinuous in patches. The algorithm was validated on all voiced sentences from the TIMIT database. Both IQ and Riesz-transform-based methods were found to give more accurate estimates of the 2-D AM (relates to vocal tract) and 2-D carrier (relates to source) compared with the sinusoidal modulation. In Chapter five we show application of the demodulated AM and carrier to pitch estimation and for creation of hybrid sounds. The hybrid sounds created were found to have better perceptual quality compared with their counterparts created using the linear prediction analysis. In Chapter six we summarize the work and present with possible directions of future research. Speech Spectrograms Speech Modulation Spectrogram Patch Models Spectrogram Demodulation Narrowband Speech Spectrograms Spectro-Temporal Demodulation Riesz Transform Speech Processing Systems 2-D Speech Model 2-D Speech Analysis Communication Engineering
5	Sum frequency generation study of CO adsorbed on palladium single crystal and nanoparticles : adsorption and catalytic oxidation as a function of size / Etude par génération de somme de fréquences de CO adsorbé sur monocristal et sur nanoparticules de palladium : adsorption et oxydation catalytique en fonction de la taille Wang, Jijin 05 December 2013 (has links) La réaction de CO sur métaux est d'un grand intérêt, car il sert de système modèle pour comprendre la chimisorption et les réactions catalytiques sur les métaux. Cette thèse se place dans la démarche générale de la science des surfaces de franchir les « fossés » de pression et de materiaux pour l’étude de la catalyse. La Génération de Somme de Fréquences (SFG) est au cœur de ce travail. Elle implique un processus optique non linéaire créé par une impulsion IR qui induit une polarisation cohérente du premier ordre, convertie par une impulsion visible en une polarisation du second ordre à la fréquence somme. La SFG est utilisée pour mesurer les spectres vibrationnels de CO sur Pd nanoparticule (NP)/MgO/Ag(100) grâce à des avantages spécifiques en science de la surface de SFG : sensibilité, sélectivité de surface. Les questions posées sont les rôles possibles des sites d'adsorption qui n'existent que sur les NP, l'effet de taille des NP, l'adsorption de l'oxygène (de « normal » - chimisorption dissociative - à « sub-surface »), sur l'adsorption de CO et la réactivité catalytique, la variation de la réactivité de CO dans les différents sites lors de l'augmentation de la pression et de la température. (1) Nous avons étudié l’adsorption de CO sur Pd (100) comme une référence. En dessous d’une couverture de 0.5 ML de CO, les résultats de SFG confirment les études IRAS antérieures. Au-dessus de 0.5 ML, nous avons observé deux bandes vibrationnelles attribuées à CO dans des sites pontés « comprimés » et « non comprimés », dont nous avons mesuré la fréquence et l’intensité en fonction de la couverture, ainsi que le temps de décohérence T₂. (2) L’effet de taille des NP de Pd sur l'adsorption de CO a été observé (depuis Pd(100) à NP d’environ 300 atomes). Aux pressions ≤ 10⁻³ mbar, les spectres de CO sur une couche coalescées et sur des NP larges sont dominés par la même bande de sites pontés que sur Pd (100). La fréquence « singleton » de CO diminue avec la taille des NP, ce qui révèle l'évolution de la chimisorption avec la taille des NP. Des calculs DFT faits à l'ENS Lyon révèlent que le mécanisme principal est la contrainte induite par le substrat qui augmente la longueur de liaison Pd-Pd, favorise la rétrodonation d’électrons vers CO, affaiblit la liaison interne de CO et probablement renforce la liaison CO-métal. (3) Pour CO oxidation catalytic, les résultats suggèrent fortement que les sites pontés sont les sites clé dans la catalyse dans nos conditions expérimentales. Cependant, tandis qu'une fraction des sites pontés sont plus réactifs sur les NP, une grande fraction sont moins réactifs par rapport à Pd(100). La réactivité de CO sur les facettes (100) diminue à plus petite taille des NP. Il se dégage l’idée que la réaction procède par les sites les plus réactifs, et que les autres sites servent seulement de réservoirs en réactifs, à condition que la diffusion entre sites soit suffisamment élevée. L’oxygène modifie l'adsorption de co-réactifs. Dans le cas de CO+O/NP de Pd/MgO, au-dessus de 1 mbar, une nouvelle classe de sites linéaires apparaît, qui est probablement due à "sub-surface" oxygen. Une expérience pompe-sonde a été faite pour comparer l’effet de pompe sur les différents sites d’adsorption. Tous ces résultats confirment l'intérêt de spectroscopie vibrationnelle de SFG pour l’étude de la catalyse. Une contribution supplémentaire de cette thèse est l'étude des aspects spectro-temporels de l’émission SFG. Des spectres SFG qui contiennent plusieurs bandes sont modélisés en détail dans le cas du système modèle ODT/Au, et comparés à des spectres expérimentaux. Les spectres SFG sont affectées par la forme spectro-temporel du laser visible. La comparaison montre que l’interprétation quantitative des intensités relatives des spectres SFG obtenus avec des impulsions femtosecondes nécessite une analyse spectro-temporelle et pas seulement spectrale. La méthode de déconvolution standard utilisée dans la littérature est approximative. / The CO reaction on metals is of great interest experimentally and theoretically because it serves as a model system to understand molecular chemisorption and catalyzed reactions on metals. This thesis aims at progressing along the general trends of surface science: bridging the pressure and material gaps in the study of catalysts. Sum Frequency Generation (SFG) is at the heart of this work. It involves a nonlinear optical process with an IR pulse induced coherent first-order polarization up-converted by a visible pulse into a second-order polarization at the sum frequency. In this thesis it is used to record CO vibrational spectra on the Pd nanoparticles (NP)/MgO/Ag(100) to understand the adsorption and oxidation thanks to its specific advantages in surface science: sensitivity and surface selectivity. The questions proposed are the possible roles of the adsorption sites which only exist on the NPs, the effect of the size of NPs and the presence of oxygen on the CO adsorption and catalytic reactivity, the effect of adsorption of oxygen (from ‘normal’ – dissociative chemisorption to ‘sub-surface’), the variation of reactivity of CO in the different sites when pressure and temperature increase. (1) We have studied CO adsorption on Pd(100) as a reference. Below a CO coverage of 0.5 ML SFG results confirm previous IRAS studies. Above 0.5 ML, we have observed in much more details than previously two vibrational bands assigned to CO at compressed and uncompressed bridge sites, of which we have measured the frequency and intensity and the decoherence time T₂ as a function of coverage. (2) Pd NP size effect on CO adsorption is studied (from Pd(100) to particles with about 300 atoms). At pressures below 10⁻³ mbar the CO spectra on a coalesced layer and on large NPs are dominated by the same bridge band as on Pd(100). The CO singleton frequency decreases with coverage, revealing the evolution of chemisorption with size. DFT calculations done at ENS Lyon reveal that the main mechanism is the strain induced by the substrate which increases the Pd-Pd bondlength, favors electron back donation to CO, weakens the CO bond and probably reinforce the CO-metal bond. (3) Because of a limit of our maximal temperature, we have to study the CO catalytic oxidation in an excess of oxygen to avoid self-poisoning by CO. The results strongly suggest that bridge sites are the key sites in catalysis in our experimental condition. However, while a fraction of bridge sites are more reactive on NPs, a large fraction of them seem less reactive with respect to Pd(100). The reactivity of CO on (100) facet decreases at smaller NP size. It emerges the ideal that the reaction proceeds by the most reactive sites, and that the other sites are only reservoir in reactivity, if the diffusion between sites are high enough. Oxygen modifies the adsorption of co-reactants. In the case of CO + O / Pd NPs / MgO, below 10⁻⁴ mbar oxygen does not seem to influence significantly CO adsorption; between 10⁻³ and 10⁻¹ mbar the spectroscopic signature of CO compression disappears, and above 1 mbar a new class of a top sites appears, suggesting that some oxygen species (perhaps “subsurface”) favors CO adsorption on linear sites. A pump-probe experiment has been done to compare the effect of pump on different adsorption sites. All this confirms the interest of SFG vibrational spectroscopy for catalysis. An additional contribution of this thesis to SFG is the study of the spectro-temporal aspects of SFG emission. SFG spectra containing several bands are modeled in details based on an ODT/Au system and compared to experimental spectra, showing that in SFG spectra are affected by the spectro-temporal shape of the visible laser. The standard deconvolution method used in the literature is only approximate. Accurate spectro-temporal spectrum modeling is required to evaluate precisely the relative intensities when several bands are present. CO Nanoparticule Adsorption Catalyse Oxydation Effets de taille Sites d’adsorption Spectro-temporel Sum frequency generation (SFG) CO Nanoparticle Adsorption Catalysis Oxidation Size effects Adsorption sites Spectro-temporal
6	Analyse par apprentissage automatique des réponses fMRI du cortex auditif à des modulations spectro-temporelles Bouchard, Lysiane 12 1900 (has links) L'application de classifieurs linéaires à l'analyse des données d'imagerie cérébrale (fMRI) a mené à plusieurs percées intéressantes au cours des dernières années. Ces classifieurs combinent linéairement les réponses des voxels pour détecter et catégoriser différents états du cerveau. Ils sont plus agnostics que les méthodes d'analyses conventionnelles qui traitent systématiquement les patterns faibles et distribués comme du bruit. Dans le présent projet, nous utilisons ces classifieurs pour valider une hypothèse portant sur l'encodage des sons dans le cerveau humain. Plus précisément, nous cherchons à localiser des neurones, dans le cortex auditif primaire, qui détecteraient les modulations spectrales et temporelles présentes dans les sons. Nous utilisons les enregistrements fMRI de sujets soumis à 49 modulations spectro-temporelles différentes. L'analyse fMRI au moyen de classifieurs linéaires n'est pas standard, jusqu'à maintenant, dans ce domaine. De plus, à long terme, nous avons aussi pour objectif le développement de nouveaux algorithmes d'apprentissage automatique spécialisés pour les données fMRI. Pour ces raisons, une bonne partie des expériences vise surtout à étudier le comportement des classifieurs. Nous nous intéressons principalement à 3 classifieurs linéaires standards, soient l'algorithme machine à vecteurs de support (linéaire), l'algorithme régression logistique (régularisée) et le modèle bayésien gaussien naïf (variances partagées). / The application of linear machine learning classifiers to the analysis of brain imaging data (fMRI) has led to several interesting breakthroughs in recent years. These classiﬁers combine the responses of the voxels to detect and categorize different brain states. They allow a more agnostic analysis than conventional fMRI analysis that systematically treats weak and distributed patterns as unwanted noise. In this project, we use such classifiers to validate an hypothesis concerning the encoding of sounds in the human brain. More precisely, we attempt to locate neurons tuned to spectral and temporal modulations in sound. We use fMRI recordings of brain responses of subjects listening to 49 different spectro-temporal modulations. The analysis of fMRI data through linear classifiers is not yet a standard procedure in this field. Thus, an important objective of this project, in the long term, is the development of new machine learning algorithms specialized for neuroimaging data. For these reasons, an important part of the experiments is dedicated to studying the behaviour of the classifiers. We are mainly interested in 3 standard linear classifiers, namely the support vectors machine algorithm (linear), the logistic regression algorithm (regularized) and the naïve bayesian gaussian model (shared variances). Classifieur linéaire Linear classifier Neuroimagerie Neuroimaging Modulation spectro-temporelle Spectro-temporal modulation Cortex auditif Auditory cortex fMRI fMRI Modèle bayésien gaussien naïf Naïve bayesian gaussian model Machine à vecteurs de support Support vectors machine Régression logistique Logistic regression
7	Analyse par apprentissage automatique des réponses fMRI du cortex auditif à des modulations spectro-temporelles Bouchard, Lysiane 12 1900 (has links) No description available. Classifieur linéaire Linear classifier Neuroimagerie Neuroimaging Modulation spectro-temporelle Spectro-temporal modulation Cortex auditif Auditory cortex fMRI fMRI Modèle bayésien gaussien naïf Naïve bayesian gaussian model Machine à vecteurs de support Support vectors machine Régression logistique Logistic regression

1

Page generated in 0.0496 seconds