• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 102
  • 9
  • 9
  • 8
  • 6
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 183
  • 183
  • 92
  • 75
  • 37
  • 36
  • 34
  • 32
  • 27
  • 26
  • 26
  • 25
  • 22
  • 21
  • 19
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
141

Dose savings in digital breast tomosynthesis through image processing / Redução da dose de radiação em tomossíntese mamária através de processamento de imagens

Borges, Lucas Rodrigues 14 June 2017 (has links)
In x-ray imaging, the x-ray radiation must be the minimum necessary to achieve the required diagnostic objective, to ensure the patients safety. However, low-dose acquisitions yield images with low quality, which affect the radiologists image interpretation. Therefore, there is a compromise between image quality and radiation dose. This work proposes an image restoration framework capable of restoring low-dose acquisitions to achieve the quality of full-dose acquisitions. The contribution of the new method includes the capability of restoring images with quantum and electronic noise, pixel offset and variable detector gain. To validate the image processing chain, a simulation algorithm was proposed. The simulation generates low-dose DBT projections, starting from fulldose images. To investigate the feasibility of reducing the radiation dose in breast cancer screening programs, a simulated pre-clinical trial was conducted using the simulation and the image processing pipeline proposed in this work. Digital breast tomosynthesis (DBT) images from 72 patients were selected, and 5 human observers were invited for the experiment. The results suggested that a reduction of up to 30% in radiation dose could not be perceived by the human reader after the proposed image processing pipeline was applied. Thus, the image processing algorithm has the potential to decrease radiation levels in DBT, also decreasing the cancer induction risks associated with the exam. / Em programas de rastreamento de câncer de mama, a dose de radiação deve ser mantida o mínimo necessário para se alcançar o diagnóstico, para garantir a segurança dos pacientes. Entretanto, imagens adquiridas com dose de radiação reduzida possuem qualidade inferior. Assim, existe um equilíbrio entre a dose de radiação e a qualidade da imagem. Este trabalho propõe um algoritmo de restauração de imagens capaz de recuperar a qualidade das imagens de tomossíntese digital mamária, adquiridas com doses reduzidas de radiação, para alcançar a qualidade de imagens adquiridas com a dose de referência. As contribuições do trabalho incluem a melhoria do modelo de ruído, e a inclusão das características do detector, como o ganho variável do ruído quântico. Para a validação a cadeia de processamento, um método de simulação de redução de dose de radiação foi proposto. Para investigar a possibilidade de redução de dose de radiação utilizada na tomossíntese, um estudo pré-clínico foi conduzido utilizando o método de simulação proposto e a cadeia de processamento. Imagens clínicas de tomossíntese mamária de 72 pacientes foram selecionadas e cinco observadores foram convidados para participar do estudo. Os resultados sugeriram que, após a utilização do processamento proposto, uma redução de 30% de dose de radiação pôde ser alcançada sem que os observadores percebessem diferença nos níveis de ruído e borramento. Assim, o algoritmo de processamento tem o potencial de reduzir os níveis de radiação na tomossíntese mamária, reduzindo também os riscos de indução do câncer de mama.
142

Model-based clustering and model selection for binned data. / Classification automatique à base de modèle et choix de modèles pour les données discrétisées

Wu, Jingwen 28 January 2014 (has links)
Cette thèse étudie les approches de classification automatique basées sur les modèles de mélange gaussiens et les critères de choix de modèles pour la classification automatique de données discrétisées. Quatorze algorithmes binned-EM et quatorze algorithmes bin-EM-CEM sont développés pour quatorze modèles de mélange gaussiens parcimonieux. Ces nouveaux algorithmes combinent les avantages des données discrétisées en termes de réduction du temps d’exécution et les avantages des modèles de mélange gaussiens parcimonieux en termes de simplification de l'estimation des paramètres. Les complexités des algorithmes binned-EM et bin-EM-CEM sont calculées et comparées aux complexités des algorithmes EM et CEM respectivement. Afin de choisir le bon modèle qui s'adapte bien aux données et qui satisfait les exigences de précision en classification avec un temps de calcul raisonnable, les critères AIC, BIC, ICL, NEC et AWE sont étendus à la classification automatique de données discrétisées lorsque l'on utilise les algorithmes binned-EM et bin-EM-CEM proposés. Les avantages des différentes méthodes proposées sont illustrés par des études expérimentales. / This thesis studies the Gaussian mixture model-based clustering approaches and the criteria of model selection for binned data clustering. Fourteen binned-EM algorithms and fourteen bin-EM-CEM algorithms are developed for fourteen parsimonious Gaussian mixture models. These new algorithms combine the advantages in computation time reduction of binning data and the advantages in parameters estimation simplification of parsimonious Gaussian mixture models. The complexities of the binned-EM and the bin-EM-CEM algorithms are calculated and compared to the complexities of the EM and the CEM algorithms respectively. In order to select the right model which fits well the data and satisfies the clustering precision requirements with a reasonable computation time, AIC, BIC, ICL, NEC, and AWE criteria, are extended to binned data clustering when the proposed binned-EM and bin-EM-CEM algorithms are used. The advantages of the different proposed methods are illustrated through experimental studies.
143

Hybrid 2D and 3D face verification

McCool, Christopher Steven January 2007 (has links)
Face verification is a challenging pattern recognition problem. The face is a biometric that, we as humans, know can be recognised. However, the face is highly deformable and its appearance alters significantly when the pose, illumination or expression changes. These changes in appearance are most notable for texture images, or two-dimensional (2D) data. But the underlying structure of the face, or three dimensional (3D) data, is not changed by pose or illumination variations. Over the past five years methods have been investigated to combine 2D and 3D face data to improve the accuracy and robustness of face verification. Much of this research has examined the fusion of a 2D verification system and a 3D verification system, known as multi-modal classifier score fusion. These verification systems usually compare two feature vectors (two image representations), a and b, using distance or angular-based similarity measures. However, this does not provide the most complete description of the features being compared as the distances describe at best the covariance of the data, or the second order statistics (for instance Mahalanobis based measures). A more complete description would be obtained by describing the distribution of the feature vectors. However, feature distribution modelling is rarely applied to face verification because a large number of observations is required to train the models. This amount of data is usually unavailable and so this research examines two methods for overcoming this data limitation: 1. the use of holistic difference vectors of the face, and 2. by dividing the 3D face into Free-Parts. The permutations of the holistic difference vectors is formed so that more observations are obtained from a set of holistic features. On the other hand, by dividing the face into parts and considering each part separately many observations are obtained from each face image; this approach is referred to as the Free-Parts approach. The extra observations from both these techniques are used to perform holistic feature distribution modelling and Free-Parts feature distribution modelling respectively. It is shown that the feature distribution modelling of these features leads to an improved 3D face verification system and an effective 2D face verification system. Using these two feature distribution techniques classifier score fusion is then examined. This thesis also examines methods for performing classifier fusion score fusion. Classifier score fusion attempts to combine complementary information from multiple classifiers. This complementary information can be obtained in two ways: by using different algorithms (multi-algorithm fusion) to represent the same face data for instance the 2D face data or by capturing the face data with different sensors (multimodal fusion) for instance capturing 2D and 3D face data. Multi-algorithm fusion is approached as combining verification systems that use holistic features and local features (Free-Parts) and multi-modal fusion examines the combination of 2D and 3D face data using all of the investigated techniques. The results of the fusion experiments show that multi-modal fusion leads to a consistent improvement in performance. This is attributed to the fact that the data being fused is collected by two different sensors, a camera and a laser scanner. In deriving the multi-algorithm and multi-modal algorithms a consistent framework for fusion was developed. The consistent fusion framework, developed from the multi-algorithm and multimodal experiments, is used to combine multiple algorithms across multiple modalities. This fusion method, referred to as hybrid fusion, is shown to provide improved performance over either fusion system on its own. The experiments show that the final hybrid face verification system reduces the False Rejection Rate from 8:59% for the best 2D verification system and 4:48% for the best 3D verification system to 0:59% for the hybrid verification system; at a False Acceptance Rate of 0:1%.
144

Impact des multitrajets sur les performances des systèmes de navigation par satellite : contribution à l'amélioration de la précision de localisation par modélisation bayésienne / Multipath impact on the performances of satellite navigation systems : contribution to the enhancement of location accuracy towards bayesian modeling

Nahimana, Donnay Fleury 19 February 2009 (has links)
De nombreuses solutions sont développées pour diminuer l'influence des multitrajets sur la précision et la disponibilité des systèmes GNSS. L'intégration de capteurs supplémentaires dans le système de localisation est l'une des solutions permettant de compenser notamment l'absence de données satellitaires. Un tel système est certes d'une bonne précision mais sa complexité et son coût limitent un usage très répandu.Cette thèse propose une approche algorithmique destinée à améliorer la précision des systèmes GNSS en milieu urbain. L'étude se base sur l'utilisation des signaux GNSS uniquement et une connaissance de l'environnement proche du récepteur à partir d'un modèle 3D du lieu de navigation.La méthode présentée intervient à l'étape de filtrage du signal reçu par le récepteur GNSS. Elle exploite les techniques de filtrage statistique de type Monte Carlo Séquentiels appelées filtre particulaire. L'erreur de position en milieu urbain est liée à l'état de réception des signaux satellitaires (bloqué, direct ou réfléchi). C'est pourquoi une information sur l'environnement du récepteur doit être prise en compte. La thèse propose également un nouveau modèle d'erreurs de pseudodistance qui permet de considérer les conditions de réception du signal dans le calcul de la position.Dans un premier temps, l'état de réception de chaque satellite reçu est supposé connu dans le filtre particulaire. Une chaîne de Markov, valable pour une trajectoire connue du mobile, est préalablement définie pour déduire les états successifs de réception des satellites. Par la suite, on utilise une distribution de Dirichlet pour estimer les états de réception des satellites / Most of the GNSS-based transport applications are employed in dense urban areas. One of the reasons of bad position accuracy in urban area is the obstacle's presence (building and trees). Many solutions are developed to decrease the multipath impact on accuracy and availability of GNSS systems. Integration of supplementary sensors into the localisation system is one of the solutions used to supply a lack of GNSS data. Such systems offer good accuracy but increase complexity and cost, which becomes inappropriate to equip a large fleet of vehicles.This thesis proposes an algorithmic approach to enhance the position accuracy in urban environment. The study is based on GNSS signals only and knowledge of the close reception environment with a 3D model of the navigation area.The method impacts the signal filtering step of the process. The filtering process is based on Sequential Monte Carlo methods called particle filter. As the position error in urban area is related to the satellite reception state (blocked, direct or reflected), information of the receiver environment is taken into account. A pseudorange error model is also proposed to fit satellite reception conditions. In a first work, the reception state of each satellite is assumed to be known. A Markov chain is defined for a known trajectory of the vehicle and is used to determine the successive reception states of each signal. Then, the states are estimated using a Dirichlet distribution
145

Modélisation et utilisation des erreurs de pseudodistances GNSS en environnement transport pour l’amélioration des performances de localisation / Modeling and use of GNSS pseudorange errors in transport environment to enhance the localization performances

Viandier, Nicolas 07 June 2011 (has links)
Les GNSS sont désormais largement présents dans le domaine des transports. Actuellement, la communauté scientifique désire développer des applications nécessitant une grande précision, disponibilité et intégrité.Ces systèmes offrent un service de position continu. Les performances sont définies par les paramètres du système mais également par l’environnement de propagation dans lequel se propagent les signaux. Les caractéristiques de propagation dans l’atmosphère sont connues. En revanche, il est plus difficile de prévoir l’impact de l’environnement proche de l’antenne, composé d’obstacles urbains. L’axe poursuivit par le LEOST et le LAGIS consiste à appréhender l’environnement et à utiliser cette information en complément de l’information GNSS. Cette approche vise à réduire le nombre de capteurs et ainsi la complexité du système et son coût. Les travaux de recherche menés dans le cadre de cette thèse permettent principalement de proposer des modélisations d'erreur de pseudodistances et des modélisations de l'état de réception encore plus réalistes. Après une étape de caractérisation de l’erreur, plusieurs modèles d’erreur de pseudodistance sont proposés. Ces modèles sont le mélange fini de gaussiennes et le mélange de processus de Dirichlet. Les paramètres du modèle sont estimés conjointement au vecteur d’état contenant la position grâce à une solution de filtrage adaptée comme le filtre particulaire Rao-Blackwellisé. L’évolution du modèle de bruit permet de s'adapter à l’environnement et donc de fournir une localisation plus précise. Les différentes étapes des travaux réalisés dans cette thèse ont été testées et validées sur données de simulation et réelles. / Today, the GNSS are largely present in the transport field. Currently, the scientific community aims to develop transport applications with a high accuracy, availability and integrity. These systems offer a continuous positioning service. Performances are defined by the system parameters but also by signal environment propagation. The atmosphere propagation characteristics are well known. However, it is more difficult to anticipate and analyze the impact of the propagation environment close to the antenna which can be composed, for instance, of urban obstacles or vegetation.Since several years, the LEOST and the LAGIS research axes are driven by the understanding of the propagation environment and its use as supplementary information to help the GNSS receiver to be more pertinent. This approach aims to reduce the number of sensors in the localisation system, and consequently reduces its complexity and cost. The work performed in this thesis is devoted to provide more realistic pseudorange error models and reception channel model. After, a step of observation error characterization, several pseudorange error models have been proposed. These models are the finite gaussian mixture model and the Dirichlet process mixture. The model parameters are then estimated jointly with the state vector containing position by using adapted filtering solution like the Rao-Blackwellized particle filter. The noise model evolution allows adapting to an urban environment and consequently providing a position more accurate.Each step of this work has been tested and evaluated on simulation data and real data.
146

A connectionist approach for incremental function approximation and on-line tasks / Uma abordagem conexionista para a aproximação incremental de funções e tarefas de tempo real

Heinen, Milton Roberto January 2011 (has links)
Este trabalho propõe uma nova abordagem conexionista, chamada de IGMN (do inglês Incremental Gaussian Mixture Network), para aproximação incremental de funções e tarefas de tempo real. Ela é inspirada em recentes teorias do cérebro, especialmente o MPF (do inglês Memory-Prediction Framework) e a Inteligência Artificial Construtivista, que fazem com que o modelo proposto possua características especiais que não estão presentes na maioria dos modelos de redes neurais existentes. Além disso, IGMN é baseado em sólidos princípios estatísticos (modelos de mistura gaussianos) e assintoticamente converge para a superfície de regressão ótima a medida que os dados de treinamento chegam. As principais vantagens do IGMN em relação a outros modelos de redes neurais são: (i) IGMN aprende instantaneamente analisando cada padrão de treinamento apenas uma vez (cada dado pode ser imediatamente utilizado e descartado); (ii) o modelo proposto produz estimativas razoáveis baseado em poucos dados de treinamento; (iii) IGMN aprende de forma contínua e perpétua a medida que novos dados de treinamento chegam (não existem fases separadas de treinamento e utilização); (iv) o modelo proposto resolve o dilema da estabilidade-plasticidade e não sofre de interferência catastrófica; (v) a topologia da rede neural é definida automaticamente e de forma incremental (novas unidades são adicionadas sempre que necessário); (vi) IGMN não é sensível às condições de inicialização (de fato IGMN não utiliza nenhuma decisão e/ou inicialização aleatória); (vii) a mesma rede neural IGMN pode ser utilizada em problemas diretos e inversos (o fluxo de informações é bidirecional) mesmo em regiões onde a função alvo tem múltiplas soluções; e (viii) IGMN fornece o nível de confiança de suas estimativas. Outra contribuição relevante desta tese é o uso do IGMN em importantes tarefas nas áreas de robótica e aprendizado de máquina, como por exemplo a identificação de modelos, a formação incremental de conceitos, o aprendizado por reforço, o mapeamento robótico e previsão de séries temporais. De fato, o poder de representação e a eficiência e do modelo proposto permitem expandir o conjunto de tarefas nas quais as redes neurais podem ser utilizadas, abrindo assim novas direções nos quais importantes contribuições do estado da arte podem ser feitas. Através de diversos experimentos, realizados utilizando o modelo proposto, é demonstrado que o IGMN é bastante robusto ao problema de overfitting, não requer um ajuste fino dos parâmetros de configuração e possui uma boa performance computacional que permite o seu uso em aplicações de controle em tempo real. Portanto pode-se afirmar que o IGMN é uma ferramenta de aprendizado de máquina bastante útil em tarefas de aprendizado incremental de funções e predição em tempo real. / This work proposes IGMN (standing for Incremental Gaussian Mixture Network), a new connectionist approach for incremental function approximation and real time tasks. It is inspired on recent theories about the brain, specially the Memory-Prediction Framework and the Constructivist Artificial Intelligence, which endows it with some unique features that are not present in most ANN models such as MLP, RBF and GRNN. Moreover, IGMN is based on strong statistical principles (Gaussian mixture models) and asymptotically converges to the optimal regression surface as more training data arrive. The main advantages of IGMN over other ANN models are: (i) IGMN learns incrementally using a single scan over the training data (each training pattern can be immediately used and discarded); (ii) it can produce reasonable estimates based on few training data; (iii) the learning process can proceed perpetually as new training data arrive (there is no separate phases for leaning and recalling); (iv) IGMN can handle the stability-plasticity dilemma and does not suffer from catastrophic interference; (v) the neural network topology is defined automatically and incrementally (new units added whenever is necessary); (vi) IGMN is not sensible to initialization conditions (in fact there is no random initialization/ decision in IGMN); (vii) the same neural network can be used to solve both forward and inverse problems (the information flow is bidirectional) even in regions where the target data are multi-valued; and (viii) IGMN can provide the confidence levels of its estimates. Another relevant contribution of this thesis is the use of IGMN in some important state-of-the-art machine learning and robotic tasks such as model identification, incremental concept formation, reinforcement learning, robotic mapping and time series prediction. In fact, the efficiency of IGMN and its representational power expand the set of potential tasks in which the neural networks can be applied, thus opening new research directions in which important contributions can be made. Through several experiments using the proposed model it is demonstrated that IGMN is also robust to overfitting, does not require fine-tunning of its configuration parameters and has a very good computational performance, thus allowing its use in real time control applications. Therefore, IGMN is a very useful machine learning tool for incremental function approximation and on-line prediction.
147

A connectionist approach for incremental function approximation and on-line tasks / Uma abordagem conexionista para a aproximação incremental de funções e tarefas de tempo real

Heinen, Milton Roberto January 2011 (has links)
Este trabalho propõe uma nova abordagem conexionista, chamada de IGMN (do inglês Incremental Gaussian Mixture Network), para aproximação incremental de funções e tarefas de tempo real. Ela é inspirada em recentes teorias do cérebro, especialmente o MPF (do inglês Memory-Prediction Framework) e a Inteligência Artificial Construtivista, que fazem com que o modelo proposto possua características especiais que não estão presentes na maioria dos modelos de redes neurais existentes. Além disso, IGMN é baseado em sólidos princípios estatísticos (modelos de mistura gaussianos) e assintoticamente converge para a superfície de regressão ótima a medida que os dados de treinamento chegam. As principais vantagens do IGMN em relação a outros modelos de redes neurais são: (i) IGMN aprende instantaneamente analisando cada padrão de treinamento apenas uma vez (cada dado pode ser imediatamente utilizado e descartado); (ii) o modelo proposto produz estimativas razoáveis baseado em poucos dados de treinamento; (iii) IGMN aprende de forma contínua e perpétua a medida que novos dados de treinamento chegam (não existem fases separadas de treinamento e utilização); (iv) o modelo proposto resolve o dilema da estabilidade-plasticidade e não sofre de interferência catastrófica; (v) a topologia da rede neural é definida automaticamente e de forma incremental (novas unidades são adicionadas sempre que necessário); (vi) IGMN não é sensível às condições de inicialização (de fato IGMN não utiliza nenhuma decisão e/ou inicialização aleatória); (vii) a mesma rede neural IGMN pode ser utilizada em problemas diretos e inversos (o fluxo de informações é bidirecional) mesmo em regiões onde a função alvo tem múltiplas soluções; e (viii) IGMN fornece o nível de confiança de suas estimativas. Outra contribuição relevante desta tese é o uso do IGMN em importantes tarefas nas áreas de robótica e aprendizado de máquina, como por exemplo a identificação de modelos, a formação incremental de conceitos, o aprendizado por reforço, o mapeamento robótico e previsão de séries temporais. De fato, o poder de representação e a eficiência e do modelo proposto permitem expandir o conjunto de tarefas nas quais as redes neurais podem ser utilizadas, abrindo assim novas direções nos quais importantes contribuições do estado da arte podem ser feitas. Através de diversos experimentos, realizados utilizando o modelo proposto, é demonstrado que o IGMN é bastante robusto ao problema de overfitting, não requer um ajuste fino dos parâmetros de configuração e possui uma boa performance computacional que permite o seu uso em aplicações de controle em tempo real. Portanto pode-se afirmar que o IGMN é uma ferramenta de aprendizado de máquina bastante útil em tarefas de aprendizado incremental de funções e predição em tempo real. / This work proposes IGMN (standing for Incremental Gaussian Mixture Network), a new connectionist approach for incremental function approximation and real time tasks. It is inspired on recent theories about the brain, specially the Memory-Prediction Framework and the Constructivist Artificial Intelligence, which endows it with some unique features that are not present in most ANN models such as MLP, RBF and GRNN. Moreover, IGMN is based on strong statistical principles (Gaussian mixture models) and asymptotically converges to the optimal regression surface as more training data arrive. The main advantages of IGMN over other ANN models are: (i) IGMN learns incrementally using a single scan over the training data (each training pattern can be immediately used and discarded); (ii) it can produce reasonable estimates based on few training data; (iii) the learning process can proceed perpetually as new training data arrive (there is no separate phases for leaning and recalling); (iv) IGMN can handle the stability-plasticity dilemma and does not suffer from catastrophic interference; (v) the neural network topology is defined automatically and incrementally (new units added whenever is necessary); (vi) IGMN is not sensible to initialization conditions (in fact there is no random initialization/ decision in IGMN); (vii) the same neural network can be used to solve both forward and inverse problems (the information flow is bidirectional) even in regions where the target data are multi-valued; and (viii) IGMN can provide the confidence levels of its estimates. Another relevant contribution of this thesis is the use of IGMN in some important state-of-the-art machine learning and robotic tasks such as model identification, incremental concept formation, reinforcement learning, robotic mapping and time series prediction. In fact, the efficiency of IGMN and its representational power expand the set of potential tasks in which the neural networks can be applied, thus opening new research directions in which important contributions can be made. Through several experiments using the proposed model it is demonstrated that IGMN is also robust to overfitting, does not require fine-tunning of its configuration parameters and has a very good computational performance, thus allowing its use in real time control applications. Therefore, IGMN is a very useful machine learning tool for incremental function approximation and on-line prediction.
148

Contrôle de têtes parlantes par inversion acoustico-articulatoire pour l’apprentissage et la réhabilitation du langage / Control of talking heads by acoustic-to-articulatory inversion for language learning and rehabilitation

Ben Youssef, Atef 26 October 2011 (has links)
Les sons de parole peuvent être complétés par l'affichage des articulateurs sur un écran d'ordinateur pour produire de la parole augmentée, un signal potentiellement utile dans tous les cas où le son lui-même peut être difficile à comprendre, pour des raisons physiques ou perceptuelles. Dans cette thèse, nous présentons un système appelé retour articulatoire visuel, dans lequel les articulateurs visibles et non visibles d'une tête parlante sont contrôlés à partir de la voix du locuteur. La motivation de cette thèse était de développer un tel système qui pourrait être appliqué à l'aide à l'apprentissage de la prononciation pour les langues étrangères, ou dans le domaine de l'orthophonie. Nous avons basé notre approche de ce problème d'inversion sur des modèles statistiques construits à partir de données acoustiques et articulatoires enregistrées sur un locuteur français à l'aide d'un articulographe électromagnétique (EMA). Notre approche avec les modèles de Markov cachés (HMMs) combine des techniques de reconnaissance automatique de la parole et de synthèse articulatoire pour estimer les trajectoires articulatoires à partir du signal acoustique. D'un autre côté, les modèles de mélanges gaussiens (GMMs) estiment directement les trajectoires articulatoires à partir du signal acoustique sans faire intervenir d'information phonétique. Nous avons basé notre évaluation des améliorations apportées à ces modèles sur différents critères : l'erreur quadratique moyenne (RMSE) entre les coordonnées EMA originales et reconstruites, le coefficient de corrélation de Pearson, l'affichage des espaces et des trajectoires articulatoires, aussi bien que les taux de reconnaissance acoustique et articulatoire. Les expériences montrent que l'utilisation d'états liés et de multi-gaussiennes pour les états des HMMs acoustiques améliore l'étage de reconnaissance acoustique des phones, et que la minimisation de l'erreur générée (MGE) dans la phase d'apprentissage des HMMs articulatoires donne des résultats plus précis par rapport à l'utilisation du critère plus conventionnel de maximisation de vraisemblance (MLE). En outre, l'utilisation du critère MLE au niveau de mapping direct de l'acoustique vers l'articulatoire par GMMs est plus efficace que le critère de minimisation de l'erreur quadratique moyenne (MMSE). Nous constatons également trouvé que le système d'inversion par HMMs est plus précis celui basé sur les GMMs. Par ailleurs, des expériences utilisant les mêmes méthodes statistiques et les mêmes données ont montré que le problème de reconstruction des mouvements de la langue à partir des mouvements du visage et des lèvres ne peut pas être résolu dans le cas général, et est impossible pour certaines classes phonétiques. Afin de généraliser notre système basé sur un locuteur unique à un système d'inversion de parole multi-locuteur, nous avons implémenté une méthode d'adaptation du locuteur basée sur la maximisation de la vraisemblance par régression linéaire (MLLR). Dans cette méthode MLLR, la transformation basée sur la régression linéaire qui adapte les HMMs acoustiques originaux à ceux du nouveau locuteur est calculée de manière à maximiser la vraisemblance des données d'adaptation. Finalement, cet étage d'adaptation du locuteur a été évalué en utilisant un système de reconnaissance automatique des classes phonétique de l'articulation, dans la mesure où les données articulatoires originales du nouveau locuteur n'existent pas. Finalement, en utilisant cette procédure d'adaptation, nous avons développé un démonstrateur complet de retour articulatoire visuel, qui peut être utilisé par un locuteur quelconque. Ce système devra être évalué de manière perceptive dans des conditions réalistes. / Speech sounds may be complemented by displaying speech articulators shapes on a computer screen, hence producing augmented speech, a signal that is potentially useful in all instances where the sound itself might be difficult to understand, for physical or perceptual reasons. In this thesis, we introduce a system called visual articulatory feedback, in which the visible and hidden articulators of a talking head are controlled from the speaker's speech sound. The motivation of this research was to develop such a system that could be applied to Computer Aided Pronunciation Training (CAPT) for learning of foreign languages, or in the domain of speech therapy. We have based our approach to this mapping problem on statistical models build from acoustic and articulatory data. In this thesis we have developed and evaluated two statistical learning methods trained on parallel synchronous acoustic and articulatory data recorded on a French speaker by means of an electromagnetic articulograph. Our Hidden Markov models (HMMs) approach combines HMM-based acoustic recognition and HMM-based articulatory synthesis techniques to estimate the articulatory trajectories from the acoustic signal. Gaussian mixture models (GMMs) estimate articulatory features directly from the acoustic ones. We have based our evaluation of the improvement results brought to these models on several criteria: the Root Mean Square Error between the original and recovered EMA coordinates, the Pearson Product-Moment Correlation Coefficient, displays of the articulatory spaces and articulatory trajectories, as well as some acoustic or articulatory recognition rates. Experiments indicate that the use of states tying and multi-Gaussian per state in the acoustic HMM improves the recognition stage, and that the minimum generation error (MGE) articulatory HMMs parameter updating results in a more accurate inversion than the conventional maximum likelihood estimation (MLE) training. In addition, the GMM mapping using MLE criteria is more efficient than using minimum mean square error (MMSE) criteria. In conclusion, we have found that the HMM inversion system has a greater accuracy compared with the GMM one. Beside, experiments using the same statistical methods and data have shown that the face-to-tongue inversion problem, i.e. predicting tongue shapes from face and lip shapes cannot be solved in a general way, and that it is impossible for some phonetic classes. In order to extend our system based on a single speaker to a multi-speaker speech inversion system, we have implemented a speaker adaptation method based on the maximum likelihood linear regression (MLLR). In MLLR, a linear regression-based transform that adapts the original acoustic HMMs to those of the new speaker was calculated to maximise the likelihood of adaptation data. Finally, this speaker adaptation stage has been evaluated using an articulatory phonetic recognition system, as there are not original articulatory data available for the new speakers. Finally, using this adaptation procedure, we have developed a complete articulatory feedback demonstrator, which can work for any speaker. This system should be assessed by perceptual tests in realistic conditions.
149

A connectionist approach for incremental function approximation and on-line tasks / Uma abordagem conexionista para a aproximação incremental de funções e tarefas de tempo real

Heinen, Milton Roberto January 2011 (has links)
Este trabalho propõe uma nova abordagem conexionista, chamada de IGMN (do inglês Incremental Gaussian Mixture Network), para aproximação incremental de funções e tarefas de tempo real. Ela é inspirada em recentes teorias do cérebro, especialmente o MPF (do inglês Memory-Prediction Framework) e a Inteligência Artificial Construtivista, que fazem com que o modelo proposto possua características especiais que não estão presentes na maioria dos modelos de redes neurais existentes. Além disso, IGMN é baseado em sólidos princípios estatísticos (modelos de mistura gaussianos) e assintoticamente converge para a superfície de regressão ótima a medida que os dados de treinamento chegam. As principais vantagens do IGMN em relação a outros modelos de redes neurais são: (i) IGMN aprende instantaneamente analisando cada padrão de treinamento apenas uma vez (cada dado pode ser imediatamente utilizado e descartado); (ii) o modelo proposto produz estimativas razoáveis baseado em poucos dados de treinamento; (iii) IGMN aprende de forma contínua e perpétua a medida que novos dados de treinamento chegam (não existem fases separadas de treinamento e utilização); (iv) o modelo proposto resolve o dilema da estabilidade-plasticidade e não sofre de interferência catastrófica; (v) a topologia da rede neural é definida automaticamente e de forma incremental (novas unidades são adicionadas sempre que necessário); (vi) IGMN não é sensível às condições de inicialização (de fato IGMN não utiliza nenhuma decisão e/ou inicialização aleatória); (vii) a mesma rede neural IGMN pode ser utilizada em problemas diretos e inversos (o fluxo de informações é bidirecional) mesmo em regiões onde a função alvo tem múltiplas soluções; e (viii) IGMN fornece o nível de confiança de suas estimativas. Outra contribuição relevante desta tese é o uso do IGMN em importantes tarefas nas áreas de robótica e aprendizado de máquina, como por exemplo a identificação de modelos, a formação incremental de conceitos, o aprendizado por reforço, o mapeamento robótico e previsão de séries temporais. De fato, o poder de representação e a eficiência e do modelo proposto permitem expandir o conjunto de tarefas nas quais as redes neurais podem ser utilizadas, abrindo assim novas direções nos quais importantes contribuições do estado da arte podem ser feitas. Através de diversos experimentos, realizados utilizando o modelo proposto, é demonstrado que o IGMN é bastante robusto ao problema de overfitting, não requer um ajuste fino dos parâmetros de configuração e possui uma boa performance computacional que permite o seu uso em aplicações de controle em tempo real. Portanto pode-se afirmar que o IGMN é uma ferramenta de aprendizado de máquina bastante útil em tarefas de aprendizado incremental de funções e predição em tempo real. / This work proposes IGMN (standing for Incremental Gaussian Mixture Network), a new connectionist approach for incremental function approximation and real time tasks. It is inspired on recent theories about the brain, specially the Memory-Prediction Framework and the Constructivist Artificial Intelligence, which endows it with some unique features that are not present in most ANN models such as MLP, RBF and GRNN. Moreover, IGMN is based on strong statistical principles (Gaussian mixture models) and asymptotically converges to the optimal regression surface as more training data arrive. The main advantages of IGMN over other ANN models are: (i) IGMN learns incrementally using a single scan over the training data (each training pattern can be immediately used and discarded); (ii) it can produce reasonable estimates based on few training data; (iii) the learning process can proceed perpetually as new training data arrive (there is no separate phases for leaning and recalling); (iv) IGMN can handle the stability-plasticity dilemma and does not suffer from catastrophic interference; (v) the neural network topology is defined automatically and incrementally (new units added whenever is necessary); (vi) IGMN is not sensible to initialization conditions (in fact there is no random initialization/ decision in IGMN); (vii) the same neural network can be used to solve both forward and inverse problems (the information flow is bidirectional) even in regions where the target data are multi-valued; and (viii) IGMN can provide the confidence levels of its estimates. Another relevant contribution of this thesis is the use of IGMN in some important state-of-the-art machine learning and robotic tasks such as model identification, incremental concept formation, reinforcement learning, robotic mapping and time series prediction. In fact, the efficiency of IGMN and its representational power expand the set of potential tasks in which the neural networks can be applied, thus opening new research directions in which important contributions can be made. Through several experiments using the proposed model it is demonstrated that IGMN is also robust to overfitting, does not require fine-tunning of its configuration parameters and has a very good computational performance, thus allowing its use in real time control applications. Therefore, IGMN is a very useful machine learning tool for incremental function approximation and on-line prediction.
150

Network configuration improvement and design aid using artificial intelligence

Van Graan, Sebastian Jan 29 August 2008 (has links)
This dissertation investigates the development of new Global system for mobile communications (GSM) improvement algorithms used to solve the nondeterministic polynomial-time hard (NP-hard) problem of assigning cells to switches. The departure of this project from previous projects is in the area of the GSM network being optimised. Most previous projects tried minimising the signalling load on the network. The main aim in this project is to reduce the operational expenditure as much as possible while still adhering to network element constraints. This is achieved by generating new network configurations with a reduced transmission cost. Since assigning cells to switches in cellular mobile networks is a NP-hard problem, exact methods cannot be used to solve it for real-size networks. In this context, heuristic approaches, evolutionary search algorithms and clustering techniques can, however, be used. This dissertation presents a comprehensive and comparative study of the above-mentioned categories of search techniques adopted specifically for GSM network improvement. The evolutionary search technique evaluated is a genetic algorithm (GA) while the unsupervised learning technique is a Gaussian mixture model (GMM). A number of custom-developed heuristic search techniques with differing goals were also experimented with. The implementation of these algorithms was tested in order to measure the quality of the solutions. Results obtained confirmed the ability of the search techniques to produce network configurations with a reduced operational expenditure while still adhering to network element constraints. The best results found were using the Gaussian mixture model where savings of up to 17% were achieved. The heuristic searches produced promising results in the form of the characteristics they portray, for example, load-balancing. Due to the massive problem space and a suboptimal chromosome representation, the genetic algorithm struggled to find high quality viable solutions. The objective of reducing network cost was achieved by performing cell-to-switch optimisation taking traffic distributions, transmission costs and network element constraints into account. These criteria cannot be divorced from each other since they are all interdependent, omitting any one of them will lead to inefficient and infeasible configurations. Results obtained further indicated that the search space consists out of two components namely, traffic and transmission cost. When optimising, it is very important to consider both components simultaneously, if not, infeasible or suboptimum solutions are generated. It was also found that pre-processing has a major impact on the cluster-forming ability of the GMM. Depending on how the pre-processing technique is set up, it is possible to bias the cluster-formation process in such a way that either transmission cost savings or a reduction in inter base station controller/switching centre traffic volume is given preference. Two of the difficult questions to answer when performing network capacity expansions are where to install the remote base station controllers (BSCs) and how to alter the existing BSC boundaries to accommodate the new BSCs being introduced. Using the techniques developed in this dissertation, these questions can now be answered with confidence. / Dissertation (MEng)--University of Pretoria, 2008. / Electrical, Electronic and Computer Engineering / unrestricted

Page generated in 0.1424 seconds