Global ETD Search

1	Optimization of demodulation performance of the GPS and GALILEO navigation messages / Optimisation de la performance de démodulation des messages de navigation GPS et GALILEO Garcia Peña, Axel Javier 08 October 2010 (has links) La performance de démodulation des signaux GNSS existants, GPS L1 C/A, L2C ou L5, est satisfaisante en environnements ouverts où le C/N0 disponible est assez élevé. Cependant, en milieu urbain, le niveau de C/N0 du signal reçu est souvent très bas et est affecté de variations rapides qui peuvent nuire la démodulation des messages GNSS. Donc, car les applications du marché de masse sont appelées à être déployées dans ces environnements, il est nécessaire d'étudier et de chercher des méthodes de démodulation/décodage qui améliorent la performance de démodulation des messages GNSS dans ces environnements. Il est aussi nécessaire de considérer les nouveaux signaux GPS L1C et GALILEO E1. Ces signaux doivent fournir un service de positionnement par satellite dans tout type d'environnement, et spécifiquement en milieu urbain. Ainsi, cette thèse analyse aussi les performances de démodulation des nouveaux signaux GNSS tels que définis dans les documents publics actuels. De plus, de nouvelles structures de message GALILEO E1 sont proposées et analysées afin d'optimiser la performance de démodulation ainsi que la quantité d'information diffusée. En conséquence, le but principal de cette thèse est d'analyser et améliorer la performance de démodulation des signaux GNSS ouverts au public, spécifiquement en milieu urbain, et de proposer de nouvelles structures de messages de navigation pour GALILEO E1. La structure détaillée des chapitres de cette thèse est donnée ci-après. En premier lieu, le sujet de cette thèse est introduit, ses contributions originales sont mises en avant, et le plan du rapport est présenté. Dans le 2ième chapitre, la thèse décrit la structure actuelle des signaux GNSS analysés, en se concentrant sur la structure du message de navigation, les codages canal implantés et leurs techniques de décodage. Dans le 3ième chapitre, deux types de modèles de canal de propagation sont présentés pour deux différents types de scénarios. D'un côté, un canal AWGN est choisi pour modéliser les environnements ouverts. De l'autre côté, le modèle mathématique de Perez-Fontan d'un canal mobile est choisi pour représenter les environnements urbains et indoor. Dans le 4ième chapitre, une tentative pour effectuer une prédiction binaire d'une partie du message de navigation GPS L1 C/A est présentée. La prédiction est essayée en utilisant les almanachs GPS L1 C/A, grâce à un programme de prédiction à long terme fourni par TAS-F, et des méthodes de traitement du signal: estimation spectrale, méthode de PRONY et réseau de neurones. Dans le 5ème chapitre, des améliorations à la performance de démodulation du message de GPS L2C et L5 sont apportées en utilisant leur codage canal de manière non traditionnelle. Deux méthodes sont analysées. La première méthode consiste à combiner les codages canal internes et externes du message afin de corriger davantage de mots reçus. La deuxième méthode consiste à utiliser les probabilités des données d'éphémérides afin d'améliorer le décodage traditionnel de Viterbi. Dans le 6ième chapitre, la performance de démodulation des messages de GPS L1C et du Open Service GALILEO E1 est analysée dans différents environnements. D'abord, une étude de la structure de ces deux signaux est présentée pour déterminer le C/N0 du signal utile reçu dans un canal AWGN. Puis, la performance de démodulation de ces signaux est analysée grâce à des simulations dans différents environnements, avec un récepteur se déplaçant à différentes vitesses et avec différentes techniques d'estimation de la phase porteuse du signal. / The demodulation performance achieved by any of the existing GPS signals, L1 C/A, L2C or L5, is satisfactory in open environments where the available C/N0 is quite high. However, in indoor/urban environments, the C/N0 level of the received signal is often very low and suffers fast variations which can further affect the GNSS messages demodulation. Therefore, since the mass-market applications being designed nowadays are aimed at these environments, it is necessary to study and to search alternative demodulation/decoding methods which improve the GNSS messages demodulation performance in these environments. Moreover, new GNSS signals recently developed, such as GPS L1C and GALILEO E1, must also be considered. These signals aim at providing satellite navigation positioning service in any kind of environment, giving special attention to indoor and urban environments. Therefore, the demodulation performances of the new GNSS signals as they are defined in the current public documents is also analysed. Moreover, new GALILEO E1 message structures are proposed and analysed in order to optimize the demodulation performance as well as the quantity of broadcasted information. Therefore, the main goal of this dissertation is to analyse and to improve the demodulation performance of the current open GNSS signals, specifically in indoor and urban environments, and to propose new navigation message structures for GALILEO E1. A detailed structure of this dissertation sections is given next. First, the subject of this thesis is introduced, original contributions are highlighted, and the outline of the report is presented. Second, this dissertation begins by a description of the current structure of the different analysed GNSS signals, paying special attention to the navigation message structure, implemented channel code and their decoding techniques. In the third section, two types of transmission channel models are presented for two different types of environments. On one hand, an AWGN channel is used to model the signal transmission in an open environments. On the other hand, the choice of a specific mobile channel, the Perez-Fontan channel model, is chosen to model the signal transmission in an urban environment. In the fourth section, a tentative to make a binary prediction of the broadcasted satellite ephemeris of the GPS L1 C/A navigation message is presented. The prediction is attempted using the GPS L1 C/A almanacs data, a long term orbital prediction program provided by TAS-F, and some signal processing methods: spectral estimation, the PRONY method, and a neural network. In the fifth section, improvements to the GPS L2C and GPS L5 navigation message demodulation performance are brought by using their channel codes in a non-traditional way. Two methods are inspected. The first method consists in sharing information between the message inner and outer channel codes in order to correct more received words. The second method consists in using the ephemeris data probabilities in order to improve the traditional Viterbi decoding. In the sixth section, the GPS L1C and GALILEO E1 Open Service demodulation performance is analysed in different environments. First, a brief study of the structure of both signals to determine the received C/N0 in an AWGN channel is presented. Second, their demodulation performance is analysed through simulations in different environments, with different receiver speeds and signal carrier phase estimation techniques. Gps Gnss Galileo Performance de démodulation Canal de propagation Canal AWGN Canal mobile Csk Ber Wer Eer Messages de navigation Gps Gnss Galileo Demodulation performance Transmission channel AWGN channel Mobile channel Code Shift Keying Csk Ber Wer Eer Navigation message Channel code Ldpc
2	Sélection de corpus en traduction automatique statistique / Efficient corpus selection for statistical machine translation Abdul Rauf, Sadaf 17 January 2012 (has links) Dans notre monde de communications au niveau international, la traduction automatique est devenue une technologie clef incontournable. Plusieurs approches existent, mais depuis quelques années la dite traduction automatique statistique est considérée comme la plus prometteuse. Dans cette approche, toutes les connaissances sont extraites automatiquement à partir d'exemples de traductions, appelés textes parallèles, et des données monolingues en langue cible. La traduction automatique statistique est un processus guidé par les données. Ceci est communément avancé comme un grand avantage des approches statistiques puisque l'intervention d'être humains bilingues n'est pas nécessaire, mais peut se retourner en un problème lorsque ces données nécessaires au développement du système ne sont pas disponibles, de taille insuffisante ou dont le genre ne convient pas. Les recherches présentées dans cette thèse sont une tentative pour surmonter un des obstacles au déploiement massif de systèmes de traduction automatique statistique : le manque de corpus parallèles. Un corpus parallèle est une collection de phrases en langues source et cible qui sont alignées au niveau de la phrase. La plupart des corpus parallèles existants ont été produits par des traducteurs professionnels. Ceci est une tâche coûteuse, en termes d'argent, de ressources humaines et de temps. Dans la première partie de cette thèse, nous avons travaillé sur l'utilisation de corpus comparables pour améliorer les systèmes de traduction statistique. Un corpus comparable est une collection de données en plusieurs langues, collectées indépendamment, mais qui contiennent souvent des parties qui sont des traductions mutuelles. La taille et la qualité des contenus parallèles peuvent variées considérablement d'un corpus comparable à un autre, en fonction de divers facteurs, notamment la méthode de construction du corpus. Dans tous les cas, il n'est pas aisé d'identifier automatiquement des parties parallèles. Dans le cadre de cette thèse, nous avons développé une telle approche qui est entièrement basée sur des outils librement disponibles. L'idée principale de notre approche est l'utilisation d'un système de traduction automatique statistique pour traduire toutes les phrases en langue source du corpus comparable. Chacune de ces traductions est ensuite utilisée en tant que requête afin de trouver des phrases potentiellement parallèles. Cette recherche est effectuée à l'aide d'un outil de recherche d'information. En deuxième étape, les phrases obtenues sont comparées aux traductions automatiques afin de déterminer si elles sont effectivement parallèles à la phrase correspondante en langue source. Plusieurs critères ont été évalués tels que le taux d'erreur de mots ou le «translation edit rate (TER)». Nous avons effectué une analyse expérimentale très détaillée afin de démontrer l'intérêt de notre approche. Les corpus comparables utilisés se situent dans le domaine des actualités, plus précisément, des dépêches d'actualités des agences de presse telles que «Agence France Press (AFP)», «Associate press» ou «Xinua News». Ces agences publient quotidiennement des actualités en plusieurs langues. Nous avons pu extraire des textes parallèles à partir de grandes collections de plus de trois cent millions de mots pour les paires de langues français/anglais et arabe/anglais. Ces textes parallèles ont permis d'améliorer significativement nos systèmes de traduction statistique. Nous présentons également une comparaison théorique du modèle développé dans cette thèse avec une autre approche présentée dans la littérature. Diverses extensions sont également étudiées : l'extraction automatique de mots inconnus et la création d'un dictionnaire, la détection et suppression 1 d'informations supplémentaires, etc. Dans la deuxième partie de cette thèse, nous avons examiné la possibilité d'utiliser des données monolingues afin d'améliorer le modèle de traduction d'un système statistique... / In our world of international communications, machine translation has become a key technology essential. Several pproaches exist, but in recent years the so-called Statistical Machine Translation (SMT) is considered the most promising. In this approach, knowledge is automatically extracted from examples of translations, called parallel texts, and monolingual data in the target language. Statistical machine translation is a data driven process. This is commonly put forward as a great advantage of statistical approaches since no human intervention is required, but this can also turn into a problem when the necessary development data are not available, are too small or the domain is not appropriate. The research presented in this thesis is an attempt to overcome barriers to massive deployment of statistical machine translation systems: the lack of parallel corpora. A parallel corpus is a collection of sentences in source and target languages that are aligned at the sentence level. Most existing parallel corpora were produced by professional translators. This is an expensive task in terms of money, human resources and time. This thesis provides methods to overcome this need by exploiting the easily available huge comparable and monolingual data collections. We present two effective architectures to achieve this.In the first part of this thesis, we worked on the use of comparable corpora to improve statistical machine translation systems. A comparable corpus is a collection of texts in multiple languages, collected independently, but often containing parts that are mutual translations. The size and quality of parallel contents may vary considerably from one comparable corpus to another, depending on various factors, including the method of construction of the corpus. In any case, itis not easy to automatically identify the parallel parts. As part of this thesis, we developed an approach which is entirely based on freely available tools. The main idea of our approach is the use of a statistical machine translation system to translate all sentences in the source language comparable corpus to the target language. Each of these translations is then used as query to identify potentially parallel sentences from the target language comparable corpus. This research is carried out using an information retrieval toolkit. In the second step, the retrieved sentences are compared to the automatic translation to determine whether they are parallel to the corresponding sentence in source language. Several criteria wereevaluated such as word error rate or the translation edit rate (TER) and TERp. We conducted a very detailed experimental analysis to demonstrate the interest of our approach. We worked on comparable corpora from the news domain, more specifically, multilingual news agencies such as, "Agence France Press (AFP)", "Associate Press" or "Xinua News." These agencies publish daily news in several languages. We were able to extract parallel texts from large collections of over three hundred million words for French-English and Arabic-English language pairs. These parallel texts have significantly improved our statistical translation systems. We also present a theoretical comparison of the model developed in this thesis with another approach presented in the literature. Various extensions are also discussed: automatic extraction of unknown words and the creation of a dictionary, detection and suppression of extra information, etc.. In the second part of this thesis, we examined the possibility of using monolingual data to improve the translation model of a statistical system. The idea here is to replace parallel data by monolingual source or target language data. This research is thus placed in the context of unsupervised learning, since missing translations are produced by an automatic translation system, and after various filtering, reinjected into the system... Traduction automatique statistique Corpus comparable Recherche d'information Statistical machine translation Comparable corpus Information retrieval Unsupervised learning WER TER TERp
3	Literatur und Film im Fadenkreuz der Systemtheorie : ein paradigmatischer Vergleich / Föls, Maike-Maren. January 2003 (has links) Thesis (doctoral)--Universität, Mannheim, 2002. / Includes bibliographical references.
4	Automatic Speech Recognition System for Somali in the interest of reducing Maternal Morbidity and Mortality. Laryea, Joycelyn, Jayasundara, Nipunika January 2020 (has links) Developing an Automatic Speech Recognition (ASR) system for the Somali language, though not novel, is not actively explored; hence there has been no success in a model for conversational speech. Neither are related works accessible as open-source. The unavailability of digital data is what labels Somali as a low resource language and poses the greatest impediment to the development of an ASR for Somali. The incentive to develop an ASR system for the Somali language is to contribute to reducing the Maternal Mortality Rate (MMR) in Somalia. Researchers acquire interview audio data regarding maternal health and behaviour in the Somali language; to be able to engage the relevant stakeholders to bring about the needed change, these audios must be transcribed into text, which is an important step towards translation into any language. This work investigates available ASR for Somali and attempts to develop a prototype ASR system to convert Somali audios into Somali text. To achieve this target, we first identified the available open-source systems for speech recognition and selected the DeepSpeech engine for the implementation of the prototype. With three hours of audio data, the accuracy of transcription is not as required and cannot be deployed for use. This we attribute to insufficient training data and estimate that the effort towards an ASR for Somali will be more significant by acquiring about 1200 hours of audio to train the DeepSpeech engine Automatic Speech Recognition (ASR) DeepSpeech Natural Language Processing (NLP) Word Error Rate (WER) Character Error Rate (CER) Social Sciences Samhällsvetenskap
5	A Comparative Analysis of Whisper and VoxRex on Swedish Speech Data Fredriksson, Max, Ramsay Veljanovska, Elise January 2024 (has links) With the constant development of more advanced speech recognition models, the need to determine which models are better in specific areas and for specific purposes becomes increasingly crucial. Even more so for low-resource languages such as Swedish, dependent on the progress of models for the large international languages. Lagerlöf (2022) conducted a comparative analysis between Google’s speech-to-text model and NLoS’s VoxRex B, concluding that VoxRex was the best for Swedish audio. Since then, OpenAI released their Automatic Speech Recognition model Whisper, prompting a reassessment of the preferred choice for transcribing Swedish. In this comparative analysis using data from Swedish radio news segments, Whisper performs better than VoxRex in tests on the raw output, highly affected by more proficient sentence constructions. It is not possible to conclude which model is better regarding pure word prediction. However, the results favor VoxRex, displaying a lower variability, meaning that even though Whisper can predict full text better, the decision of what model to use should be determined by the user’s needs. ASR Automatic Speech Recognition Swedish Speech Recognition Speech Recognition Models Speech-to-Text Whisper VoxRex Wav2Vec Model Comparison Transformer Models Neural Networks Machine Learning WER Word Error Rate Transcription Probability Theory and Statistics Sannolikhetsteori och statistik

1

Page generated in 0.0477 seconds