Global ETD Search

21	Understanding Patterns in Infant-Directed Speech in Context: An Investigation of Statistical Cues to Word Boundaries Hartman, Rose 01 May 2017 (has links) People talk about coherent episodes of their experience, leading to strong dependencies between words and the contexts in which they appear. Consequently, language within a context is more repetitive and more coherent than language sampled from across contexts. In this dissertation, I investigated how patterns in infant-directed speech differ under context-sensitive compared to context-independent analysis. In particular, I tested the hypothesis that cues to word boundaries may be clearer within contexts. Analyzing a large corpus of transcribed infant-directed speech, I implemented three different approaches to defining context: a top-down approach using the occurrence of key words from pre-determined context lists, a bottom-up approach using topic modeling, and a subjective coding approach where contexts were determined by open-ended, subjective judgments of coders reading sections of the transcripts. I found substantial agreement among the context codes from the three different approaches, but also important differences in the proportion of the corpus that was identified by context, the distribution of the contexts identified, and some characteristics of the utterances selected by each approach. I discuss implications for the use and interpretation of contexts defined in each of these three ways, and the value of a multiple-method approach in the exploration of context. To test the strength of statistical cues to word boundaries in context-specific sub-corpora relative to a context-independent analysis of cues to word boundaries, I used a resampling procedure to compare the segmentability of context sub-corpora defined by each of the three approaches to a distribution of random sub-corpora, matched for size for each context sub-corpus. Although my analyses confirmed that context-specific sub-corpora are indeed more repetitive, the data did not support the hypothesis that speech within contexts provides richer information about the statistical dependencies among phonemes than is available when analyzing the same statistical dependencies without respect to context. Alternative hypotheses and future directions to further elucidate this phenomenon are discussed. / 2019-02-17 Bayesian modeling Context Language acquisition Resampling Statistical learning Word segmentation
22	An fMRI study of implicit language learning in developmental language impairment Plante, Elena, Patterson, Dianne, Sandoval, Michelle, Vance, Christopher J., Asbjørnsen, Arve E. January 2017 (has links) Individuals with developmental language impairment can show deficits into adulthood. This suggests that neural networks related to their language do not normalize with time. We examined the ability of 16 adults with and without impaired language to learn individual words in an unfamiliar language. Adults with impaired language were able to segment individual words from running speech, but needed more time to do so than their normal-language peers. ICA analysis of fMRI data indicated that adults with language impairment activate a neural network that is comparable to that of adults with normal language. However, a regional analysis indicated relative hyperactivation of a collection of regions associated with language processing. These results are discussed with reference to the Statistical Learning Framework and the sub-skills thought to relate to word segmentation. (C) 2017 The University of Arizona. Published by Elsevier Inc. Statistical learning Language Learning Specific language impairment fMRI Brain
23	Neural Correlates of Morphology Acquisition through a Statistical Learning Paradigm Sandoval, Michelle, Patterson, Dianne, Dai, Huanping, Vance, Christopher J., Plante, Elena 27 July 2017 (has links) The neural basis of statistical learning as it occurs over time was explored with stimuli drawn from a natural language (Russian nouns). The input reflected the "rules" for marking categories of gendered nouns, without making participants explicitly aware of the nature of what they were to learn. Participants were scanned while listening to a series of gender-marked nouns during four sequential scans, and were tested for their learning immediately after each scan. Although participants were not told the nature of the learning task, they exhibited learning after their initial exposure to the stimuli. Independent component analysis of the brain data revealed five task- related sub- networks. Unlike prior statistical learning studies of word segmentation, this morphological learning task robustly activated the inferior frontal gyrus during the learning period. This region was represented in multiple independent components, suggesting it functions as a network hub for this type of learning. Moreover, the results suggest that subnetworks activated by statistical learning are driven by the nature of the input, rather than reflecting a general statistical learning system. statistical learning implicit learning language morphology fMRI second language acquisition
24	Réseaux de neurones convolutionnels profonds pour la détection de petits véhicules en imagerie aérienne / Deep neural networks for the detection of small vehicles in aerial imagery Ogier du Terrail, Jean 20 December 2018 (has links) Cette thèse présente une tentative d'approche du problème de la détection et discrimination des petits véhicules dans des images aériennes en vue verticale par l'utilisation de techniques issues de l'apprentissage profond ou "deep-learning". Le caractère spécifique du problème permet d'utiliser des techniques originales mettant à profit les invariances des automobiles et autres avions vus du ciel.Nous commencerons par une étude systématique des détecteurs dits "single-shot", pour ensuite analyser l'apport des systèmes à plusieurs étages de décision sur les performances de détection. Enfin nous essayerons de résoudre le problème de l'adaptation de domaine à travers la génération de données synthétiques toujours plus réalistes, et son utilisation dans l'apprentissage de ces détecteurs. / The following manuscript is an attempt to tackle the problem of small vehicles detection in vertical aerial imagery through the use of deep learning algorithms. The specificities of the matter allows the use of innovative techniques leveraging the invariance and self similarities of automobiles/planes vehicles seen from the sky.We will start by a thorough study of single shot detectors. Building on that we will examine the effect of adding multiple stages to the detection decision process. Finally we will try to come to grips with the domain adaptation problem in detection through the generation of better looking synthetic data and its use in the training process of these detectors. Détection d'objets Statistical Learning Object-detection Deep-learning Computer-vision
25	Automated RRM optimization of LTE networks using statistical learning / Optimisation automatique des paramètres RRM des réseaux LTE en utilisant l'apprentissage statistique Tiwana, Moazzam Islam 19 November 2010 (has links) Le secteur des télécommunications mobiles a connu une croissance très rapide dans un passé récent avec pour résultat d'importantes évolutions technologiques et architecturales des réseaux sans fil. L'expansion et l'hétérogénéité de ces réseaux ont engendré des coûts de fonctionnement de plus en plus importants. Les dysfonctionnements typiques de ces réseaux ont souvent pour origines des pannes d'équipements ainsi que de mauvaises planifications et/ou configurations. Dans ce contexte, le dépannage automatisé des réseaux sans fil peut s'avérer d'une importance particulière visant à réduire les coûts opérationnels et à fournir une bonne qualité de service aux utilisateurs. Le dépannage automatisé des pannes survenant sur les réseaux sans fil peuvent ainsi conduire à une réduction du temps d'interruption de service pour les clients, permettant ainsi d'éviter l'orientation de ces derniers vers les opérateurs concurrents. Le RAN (Radio Access Network) d'un réseau sans fil constitue sa plus grande partie. Par conséquent, le dépannage automatisé des réseaux d'accès radio des réseaux sans fil est très important. Ce dépannage comprend la détection des dysfonctionnements, l'identification des causes des pannes (diagnostic) et la proposition d'actions correctives (déploiement de la solution). Tout d'abord, dans cette thèse, les travaux antérieurs liés au dépannage automatisé des réseaux sans-fil ont été explorés. Il s'avère que la détection et le diagnostic des incidents impactant les réseaux sans-fil ont déjà bien été étudiés dans les productions scientifiques traitant de ces sujets. Mais étonnamment, aucune référence significative sur des travaux de recherche liés aux résolutions automatisées des pannes des réseaux sans fil n'a été rapportée. Ainsi, l'objectif de cette thèse est de présenter mes travaux de recherche sur la " résolution automatisée des dysfonctionnements des réseaux sans fil LTE (Long Term Evolution) à partir d'une approche statistique ". Les dysfonctionnements liés aux paramètres RRM (Radio Resource Management) seront particulièrement étudiés. Cette thèse décrit l'utilisation des données statistiques pour l'automatisation du processus de résolution des problèmes survenant sur les réseaux sans fil. Dans ce but, l'efficacité de l'approche statistique destinée à l'automatisation de la résolution des incidents liés aux paramètres RRM a été étudiée. Ce résultat est obtenu par la modélisation des relations fonctionnelles existantes entre les paramètres RRM et les indicateurs de performance ou KPI (Key Performance Indicator). Une architecture générique automatisée pour RRM 8 a été proposée. Cette dernière a été utilisée afin d'étudier l'utilisation de l'approche statistique dans le paramétrage automatique et le suivi des performances des réseaux sans fil. L'utilisation de l'approche statistique dans la résolution automatique des dysfonctionnements des réseaux sans fil présente deux contraintes majeures. Premièrement, les mesures de KPI obtenues à partir du réseau peuvent contenir des erreurs qui peuvent partiellement masquer le comportement réel des indicateurs de performance. Deuxièmement, ces algorithmes automatisés sont itératifs. Ainsi, après chaque itération, la performance du réseau est généralement évaluée sur la durée d'une journée avec les nouveaux paramètres réseau implémentés. Les algorithmes itératifs devraient donc atteindre leurs objectifs de qualité de service dans un nombre minimum d'itérations. La méthodologie automatisée de diagnostic et de résolution développée dans cette thèse, basée sur la modélisation statistique, prend en compte ces deux difficultés. Ces algorithmes de la résolution automatisé nécessitent peu de calculs et convergent vers un petit nombre d'itérations ce qui permet leur implémentation à l'OMC (Operation and Maintenace Center). La méthodologie a été appliquée à des cas pratiques sur réseau LTE dans le but de résoudre des problématiques liées à la mobilité et aux interférences. Il est ainsi apparu que l'objectif de correction de ces dysfonctionnements a été atteint au bout d'un petit nombre d'itérations. Un processus de résolution automatisé utilisant l'optimisation séquentielle des paramètres d'atténuation des interférences et de packet scheduling a également été étudié. L'incorporation de la "connaissance a priori" dans le processus de résolution automatisé réduit d'avantage le nombre d'itérations nécessaires à l'automatisation du processus. En outre, le processus automatisé de résolution devient plus robuste, et donc, plus simple et plus pratique à mettre en œuvre dans les réseaux sans fil. / The mobile telecommunication industry has experienced a very rapid growth in the recent past. This has resulted in significant technological and architectural evolution in the wireless networks. The expansion and the heterogenity of these networks have made their operational cost more and more important. Typical faults in these networks may be related to equipment breakdown and inappropriate planning and configuration. In this context, automated troubleshooting in wireless networks receives a growing importance, aiming at reducing the operational cost and providing high-quality services for the end-users. Automated troubleshooting can reduce service breakdown time for the clients, resulting in the decrease in client switchover to competing network operators. The Radio Access Network (RAN) of a wireless network constitutes its biggest part. Hence, the automated troubleshooting of RAN of the wireless networks is very important. The troubleshooting comprises the isolation of the faulty cells (fault detection), identifying the causes of the fault (fault diagnosis) and the proposal and deployement of the healing action (solution deployement). First of all, in this thesis, the previous work related to the troubleshooting of the wireless networks has been explored. It turns out that the fault detection and the diagnosis of wireless networks have been well studied in the scientific literature. Surprisingly, no significant references for the research work related to the automated healing of wireless networks have been reported. Thus, the aim of this thesis is to describe my research advances on "Automated healing of LTE wireless networks using statistical learning". We focus on the faults related to Radio Resource Management (RRM) parameters. This thesis explores the use of statistical learning for the automated healing process. In this context, the effectiveness of statistical learning for automated RRM has been investigated. This is achieved by modeling the functional relationships between the RRM parameters and Key Performance Indicators (KPIs). A generic automated RRM architecture has been proposed. This generic architecture has been used to study the application of statistical learning approach to auto-tuning and performance monitoring of the wireless networks. The use of statistical learning in the automated healing of wireless networks introduces two important diculties: Firstly, the KPI measurements obtained from the network are noisy, hence this noise can partially mask the actual behaviour of KPIs. Secondly, these automated healing algorithms are iterative. After each iteration the network performance is typically evaluated over the duration of a day with new network parameter settings. Hence, the iterative algorithms should achieve their QoS objective in a minimum number of iterations. Automated healing methodology developped in this thesis, based on statistical modeling, addresses these two issues. The automated healing algorithms developped are computationaly light and converge in a few number of iterations. This enables the implemenation of these algorithms in the Operation and Maintenance Center (OMC) in the off-line mode. The automated healing methodolgy has been applied to 3G Long Term Evolution (LTE) use cases for healing the mobility and intereference mitigation parameter settings. It has been observed that our healing objective is achieved in a few number of iterations. An automated healing process using the sequential optimization of interference mitigation and packet scheduling parameters has also been investigated. The incorporation of the a priori knowledge into the automated healing process, further reduces the number of iterations required for automated healing. Furthermore, the automated healing process becomes more robust, hence, more feasible and practical for the implementation in the wireless networks. Lte Automated troubleshouting Long Term Evolution Statistical learning
26	Méthodes d’apprentissage statistique pour l’optimisation globale / Statistical learning approaches for global optimization Contal, Emile 29 September 2016 (has links) Cette thèse se consacre à une analyse rigoureuse des algorithmes d'optimisation globale équentielle. On se place dans un modèle de bandits stochastiques où un agent vise à déterminer l'entrée d'un système optimisant un critère. Cette fonction cible n'est pas connue et l'agent effectue séquentiellement des requêtes pour évaluer sa valeur aux entrées qu'il choisit. Cette fonction peut ne pas être convexe et contenir un grand nombre d'optima locaux. Nous abordons le cas difficile où les évaluations sont coûteuses, ce qui exige de concevoir une sélection rigoureuse des requêtes. Nous considérons deux objectifs, d'une part l'optimisation de la somme des valeurs reçues à chaque itération, d'autre part l'optimisation de la meilleure valeur trouvée jusqu'à présent. Cette thèse s'inscrit dans le cadre de l'optimisation bayésienne lorsque la fonction est une réalisation d'un processus stochastique connu, et introduit également une nouvelle approche d'optimisation par ordonnancement où l'on effectue seulement des comparaisons des valeurs de la fonction. Nous proposons des algorithmes nouveaux et apportons des concepts théoriques pour obtenir des garanties de performance. Nous donnons une stratégie d'optimisation qui s'adapte à des observations reçues par batch et non individuellement. Une étude générique des supremums locaux de processus stochastiques nous permet d'analyser l'optimisation bayésienne sur des espaces de recherche nonparamétriques. Nous montrons également que notre approche s'étend à des processus naturels non gaussiens. Nous établissons des liens entre l'apprentissage actif et l'apprentissage statistique d'ordonnancements et déduisons un algorithme d'optimisation de fonctions potentiellement discontinue. / This dissertation is dedicated to a rigorous analysis of sequential global optimization algorithms. We consider the stochastic bandit model where an agent aim at finding the input of a given system optimizing the output. The function which links the input to the output is not explicit, the agent requests sequentially an oracle to evaluate the output for any input. This function is not supposed to be convex and may display many local optima. In this work we tackle the challenging case where the evaluations are expensive, which requires to design a careful selection of the input to evaluate. We study two different goals, either to maximize the sum of the rewards received at each iteration, or to maximize the best reward found so far. The present thesis comprises the field of global optimization where the function is a realization from a known stochastic process, and the novel field of optimization by ranking where we only perform function value comparisons. We propose novel algorithms and provide theoretical concepts leading to performance guarantees. We first introduce an optimization strategy for observations received by batch instead of individually. A generic study of local supremum of stochastic processes allows to analyze Bayesian optimization on nonparametric search spaces. In addition, we show that our approach extends to natural non-Gaussian processes. We build connections between active learning and ranking and deduce an optimization algorithm of potentially discontinuous functions. Apprentissage statistique Optimisation Processus gaussien Statistical learning Optimization Gaussian process
27	Statistical Bootstrapping of Speech Segmentation Cues Planet, Nicolas O. 01 January 2010 (has links) (PDF) Various infant studies suggest that statistical regularities in the speech stream (e.g. transitional probabilities) are one of the first speech segmentation cues available. Statistical learning may serve as a mechanism for learning various language specific segmentation cues (e.g. stress segmentation by English speakers). To test this possibility we exposed adults to an artificial language in which all words had a novel acoustic cue on the final syllable. Subjects were presented with a continuous stream of synthesized speech in which the words were repeated in random order. Subjects were then given a new set of words to see if they had learned the acoustic cue and generalized it to new stimuli. Finally, subjects were exposed to a competition stream in which the transitional probability and novel acoustic cues conflicted to see which cue they preferred to use for segmentation. Results on the word-learning test suggest that subjects were able to segment the first exposure stream, however, on the cue transfer test they did not display any evidence of learning the relationship between word boundaries and the novel acoustic cue. Subjects were able to learn statistical words from the competition stream despite extra intervening syllables. speech segmentation statistical learning language Cognitive Psychology Psycholinguistics and Neurolinguistics
28	Discriminant Analysis for Longitudinal Data Matira, Kevin January 2017 (has links) Various approaches for discriminant analysis of longitudinal data are investigated, with some focus on model-based approaches. The latter are typically based on the modi ed Cholesky decomposition of the covariance matrix in a Gaussian mixture; however, non-Gaussian mixtures are also considered. Where applicable, the Bayesian information criterion is used to select the number of components per class. The various approaches are demonstrated on real and simulated data. / Thesis / Master of Science (MSc) mixture models supervised learning longitudinal data classification statistical learning
29	Extracting the Wisdom of Crowds From Crowdsourcing Platforms Du, Qianzhou 02 August 2019 (has links) Enabled by the wave of online crowdsourcing activities, extracting the Wisdom of Crowds (WoC) has become an emerging research area, one that is used to aggregate judgments, opinions, or predictions from a large group of individuals for improved decision making. However, existing literature mostly focuses on eliciting the wisdom of crowds in an offline context—without tapping into the vast amount of data available on online crowdsourcing platforms. To extract WoC from participants on online platforms, there exist at least three challenges, including social influence, suboptimal aggregation strategies, and data sparsity. This dissertation aims to answer the research question of how to effectively extract WoC from crowdsourcing platforms for the purpose of making better decisions. In the first study, I designed a new opinions aggregation method, Social Crowd IQ (SCIQ), using a time-based decay function to eliminate the impact of social influence on crowd performance. In the second study, I proposed a statistical learning method, CrowdBoosting, instead of a heuristic-based method, to improve the quality of crowd wisdom. In the third study, I designed a new method, Collective Persuasibility, to solve the challenge of data sparsity in a crowdfunding platform by inferring the backers' preferences and persuasibility. My work shows that people can obtain business benefits from crowd wisdom, and it provides several effective methods to extract wisdom from online crowdsourcing platforms, such as StockTwits, Good Judgment Open, and Kickstarter. / Doctor of Philosophy / Since Web 2.0 and mobile technologies have inspired increasing numbers of people to contribute and interact online, crowdsourcing provides a great opportunity for the businesses to tap into a large group of online users who possess varied capabilities, creativity, and knowledge levels. Howe (2006) first defined crowdsourcing as a method for obtaining necessary ideas, information, or services by asking for contributions from a large group of individuals, especially participants in online communities. Many online platforms have been developed to support various crowdsourcing tasks, including crowdfunding (e.g., Kickstarter and Indiegogo), crowd prediction (e.g., StockTwits, Good Judgment Open, and Estimize), crowd creativity (e.g., Wikipedia), and crowdsolving (e.g., Dell IdeaStorm). The explosive data generated by those platforms give us a good opportunity for business benefits. Specifically, guided by the Wisdom of Crowds (WoC) theory, we can aggregate multiple opinions from a crowd of individuals for improving decision making. In this dissertation, I apply WoC to three crowdsourcing tasks, stock return prediction, event outcome forecast, and crowdfunding project success prediction. Our study shows the effectiveness of WoC and makes both theoretical and practical contributions to the literature of WoC. crowdsourcing the wisdom of crowds statistical learning opinion aggregation crowdfunding
30	A data-driven framework to support resilient and sustainable early design Zaker Esteghamati, Mohsen 05 August 2021 (has links) Early design is the most critical stage to improve the resiliency and sustainability of buildings. An unaided early design follows the designer's accustomed domain of knowledge and cognitive biases. Given the inherent limitations of human decision-making, such a design process will only explore a small set of alternatives using limited criteria, and most likely, miss high-performing alternatives. Performance-based engineering (PBE) is a probabilistic approach to quantify buildings performance against natural hazards in terms of decision metrics such as repair cost and functionality loss. Therefore, PBE can remarkably improve early design by informing the designer regarding the possible consequences of different decisions. Incorporating PBE in early design is obstructed by several challenges such as time- and effort-intensiveness of performing rigorous PBE assessments, a specific skillset that might not be available, and accrual of aleatoric (associated with innate randomness of physical systems properties and surrounding environment conditions) and epistemic (associated with the incomplete state of knowledge) uncertainties. In addition, a successful early design requires exploring a large number of alternatives, which, when compounded by PBE assessments, will significantly exhaust computational resources and pressure the project timeline. This dissertation proposes a framework to integrate prior knowledge and PBE assessments in early design. The primary workflow in the proposed framework develops a performance inventory to train statistical surrogate models using supervised learning algorithms. This performance inventory comprises PBE assessments consistent with building taxonomy and site, and is supported by a knowledge-based module. The knowledge-based module organizes prior published PBE assessments as a relational database to supplement the performance inventory and aid early design exploration through knowledge-based surrogate models. Lastly, the developed knowledge-based and data-driven surrogate models are implemented in a sequential design exploration scheme to estimate the performance range for a given topology and building system. The proposed framework is then applied for mid-rise concrete office buildings in Charleston, South Carolina, where seismic vulnerability and environmental performance are linked to topology and design parameters. / Doctor of Philosophy / Recent advances in structural engineering aspire to achieve higher societal objectives than focusing solely on safety. Two main current objectives are resiliency (i.e., the built environment's ability to rapidly and equitably recover after an external shock, among other definitions) and sustainability (i.e., the ability to meet current needs without preventing future generations from meeting theirs, among other definitions). Therefore, holistic design approaches are needed that can include and explicitly evaluate these objectives at different steps, particularly the earlier stages. The importance of earlier stages stems from the higher freedom to make critical decisions – such as material and building system selection – without incurring higher costs and effort on the designer. Performance-based engineering (PBE) is a quantitative approach to calculating the impact of natural hazards on the built environment. The calculated impacts from PBE can then be communicated through a more easily understood language such as monetary values. However, several challenges should be first addressed to apply PBE in early design. First, PBE assessments are time- and effort-intensive and require expertise that might not be available to the designer. Second, a typical early design exploration evaluates many alternatives, significantly increasing the already high computational and time cost. Third, PBE requires detailed design and building information which is not available at the preliminary stages. This lack of knowledge is coupled with additional uncertainties due to the random nature of natural hazards and building system characteristics (e.g., material strength or other mechanical properties). This dissertation proposes a framework to incorporate PBE in early design, and tests it for concrete mid-rise offices in Charleston, South Carolina. The centerpiece of this framework is to use data-driven modeling to learn directly from assessments. The data-driven modeling treats PBE as a pre-configured data inventory and develops statistical surrogate models (i.e., simplified mathematical models). These models can then relate early design parameters to building seismic and environmental performance. The inventory is also supported by prior knowledge, structured as a database of published literature on PBE assessments. Lastly, the knowledge-based and data-driven models are applied in a specific order to narrow the performance range for given building layout and system. Performance-based engineering Sustainability Early design Surrogate modeling Statistical learning

Search results