Global ETD Search

31	The impact of frequency, consistency, and semantics on reading aloud : an artificial orthography learning paradigm Taylor, Jo S. H. January 2010 (has links) Five experiments explored how we learn to read and recognise words with typical and atypical spelling-sound mappings and to generalize to novel words. In Experiment 1, adults learned to read pseudowords with typical or atypical pronunciations. There was some evidence that prior exposure to word meanings enhanced orthographic learning. However, interpretation was clouded by stimulus control problems that plague research using natural alphabets. In Experiment 2, an artificial orthography paradigm was developed to overcome these problems. Adults learned to read novel words written in novel symbols. Post-training, they could generalize, indicating extraction of individual symbol sounds. The frequency and predictability of symbol-sound mappings influenced learning and generalization, mirroring natural language findings. Experiment 3 found extended training to improve item recognition and generalization. In Experiment 4, pre-exposure to item sounds plus an object referent vs. item sounds provided equivalent benefit for orthographic learning. By the end of training, this was limited to items with low frequency unpredictable symbol-sound mappings. In Experiment 5, pre-exposure to novel definitions enhanced orthographic learning more than pre-exposure to item sounds, but by the end of training, both conditions were again equally beneficial. 150.724
32	Let's Have a party! An Open-Source Toolbox for Recursive Partytioning Hothorn, Torsten, Zeileis, Achim, Hornik, Kurt January 2007 (has links) (PDF) Package party, implemented in the R system for statistical computing, provides basic classes and methods for recursive partitioning along with reference implementations for three recently-suggested tree-based learners: conditional inference trees and forests, and model-based recursive partitioning. / Series: Research Report Series / Department of Statistics and Mathematics RVK ST 250
33	Coping with the computational and statistical bipolar nature of machine learning Machart, Pierre 21 December 2012 (has links) L'Apprentissage Automatique tire ses racines d'un large champ disciplinaire qui inclut l'Intelligence Artificielle, la Reconnaissance de Formes, les Statistiques ou l'Optimisation. Dès les origines de l'Apprentissage, les questions computationelles et les propriétés en généralisation ont toutes deux été identifiées comme centrales pour la discipline. Tandis que les premières concernent les questions de calculabilité ou de complexité (sur un plan fondamental) ou d'efficacité computationelle (d'un point de vue plus pratique) des systèmes d'apprentissage, les secondes visent a comprendre et caractériser comment les solutions qu'elles fournissent vont se comporter sur de nouvelles données non encore vues. Ces dernières années, l'émergence de jeux de données à grande échelle en Apprentissage Automatique a profondément remanié les principes de la Théorie de l'Apprentissage. En prenant en compte de potentielles contraintes sur le temps d'entraînement, il faut faire face à un compromis plus complexe que ceux qui sont classiquement traités par les Statistiques. Une conséquence directe tient en ce que la mise en place d'algorithmes efficaces (autant en théorie qu'en pratique) capables de tourner sur des jeux de données a grande échelle doivent impérativement prendre en compte les aspects statistiques et computationels de l'Apprentissage de façon conjointe. Cette thèse a pour but de mettre à jour, analyser et exploiter certaines des connections qui existent naturellement entre les aspects statistiques et computationels de l'Apprentissage. / Machine Learning is known to have its roots in a broad spectrum of fields including Artificial Intelligence, Pattern Recognition, Statistics or Optimisation. From the earliest stages of Machine Learning, both computational issues and generalisation properties have been identified as central to the field. While the former address the question of computability, complexity (from a fundamental perspective) or computational efficiency (on a more practical standpoint) of learning systems, the latter aim at understanding and characterising how well the solutions they provide perform on new, unseen data. Those last years, the emergence of large-scale datasets in Machine Learning has been deeply reshaping the principles of Learning Theory. Taking into account possible constraints on the training time, one has to deal with more complex trade-offs than the ones classically addressed by Statistics. As a direct consequence, designing new efficient algorithms (both in theory and practice), able to handle large-scale datasets, imposes to jointly deal with the statistical and computational aspects of Learning. The present thesis aims at unravelling, analysing and exploiting some of the connections that naturally exist between the statistical and computational aspects of Learning. More precisely, in a first part, we extend the stability analysis, which relates some algorithmic properties to the generalisation abilities of learning algorithms, to a novel (and fine-grain) performance measure, namely the confusion matrix. In a second part, we present a novel approach to learn a kernel-based regression function, that serves the learning task at hand and exploits the structure of Apprentissage statistique Théorie de l'apprentissage Optimisation Algorithmes Efficacité computationelle Statistical learning Learning theory Optimisation Algorithms Computational efficiency
34	The Effect of Reputation Shocks to Rating Agencies on Corporate Disclosures Sethuraman, Subramanian January 2016 (has links) <p>This paper explores the effect of credit rating agency’s (CRA) reputation on the discretionary disclosures of corporate bond issuers. Academics, practitioners, and regulators disagree on the informational role played by major CRAs and the usefulness of credit ratings in influencing investors’ perception of the credit risk of bond issuers. Using management earnings forecasts as a measure of discretionary disclosure, I find that investors demand more (less) disclosure from bond issuers when the ratings become less (more) credible. In addition, using content analytics, I find that bond issuers disclose more qualitative information during periods of low CRA reputation to aid investors better assess credit risk. That the corporate managers alter their voluntary disclosure in response to CRA reputation shocks is consistent with credit ratings providing incremental information to investors and reducing adverse selection in lending markets. Overall, my findings suggest that managers rely on voluntary disclosure as a credible mechanism to reduce information asymmetry in bond markets.</p> / Dissertation Accounting Content Analytics Credit Rating Agency Management Forecasts Reputation Statistical Learning Voluntary Disclosure
35	Attitude and Adoption: Understanding Climate Change Through Predictive Modeling Jackson B Bennett (7042994) 12 August 2019 (has links) Climate change has emerged as one of the most critical issues of the 21st century. It stands to impact communities across the globe, forcing individuals and governments alike to adapt to a new environment. While it is critical for governments and organizations to make strides to change business as usual, individuals also have the ability to make an impact. The goal of this thesis is to study the beliefs that shape climate-related attitudes and the factors that drive the adoption of sustainable practices and technologies using a foundation in statistical learning. Previous research has studied the factors that influence both climate-related attitude and adoption, but comparatively little has been done to leverage recent advances in statistical learning and computing ability to advance our understanding of these topics. As increasingly large amounts of relevant data become available, it will be pivotal not only to use these emerging sources to derive novel insights on climate change, but to develop and improve statistical frameworks designed with climate change in mind. This thesis presents two novel applications of statistical learning to climate change, one of which includes a more general framework that can easily be extended beyond the field of climate change. Specifically, the work consists of two studies: (1) a robust integration of social media activity with climate survey data to relate climate-talk to climate-thought and (2) the development and validation of a statistical learning model to predict renewable energy installations using social, environmental, and economic predictors. The analysis presented in this thesis supports decision makers by providing new insights on the factors that drive climate attitude and adoption. Applied Computer Science Climate Science Climate Change Adaptation Statistical Learning
36	Performance financeira da carteira na avaliação de modelos de análise e concessão de crédito: uma abordagem baseada em aprendizagem estatística / Financial performance portfolio to evaluate and select analyses and credit models: An approach based on Statistical Learning Silva, Rodrigo Alves 05 September 2014 (has links) Os modelos de análise e decisão de concessão de crédito buscam associar o perfil do tomador de crédito à probabilidade do não pagamento de obrigações contraídas, identificando assim o risco associado ao tomador e auxiliando a firma a decidir pela aprovação ou negação da solicitação de crédito. Atualmente este campo de pesquisa tem ganhado importância no cenário nacional - pela intensificação da atividade de crédito no país com grande participação dos bancos públicos neste processo - e internacional - pelo aumento das preocupações com potenciais danos à economia derivados de eventos de default. Tal quadro fez com que fossem construídos e adaptados diversos modelos e métodos à análise de risco de crédito tanto para consumidores como para empresas. Estes modelos são testados e comparados com base na acurácia de previsão ou de métricas de otimização estatística. Este é um procedimento que pode não se mostrar eficiente do ponto de vista financeiro, ao mesmo tempo em que dificulta a interpretação e tomada de decisão por parte da firma quanto a qual o melhor modelo, gerando uma lacuna pelo desprendimento observado entre a decisão de qual o modelo a ser adotado e o objetivo financeiro da empresa. Tendo em vista que o desempenho financeiro é um dos principais indicadores de qualquer procedimento gerencial, o presente estudo objetivou preencher a esta lacuna analisando o desempenho financeiro de carteiras de crédito formadas por técnicas de aprendizagem estatística utilizadas atualmente na classificação e análise de risco de crédito em pesquisas nacionais e internacionais. A pesquisa selecionou as técnicas: análise discriminante, regressão logística, redes bayesianas Naïve Bayes, kdB-1, kdB-2, SVC e SVM e aplicou tais técnicas junto à base de dados German Credit Data Set. Os resultados foram analisados e comparados inicialmente em termos de acurácia e custos por erro de classificação. Adicionalmente a pesquisa propôs o emprego de quatro métricas financeiras (RFC, PLR, RAROC e IS), encontrando variações quanto aos resultados produzidos por cada técnica. Estes resultados sugerem variações quanto a sequência de eficiência e consequentemente de emprego das técnicas, demonstrando a importância da consideração destas métricas para a análise e decisão de seleção de modelos de classificação ótimos. / Analysis and decision credit concession models search for relating the borrower\'s credit profile to the nonpayment probability of their obligations, identifying risks related to borrower and helping decision firm to approve or deny the credit request. Currently this search field has increased in Brazilian scenario - by credit activity intensification into the country with a large public banks sharing - and in the international scenario - by growing concerns about economy potential damages resulting from default events. This position leads the construction and adaptation of several models and methods by credit risk analysis from both consumers and companies. These models have been tested and compared based on prediction of accuracy or other statistical optimization metrics. This proceed is eventually not effective when analyzed by a financial standpoint, in the same time that affects the understanding and decision of the enterprise about the best model, creating a gap in the decision model choice and the firm financial goals. Given that the financial performance is a foremost indicator of any management procedure, this study aimed to address this gap by the financial performance analysis of loan portfolios formed by statistical learning techniques currently used in the classification and credit risk analysis in national and international researches. The selected techniques (discriminant analysis, logistic regression, Bayesian networks Naïve Bayes , 1 - KDB , KDB - 2 , SVC and SVM) were applied to the German Credit Data Set and their results were initially analyzed and compared in terms of accuracy and misclassification costs. Regardless of these metrics the research has proposed to use four financial metrics (RFC, PLR, RAROC and IS), finding variations in the results of each statistical learning techniques. These results suggest variations in the sequence of efficiency and, ultimately, techniques choice, demonstrating the importance of considering these metrics for analysis and selection of decision models of optimal classification. Aprendizagem Estatística Classificadores Classifiers Credit risk Desempenho Financeiro Financial Performance Risco de crédito Statistical Learning
37	Individual differences in the use of distributional information in linguistic contexts Hall, Jessica Erin 01 May 2018 (has links) Statistical learning experiments have demonstrated that children and infants are sensitive to the types of statistical regularities found in natural language. These experiments often rely on statistical information based on linear dependencies, e.g. that x predicts y either immediately or after some intervening items, whereas learning to creatively use language relies on the ability to form grammatical categories (e.g. verbs, nouns) that share distributions. Distributional learning has not been explored in children or in individuals with developmental language disorder. Proposed statistical learning deficits in individuals with developmental language delay (DLD) are thought to have downstream effects related to poorer comprehension, but this relationship has not been experimentally shown. In this project, children and adults with and without DLD and their same-age typically developing (TD) peers complete an artificial grammar learning task that employs a made-up language and an online comprehension task that employs real language. In the artificial grammar learning task, participants are tested to determine if they have learned the statistical regularities of trained stimuli and formed categories based upon these regularities. We hypothesize that if individuals with DLD have difficulty utilizing distributional information from novel input, then they will show less evidence of forming new categories than TD peers. Our second hypothesis is that if regularities are learned based on experience, then adults and children will show similar learning because they will have the same exposure to the artificial language. In the online comprehension task, participants use a computer mouse to choose a preferred interpretation of a sentence that is ambiguous, but that most adults interpret a certain way due to linguistic experience. We hypothesize that if individuals with DLD have overall poorer linguistic experience compared to TD individuals, then they will show weaker effects of biases than peers. Finally, we use measurements from both tasks to verify correlation between them, for the additional goal of showing that language comprehension and statistical learning are related. This study provides information about differences between individuals with DLD and their TD peers and between adults and children in the ability to use distributional information from both accumulated and novel input. To this end, we reveal the role of input and experience in using distributional information in linguistic environments. adults children developmental language disorder individual differences language processing statistical learning Speech and Hearing Science
38	Neural Networks Jordan, Michael I., Bishop, Christopher M. 13 March 1996 (has links) We present an overview of current research on artificial neural networks, emphasizing a statistical perspective. We view neural networks as parameterized graphs that make probabilistic assumptions about data, and view learning algorithms as methods for finding parameter values that look probable in the light of the data. We discuss basic issues in representation and learning, and treat some of the practical issues that arise in fitting networks to data. We also discuss links between neural networks and the general formalism of graphical models. AI MIT Artificial Intelligence neural networks learning graphical models machine learning pattern recognition statistical learning theory
39	Learning from Incomplete Data Ghahramani, Zoubin, Jordan, Michael I. 24 January 1995 (has links) Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data. AI MIT Artificial Intelligence missing data mixture models statistical learning EM algorithm maximum likelihood neural networks
40	A Note on Support Vector Machines Degeneracy Rifkin, Ryan, Pontil, Massimiliano, Verri, Alessandro 11 August 1999 (has links) When training Support Vector Machines (SVMs) over non-separable data sets, one sets the threshold $b$ using any dual cost coefficient that is strictly between the bounds of $0$ and $C$. We show that there exist SVM training problems with dual optimal solutions with all coefficients at bounds, but that all such problems are degenerate in the sense that the "optimal separating hyperplane" is given by ${f w} = {f 0}$, and the resulting (degenerate) SVM will classify all future points identically (to the class that supplies more training data). We also derive necessary and sufficient conditions on the input data for this to occur. Finally, we show that an SVM training problem can always be made degenerate by the addition of a single data point belonging to a certain unboundedspolyhedron, which we characterize in terms of its extreme points and rays. AI MIT Artificial Intelligence Support Vector Machines Scale Sensitive Loss Function Statistical Learning Theory.

Search results