Global ETD Search

41	Individual differences in the use of distributional information in linguistic contexts Hall, Jessica Erin 01 May 2018 (has links) Statistical learning experiments have demonstrated that children and infants are sensitive to the types of statistical regularities found in natural language. These experiments often rely on statistical information based on linear dependencies, e.g. that x predicts y either immediately or after some intervening items, whereas learning to creatively use language relies on the ability to form grammatical categories (e.g. verbs, nouns) that share distributions. Distributional learning has not been explored in children or in individuals with developmental language disorder. Proposed statistical learning deficits in individuals with developmental language delay (DLD) are thought to have downstream effects related to poorer comprehension, but this relationship has not been experimentally shown. In this project, children and adults with and without DLD and their same-age typically developing (TD) peers complete an artificial grammar learning task that employs a made-up language and an online comprehension task that employs real language. In the artificial grammar learning task, participants are tested to determine if they have learned the statistical regularities of trained stimuli and formed categories based upon these regularities. We hypothesize that if individuals with DLD have difficulty utilizing distributional information from novel input, then they will show less evidence of forming new categories than TD peers. Our second hypothesis is that if regularities are learned based on experience, then adults and children will show similar learning because they will have the same exposure to the artificial language. In the online comprehension task, participants use a computer mouse to choose a preferred interpretation of a sentence that is ambiguous, but that most adults interpret a certain way due to linguistic experience. We hypothesize that if individuals with DLD have overall poorer linguistic experience compared to TD individuals, then they will show weaker effects of biases than peers. Finally, we use measurements from both tasks to verify correlation between them, for the additional goal of showing that language comprehension and statistical learning are related. This study provides information about differences between individuals with DLD and their TD peers and between adults and children in the ability to use distributional information from both accumulated and novel input. To this end, we reveal the role of input and experience in using distributional information in linguistic environments. adults children developmental language disorder individual differences language processing statistical learning Speech and Hearing Science
42	Neural Networks Jordan, Michael I., Bishop, Christopher M. 13 March 1996 (has links) We present an overview of current research on artificial neural networks, emphasizing a statistical perspective. We view neural networks as parameterized graphs that make probabilistic assumptions about data, and view learning algorithms as methods for finding parameter values that look probable in the light of the data. We discuss basic issues in representation and learning, and treat some of the practical issues that arise in fitting networks to data. We also discuss links between neural networks and the general formalism of graphical models. AI MIT Artificial Intelligence neural networks learning graphical models machine learning pattern recognition statistical learning theory
43	Learning from Incomplete Data Ghahramani, Zoubin, Jordan, Michael I. 24 January 1995 (has links) Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data. AI MIT Artificial Intelligence missing data mixture models statistical learning EM algorithm maximum likelihood neural networks
44	A Note on Support Vector Machines Degeneracy Rifkin, Ryan, Pontil, Massimiliano, Verri, Alessandro 11 August 1999 (has links) When training Support Vector Machines (SVMs) over non-separable data sets, one sets the threshold $b$ using any dual cost coefficient that is strictly between the bounds of $0$ and $C$. We show that there exist SVM training problems with dual optimal solutions with all coefficients at bounds, but that all such problems are degenerate in the sense that the "optimal separating hyperplane" is given by ${f w} = {f 0}$, and the resulting (degenerate) SVM will classify all future points identically (to the class that supplies more training data). We also derive necessary and sufficient conditions on the input data for this to occur. Finally, we show that an SVM training problem can always be made degenerate by the addition of a single data point belonging to a certain unboundedspolyhedron, which we characterize in terms of its extreme points and rays. AI MIT Artificial Intelligence Support Vector Machines Scale Sensitive Loss Function Statistical Learning Theory.
45	Novel Computational Analyses of Allergens for Improved Allergenicity Risk Assessment and Characterization of IgE Reactivity Relationships Soeria-Atmadja, Daniel January 2008 (has links) Immunoglobulin E (IgE) mediated allergy is a major and seemingly increasing health problem in the Western countries. The combined usage of databases of molecular and clinical information on allergens (allergenic proteins) as well as new experimental platforms capable of generating huge amounts of allergy-related data from a single blood test holds great potential to enhance our knowledge of this complex disease. To maximally benefit from this development, however, both novel and improved methods for computational analysis are urgently required. This thesis concerns two types of important and practical computational analyses of allergens: allergenicity/IgE-cross-reactivity risk assessment and characterization of IgE-reactivity patterns. Both directions rely on development and implementation of bioinformatics and statistical learning algorithms, which are applied to either amino acid sequence information of allergenic proteins or on quantified human blood serum levels of specific IgE-antibodies to allergen preparations (purified extracts of allergenic sources, such as e.g. peanut or birch). The main application for computational risk assessment of allergenicity is to prevent unintentional introduction of allergen-encoding transgenes in genetically modified (GM) food crops. Two separate classification procedures for potential protein allergenicity are introduced. Both protocols rely on multivariate classification algorithms that are educated to discriminate allergens from presumable non-allergens based on their amino acid sequence. Both classification procedures are thoroughly evaluated and the second protocol shows state-of-the-art performance in comparison to current top-ranked methods. Moreover, several pitfalls in performance estimation of classifiers are demonstrated and procedures to circumvent these are suggested. Visualization and characterization of IgE-reactivity patterns among allergen preparations are enabled by application of bioinformatics and statistical learning methods to a multivariate dataset holding recorded blood serum IgE-levels of over 1000 sensitized individuals, each measured to 89 allergen preparations. Moreover, a novel framework for divisive hierarchical clustering including graphical representation of the resulting output is introduced, which greatly simplifies analysis of the abovementioned dataset. Important IgE-reactivity relationships within several groups of allergen preparations are identified including well-known groups of clinically relevant cross-reactivities. allergens bioinformatics statistical learning performance estimation risk assessment Medical informatics Medicinsk informatik
46	Measurability Aspects of the Compactness Theorem for Sample Compression Schemes Kalajdzievski, Damjan 31 July 2012 (has links) In 1998, it was proved by Ben-David and Litman that a concept space has a sample compression scheme of size $d$ if and only if every finite subspace has a sample compression scheme of size $d$. In the compactness theorem, measurability of the hypotheses of the created sample compression scheme is not guaranteed; at the same time measurability of the hypotheses is a necessary condition for learnability. In this thesis we discuss when a sample compression scheme, created from compression schemes on finite subspaces via the compactness theorem, have measurable hypotheses. We show that if $X$ is a standard Borel space with a $d$-maximum and universally separable concept class $\m{C}$, then $(X,\CC)$ has a sample compression scheme of size $d$ with universally Borel measurable hypotheses. Additionally we introduce a new variant of compression scheme called a copy sample compression scheme. Statistical Learning VC-dimension PAC learnability Sample Compression Schemes
47	PAC-Bayesian aggregation and multi-armed bandits Audibert, Jean-Yves 14 October 2010 (has links) (PDF) This habilitation thesis presents several contributions to (1) the PAC-Bayesian analysis of statistical learning, (2) the three aggregation problems: given d functions, how to predict as well as (i) the best of these d functions (model selection type aggregation), (ii) the best convex combination of these d functions, (iii) the best linear combination of these d functions, (3) the multi-armed bandit problems. [MATH:MATH_ST] Mathematics/Statistics [STAT:TH] Statistics/Statistics Theory statistical learning aggregation problems
48	Stochastic Stepwise Ensembles for Variable Selection Xin, Lu 30 April 2009 (has links) Ensembles methods such as AdaBoost, Bagging and Random Forest have attracted much attention in the statistical learning community in the last 15 years. Zhu and Chipman (2006) proposed the idea of using ensembles for variable selection. Their implementation used a parallel genetic algorithm (PGA). In this thesis, I propose a stochastic stepwise ensemble for variable selection, which improves upon PGA. Traditional stepwise regression (Efroymson 1960) combines forward and backward selection. One step of forward selection is followed by one step of backward selection. In the forward step, each variable other than those already included is added to the current model, one at a time, and the one that can best improve the objective function is retained. In the backward step, each variable already included is deleted from the current model, one at a time, and the one that can best improve the objective function is discarded. The algorithm continues until no improvement can be made by either the forward or the backward step. Instead of adding or deleting one variable at a time, Stochastic Stepwise Algorithm (STST) adds or deletes a group of variables at a time, where the group size is randomly decided. In traditional stepwise, the group size is one and each candidate variable is assessed. When the group size is larger than one, as is often the case for STST, the total number of variable groups can be quite large. Instead of evaluating all possible groups, only a few randomly selected groups are assessed and the best one is chosen. From a methodological point of view, the improvement of STST ensemble over PGA is due to the use of a more structured way to construct the ensemble; this allows us to better control over the strength-diversity tradeoff established by Breiman (2001). In fact, there is no mechanism to control this fundamental tradeoff in PGA. Empirically, the improvement is most prominent when a true variable in the model has a relatively small coefficient (relative to other true variables). I show empirically that PGA has a much higher probability of missing that variable. Stochastic Stepwise Ensemble Parallel Genetic Algorithm Variable Selection statistical learning Statistics
49	Fundamental Limitations of Semi-Supervised Learning Lu, Tyler (Tian) 30 April 2009 (has links) The emergence of a new paradigm in machine learning known as semi-supervised learning (SSL) has seen benefits to many applications where labeled data is expensive to obtain. However, unlike supervised learning (SL), which enjoys a rich and deep theoretical foundation, semi-supervised learning, which uses additional unlabeled data for training, still remains a theoretical mystery lacking a sound fundamental understanding. The purpose of this research thesis is to take a first step towards bridging this theory-practice gap. We focus on investigating the inherent limitations of the benefits SSL can provide over SL. We develop a framework under which one can analyze the potential benefits, as measured by the sample complexity of SSL. Our framework is utopian in the sense that a SSL algorithm trains on a labeled sample and an unlabeled distribution, as opposed to an unlabeled sample in the usual SSL model. Thus, any lower bound on the sample complexity of SSL in this model implies lower bounds in the usual model. Roughly, our conclusion is that unless the learner is absolutely certain there is some non-trivial relationship between labels and the unlabeled distribution (``SSL type assumption''), SSL cannot provide significant advantages over SL. Technically speaking, we show that the sample complexity of SSL is no more than a constant factor better than SL for any unlabeled distribution, under a no-prior-knowledge setting (i.e. without SSL type assumptions). We prove that for the class of thresholds in the realizable setting the sample complexity of SL is at most twice that of SSL. Also, we prove that in the agnostic setting for the classes of thresholds and union of intervals the sample complexity of SL is at most a constant factor larger than that of SSL. We conjecture this to be a general phenomenon applying to any hypothesis class. We also discuss issues regarding SSL type assumptions, and in particular the popular cluster assumption. We give examples that show even in the most accommodating circumstances, learning under the cluster assumption can be hazardous and lead to prediction performance much worse than simply ignoring the unlabeled data and doing supervised learning. We conclude with a look into future research directions that build on our investigation. artificial intelligence machine learning semi-supervised learning statistical learning theory Computer Science
50	RELIABILITY AND RISK ASSESSMENT OF NETWORKED URBAN INFRASTRUCTURE SYSTEMS UNDER NATURAL HAZARDS Rokneddin, Keivan 16 September 2013 (has links) Modern societies increasingly depend on the reliable functioning of urban infrastructure systems in the aftermath of natural disasters such as hurricane and earthquake events. Apart from a sizable capital for maintenance and expansion, the reliable performance of infrastructure systems under extreme hazards also requires strategic planning and effective resource assignment. Hence, efficient system reliability and risk assessment methods are needed to provide insights to system stakeholders to understand infrastructure performance under different hazard scenarios and accordingly make informed decisions in response to them. Moreover, efficient assignment of limited financial and human resources for maintenance and retrofit actions requires new methods to identify critical system components under extreme events. Infrastructure systems such as highway bridge networks are spatially distributed systems with many linked components. Therefore, network models describing them as mathematical graphs with nodes and links naturally apply to study their performance. Owing to their complex topology, general system reliability methods are ineffective to evaluate the reliability of large infrastructure systems. This research develops computationally efficient methods such as a modified Markov Chain Monte Carlo simulations algorithm for network reliability, and proposes a network reliability framework (BRAN: Bridge Reliability Assessment in Networks) that is applicable to large and complex highway bridge systems. Since the response of system components to hazard scenario events are often correlated, the BRAN framework enables accounting for correlated component failure probabilities stemming from different correlation sources. Failure correlations from non-hazard sources are particularly emphasized, as they potentially have a significant impact on network reliability estimates, and yet they have often been ignored or only partially considered in the literature of infrastructure system reliability. The developed network reliability framework is also used for probabilistic risk assessment, where network reliability is assigned as the network performance metric. Risk analysis studies may require prohibitively large number of simulations for large and complex infrastructure systems, as they involve evaluating the network reliability for multiple hazard scenarios. This thesis addresses this challenge by developing network surrogate models by statistical learning tools such as random forests. The surrogate models can replace network reliability simulations in a risk analysis framework, and significantly reduce computation times. Therefore, the proposed approach provides an alternative to the established methods to enhance the computational efficiency of risk assessments, by developing a surrogate model of the complex system at hand rather than reducing the number of analyzed hazard scenarios by either hazard consistent scenario generation or importance sampling. Nevertheless, the application of surrogate models can be combined with scenario reduction methods to improve even further the analysis efficiency. To address the problem of prioritizing system components for maintenance and retrofit actions, two advanced metrics are developed in this research to rank the criticality of system components. Both developed metrics combine system component fragilities with the topological characteristics of the network, and provide rankings which are either conditioned on specific hazard scenarios or probabilistic, based on the preference of infrastructure system stakeholders. Nevertheless, they both offer enhanced efficiency and practical applicability compared to the existing methods. The developed frameworks for network reliability evaluation, risk assessment, and component prioritization are intended to address important gaps in the state-of-the-art management and planning for infrastructure systems under natural hazards. Their application can enhance public safety by informing the decision making process for expansion, maintenance, and retrofit actions for infrastructure systems. Urban Infrastructure Network Reliability Seismic Risk Assessment Correlated Bridge Failures Network Surrogate Models Statistical Learning in Networks

Search results