Global ETD Search

1171	Evaluation and Analysis of Supervised Learning Algorithms and Classifiers / Utvärdering och Analys av Övervakade Inlärningsalgoritmer och Klassificerare Lavesson, Niklas January 2006 (has links) The fundamental question studied in this thesis is how to evaluate and analyse supervised learning algorithms and classifiers. As a first step, we analyse current evaluation methods. Each method is described and categorised according to a number of properties. One conclusion of the analysis is that performance is often only measured in terms of accuracy, e.g., through cross-validation tests. However, some researchers have questioned the validity of using accuracy as the only performance metric. Also, the number of instances available for evaluation is usually very limited. In order to deal with these issues, measure functions have been suggested as a promising approach. However, a limitation of current measure functions is that they can only handle two-dimensional instance spaces. We present the design and implementation of a generalised multi-dimensional measure function and demonstrate its use through a set of experiments. The results indicate that there are cases for which measure functions may be able to capture aspects of performance that cannot be captured by cross-validation tests. Finally, we investigate the impact of learning algorithm parameter tuning. To accomplish this, we first define two quality attributes (sensitivity and classification performance) as well as two metrics for measuring each of the attributes. Using these metrics, a systematic comparison is made between four learning algorithms on eight data sets. The results indicate that parameter tuning is often more important than the choice of algorithm. Moreover, quantitative support is provided to the assertion that some algorithms are more robust than others with respect to parameter configuration. To sum up, the contributions of this thesis include; the definition and application of a formal framework which enables comparison and deeper understanding of evaluation methods from different fields of research, a survey of current evaluation methods, the implementation and analysis of a multi-dimensional measure function and the definition and analysis of quality attributes used to investigate the impact of learning algorithm parameter tuning. / Den centrala frågan som studeras i denna uppsats är hur övervakade inlärningsalgoritmer och klassificerare ska utvärderas och analyseras. Som ett första steg analyserar vi existerande utvärderingsmetoder. Varje metod beskrivs och kategoriseras enligt ett antal egenskaper. En slutsats är att prestanda ofta mäts i form av korrekthet, exempelvis med korsvalidering. Några studier har emellertid ifrågasatt användandet av korrekthet som enda mått för prestanda. Dessutom är datamängden som är tillgänglig för utvärdering oftast begränsad. Användandet av mätfunktioner har givits som förslag för att hantera dessa problem. En begränsning med existerande mätfunktioner är att de bara kan hantera tvådimensionella instansrum. Vi presenterar en generaliserad flerdimensionell mätfunktion och demonstrerar användbarheten med ett experiment. Resultaten indikerar att det finns fall då mätfunktioner fånga andra aspekter av prestanda än korsvalideringstest. Slutligen undersöker vi effekten av parameterjustering. Detta görs genom att definiera två kvalitetsattribut (känslighet och klassificeringsprestanda) samt två mått för varje attribut. Dessa mått används för att utföra en systematisk jämförelse mellan fyra inlärningsalgoritmer över åtta datamängder. Resultaten indikerar att parameterjustering oftast är viktigare än val av algoritm. Kvantitativt stöd ges också åt påståendet att vissa algoritmer är mer robusta än andra vad gäller parameter konfiguration. Bidragen från denna uppsats innehåller; definition och användande av ett formellt ramverk som möjliggör jämförelse och djupare förståelse för utvärderingsmetoder från olika forskningsdiscipliner, en överblick av existerande utvärderingsmetoder, en implementation och analys av en flerdimensionell mätfunktion samt en definition och analys av kvalitetsattribut som används för att undersöka effekten av parameterjustering för inlärningsalgoritmer. machine learning evaluation classification Computer Sciences Datavetenskap (datalogi)
1172	Modeling the power consumption of computing systems and applications through machine learning techniques / Modélisation de la consommation énergétique des systèmes informatiques et ses applications grâce à des techniques d'apprentissage automatique Fontoura Cupertino, Leandro 17 July 2015 (has links) Au cours des dernières années, le nombre de systèmes informatiques n'a pas cesser d'augmenter. Les centres de données sont peu à peu devenus des équipements hautement demandés et font partie des plus consommateurs en énergie. L'utilisation des centres de données se partage entre le calcul intensif et les services web, aussi appelés informatique en nuage. La rapidité de calcul est primordiale pour le calcul intensif, mais pour les autres services ce paramètre peut varier selon les accords signés sur la qualité de service. Certains centres de données sont dits hybrides car ils combinent plusieurs types de services. Toutes ces infrastructures sont extrêmement énergivores. Dans ce présent manuscrit nous étudions les modèles de consommation énergétiques des systèmes informatiques. De tels modèles permettent une meilleure compréhension des serveurs informatiques et de leur façon de consommer l'énergie. Ils représentent donc un premier pas vers une meilleure gestion de ces systèmes, que ce soit pour faire des économies d'énergie ou pour facturer l'électricité à la charge des utilisateurs finaux. Les politiques de gestion et de contrôle de l'énergie comportent de nombreuses limites. En effet, la plupart des algorithmes d'ordonnancement sensibles à l'énergie utilisent des modèles de consommation restreints qui renferment un certain nombre de problèmes ouverts. De précédents travaux dans le domaine suggèrent d'utiliser les informations de contrôle fournies par le système informatique lui-même pour surveiller la consommation énergétique des applications. Néanmoins, ces modèles sont soit trop dépendants du type d'application, soit manquent de précision. Ce manuscrit présente des techniques permettant d'améliorer la précision des modèles de puissance en abordant des problèmes à plusieurs niveaux: depuis l'acquisition des mesures de puissance jusqu'à la définition d'une charge de travail générique permettant de créer un modèle lui aussi générique, c'est-à-dire qui pourra être utilisé pour des charges de travail hétérogènes. Pour atteindre un tel but, nous proposons d'utiliser des techniques d'apprentissage automatique.Les modèles d'apprentissage automatique sont facilement adaptables à l'architecture et sont le cœur de cette recherche. Ces travaux évaluent l'utilisation des réseaux de neurones artificiels et la régression linéaire comme technique d'apprentissage automatique pour faire de la modélisation statistique non linéaire. De tels modèles sont créés par une approche orientée données afin de pouvoir adapter les paramètres en fonction des informations collectées pendant l'exécution de charges de travail synthétiques. L'utilisation des techniques d'apprentissage automatique a pour but d'atteindre des estimateurs de très haute précision à la fois au niveau application et au niveau système. La méthodologie proposée est indépendante de l'architecture cible et peut facilement être reproductible quel que soit l'environnement. Les résultats montrent que l'utilisation de réseaux de neurones artificiels permet de créer des estimations très précises. Cependant, en raison de contraintes de modélisation, cette technique n'est pas applicable au niveau processus. Pour ce dernier, des modèles prédéfinis doivent être calibrés afin d'atteindre de bons résultats. / The number of computing systems is continuously increasing during the last years. The popularity of data centers turned them into one of the most power demanding facilities. The use of data centers is divided into high performance computing (HPC) and Internet services, or Clouds. Computing speed is crucial in HPC environments, while on Cloud systems it may vary according to their service-level agreements. Some data centers even propose hybrid environments, all of them are energy hungry. The present work is a study on power models for computing systems. These models allow a better understanding of the energy consumption of computers, and can be used as a first step towards better monitoring and management policies of such systems either to enhance their energy savings, or to account the energy to charge end-users. Energy management and control policies are subject to many limitations. Most energy-aware scheduling algorithms use restricted power models which have a number of open problems. Previous works in power modeling of computing systems proposed the use of system information to monitor the power consumption of applications. However, these models are either too specific for a given kind of application, or they lack of accuracy. This report presents techniques to enhance the accuracy of power models by tackling the issues since the measurements acquisition until the definition of a generic workload to enable the creation of a generic model, i.e. a model that can be used for heterogeneous workloads. To achieve such models, the use of machine learning techniques is proposed. Machine learning models are architecture adaptive and are used as the core of this research. More specifically, this work evaluates the use of artificial neural networks (ANN) and linear regression (LR) as machine learning techniques to perform non-linear statistical modeling.Such models are created through a data-driven approach, enabling adaptation of their parameters based on the information collected while running synthetic workloads. The use of machine learning techniques intends to achieve high accuracy application- and system-level estimators. The proposed methodology is architecture independent and can be easily reproduced in new environments.The results show that the use of artificial neural networks enables the creation of high accurate estimators. However, it cannot be applied at the process-level due to modeling constraints. For such case, predefined models can be calibrated to achieve fair results.% The use of process-level models enables the estimation of virtual machines' power consumption that can be used for Cloud provisioning. Power aware computing Machine learning Statistical modeling Application power models
1173	Techniques d'analyse de contenu appliquées à l'imagerie spatiale / Machine learning applied to remote sensing images Le Goff, Matthieu 20 October 2017 (has links) Depuis les années 1970, la télédétection a permis d’améliorer l’analyse de la surface de la Terre grâce aux images satellites produites sous format numérique. En comparaison avec les images aéroportées, les images satellites apportent plus d’information car elles ont une couverture spatiale plus importante et une période de revisite courte. L’essor de la télédétection a été accompagné de l’émergence des technologies de traitement qui ont permis aux utilisateurs de la communauté d’analyser les images satellites avec l’aide de chaînes de traitement de plus en plus automatiques. Depuis les années 1970, les différentes missions d’observation de la Terre ont permis d’accumuler une quantité d’information importante dans le temps. Ceci est dû notamment à l’amélioration du temps de revisite des satellites pour une même région, au raffinement de la résolution spatiale et à l’augmentation de la fauchée (couverture spatiale d’une acquisition). La télédétection, autrefois cantonnée à l’étude d’une seule image, s’est progressivement tournée et se tourne de plus en plus vers l’analyse de longues séries d’images multispectrales acquises à différentes dates. Le flux annuel d’images satellite est supposé atteindre plusieurs Péta octets prochainement. La disponibilité d’une si grande quantité de données représente un atout pour développer de chaines de traitement avancées. Les techniques d’apprentissage automatique beaucoup utilisées en télédétection se sont beaucoup améliorées. Les performances de robustesse des approches classiques d’apprentissage automatique étaient souvent limitées par la quantité de données disponibles. Des nouvelles techniques ont été développées pour utiliser efficacement ce nouveau flux important de données. Cependant, la quantité de données et la complexité des algorithmes mis en place nécessitent une grande puissance de calcul pour ces nouvelles chaînes de traitement. En parallèle, la puissance de calcul accessible pour le traitement d’images s’est aussi accrue. Les GPUs («Graphic Processing Unit ») sont de plus en plus utilisés et l’utilisation de cloud public ou privé est de plus en plus répandue. Désormais, pour le traitement d’images, toute la puissance nécessaire pour les chaînes de traitements automatiques est disponible à coût raisonnable. La conception des nouvelles chaînes de traitement doit prendre en compte ce nouveau facteur. En télédétection, l’augmentation du volume de données à exploiter est devenue une problématique due à la contrainte de la puissance de calcul nécessaire pour l’analyse. Les algorithmes de télédétection traditionnels ont été conçus pour des données pouvant être stockées en mémoire interne tout au long des traitements. Cette condition est de moins en moins respectée avec la quantité d’images et leur résolution. Les algorithmes de télédétection traditionnels nécessitent d’être revus et adaptés pour le traitement de données à grande échelle. Ce besoin n’est pas propre à la télédétection et se retrouve dans d’autres secteurs comme le web, la médecine, la reconnaissance vocale,… qui ont déjà résolu une partie de ces problèmes. Une partie des techniques et technologies développées par les autres domaines doivent encore être adaptées pour être appliquée aux images satellites. Cette thèse se focalise sur les algorithmes de télédétection pour le traitement de volumes de données massifs. En particulier, un premier algorithme existant d’apprentissage automatique est étudié et adapté pour une implantation distribuée. L’objectif de l’implantation est le passage à l’échelle c’est-à-dire que l’algorithme puisse traiter une grande quantité de données moyennant une puissance de calcul adapté. Enfin, la deuxième méthodologie proposée est basée sur des algorithmes récents d’apprentissage automatique les réseaux de neurones convolutionnels et propose une méthodologie pour les appliquer à nos cas d’utilisation sur des images satellites. / Since the 1970s, remote sensing has been a great tool to study the Earth in particular thanks to satellite images produced in digital format. Compared to airborne images, satellite images provide more information with a greater spatial coverage and a short revisit period. The rise of remote sensing was followed by the development of processing technologies enabling users to analyze satellite images with the help of automatic processing chains. Since the 1970s, the various Earth observation missions have gathered an important amount of information over time. This is caused in particular by the frequent revisiting time for the same region, the improvement of spatial resolution and the increase of the swath (spatial coverage of an acquisition). Remote sensing, which was once confined to the study of a single image, has gradually turned into the analysis of long time series of multispectral images acquired at different dates. The annual flow of satellite images is expected to reach several Petabytes in the near future. The availability of such a large amount of data is an asset to develop advanced processing chains. The machine learning techniques used in remote sensing have greatly improved. The robustness of traditional machine learning approaches was often limited by the amount of available data. New techniques have been developed to effectively use this new and important data flow. However, the amount of data and the complexity of the algorithms embedded in the new processing pipelines require a high computing power. In parallel, the computing power available for image processing has also increased. Graphic Processing Units (GPUs) are increasingly being used and the use of public or private clouds is becoming more widespread. Now, all the power required for image processing is available at a reasonable cost. The design of the new processing lines must take this new factor into account. In remote sensing, the volume of data currently available for exploitation has become a problem due to the constraint of the computing power required for the analysis. Traditional remote sensing algorithms have often been designed for data that can be stored in internal memory throughout processing. This condition is violated with the quantity of images and their resolution taken into account. Traditional remote sensing algorithms need to be reviewed and adapted for large-scale data processing. This need is not specific to remote sensing and is found in other sectors such as the web, medicine, speech recognition ... which have already solved some of these problems. Some of the techniques and technologies developed by the other domains still need to be adapted to be applied to satellite images. This thesis focuses on remote sensing algorithms for processing massive data volumes. In particular, a first algorithm of machine learning is studied and adapted for a distributed implementation. The aim of the implementation is the scalability, i.e. the algorithm can process a large quantity of data with a suitable computing power. Finally, the second proposed methodology is based on recent algorithms of learning convolutional neural networks and proposes a methodology to apply them to our cases of use on satellite images. Apprentissage automatique Télédétection Machine learning Deep learning Remote Sensing
1174	Contributions to the use of analogical proportions for machine learning : theoretical properties and application to recommendation / Contributions à l'usage des proportions analogiques pour l'apprentissage artificiel : propriétés théoriques et application à la recommandation Hug, Nicolas 05 July 2017 (has links) Le raisonnement par analogie est reconnu comme une des principales caractéristiques de l'intelligence humaine. En tant que tel, il a pendant longtemps été étudié par les philosophes et les psychologues, mais de récents travaux s'intéressent aussi à sa modélisation d'un point de vue formel à l'aide de proportions analogiques, permettant l'implémentation de programmes informatiques. Nous nous intéressons ici à l'utilisation des proportions analogiques à des fins prédictives, dans un contexte d'apprentissage artificiel. Dans de récents travaux, les classifieurs analogiques ont montré qu'ils sont capables d'obtenir d'excellentes performances sur certains problèmes artificiels, là où d'autres techniques traditionnelles d'apprentissage se montrent beaucoup moins efficaces. Partant de cette observation empirique, cette thèse s'intéresse à deux axes principaux de recherche. Le premier sera de confronter le raisonnement par proportion analogique à des applications pratiques, afin d'étudier la viabilité de l'approche analogique sur des problèmes concrets. Le second axe de recherche sera d'étudier les classifieurs analogiques d'un point de vue théorique, car jusqu'à présent ceux-ci n'étaient connus que grâce à leurs définitions algorithmiques. Les propriétés théoriques qui découleront nous permettront de comprendre plus précisément leurs forces, ainsi que leurs faiblesses. Comme domaine d'application, nous avons choisi celui des systèmes de recommandation. On reproche souvent à ces derniers de manquer de nouveauté ou de surprise dans les recommandations qui sont adressées aux utilisateurs. Le raisonnement par analogie, capable de mettre en relation des objets en apparence différents, nous est apparu comme un outil potentiel pour répondre à ce problème. Nos expériences montreront que les systèmes analogiques ont tendance à produire des recommandations d'une qualité comparable à celle des méthodes existantes, mais que leur complexité algorithmique cubique les pénalise trop fortement pour prétendre à des applications pratiques où le temps de calcul est une des contraintes principales. Du côté théorique, une contribution majeure de cette thèse est de proposer une définition fonctionnelle des classifieurs analogiques, qui a la particularité d'unifier les approches préexistantes. Cette définition fonctionnelle nous permettra de clairement identifier les liens sous-jacents entre l'approche analogique et l'approche par k plus-proches-voisins, tant au plan algorithmique de haut niveau qu'au plan des propriétés théoriques (taux d'erreur notamment). De plus, nous avons pu identifier un critère qui rend l'application de notre principe d'inférence analogique parfaitement certaine (c'est-à-dire sans erreur), exhibant ainsi les propriétés linéaires du raisonnement par analogie. / Analogical reasoning is recognized as a core component of human intelligence. It has been extensively studied from philosophical and psychological viewpoints, but recent works also address the modeling of analogical reasoning for computational purposes, particularly focused on analogical proportions. We are interested here in the use of analogical proportions for making predictions, in a machine learning context. In recent works, analogy-based classifiers have achieved noteworthy performances, in particular by performing well on some artificial problems where other traditional methods tend to fail. Starting from this empirical observation, the goal of this thesis is twofold. The first topic of research is to assess the relevance of analogical learners on real-world, practical application problems. The second topic is to exhibit meaningful theoretical properties of analogical classifiers, which were yet only empirically studied. The field of application that was chosen for assessing the suitability of analogical classifiers in real-world setting is the topic of recommender systems. A common reproach addressed towards recommender systems is that they often lack of novelty and diversity in their recommendations. As a way of establishing links between seemingly unrelated objects, analogy was thought as a way to overcome this issue. Experiments here show that while offering sometimes similar accuracy performances to those of basic classical approaches, analogical classifiers still suffer from their algorithmic complexity. On the theoretical side, a key contribution of this thesis is to provide a functional definition of analogical classifiers, that unifies the various pre-existing approaches. So far, only algorithmic definitions were known, making it difficult to lead a thorough theoretical study. From this functional definition, we clearly identified the links between our approach and that of the nearest neighbors classifiers, in terms of process and in terms of accuracy. We were also able to identify a criterion that ensures a safe application of our analogical inference principle, which allows us to characterize analogical reasoning as some sort of linear process. Apprentissage artificiel Proportions analogiques Recommandation Machine learning Analogical proportions Recommendation
1175	Essays on exchange rate pass through Han, Lu January 2018 (has links) This dissertation contributes to the theoretical and empirical understandings of international transmissions of exchange rate shocks. It consists of three chapters. The first chapter extends Corsetti and Dedola (2005) and further allows for competition in retail networks. In the model, there are four types of firms interacting with each other including retailing manufacturers, non-retailing manufacturers, specialised retailers and nontradable good producers. The equilibrium depends on the interaction among these four types of firms, which leads to a dynamic and incomplete exchange rate pass through (ERPT) depending on the firms’ share of retail networks. With the standard calibration, the model can generate a high (4-5) long-run trade elasticity without conflicting with a low (0.5-1) short-run elasticity, suggesting that the dynamics of retail networks offer a potential explanation of the trade elasticity puzzle. Chapter 2 investigates the ERPT of Chinese exporters. We propose an estimator that utilises orthogonal dimensions to control for unobserved marginal costs and estimate destination specific markup adjustments to bilateral and multilateral exchange rate shocks. Our estimates suggest that the cost channel accounts for roughly 50% of conventional EPRT estimates. We offer new channels of heterogeneity in firms’ pricing behaviour and provide supporting evidence on the international pricing system. Chapter 3 aims to bridge the gap between theoretical and empirical works on ERPT. I propose a machine learning algorithm that systematically detects the determinants of ERPT. The proposed algorithm is designed to work directly with highly disaggregated firm-level customs trade databases as well as publicly available commodity trade flow datasets. Tested on the simulated data from a realistic micro-founded multi-country trade model, my algorithm is proven to have accuracies around 95% and 80% in simple and complex scenarios respectively. Applying the algorithm to China’s customs data from 2000 to 2006, I document new evidence on the nonlinear relationships among market structures, unit value volatility and ERPT.
1176	Learning to predict cryptocurrency price using artificial neural network models of time series Gullapalli, Sneha January 1900 (has links) Master of Science / Department of Computer Science / William H. Hsu / Cryptocurrencies are digital currencies that have garnered significant investor attention in the financial markets. The aim of this project is to predict the daily price, particularly the daily high and closing price, of the cryptocurrency Bitcoin. This plays a vital role in making trading decisions. There exist various factors which affect the price of Bitcoin, thereby making price prediction a complex and technically challenging task. To perform prediction, we trained temporal neural networks such as time-delay neural networks (TDNN) and recurrent neural networks (RNN) on historical time series – that is, past prices of Bitcoin over several years. Features such as the opening price, highest price, lowest price, closing price, and volume of a currency over several preceding quarters were taken into consideration so as to predict the highest and closing price of the next day. We designed and implemented TDNNs and RNNs using the NeuroSolutions artificial neural network (ANN) development environment to build predictive models and evaluated them by computing various measures such as the MSE (mean square error), NMSE (normalized mean square error), and r (Pearson’s correlation coefficient) on a continuation of the training data from each time series, held out for validation. Cryptocurrency Artificial neural networks Time series analysis Machine learning
1177	Reconstruction-free Inference from Compressive Measurements January 2015 (has links) abstract: As a promising solution to the problem of acquiring and storing large amounts of image and video data, spatial-multiplexing camera architectures have received lot of attention in the recent past. Such architectures have the attractive feature of combining a two-step process of acquisition and compression of pixel measurements in a conventional camera, into a single step. A popular variant is the single-pixel camera that obtains measurements of the scene using a pseudo-random measurement matrix. Advances in compressive sensing (CS) theory in the past decade have supplied the tools that, in theory, allow near-perfect reconstruction of an image from these measurements even for sub-Nyquist sampling rates. However, current state-of-the-art reconstruction algorithms suffer from two drawbacks -- They are (1) computationally very expensive and (2) incapable of yielding high fidelity reconstructions for high compression ratios. In computer vision, the final goal is usually to perform an inference task using the images acquired and not signal recovery. With this motivation, this thesis considers the possibility of inference directly from compressed measurements, thereby obviating the need to use expensive reconstruction algorithms. It is often the case that non-linear features are used for inference tasks in computer vision. However, currently, it is unclear how to extract such features from compressed measurements. Instead, using the theoretical basis provided by the Johnson-Lindenstrauss lemma, discriminative features using smashed correlation filters are derived and it is shown that it is indeed possible to perform reconstruction-free inference at high compression ratios with only a marginal loss in accuracy. As a specific inference problem in computer vision, face recognition is considered, mainly beyond the visible spectrum such as in the short wave infra-red region (SWIR), where sensors are expensive. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2015 Computer engineering Electrical engineering Compressive Sensing Computer VIsion Machine Learning
1178	A New Machine Learning Based Approach to NASA's Propulsion Engine Diagnostic Benchmark Problem January 2015 (has links) abstract: Gas turbine engine for aircraft propulsion represents one of the most physics-complex and safety-critical systems in the world. Its failure diagnostic is challenging due to the complexity of the model system, difficulty involved in practical testing and the infeasibility of creating homogeneous diagnostic performance evaluation criteria for the diverse engine makes. NASA has designed and publicized a standard benchmark problem for propulsion engine gas path diagnostic that enables comparisons among different engine diagnostic approaches. Some traditional model-based approaches and novel purely data-driven approaches such as machine learning, have been applied to this problem. This study focuses on a different machine learning approach to the diagnostic problem. Some most common machine learning techniques, such as support vector machine, multi-layer perceptron, and self-organizing map are used to help gain insight into the different engine failure modes from the perspective of big data. They are organically integrated to achieve good performance based on a good understanding of the complex dataset. The study presents a new hierarchical machine learning structure to enhance classification accuracy in NASA's engine diagnostic benchmark problem. The designed hierarchical structure produces an average diagnostic accuracy of 73.6%, which outperforms comparable studies that were most recently published. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2015 Electrical engineering gas turbine engine machine learning support vector machine
1179	Graph-based Estimation of Information Divergence Functions January 2017 (has links) abstract: Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is rarely the case in real-word scenarios. Non-parametric density estimators are characterized by a very large number of parameters that have to be tuned with costly cross-validation. In this dissertation we focus on a specific family of non-parametric estimators, called direct estimators, that bypass density estimation completely and directly estimate the quantity of interest from the data. We introduce a new divergence measure, the $D_p$-divergence, that can be estimated directly from samples without parametric assumptions on the distribution. We show that the $D_p$-divergence bounds the binary, cross-domain, and multi-class Bayes error rates and, in certain cases, provides provably tighter bounds than the Hellinger divergence. In addition, we also propose a new methodology that allows the experimenter to construct direct estimators for existing divergence measures or to construct new divergence measures with custom properties that are tailored to the application. To examine the practical efficacy of these new methods, we evaluate them in a statistical learning framework on a series of real-world data science problems involving speech-based monitoring of neuro-motor disorders. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2017 Engineering Statistics information thoery Machine learning Non-parametric Performance bounds
1180	Patient-Centered and Experience-Aware Mining for Effective Information Discovery in Health Forums January 2016 (has links) abstract: Online health forums provide a convenient channel for patients, caregivers, and medical professionals to share their experience, support and encourage each other, and form health communities. The fast growing content in health forums provides a large repository for people to seek valuable information. A forum user can issue a keyword query to search health forums regarding to some specific questions, e.g., what treatments are effective for a disease symptom? A medical researcher can discover medical knowledge in a timely and large-scale fashion by automatically aggregating the latest evidences emerging in health forums. This dissertation studies how to effectively discover information in health forums. Several challenges have been identified. First, the existing work relies on the syntactic information unit, such as a sentence, a post, or a thread, to bind different pieces of information in a forum. However, most of information discovery tasks should be based on the semantic information unit, a patient. For instance, given a keyword query that involves the relationship between a treatment and side effects, it is expected that the matched keywords refer to the same patient. In this work, patient-centered mining is proposed to mine patient semantic information units. In a patient information unit, the health information, such as diseases, symptoms, treatments, effects, and etc., is connected by the corresponding patient. Second, the information published in health forums has varying degree of quality. Some information includes patient-reported personal health experience, while others can be hearsay. In this work, a context-aware experience extraction framework is proposed to mine patient-reported personal health experience, which can be used for evidence-based knowledge discovery or finding patients with similar experience. At last, the proposed patient-centered and experience-aware mining framework is used to build a patient health information database for effectively discovering adverse drug reactions (ADRs) from health forums. ADRs have become a serious health problem and even a leading cause of death in the United States. Health forums provide valuable evidences in a large scale and in a timely fashion through the active participation of patients, caregivers, and doctors. Empirical evaluation shows the effectiveness of the proposed approach. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2016 Computer science Data Mining Knowledge Discovery Machine Learning

Search results