• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 121
  • 21
  • 20
  • 11
  • 7
  • 6
  • 3
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 232
  • 75
  • 53
  • 46
  • 44
  • 38
  • 36
  • 30
  • 30
  • 30
  • 27
  • 25
  • 23
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

Investigating the Impact of Air Pollution, Meteorology, and Human Mobility on Excess Deaths during COVID-19 in Quito : A Correlation, Regression, Machine Learning, and Granger Causality Analysis

Tariq, Waleed, Naqvi, Sehrish January 2023 (has links)
Air pollution and meteorological conditions impact COVID-19 mortality rates. This research studied Quito, Ecuador, using Granger causality tests and regression models to investigate the relationship between pollutants, meteorological variables, human mobility, and excess deaths. Results suggested that Mobility as defined by Google Mobility Index, Facebook Isolation Index, in addition to Nitrogen Dioxide, and Sulphur Dioxide significantly impact excess deaths, while Carbon Monoxide and Relative Humidity have mixed results. Measures to reduce Carbon Monoxide emissions and increase humidity levels may mitigate the impact of air pollution on COVID-19 mortality rates. Further research is needed to investigate the impact of pollutants on COVID-19 transmission in other locations. Healthcare decision-makers must monitor and mitigate the impact of pollutants, promote healthy air quality policies, and encourage physical activity in safe environments. They must also consider meteorological conditions and implement measures such as increased ventilation and air conditioning to reduce exposure. Additionally, they must consider human mobility and reduce it to slow the spread of the diseases. Decisionmakers must monitor and track excess deaths during the pandemic to understand the impact of pollutants, meteorological conditions, and human mobility on human health. Public education is critical to raising awareness of air quality and its impact on health. Encouraging individuals to reduce their exposure to pollutants and meteorological conditions can play a critical role in mitigating the impact of air pollution on respiratory health during the pandemic.
172

Exploring Alarm Data for Improved Return Prediction in Radios : A Study on Imbalanced Data Classification

Färenmark, Sofia January 2023 (has links)
The global tech company Ericsson has been tracking the return rate of their products for over 30 years, using it as a key performance indicator (KPI). These KPIs play a critical role in making sound business decisions, identifying areas for improvement, and planning. To enhance the customer experience, the company highly values the ability to predict the number of returns in advance each month. However, predicting returns is a complex problem affected by multiple factors that determine when radios are returned. Analysts at the company have observed indications of a potential correlation between alarm data and the number of returns. This paper aims to address the need for better prediction models to improve return rate forecasting for radios, utilizing alarm data. The alarm data, which is stored in an internal database, includes logs of activated alarms at various sites, along with technical and logistical information about the products, as well as the historical records of returns. The problem is approached as a classification task, where radios are classified as either "return" or "no return" for a specific month, using the alarm dataset as input. However, due to the significantly smaller number of returned radios compared to the distributed ones, the dataset suffers from a heavy class imbalance. The imbalance class problem has garnered considerable attention in the field of machine learning in recent years, as traditional classification models struggle to identify patterns in the minority class of imbalanced datasets. Therefore, a specific method that addresses the imbalanced class problem was required to construct an effective prediction model for returns. Therefore, this paper has adopted a systematic approach inspired by similar problems. It applies the feature selection methods LASSO and Boruta, along with the resampling technique SMOTE, and evaluates various classifiers including the Support vector machine (SVM), Random Forest classifier (RFC), Decision tree (DT), and a Neural network (NN) with weights to identify the best-performing model. As accuracy is not suitable as an evaluation metric for imbalanced datasets, the AUC and AUPRC values were calculated for all models to assess the impact of feature selection, weights, resampling techniques, and the choice of classifier. The best model was determined to be the NN with weights, achieving a median AUC value of 0.93 and a median AUPRC value of 0.043. Likewise, both the LASSO+SVM+SMOTE and LASSO+RFC+SMOTE models demonstrated similar performance with median AUC values of 0.92 and 0.93, and median AUPRC values of 0.038 and 0.041, respectively. The baseline for the AUPRC value for this data set was 0.005. Furthermore, the results indicated that resampling techniques are necessary for successful classification of the minority class. Thorough pre-processing and a balanced split between the test and training sets are crucial before applying resampling, as this technique is sensitive to noisy data. While feature selection improved performance to some extent, it could also lead to unreadable results due to noise. The choice of classifier did not have an equal impact on model performance compared to the effects of resampling and feature selection.
173

Regularization: Stagewise Regression and Bagging

Ehrlinger, John M. 31 March 2011 (has links)
No description available.
174

Joint Gaussian Graphical Model for multi-class and multi-level data

Shan, Liang 01 July 2016 (has links)
Gaussian graphical model has been a popular tool to investigate conditional dependency between random variables by estimating sparse precision matrices. The estimated precision matrices could be mapped into networks for visualization. For related but different classes, jointly estimating networks by taking advantage of common structure across classes can help us better estimate conditional dependencies among variables. Furthermore, there may exist multilevel structure among variables; some variables are considered as higher level variables and others are nested in these higher level variables, which are called lower level variables. In this dissertation, we made several contributions to the area of joint estimation of Gaussian graphical models across heterogeneous classes: the first is to propose a joint estimation method for estimating Gaussian graphical models across unbalanced multi-classes, whereas the second considers multilevel variable information during the joint estimation procedure and simultaneously estimates higher level network and lower level network. For the first project, we consider the problem of jointly estimating Gaussian graphical models across unbalanced multi-class. Most existing methods require equal or similar sample size among classes. However, many real applications do not have similar sample sizes. Hence, in this dissertation, we propose the joint adaptive graphical lasso, a weighted L1 penalized approach, for unbalanced multi-class problems. Our joint adaptive graphical lasso approach combines information across classes so that their common characteristics can be shared during the estimation process. We also introduce regularization into the adaptive term so that the unbalancedness of data is taken into account. Simulation studies show that our approach performs better than existing methods in terms of false positive rate, accuracy, Mathews correlation coefficient, and false discovery rate. We demonstrate the advantage of our approach using liver cancer data set. For the second one, we propose a method to jointly estimate the multilevel Gaussian graphical models across multiple classes. Currently, methods are still limited to investigate a single level conditional dependency structure when there exists the multilevel structure among variables. Due to the fact that higher level variables may work together to accomplish certain tasks, simultaneously exploring conditional dependency structures among higher level variables and among lower level variables are of our main interest. Given multilevel data from heterogeneous classes, our method assures that common structures in terms of the multilevel conditional dependency are shared during the estimation procedure, yet unique structures for each class are retained as well. Our proposed approach is achieved by first introducing a higher level variable factor within a class, and then common factors across classes. The performance of our approach is evaluated on several simulated networks. We also demonstrate the advantage of our approach using breast cancer patient data. / Ph. D.
175

Sensitivity of the EQ-5D-5L for fatigue, memory and concentration problems, and dyspnea, and their added value in patients after COVID-19 with persistent long-term symptoms : - An application of multiple linear regression and LASSO

Wadsten, Carl January 2023 (has links)
This thesis examined the sensitivity of the EQ-5D-5L instrument in measuring health-related quality of life (HRQoL) among patients with persistent symptoms following COVID-19, including fatigue, memory and concentration problems, and dyspnea. Additionally, it was analyzed whether adding these symptoms to the EQ-5D-5L improved the explained variance for HRQoL. Patients from Uppsala University Hospital, Sweden, answered a survey that included questions on five dimensions of health represented by the EQ-5D-5L and an additional question on general health score called EQ-VAS. Multiple linear regression, Spearman’s rank correlation coefficient, and Least Absolute Shrinkage and Selection Operator (LASSO) were used to examine the sensitivity of the EQ-5D-5L. For the explanatory analysis, the Adjusted 𝑅2 was used to evaluate explanatory power with and without the presence of the symptoms. The results showed that the EQ-5D-5L dimensions explained a moderate proportion of the variance for fatigue and memory/concentration problems and a weak proportion for dyspnea. The explanatory analysis provided findings that fatigue significantly improved the explained variance of EQ-VAS by 5.5%, adding memory/concentration problems only improved it marginally, and adding dyspnea was non-significant. Additionally, strong to moderate correlations between fatigue and memory/concentration problems were found with multiple dimensions of the EQ-5D-5L. These findings suggest that the EQ-5D-5L instrument may be a valuable tool in assessing HRQoL in patients with persistent COVID-19 symptoms and that adding fatigue to the EQ-5D-5L could be beneficial for improving explanatory power to HRQoL in patients suffering from infectious disease. / COMBAT post-covid
176

Achieving shrinkage in a time-varying parameter model framework

Bitto, Angela, Frühwirth-Schnatter, Sylvia January 2019 (has links) (PDF)
Shrinkage for time-varying parameter (TVP) models is investigated within a Bayesian framework, with the aim to automatically reduce time-varying Parameters to staticones, if the model is overfitting. This is achieved through placing the double gamma shrinkage prior on the process variances. An efficient Markov chain Monte Carlo scheme is devel- oped, exploiting boosting based on the ancillarity-sufficiency interweaving strategy. The method is applicable both to TVP models for univariate a swell as multivariate time series. Applications include a TVP generalized Phillips curve for EU area inflation modeling and a multivariate TVP Cholesky stochastic volatility model for joint modeling of the Returns from the DAX-30index.
177

Learning algorithms for sparse classification / Algorithmes d'estimation pour la classification parcimonieuse

Sanchez Merchante, Luis Francisco 07 June 2013 (has links)
Cette thèse traite du développement d'algorithmes d'estimation en haute dimension. Ces algorithmes visent à résoudre des problèmes de discrimination et de classification, notamment, en incorporant un mécanisme de sélection des variables pertinentes. Les contributions de cette thèse se concrétisent par deux algorithmes, GLOSS pour la discrimination et Mix-GLOSS pour la classification. Tous les deux sont basés sur le résolution d'une régression régularisée de type "optimal scoring" avec une formulation quadratique de la pénalité group-Lasso qui encourage l'élimination des descripteurs non-significatifs. Les fondements théoriques montrant que la régression de type "optimal scoring" pénalisée avec un terme "group-Lasso" permet de résoudre un problème d'analyse discriminante linéaire ont été développés ici pour la première fois. L'adaptation de cette théorie pour la classification avec l'algorithme EM n'est pas nouvelle, mais elle n'a jamais été détaillée précisément pour les pénalités qui induisent la parcimonie. Cette thèse démontre solidement que l'utilisation d'une régression de type "optimal scoring" pénalisée avec un terme "group-Lasso" à l'intérieur d'une boucle EM est possible. Nos algorithmes ont été testés avec des bases de données réelles et artificielles en haute dimension avec des résultats probants en terme de parcimonie, et ce, sans compromettre la performance du classifieur. / This thesis deals with the development of estimation algorithms with embedded feature selection the context of high dimensional data, in the supervised and unsupervised frameworks. The contributions of this work are materialized by two algorithms, GLOSS for the supervised domain and Mix-GLOSS for unsupervised counterpart. Both algorithms are based on the resolution of optimal scoring regression regularized with a quadratic formulation of the group-Lasso penalty which encourages the removal of uninformative features. The theoretical foundations that prove that a group-Lasso penalized optimal scoring regression can be used to solve a linear discriminant analysis bave been firstly developed in this work. The theory that adapts this technique to the unsupervised domain by means of the EM algorithm is not new, but it has never been clearly exposed for a sparsity-inducing penalty. This thesis solidly demonstrates that the utilization of group-Lasso penalized optimal scoring regression inside an EM algorithm is possible. Our algorithms have been tested with real and artificial high dimensional databases with impressive resuits from the point of view of the parsimony without compromising prediction performances.
178

THREE ESSAYS ON THE APPLICATION OF MACHINE LEARNING METHODS IN ECONOMICS

Lawani, Abdelaziz 01 January 2018 (has links)
Over the last decades, economics as a field has experienced a profound transformation from theoretical work toward an emphasis on empirical research (Hamermesh, 2013). One common constraint of empirical studies is the access to data, the quality of the data and the time span it covers. In general, applied studies rely on surveys, administrative or private sector data. These data are limited and rarely have universal or near universal population coverage. The growth of the internet has made available a vast amount of digital information. These big digital data are generated through social networks, sensors, and online platforms. These data account for an increasing part of the economic activity yet for economists, the availability of these big data also raises many new challenges related to the techniques needed to collect, manage, and derive knowledge from them. The data are in general unstructured, complex, voluminous and the traditional software used for economic research are not always effective in dealing with these types of data. Machine learning is a branch of computer science that uses statistics to deal with big data. The objective of this dissertation is to reconcile machine learning and economics. It uses threes case studies to demonstrate how data freely available online can be harvested and used in economics. The dissertation uses web scraping to collect large volume of unstructured data online. It uses machine learning methods to derive information from the unstructured data and show how this information can be used to answer economic questions or address econometric issues. The first essay shows how machine learning can be used to derive sentiments from reviews and using the sentiments as a measure for quality it examines an old economic theory: Price competition in oligopolistic markets. The essay confirms the economic theory that agents compete for price. It also confirms that the quality measure derived from sentiment analysis of the reviews is a valid proxy for quality and influences price. The second essay uses a random forest algorithm to show that reviews can be harnessed to predict consumers’ preferences. The third essay shows how properties description can be used to address an old but still actual problem in hedonic pricing models: the Omitted Variable Bias. Using the Least Absolute Shrinkage and Selection Operator (LASSO) it shows that pricing errors in hedonic models can be reduced by including the description of the properties in the models.
179

Modèles additifs parcimonieux

Avalos, Marta 21 December 2004 (has links) (PDF)
De nombreux algorithmes d'estimation fonctionnelle existent pour l'apprentissage statistique supervisé. Cependant, ils ont pour la plupart été développés dans le but de fournir des estimateurs précis, sans considérer l'interprétabilité de la solution. Les modèles additifs permettent d'expliquer les prédictions simplement, en ne faisant intervenir qu'une variable explicative à la fois, mais ils sont difficiles à mettre en ouvre. Cette thèse est consacrée au développement d'un algorithme d'estimation des modèles additifs. D'une part, leur utilisation y est simplifiée, car le réglage de la complexité est en grande partie intégré dans la phase d'estimation des paramètres. D'autre part, l'interprétabilité est favorisée par une tendance à éliminer automatiquement les variables les moins pertinentes. Des stratégies d'accélération des calculs sont également proposées. Une approximation du nombre effectif de paramètres permet l'utilisation de critères analytiques de sélection de modèle. Sa validité est testée par des simulations et sur des données réelles.
180

Pénalités hiérarchiques pour l'ntégration de connaissances dans les modèles statistiques

Szafranski, Marie 21 November 2008 (has links) (PDF)
L'apprentissage statistique vise à prédire, mais aussi analyser ou interpréter un phénomène. Dans cette thèse, nous proposons de guider le processus d'apprentissage en intégrant une connaissance relative à la façon dont les caractéristiques d'un problème sont organisées. Cette connaissance est représentée par une structure arborescente à deux niveaux, ce qui permet de constituer des groupes distincts de caractéristiques. Nous faisons également l'hypothèse que peu de (groupes de) caractéristiques interviennent pour discriminer les observations. L'objectif est donc de faire émerger les groupes de caractéristiques pertinents, mais également les caractéristiques significatives associées à ces groupes. Pour cela, nous utilisons une formulation variationnelle de type pénalisation adaptative. Nous montrons que cette formulation conduit à minimiser un problème régularisé par une norme mixte. La mise en relation de ces deux approches offre deux points de vues pour étudier les propriétés de convexité et de parcimonie de cette méthode. Ces travaux ont été menés dans le cadre d'espaces de fonctions paramétriques et non paramétriques. L'intérêt de cette méthode est illustré sur des problèmes d'interfaces cerveaux-machines.

Page generated in 0.4499 seconds