Global ETD Search

41	Assessing Viability of Open-Source Battery Cycling Data for Use in Data-Driven Battery Degradation Models Ritesh Gautam (17582694) 08 December 2023 (has links) <p dir="ltr">Lithium-ion batteries are being used increasingly more often to provide power for systems that range all the way from common cell-phones and laptops to advanced electric automotive and aircraft vehicles. However, as is the case for all battery types, lithium-ion batteries are prone to naturally occurring degradation phenomenon that limit their effective use in these systems to a finite amount of time. This degradation is caused by a plethora of variables and conditions including things like environmental conditions, physical stress/strain on the body of the battery cell, and charge/discharge parameters and cycling. Accurately and reliably being able to predict this degradation behavior in battery systems is crucial for any party looking to implement and use battery powered systems. However, due to the complicated non-linear multivariable processes that affect battery degradation, this can be difficult to achieve. Compared to traditional methods of battery degradation prediction and modeling like equivalent circuit models and physics-based electrochemical models, data-driven machine learning tools have been shown to be able to handle predicting and classifying the complex nature of battery degradation without requiring any prior knowledge of the physical systems they are describing.</p><p dir="ltr">One of the most critical steps in developing these data-driven neural network algorithms is data procurement and preprocessing. Without large amounts of high-quality data, no matter how advanced and accurate the architecture is designed, the neural network prediction tool will not be as effective as one trained on high quality, vast quantities of data. This work aims to gather battery degradation data from a wide variety of sources and studies, examine how the data was produced, test the effectiveness of the data in the Interfacial Multiphysics Laboratory’s autoencoder based neural network tool CD-Net, and analyze the results to determine factors that make battery degradation datasets perform better for use in machine learning/deep learning tools. This work also aims to relate this work to other data-driven models by comparing the CD-Net model’s performance with the publicly available BEEP’s (Battery Evaluation and Early Prediction) ElasticNet model. The reported accuracy and prediction models from the CD-Net and ElasticNet tools demonstrate that larger datasets with actively selected training/testing designations and less errors in the data produce much higher quality neural networks that are much more reliable in estimating the state-of-health of lithium-ion battery systems. The results also demonstrate that data-driven models are much less effective when trained using data from multiple different cell chemistries, form factors, and cycling conditions compared to more congruent datasets when attempting to create a generalized prediction model applicable to multiple forms of battery cells and applications.</p> Aerospace materials Data engineering and data science Neural networks Lithium-ion Batteries Machine Learning Models Battery degradation data preprocessing efforts
42	Анализ средств для интерпретирования моделей машинного обучения при анализе табличных данных : магистерская диссертация / Analysis of tools for interpreting machine learning models when analyzing tabular data Бабий, И. Н., Babiy, I. N. January 2023 (has links) Цель работы – анализ средств для интерпретирования моделей машинного обучения и их практического применения для интерпретирования результатов моделей машинного обучения при анализе табличных данных. Объект исследования – средства для интерпретирования моделей машинного обучения. Методы исследования: теоретический анализ литературы по теме исследования, изучение документации библиотек машинного обучения, классификация исследуемых методов, экспериментальный включающий проведение исследовательского анализа данных, обучение моделей машинного обучения и применение интерпретирования, обобщение полученных данных и их сравнение. Результаты работы: подготовлен обзор и практическое руководство по интерпретации результатов машинного обучения для табличных данных. Выпускная квалификационная работа выполнена в текстовом редакторе Microsoft Word и представлена в твердой копии. / The purpose of the work is to analyze tools for interpreting machine learning models and their practical application for interpreting the results of machine learning models when analyzing tabular data. The object of study is tools for interpreting machine learning models. Research methods: theoretical analysis of literature on the research topic, study of documentation of machine learning libraries, classification of methods being studied, experimental, including conducting exploratory data analysis, training machine learning models and applying interpretation, summarizing the data obtained and comparison. their. Results of the work: a review and practical guidance on interpreting the results of machine learning of tabular data has been prepared. The final qualifying work was completed in the text editor Microsoft Word and presented on paper. МАШИННОЕ ОБУЧЕНИЕ MASTER'S THESIS MACHINE LEARNING MACHINE LEARNING MODELS
43	INVESTIGATING DATA ACQUISITION TO IMPROVE FAIRNESS OF MACHINE LEARNING MODELS Ekta (18406989) 23 April 2024 (has links) <p dir="ltr">Machine learning (ML) algorithms are increasingly being used in a variety of applications and are heavily relied upon to make decisions that impact people’s lives. ML models are often praised for their precision, yet they can discriminate against certain groups due to biased data. These biases, rooted in historical inequities, pose significant challenges in developing fair and unbiased models. Central to addressing this issue is the mitigation of biases inherent in the training data, as their presence can yield unfair and unjust outcomes when models are deployed in real-world scenarios. This study investigates the efficacy of data acquisition, i.e., one of the stages of data preparation, akin to the pre-processing bias mitigation technique. Through experimental evaluation, we showcase the effectiveness of data acquisition, where the data is acquired using data valuation techniques to enhance the fairness of machine learning models.</p> Algorithmic Fairness Bias influence functions Data Acquisition Fairness Machine Learning Models Bias mitigation German credit data Adult census dataset COMPAS dataset
44	Rare Events Predictions with Time Series Data / Prediktion av sällsynta händelser med tidsseriedata Eriksson, Jonas, Kuusela, Tuomas January 2024 (has links) This study aims to develop models for predicting rare events, specifically elevated intracranial pressure (ICP) in patients with traumatic brain injury (TBI). Using time-series data of ICP, we created and evaluated several machine learning models, including K-Nearest Neighbors, Random Forest, and logistic regression, in order to predict ICP levels exceeding 20 mmHg – acritical threshold for medical intervention. The time-series data was segmented and transformed into a tabular format, with feature engineering applied to extract meaningful statistical characteristics. We framed the problem as a binary classification task, focusing on whether ICP levels exceeded the 20 mmHg threshold. We focused on evaluating the optimal model by comparing the predictive performance of the algorithms. All models demonstrated good performance for predictions up to 30 minutes in advance, after which a significant decline in performance was observed. Within this timeframe, the models achieved Matthews Correlation Coefficient (MCC) scores ranging between 0.876 and 0.980, where the Random Forest models showed the highest performance. In contrast, logistic regression displayed a notable deviation at the 40-minute mark, recording an MCC score of 0.752. The results presented highlight potential to provide reliable, real-time predictions of dangerous ICP levels up to 30 minutes in advance, which is crucial for timely and effective medical interventions. Rare event prediction Time series analysis Elevated Intracranial Pressure (ICP) Traumatic Brain Injury (TBI) Machine learning models Binary classification Probability Theory and Statistics Sannolikhetsteori och statistik
45	FEDERATED LEARNING AMIDST DYNAMIC ENVIRONMENTS Bhargav Ganguly (19119859) 08 November 2024 (has links) <p dir="ltr">Federated Learning (FL) is a prime example of a large-scale distributed machine learning framework that has emerged as a result of the exponential growth in data generation and processing capabilities on smart devices. This framework enables the efficient processing and analysis of vast amounts of data, leveraging the collective power of numerous devices to achieve unprecedented scalability and performance. In the FL framework, each end-user device trains a local model using its own data. Through the periodic synchronization of local models, FL achieves a global model that incorporates the insights from all participat- ing devices. This global model can then be used for various applications, such as predictive analytics, recommendation systems, and more.</p><p dir="ltr">Despite its potential, traditional Federated Learning (FL) frameworks face significant hur- dles in real-world applications. These challenges stem from two primary issues: the dynamic nature of data distributions and the efficient utilization of network resources in diverse set- tings. Traditional FL frameworks often rely on the assumption that data distributions remain stationary over time. However, real-world environments are inherently dynamic, with data distributions constantly evolving, which in turn becomes a potential source of <i>temporal</i> het- erogeneity in FL. Another significant challenge in traditional FL frameworks is the efficient use of network resources in heterogeneous settings. Real-world networks consist of devices with varying computational capabilities, communication protocols, and network conditions. Traditional FL frameworks often struggle to adapt to these diverse <i>spatially</i> heterogeneous settings, leading to inefficient use of network resources and increased latency.</p><p dir="ltr">The primary focus of this thesis is to investigate algorithmic frameworks that can miti- gate the challenges posed by <i>temporal</i> and <i>spatial</i> system heterogeneities in FL. One of the significant sources of <i>temporal</i> heterogeneities in FL is owed to the dynamic drifting of client datasets over time, whereas <i>spatial</i> heterogeneities majorly broadly subsume the diverse computational capabilities and network conditions of devices in a network. We introduce two novel FL frameworks: MASTER-FL, which addresses model staleness in the presence of <i>temporally</i> drifting datasets, and Cooperative Edge-Assisted Dynamic Federated Learning CE-FL, which manages both <i>spatial</i> and <i>temporal</i> heterogeneities in extensive hierarchical FL networks. MASTER-FL is specifically designed to ensure that the global model remains accurate and up-to-date even in environments which are characterized by rapidly changing datasets across time. CE-FL, on the other hand, leverages server-side computing capabili- ties, intelligent data offloading, floating aggregation and cooperative learning strategies to manage the diverse computational capabilities and network conditions often associated with modern FL systems.</p> Distributed systems and algorithms federated learning models distributed machine learning (ML) Optimization Theory
46	Estimation and misspecification Risks in VaR estimation / Estimation and misspecification risks in VaR evaluation Telmoudi, Fedya 19 December 2014 (has links) Dans cette thèse, nous étudions l'estimation de la valeur à risque conditionnelle (VaR) en tenant compte du risque d'estimation et du risque de modèle. Tout d'abord, nous considérons une méthode en deux étapes pour estimer la VaR. La première étape évalue le paramètre de volatilité en utilisant un estimateur quasi maximum de vraisemblance généralisé (gQMLE) fondé sur une densité instrumentale h. La seconde étape estime un quantile des innovations à partir du quantile empirique des résidus obtenus dans la première étape. Nous donnons des conditions sous lesquelles l'estimateur en deux étapes de la VaR est convergent et asymptotiquement normal. Nous comparons également les efficacités des estimateurs obtenus pour divers choix de la densité instrumentale h. Lorsque l'innovation n'est pas de densité h, la première étape donne généralement un estimateur biaisé de paramètre de volatilité et la seconde étape donne aussi un estimateur biaisé du quantile des innovations. Cependant, nous montrons que les deux erreurs se contrebalancent pour donner une estimation consistante de la VaR. Nous nous concentrons ensuite sur l'estimation de la VaR dans le cadre de modèles GARCH en utilisant le gQMLE fondé sur la classe des densités instrumentales double gamma généralisées qui contient la distribution gaussienne. Notre objectif est de comparer la performance du QMLE gaussien par rapport à celle du gQMLE. Le choix de l'estimateur optimal dépend essentiellement du paramètre d qui minimise la variance asymptotique. Nous testons si le paramètre d qui minimise la variance asymptotique est égal à 2. Lorsque le test est appliqué sur des séries réelles de rendements financiers, l'hypothèse stipulant l'optimalité du QMLE gaussien est généralement rejetée. Finalement, nous considérons les méthodes non-paramétriques d'apprentissage automatique pour estimer la VaR. Ces méthodes visent à s'affranchir du risque de modèle car elles ne reposent pas sur une forme spécifique de la volatilité. Nous utilisons la technique des machines à vecteurs de support pour la régression (SVR) basée sur la fonction de perte moindres carrés (en anglais LS). Pour améliorer la solution du modèle LS-SVR nous utilisons les modèles LS-SVR pondérés et LS-SVR de taille fixe. Des illustrations numériques mettent en évidence l'apport des modèles proposés pour estimer la VaR en tenant compte des risques de spécification et d'estimation. / In this thesis, we study the problem of conditional Value at Risk (VaR) estimation taking into account estimation risk and model risk. First, we considered a two-step method for VaR estimation. The first step estimates the volatility parameter using a generalized quasi maximum likelihood estimator (gQMLE) based on an instrumental density h. The second step estimates a quantile of innovations from the empirical quantile of residuals obtained in the first step. We give conditions under which the two-step estimator of the VaR is consistent and asymptotically normal. We also compare the efficiencies of the estimators for various instrumental densities h. When the distribution of is not the density h the first step usually gives a biased estimator of the volatility parameter and the second step gives a biased estimator of the quantile of the innovations. However, we show that both errors counterbalance each other to give a consistent estimate of the VaR. We then focus on the VaR estimation within the framework of GARCH models using the gQMLE based on a class of instrumental densities called double generalized gamma which contains the Gaussian distribution. Our goal is to compare the performance of the Gaussian QMLE against the gQMLE. The choice of the optimal estimator depends on the value of d that minimizes the asymptotic variance. We test if this parameter is equal 2. When the test is applied to real series of financial returns, the hypothesis stating the optimality of Gaussian QMLE is generally rejected. Finally, we consider non-parametric machine learning models for VaR estimation. These methods are designed to eliminate model risk because they are not based on a specific form of volatility. We use the support vector machine model for regression (SVR) based on the least square loss function (LS). In order to improve the solution of LS-SVR model, we used the weighted LS-SVR and the fixed size LS-SVR models. Numerical illustrations highlight the contribution of the proposed models for VaR estimation taking into account the risk of specification and estimation. Estimateur efficace Modèles d'apprentissage automatique Modèles GARCH Risque d'estimation Risque de mauvaise spécification Risque de modèle VaR conditionnelle Conditional VaR Efficient estimator Estimation risk GARCH models Generalized Quasi maximum likelihood Machine learning models Misspecification risk Model risk
47	Extraction of mobility information through heterogeneous data fusion : a multi-source, multi-scale, and multi-modal problem / Fusion de données hétérogènes pour l'extraction d'informations de mobilité : un problème multi-source, multi-échelle, et multi-modal Thuillier, Etienne 11 December 2017 (has links) Aujourd'hui c'est un fait, nous vivons dans un monde où les enjeux écologiques, économiques et sociétaux sont de plus en plus pressants. Au croisement des différentes lignes directrices envisagées pour répondre à ces problèmes, une vision plus précise de la mobilité humaine est un axe central et majeur, qui a des répercussions sur tous les domaines associés tels que le transport, les sciences sociales, l'urbanisme, les politiques d'aménagement, l'écologie, etc. C'est par ailleurs dans un contexte de contraintes budgétaires fortes que les principaux acteurs de la mobilité sur les territoires cherchent à rationaliser les services de transport, et les déplacements des individus. La mobilité humaine est donc un enjeu stratégique aussi bien pour les collectivités locales que pour les usagers, qu'il faut savoir observer, comprendre, et anticiper.Cette étude de la mobilité passe avant tout par une observation précise des déplacements des usagers sur les territoires. Aujourd'hui les acteurs de la mobilité se tournent principalement vers l'utilisation massive des données utilisateurs. L'utilisation simultanée de données multi-sources, multi-modales, et multi-échelles permet d'entrevoir de nombreuses possibilités, mais cette dernière présente des défis technologiques et scientifiques majeurs. Les modèles de mobilité présentés dans la littérature sont ainsi trop souvent axés sur des zones d'expérimentation limitées, en utilisant des données calibrées, etc. et leur application dans des contextes réels, et à plus large échelle est donc discutable. Nous identifions ainsi deux problématiques majeures qui permettent de répondre à ce besoin d'une meilleure connaissance de la mobilité humaine, mais également à une meilleure application de cette connaissance. La première problématique concerne l'extraction d'informations de mobilité à partir de la fusion de données hétérogènes. La seconde problématique concerne la pertinence de cette fusion dans un contexte réel, et à plus large échelle. Nous apportons différents éléments de réponses à ces problématiques dans cette thèse. Tout d'abord en présentant deux modèles de fusion de données, qui permettent une extraction d'informations pertinentes. Puis, en analysant l'application de ces deux modèles au sein du projet ANR Norm-Atis.Dans cette thèse, nous suivons finalement le développement de toute une chaine de processus. En commençant par une étude de la mobilité humaine, puis des modèles de mobilité, nous présentons deux modèles de fusion de données, et nous analysons leur pertinence dans un cas concret. Le premier modèle que nous proposons permet d'extraire 12 comportements types de mobilité. Il est basé sur un apprentissage non-supervisé de données issues de la téléphonie mobile. Nous validons nos résultats en utilisant des données officielles de l'INSEE, et nous déduisons de nos résultats, des comportements dynamiques qui ne peuvent pas être observés par les données de mobilité traditionnelles. Ce qui est une forte valeur-ajoutée de notre modèle. Le second modèle que nous proposons permet une désagrégation des flux de mobilité en six motifs de mobilité. Il se base sur un apprentissage supervisé des données issues d'enquêtes de déplacements ainsi que des données statiques de description du sursol. Ce modèle est appliqué par la suite aux données agrégés au sein du projet Norm-Atis. Les temps de calculs sont suffisamment performants pour permettre une application de ce modèle dans un contexte temps-réel. / Today it is a fact that we live in a world where ecological, economic and societal issues are increasingly pressing. At the crossroads of the various guidelines envisaged to address these problems, a more accurate vision of human mobility is a central and major axis, which has repercussions on all related fields such as transport, social sciences, urban planning, management policies, ecology, etc. It is also in the context of strong budgetary constraints that the main actors of mobility on the territories seek to rationalize the transport services and the movements of individuals. Human mobility is therefore a strategic challenge both for local communities and for users, which must be observed, understood and anticipated.This study of mobility is based above all on a precise observation of the movements of users on the territories. Nowadays mobility operators are mainly focusing on the massive use of user data. The simultaneous use of multi-source, multi-modal, and multi-scale data opens many possibilities, but the latter presents major technological and scientific challenges. The mobility models presented in the literature are too often focused on limited experimental areas, using calibrated data, etc., and their application in real contexts and on a larger scale is therefore questionable. We thus identify two major issues that enable us to meet this need for a better knowledge of human mobility, but also to a better application of this knowledge. The first issue concerns the extraction of mobility information from heterogeneous data fusion. The second problem concerns the relevance of this fusion in a real context, and on a larger scale. These issues are addressed in this dissertation: the first, through two data fusion models that allow the extraction of mobility information, the second through the application of these fusion models within the ANR Norm-Atis project.In this thesis, we finally follow the development of a whole chain of processes. Starting with a study of human mobility, and then mobility models, we present two data fusion models, and we analyze their relevance in a concrete case. The first model we propose allows to extract 12 types of mobility behaviors. It is based on an unsupervised learning of mobile phone data. We validate our results using official data from the INSEE, and we infer from our results, dynamic behaviors that can not be observed through traditional mobility data. This is a strong added-value of our model. The second model operates a mobility flows decompositoin into six mobility purposes. It is based on a supervised learning of mobility surveys data and static data from the land use. This model is then applied to the aggregated data within the Norm-Atis project. The computing times are sufficiently powerful to allow an application of this model in a real-time context. Fusion de données hétérogènes Mobilité humaine Modèles d'apprentissage Intelligent Transportation System Application pratique Call Detail Records Heterogeneous data fusion Human mobility Learning models Intelligent Transportation System Call Detail Records Practical application 304.8 620
48	Modelli di distribuzione della dimensione di impresa per i settori manifatturieri italiani: il problema della regolarità statistica e relative implicazioni economiche / Modelling Firm Size Distribution of Italian Manufacturing Industries: the Puzzle of Statistical Regularity and Related Economic Implications CROSATO, LISA 13 July 2007 (has links) Questo lavoro studia la distribuzione della dimensione d'impresa sulla base di due datasets. Il primo è l'indagine micro1 di istat, che include tutte le imprese manifatturiere con più di 20 addetti sopravvissute dal 1989 al 1997. Il secondo è il file Cerved riguardante l'universo delle imprese del settore meccanico (atecodk29), dal 1997 al 2002. Lo scopo generale della tesi è quello di espolare la possibilità di trovare nuove regolarità empiriche riguardanti la distribuzione della dimensione d'impresa, sulla base della passata evidenza empirica che attesta la (in)capacità di Lognormale e Pareto di modellare in modo soddisfacente la dimensione d'impresa nell'intero arco dimensionale. Vengono per questo proposti due modelli mai utilizzati prima. Gli stessi vengono poi convalidati su differenti variabili dimensionali e a diversi livelli di aggregazione. La tesi cerca anche di esplicitare al meglio le implicazioni economiche dei modelli parametrici di distribuzione adottati secondo diversi punti di vista. / The present work studies the firm size distribution of Italian manufacturing industries on the basis of two datasets. The first is the Micro1 survey carried out by ISTAT, which recorded all manufacturing firms with 20 employees and more surviving from 1989 to 1997. The second is the Cerved file regarding all firms of the mechanical sector (DK29) from 1997 to 2002. The general aim of this research is to explore the possibility to find new empirical regularities in the size distribution of firms, building on the relevant past evidence about the (in) capacity of the Lognormal and Pareto distribution of satisfactorily modelling the whole size range. Two unused statistical models are proposed and validated on different size proxies and at different levels of data aggregation. The thesis also addresses the economic implications of parametric models of firm size distribution in different aspects. SECS-S/03: STATISTICA ECONOMICA
49	Effects of Fear Conditioning on Pain : Moderation by Mindfulness and the HPA-axis Taylor, Véronique 04 1900 (has links) No description available. fear conditioning reinforcement learning models nociceptive flexion reflex mindfulness meditation cortisol pain Conditionnement à la peur Réflexe nociceptif de flexion Méditation pleine conscience Douleur
50	Prožitek na táborech / Experience at camps ZICHA, Miroslav January 2017 (has links) This thesis deals with the role of experiences at camps for children and youth and the possibilities of subsequent work with these experiences. The theoretical part describes and differentiates between the Czech terms prožitek, zážitek and zkušenost (experience). Special attention is given to the process of experience-based education and six basic experiential learning models are specified. Feedback and targeted feedback are also mentioned with special focus being given to facilitation, its rules and techniques. The thesis also informs about the connection between work with feedback and dramaturgy. Qualitative research finds out whether camp leaders work with the participants' experiences, clarifies the phenomena (not) leading to this fact, describes their experience with the form of processing the experiences and determines if they have some theoretical knowledge in this field.

Search results