41 |
Applications of Formal Explanations in MLSmyrnioudis, Nikolaos January 2023 (has links)
The most performant Machine Learning (ML) classifiers have been labeled black-boxes due to the complexity of their decision process. eXplainable Artificial Intelligence (XAI) methods aim to alleviate this issue by crafting an interpretable explanation for a models prediction. A drawback of most XAI methods is that they are heuristic with some drawbacks such as non determinism and locality. Formal Explanations (FE) have been proposed as a way to explain the decisions of classifiers by extracting a set of features that guarantee the prediction. In this thesis we explore these guarantees for different use cases: speeding up the inference speed of tree-based Machine Learning classifiers, curriculum learning using said classifiers and also reducing training data. We find that under the right circumstances we can achieve up to 6x speedup by partially compiling the model to a set of rules that are extracted using formal explainability methods. / De mest effektiva maskininlärningsklassificerarna har betecknats som svarta lådor på grund av komplexiteten i deras beslutsprocess. Metoder för förklarbar artificiell intelligens (XAI) syftar till att lindra detta problem genom att skapa en tolkbar förklaring för modellens prediktioner. En nackdel med de flesta XAI-metoder är att de är heuristiska och har vissa nackdelar såsom icke-determinism och lokalitet. Formella förklaringar (FE) har föreslagits som ett sätt att förklara klassificerarnas beslut genom att extrahera en uppsättning funktioner som garanterar prediktionen. I denna avhandling utforskar vi dessa garantier för olika användningsfall: att öka inferenshastigheten för maskininlärningsklassificerare baserade på träd, kurser med hjälp av dessa klassificerare och även minska träningsdata. Vi finner att under rätt omständigheter kan vi uppnå upp till 6 gånger snabbare prestanda genom att delvis kompilera modellen till en uppsättning regler som extraheras med hjälp av formella förklaringsmetoder.
|
42 |
Assessing Viability of Open-Source Battery Cycling Data for Use in Data-Driven Battery Degradation ModelsRitesh Gautam (17582694) 08 December 2023 (has links)
<p dir="ltr">Lithium-ion batteries are being used increasingly more often to provide power for systems that range all the way from common cell-phones and laptops to advanced electric automotive and aircraft vehicles. However, as is the case for all battery types, lithium-ion batteries are prone to naturally occurring degradation phenomenon that limit their effective use in these systems to a finite amount of time. This degradation is caused by a plethora of variables and conditions including things like environmental conditions, physical stress/strain on the body of the battery cell, and charge/discharge parameters and cycling. Accurately and reliably being able to predict this degradation behavior in battery systems is crucial for any party looking to implement and use battery powered systems. However, due to the complicated non-linear multivariable processes that affect battery degradation, this can be difficult to achieve. Compared to traditional methods of battery degradation prediction and modeling like equivalent circuit models and physics-based electrochemical models, data-driven machine learning tools have been shown to be able to handle predicting and classifying the complex nature of battery degradation without requiring any prior knowledge of the physical systems they are describing.</p><p dir="ltr">One of the most critical steps in developing these data-driven neural network algorithms is data procurement and preprocessing. Without large amounts of high-quality data, no matter how advanced and accurate the architecture is designed, the neural network prediction tool will not be as effective as one trained on high quality, vast quantities of data. This work aims to gather battery degradation data from a wide variety of sources and studies, examine how the data was produced, test the effectiveness of the data in the Interfacial Multiphysics Laboratory’s autoencoder based neural network tool CD-Net, and analyze the results to determine factors that make battery degradation datasets perform better for use in machine learning/deep learning tools. This work also aims to relate this work to other data-driven models by comparing the CD-Net model’s performance with the publicly available BEEP’s (Battery Evaluation and Early Prediction) ElasticNet model. The reported accuracy and prediction models from the CD-Net and ElasticNet tools demonstrate that larger datasets with actively selected training/testing designations and less errors in the data produce much higher quality neural networks that are much more reliable in estimating the state-of-health of lithium-ion battery systems. The results also demonstrate that data-driven models are much less effective when trained using data from multiple different cell chemistries, form factors, and cycling conditions compared to more congruent datasets when attempting to create a generalized prediction model applicable to multiple forms of battery cells and applications.</p>
|
43 |
Анализ средств для интерпретирования моделей машинного обучения при анализе табличных данных : магистерская диссертация / Analysis of tools for interpreting machine learning models when analyzing tabular dataБабий, И. Н., Babiy, I. N. January 2023 (has links)
Цель работы – анализ средств для интерпретирования моделей машинного обучения и их практического применения для интерпретирования результатов моделей машинного обучения при анализе табличных данных. Объект исследования – средства для интерпретирования моделей машинного обучения. Методы исследования: теоретический анализ литературы по теме исследования, изучение документации библиотек машинного обучения, классификация исследуемых методов, экспериментальный включающий проведение исследовательского анализа данных, обучение моделей машинного обучения и применение интерпретирования, обобщение полученных данных и их сравнение. Результаты работы: подготовлен обзор и практическое руководство по интерпретации результатов машинного обучения для табличных данных. Выпускная квалификационная работа выполнена в текстовом редакторе Microsoft Word и представлена в твердой копии. / The purpose of the work is to analyze tools for interpreting machine learning models and their practical application for interpreting the results of machine learning models when analyzing tabular data. The object of study is tools for interpreting machine learning models. Research methods: theoretical analysis of literature on the research topic, study of documentation of machine learning libraries, classification of methods being studied, experimental, including conducting exploratory data analysis, training machine learning models and applying interpretation, summarizing the data obtained and comparison. their. Results of the work: a review and practical guidance on interpreting the results of machine learning of tabular data has been prepared. The final qualifying work was completed in the text editor Microsoft Word and presented on paper.
|
44 |
<b>ENERGY CONSERVATION THROUGH INTERNET-OF-THING FRAMEWORK</b>Da Chun Wu (20391771) 06 December 2024 (has links)
<p dir="ltr">Improving the energy efficiency of buildings and manufacturing plants involves a continuous cycle of real-time monitoring, analysis, decision-making, action, and assessment, which are essential components of a smart manufacturing approach. Achieving this requires a comprehensive platform that integrates data storage and sharing; incorporates models to interpret sensor data and algorithms to analyze it; provides actionable options with projected benefits and trade-offs; executes selected actions and evaluates their outcomes; and retains knowledge for ongoing enhancement. However, many commercially available solutions are designed for large-scale institutions, making them expensive and requiring significant customization by specialized professionals, which limits accessibility for smaller companies and building owners. This research aims to address these limitations by developing an IoT-based platform that integrates all essential functions while remaining affordable and user-friendly for small and medium-sized businesses and individual building owners. The platform supports the seamless integration of sensors, software, hardware models, decision-making algorithms, actuators, and a structured knowledge repository, with data communication and sharing managed via the internet through cloud services to ensure accessibility and flexibility. The platform was applied in real-world settings to verify its performance and usability, focusing on three core implementations to establish an advanced energy management framework. The first implementation involved ventilation optimization using IoT sensors to monitor parameters such as temperature, differential pressure, and airflow, combined with neural networks to predict system behavior under varying conditions. A genetic algorithm was used to identify optimal operational settings for make-up air units, ensuring energy-efficient ventilation while maintaining indoor air quality and temperature standards. This approach resulted in a 20% reduction in annual ventilation energy consumption and a 60% decrease in power demand on weekends. The second implementation focused on optimizing compressed air system pressure settings, addressing the high energy intensity and inefficiencies caused by leaks, heat, and pressure drops. IoT-enabled sensors captured real-time data on pressure, flow, and power consumption, which were analyzed using machine learning models, achieving a 7% energy saving for every 1 bar reduction in pressure. The final implementation addressed the detection of unwanted air demand using unsupervised k-means classification to distinguish between normal operating hours and non-operating periods. Unexpected air usage patterns during non-operating hours were identified and analyzed through histogram and heatmap techniques, enabling corrective measures that saved approximately 393 kWh weekly, equivalent to 10% of the compressor’s weekly energy consumption. The innovation of this study lies in the integration of model-based intelligence within an IoT system, enhancing real-time energy management capabilities and enabling continuous learning and improvements in operational efficiency. This work demonstrates how advanced IoT frameworks can bridge the gap between energy efficiency and practicality for smaller enterprises, fostering sustainable and cost-effective operations.</p>
|
45 |
INVESTIGATING DATA ACQUISITION TO IMPROVE FAIRNESS OF MACHINE LEARNING MODELSEkta (18406989) 23 April 2024 (has links)
<p dir="ltr">Machine learning (ML) algorithms are increasingly being used in a variety of applications and are heavily relied upon to make decisions that impact people’s lives. ML models are often praised for their precision, yet they can discriminate against certain groups due to biased data. These biases, rooted in historical inequities, pose significant challenges in developing fair and unbiased models. Central to addressing this issue is the mitigation of biases inherent in the training data, as their presence can yield unfair and unjust outcomes when models are deployed in real-world scenarios. This study investigates the efficacy of data acquisition, i.e., one of the stages of data preparation, akin to the pre-processing bias mitigation technique. Through experimental evaluation, we showcase the effectiveness of data acquisition, where the data is acquired using data valuation techniques to enhance the fairness of machine learning models.</p>
|
46 |
FEDERATED LEARNING AMIDST DYNAMIC ENVIRONMENTSBhargav Ganguly (19119859) 08 November 2024 (has links)
<p dir="ltr">Federated Learning (FL) is a prime example of a large-scale distributed machine learning framework that has emerged as a result of the exponential growth in data generation and processing capabilities on smart devices. This framework enables the efficient processing and analysis of vast amounts of data, leveraging the collective power of numerous devices to achieve unprecedented scalability and performance. In the FL framework, each end-user device trains a local model using its own data. Through the periodic synchronization of local models, FL achieves a global model that incorporates the insights from all participat- ing devices. This global model can then be used for various applications, such as predictive analytics, recommendation systems, and more.</p><p dir="ltr">Despite its potential, traditional Federated Learning (FL) frameworks face significant hur- dles in real-world applications. These challenges stem from two primary issues: the dynamic nature of data distributions and the efficient utilization of network resources in diverse set- tings. Traditional FL frameworks often rely on the assumption that data distributions remain stationary over time. However, real-world environments are inherently dynamic, with data distributions constantly evolving, which in turn becomes a potential source of <i>temporal</i> het- erogeneity in FL. Another significant challenge in traditional FL frameworks is the efficient use of network resources in heterogeneous settings. Real-world networks consist of devices with varying computational capabilities, communication protocols, and network conditions. Traditional FL frameworks often struggle to adapt to these diverse <i>spatially</i> heterogeneous settings, leading to inefficient use of network resources and increased latency.</p><p dir="ltr">The primary focus of this thesis is to investigate algorithmic frameworks that can miti- gate the challenges posed by <i>temporal</i> and <i>spatial</i> system heterogeneities in FL. One of the significant sources of <i>temporal</i> heterogeneities in FL is owed to the dynamic drifting of client datasets over time, whereas <i>spatial</i> heterogeneities majorly broadly subsume the diverse computational capabilities and network conditions of devices in a network. We introduce two novel FL frameworks: MASTER-FL, which addresses model staleness in the presence of <i>temporally</i> drifting datasets, and Cooperative Edge-Assisted Dynamic Federated Learning CE-FL, which manages both <i>spatial</i> and <i>temporal</i> heterogeneities in extensive hierarchical FL networks. MASTER-FL is specifically designed to ensure that the global model remains accurate and up-to-date even in environments which are characterized by rapidly changing datasets across time. CE-FL, on the other hand, leverages server-side computing capabili- ties, intelligent data offloading, floating aggregation and cooperative learning strategies to manage the diverse computational capabilities and network conditions often associated with modern FL systems.</p>
|
47 |
Rare Events Predictions with Time Series Data / Prediktion av sällsynta händelser med tidsseriedataEriksson, Jonas, Kuusela, Tuomas January 2024 (has links)
This study aims to develop models for predicting rare events, specifically elevated intracranial pressure (ICP) in patients with traumatic brain injury (TBI). Using time-series data of ICP, we created and evaluated several machine learning models, including K-Nearest Neighbors, Random Forest, and logistic regression, in order to predict ICP levels exceeding 20 mmHg – acritical threshold for medical intervention. The time-series data was segmented and transformed into a tabular format, with feature engineering applied to extract meaningful statistical characteristics. We framed the problem as a binary classification task, focusing on whether ICP levels exceeded the 20 mmHg threshold. We focused on evaluating the optimal model by comparing the predictive performance of the algorithms. All models demonstrated good performance for predictions up to 30 minutes in advance, after which a significant decline in performance was observed. Within this timeframe, the models achieved Matthews Correlation Coefficient (MCC) scores ranging between 0.876 and 0.980, where the Random Forest models showed the highest performance. In contrast, logistic regression displayed a notable deviation at the 40-minute mark, recording an MCC score of 0.752. The results presented highlight potential to provide reliable, real-time predictions of dangerous ICP levels up to 30 minutes in advance, which is crucial for timely and effective medical interventions.
|
48 |
Использование моделей глубокого обучения для обнаружения аномалий в логах в процессе разработки программного обеспечения : магистерская диссертация / Utilizing deep learning models to detect log anomalies during software developmentДивенко, А. С., Divenko, A. S. January 2024 (has links)
Данная работа посвящена применению моделей глубокого обучения для решения этой проблемы в процессе разработки программного обеспечения. Разработан стенд для имитации процесса разработки ПО, на котором были сгенерированы синтетические данные логов из различных сервисов. Объединение разнородных логов позволило создать реалистичный набор данных со скрытыми зависимостями для более сложной задачи поиска аномалий. На созданном наборе данных были применены модели глубокого обучения DeepLog, LogAnomaly и LogBERT. Для каждой модели выполнено обучение и оценка эффективности обнаружения аномалий на тестовой выборке. Разработанный стенд может усложняться и использоваться для дальнейших исследований в области применения глубокого обучения к задаче поиска аномалий в логах в процессе разработки ПО. / This paper focuses on the application of deep learning models to address this problem in the software development. A simulation framework was developed to imitate the software development by generating synthetic log data from different services. Combining heterogeneous logs allowed the creation of a realistic dataset with hidden dependencies for a more complex anomaly search task. DeepLog, LogAnomaly and LogBERT deep learning models were applied on the created dataset. For each model, training and evaluation of anomaly detection performance on a test sample was performed. The developed framework can be extended and used for further research in the application of deep learning to the task of searching for anomalies in logs during software development.
|
49 |
Estimation and misspecification Risks in VaR estimation / Estimation and misspecification risks in VaR evaluationTelmoudi, Fedya 19 December 2014 (has links)
Dans cette thèse, nous étudions l'estimation de la valeur à risque conditionnelle (VaR) en tenant compte du risque d'estimation et du risque de modèle. Tout d'abord, nous considérons une méthode en deux étapes pour estimer la VaR. La première étape évalue le paramètre de volatilité en utilisant un estimateur quasi maximum de vraisemblance généralisé (gQMLE) fondé sur une densité instrumentale h. La seconde étape estime un quantile des innovations à partir du quantile empirique des résidus obtenus dans la première étape. Nous donnons des conditions sous lesquelles l'estimateur en deux étapes de la VaR est convergent et asymptotiquement normal. Nous comparons également les efficacités des estimateurs obtenus pour divers choix de la densité instrumentale h. Lorsque l'innovation n'est pas de densité h, la première étape donne généralement un estimateur biaisé de paramètre de volatilité et la seconde étape donne aussi un estimateur biaisé du quantile des innovations. Cependant, nous montrons que les deux erreurs se contrebalancent pour donner une estimation consistante de la VaR. Nous nous concentrons ensuite sur l'estimation de la VaR dans le cadre de modèles GARCH en utilisant le gQMLE fondé sur la classe des densités instrumentales double gamma généralisées qui contient la distribution gaussienne. Notre objectif est de comparer la performance du QMLE gaussien par rapport à celle du gQMLE. Le choix de l'estimateur optimal dépend essentiellement du paramètre d qui minimise la variance asymptotique. Nous testons si le paramètre d qui minimise la variance asymptotique est égal à 2. Lorsque le test est appliqué sur des séries réelles de rendements financiers, l'hypothèse stipulant l'optimalité du QMLE gaussien est généralement rejetée. Finalement, nous considérons les méthodes non-paramétriques d'apprentissage automatique pour estimer la VaR. Ces méthodes visent à s'affranchir du risque de modèle car elles ne reposent pas sur une forme spécifique de la volatilité. Nous utilisons la technique des machines à vecteurs de support pour la régression (SVR) basée sur la fonction de perte moindres carrés (en anglais LS). Pour améliorer la solution du modèle LS-SVR nous utilisons les modèles LS-SVR pondérés et LS-SVR de taille fixe. Des illustrations numériques mettent en évidence l'apport des modèles proposés pour estimer la VaR en tenant compte des risques de spécification et d'estimation. / In this thesis, we study the problem of conditional Value at Risk (VaR) estimation taking into account estimation risk and model risk. First, we considered a two-step method for VaR estimation. The first step estimates the volatility parameter using a generalized quasi maximum likelihood estimator (gQMLE) based on an instrumental density h. The second step estimates a quantile of innovations from the empirical quantile of residuals obtained in the first step. We give conditions under which the two-step estimator of the VaR is consistent and asymptotically normal. We also compare the efficiencies of the estimators for various instrumental densities h. When the distribution of is not the density h the first step usually gives a biased estimator of the volatility parameter and the second step gives a biased estimator of the quantile of the innovations. However, we show that both errors counterbalance each other to give a consistent estimate of the VaR. We then focus on the VaR estimation within the framework of GARCH models using the gQMLE based on a class of instrumental densities called double generalized gamma which contains the Gaussian distribution. Our goal is to compare the performance of the Gaussian QMLE against the gQMLE. The choice of the optimal estimator depends on the value of d that minimizes the asymptotic variance. We test if this parameter is equal 2. When the test is applied to real series of financial returns, the hypothesis stating the optimality of Gaussian QMLE is generally rejected. Finally, we consider non-parametric machine learning models for VaR estimation. These methods are designed to eliminate model risk because they are not based on a specific form of volatility. We use the support vector machine model for regression (SVR) based on the least square loss function (LS). In order to improve the solution of LS-SVR model, we used the weighted LS-SVR and the fixed size LS-SVR models. Numerical illustrations highlight the contribution of the proposed models for VaR estimation taking into account the risk of specification and estimation.
|
50 |
Extraction of mobility information through heterogeneous data fusion : a multi-source, multi-scale, and multi-modal problem / Fusion de données hétérogènes pour l'extraction d'informations de mobilité : un problème multi-source, multi-échelle, et multi-modalThuillier, Etienne 11 December 2017 (has links)
Aujourd'hui c'est un fait, nous vivons dans un monde où les enjeux écologiques, économiques et sociétaux sont de plus en plus pressants. Au croisement des différentes lignes directrices envisagées pour répondre à ces problèmes, une vision plus précise de la mobilité humaine est un axe central et majeur, qui a des répercussions sur tous les domaines associés tels que le transport, les sciences sociales, l'urbanisme, les politiques d'aménagement, l'écologie, etc. C'est par ailleurs dans un contexte de contraintes budgétaires fortes que les principaux acteurs de la mobilité sur les territoires cherchent à rationaliser les services de transport, et les déplacements des individus. La mobilité humaine est donc un enjeu stratégique aussi bien pour les collectivités locales que pour les usagers, qu'il faut savoir observer, comprendre, et anticiper.Cette étude de la mobilité passe avant tout par une observation précise des déplacements des usagers sur les territoires. Aujourd'hui les acteurs de la mobilité se tournent principalement vers l'utilisation massive des données utilisateurs. L'utilisation simultanée de données multi-sources, multi-modales, et multi-échelles permet d'entrevoir de nombreuses possibilités, mais cette dernière présente des défis technologiques et scientifiques majeurs. Les modèles de mobilité présentés dans la littérature sont ainsi trop souvent axés sur des zones d'expérimentation limitées, en utilisant des données calibrées, etc. et leur application dans des contextes réels, et à plus large échelle est donc discutable. Nous identifions ainsi deux problématiques majeures qui permettent de répondre à ce besoin d'une meilleure connaissance de la mobilité humaine, mais également à une meilleure application de cette connaissance. La première problématique concerne l'extraction d'informations de mobilité à partir de la fusion de données hétérogènes. La seconde problématique concerne la pertinence de cette fusion dans un contexte réel, et à plus large échelle. Nous apportons différents éléments de réponses à ces problématiques dans cette thèse. Tout d'abord en présentant deux modèles de fusion de données, qui permettent une extraction d'informations pertinentes. Puis, en analysant l'application de ces deux modèles au sein du projet ANR Norm-Atis.Dans cette thèse, nous suivons finalement le développement de toute une chaine de processus. En commençant par une étude de la mobilité humaine, puis des modèles de mobilité, nous présentons deux modèles de fusion de données, et nous analysons leur pertinence dans un cas concret. Le premier modèle que nous proposons permet d'extraire 12 comportements types de mobilité. Il est basé sur un apprentissage non-supervisé de données issues de la téléphonie mobile. Nous validons nos résultats en utilisant des données officielles de l'INSEE, et nous déduisons de nos résultats, des comportements dynamiques qui ne peuvent pas être observés par les données de mobilité traditionnelles. Ce qui est une forte valeur-ajoutée de notre modèle. Le second modèle que nous proposons permet une désagrégation des flux de mobilité en six motifs de mobilité. Il se base sur un apprentissage supervisé des données issues d'enquêtes de déplacements ainsi que des données statiques de description du sursol. Ce modèle est appliqué par la suite aux données agrégés au sein du projet Norm-Atis. Les temps de calculs sont suffisamment performants pour permettre une application de ce modèle dans un contexte temps-réel. / Today it is a fact that we live in a world where ecological, economic and societal issues are increasingly pressing. At the crossroads of the various guidelines envisaged to address these problems, a more accurate vision of human mobility is a central and major axis, which has repercussions on all related fields such as transport, social sciences, urban planning, management policies, ecology, etc. It is also in the context of strong budgetary constraints that the main actors of mobility on the territories seek to rationalize the transport services and the movements of individuals. Human mobility is therefore a strategic challenge both for local communities and for users, which must be observed, understood and anticipated.This study of mobility is based above all on a precise observation of the movements of users on the territories. Nowadays mobility operators are mainly focusing on the massive use of user data. The simultaneous use of multi-source, multi-modal, and multi-scale data opens many possibilities, but the latter presents major technological and scientific challenges. The mobility models presented in the literature are too often focused on limited experimental areas, using calibrated data, etc., and their application in real contexts and on a larger scale is therefore questionable. We thus identify two major issues that enable us to meet this need for a better knowledge of human mobility, but also to a better application of this knowledge. The first issue concerns the extraction of mobility information from heterogeneous data fusion. The second problem concerns the relevance of this fusion in a real context, and on a larger scale. These issues are addressed in this dissertation: the first, through two data fusion models that allow the extraction of mobility information, the second through the application of these fusion models within the ANR Norm-Atis project.In this thesis, we finally follow the development of a whole chain of processes. Starting with a study of human mobility, and then mobility models, we present two data fusion models, and we analyze their relevance in a concrete case. The first model we propose allows to extract 12 types of mobility behaviors. It is based on an unsupervised learning of mobile phone data. We validate our results using official data from the INSEE, and we infer from our results, dynamic behaviors that can not be observed through traditional mobility data. This is a strong added-value of our model. The second model operates a mobility flows decompositoin into six mobility purposes. It is based on a supervised learning of mobility surveys data and static data from the land use. This model is then applied to the aggregated data within the Norm-Atis project. The computing times are sufficiently powerful to allow an application of this model in a real-time context.
|
Page generated in 0.0603 seconds