• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 25
  • 4
  • Tagged with
  • 37
  • 37
  • 37
  • 13
  • 13
  • 11
  • 11
  • 11
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Two New Applications of Tensors to Machine Learning for Wireless Communications

Bhogi, Keerthana 09 September 2021 (has links)
With the increasing number of wireless devices and the phenomenal amount of data that is being generated by them, there is a growing interest in the wireless communications community to complement the traditional model-driven design approaches with data-driven machine learning (ML)-based solutions. However, managing the large-scale multi-dimensional data to maintain the efficiency and scalability of the ML algorithms has obviously been a challenge. Tensors provide a useful framework to represent multi-dimensional data in an integrated manner by preserving relationships in data across different dimensions. This thesis studies two new applications of tensors to ML for wireless communications where the tensor structure of the concerned data is exploited in novel ways. The first contribution of this thesis is a tensor learning-based low-complexity precoder codebook design technique for a full-dimension multiple-input multiple-output (FD-MIMO) system with a uniform planar antenna (UPA) array at the transmitter (Tx) whose channel distribution is available through a dataset. Represented as a tensor, the FD-MIMO channel is further decomposed using a tensor decomposition technique to obtain an optimal precoder which is a function of Kronecker-Product (KP) of two low-dimensional precoders, each corresponding to the horizontal and vertical dimensions of the FD-MIMO channel. From the design perspective, we have made contributions in deriving a criterion for optimal product precoder codebooks using the obtained low-dimensional precoders. We show that this product codebook design problem is an unsupervised clustering problem on a Cartesian Product Grassmann Manifold (CPM), where the optimal cluster centroids form the desired codebook. We further simplify this clustering problem to a $K$-means algorithm on the low-dimensional factor Grassmann manifolds (GMs) of the CPM which correspond to the horizontal and vertical dimensions of the UPA, thus significantly reducing the complexity of precoder codebook construction when compared to the existing codebook learning techniques. The second contribution of this thesis is a tensor-based bandwidth-efficient gradient communication technique for federated learning (FL) with convolutional neural networks (CNNs). Concisely, FL is a decentralized ML approach that allows to jointly train an ML model at the server using the data generated by the distributed users coordinated by a server, by sharing only the local gradients with the server and not the raw data. Here, we focus on efficient compression and reconstruction of convolutional gradients at the users and the server, respectively. To reduce the gradient communication overhead, we compress the sparse gradients at the users to obtain their low-dimensional estimates using compressive sensing (CS)-based technique and transmit to the server for joint training of the CNN. We exploit a natural tensor structure offered by the convolutional gradients to demonstrate the correlation of a gradient element with its neighbors. We propose a novel prior for the convolutional gradients that captures the described spatial consistency along with its sparse nature in an appropriate way. We further propose a novel Bayesian reconstruction algorithm based on the Generalized Approximate Message Passing (GAMP) framework that exploits this prior information about the gradients. Through the numerical simulations, we demonstrate that the developed gradient reconstruction method improves the convergence of the CNN model. / Master of Science / The increase in the number of wireless and mobile devices have led to the generation of massive amounts of multi-modal data at the users in various real-world applications including wireless communications. This has led to an increasing interest in machine learning (ML)-based data-driven techniques for communication system design. The native setting of ML is {em centralized} where all the data is available on a single device. However, the distributed nature of the users and their data has also motivated the development of distributed ML techniques. Since the success of ML techniques is grounded in their data-based nature, there is a need to maintain the efficiency and scalability of the algorithms to manage the large-scale data. Tensors are multi-dimensional arrays that provide an integrated way of representing multi-modal data. Tensor algebra and tensor decompositions have enabled the extension of several classical ML techniques to tensors-based ML techniques in various application domains such as computer vision, data-mining, image processing, and wireless communications. Tensors-based ML techniques have shown to improve the performance of the ML models because of their ability to leverage the underlying structural information in the data. In this thesis, we present two new applications of tensors to ML for wireless applications and show how the tensor structure of the concerned data can be exploited and incorporated in different ways. The first contribution is a tensor learning-based precoder codebook design technique for full-dimension multiple-input multiple-output (FD-MIMO) systems where we develop a scheme for designing low-complexity product precoder codebooks by identifying and leveraging a tensor representation of the FD-MIMO channel. The second contribution is a tensor-based gradient communication scheme for a decentralized ML technique known as federated learning (FL) with convolutional neural networks (CNNs), where we design a novel bandwidth-efficient gradient compression-reconstruction algorithm that leverages a tensor structure of the convolutional gradients. The numerical simulations in both applications demonstrate that exploiting the underlying tensor structure in the data provides significant gains in their respective performance criteria.
22

Prediction Models for TV Case Resolution Times with Machine Learning / Förutsägelsemodeller för TV-fall Upplösningstid med maskininlärning

Javierre I Moyano, Borja January 2023 (has links)
TV distribution and stream content delivery of video over the Internet, since is made up of complex networks including Content Delivery Networks (CDNs), cables and end-point user devices, that is very prone to issues appearing in different levels of the network ending up affecting the final customer’s TV services. When a problem affects the customer, and this prevents from having a proper TV delivery service in devices used for stream purposes, the issue is reported through a call, a TV case is opened and the company’s customer handling agents start supervising it to solve the problem as soon as possible. The goal of this research work is to present an ML-based solution that predicts the Resolution Times (RTs) of the TV cases in each TV delivery service type, therefore how long the cases will take to be solved. The approach taken to provide meaningful results consisted in utilizing four Machine Learning (ML) algorithms to create 480 models for each of the two scenarios. The results revealed that Random Forest (RF) and, specially, Gradient Boosting Machine (GBM) performed exceptionally well. Surprisingly, hyperparameter tuning didn’t significantly improve the RT as expected. Some challenges included the initial data preprocessing and some uncertainty in hyperparameter tuning approaches. Thanks to these predicted times, the company is now able to better inform their costumers on how long the problem is expected to last until is resolved. This real case scenario also considers how the company processes the available data and manages the problem. The research work consists in, first, a literature review on the prediction of RT of Trouble Ticket (TT) and customer churn in telecommunication companies, as well as the study of the company’s available data for the problem. Later, the research focuses in analysing the provided dataset for the experimentation, the preprocessing of the this data according to the industry standards and, finally, the predictions and analysis of the obtained performance metrics. The proposed solution is designed to offer an improved resolution for the company’s specified task. Future work could involve increasing the number of TV cases per service for improving the results and exploring the link between resolution times and customer churn decisions. / TV-distribution och leverans av strömningsinnehåll via internet består av komplexa nätverk, inklusive CDNs, kablar och slutanvändarutrustning. Detta gör det känsligt för problem på olika nätverksnivåer som kan påverka slutkundens TV-tjänster. När ett problem påverkar kunden och hindrar en korrekt TV-leveranstjänst rapporteras det genom ett samtal. Ett ärende öppnas, och företagets kundhanteringsagenter övervakar det för att lösa problemet så snabbt som möjligt. Målet med detta forskningsarbete är att presentera en maskininlärningsbaserad lösning som förutsäger löstiderna (RTs) för TV-ärenden inom varje TV-leveranstjänsttyp, det vill säga hur lång tid ärendena kommer att ta att lösa. För att få meningsfulla resultat användes fyra maskininlärningsalgoritmer för att skapa 480 modeller för var och en av de två scenarierna. Resultaten visade att Random Forest (RF) och framför allt Gradient Boosting Machine (GBM) presterade exceptionellt bra. Överraskande nog förbättrade inte finjusteringen av hyperparametrar RT som förväntat. Vissa utmaningar inkluderade den initiala dataförbehandlingen och osäkerhet i metoder för hyperparametertuning. Tack vare dessa förutsagda tider kan företaget nu bättre informera sina kunder om hur länge problemet förväntas vara olöst. Denna verkliga fallstudie tar också hänsyn till hur företaget hanterar tillgängliga data och problemet. Forskningsarbetet börjar med en litteraturgenomgång om förutsägelse av RT för Trouble Ticket (TT) och kundavhopp inom telekommunikationsföretag samt studier av företagets tillgängliga data för problemet. Därefter fokuserar forskningen på att analysera den tillhandahållna datamängden för experiment, förbehandling av datan enligt branschstandarder och till sist förutsägelser och analys av de erhållna prestandamätvärdena. Den föreslagna lösningen är utformad för att erbjuda en förbättrad lösning för företagets angivna uppgift. Framtida arbete kan innebära att öka antalet TV-ärenden per tjänst för att förbättra resultaten och utforska sambandet mellan löstider och kundavhoppbeslut.
23

AI inom radiologi, nuläge och framtid / AI in radiology, now and the future

Täreby, Linus, Bertilsson, William January 2023 (has links)
Denna uppsats presenterar resultaten av en kvalitativ undersökning som syftar till att ge en djupare förståelse för användningen av AI inom radiologi, dess framtida påverkan på yrket och hur det används idag. Genom att genomföra tre intervjuer med personer som arbetar inom radiologi, har datainsamlingen fokuserat på att identifiera de positiva och negativa aspekterna av AI i radiologi, samt dess potentiella konsekvenser på yrket. Resultaten visar på en allmän acceptans för AI inom radiologi och dess förmåga att förbättra diagnostiska processer och effektivisera arbetet. Samtidigt finns det en viss oro för att AI kan ersätta människor och minska behovet av mänskliga bedömningar. Denna uppsats ger en grundläggande förståelse för hur AI används inom radiologi och dess möjliga framtida konsekvenser. / This essay presents the results of a qualitative study aimed at gaining a deeper understanding of the use of artificial intelligence (AI) in radiology, its potential impact on the profession and how it’s used today. By conducting three interviews with individuals working in radiology, data collection focused on identifying the positive and negative aspects of AI in radiology, as well as its potential consequences on the profession. The results show a general acceptance of AI in radiology and its ability to improve diagnostic processes and streamline work. At the same time, there is a certain concern that AI may replace humans and reduce the need for human judgments. This report provides a basic understanding of how AI is used in radiology and its possible future consequences.
24

Digital transformation: How does physician’s work become affected by the use of digital health technologies?

Schultze, Jakob January 2021 (has links)
Digital transformation is evolving, and it is driving at the helm of the digital evolution. The amount of information accessible to us has revolutionized the way we gather information. Mobile technology and the immediate and ubiquitous access to information has changed how we engage with services including healthcare. Digital technology and digital transformation have afforded people the ability to self-manage in different ways than face-to-face and paper-based methods through different technologies. This study focuses on exploring the use of the most commonly used digital health technologies in the healthcare sector and how it affects physicians’ daily routine practice. The study presents findings from a qualitative methodology involving semi-structured, personal interviews with physicians from Sweden and a physician from Spain. The interviews capture what physicians feel towards digital transformation, digital health technologies and how it affects their work. In a field where a lack of information regarding how physicians work is affected by digital health technologies, this study reveals a general aspect of how reality looks for physicians. A new way of conducting medicine and the changed role of the physician is presented along with the societal implications for physicians and the healthcare sector. The findings demonstrate that physicians’ role, work and the digital transformation in healthcare on a societal level are important in shaping the future for the healthcare industry and the role of the physician in this future. / Den digitala transformationen växer och den drivs vid rodret för den digitala utvecklingen. Mängden information som är tillgänglig för oss har revolutionerat hur vi samlar in information. Mobila tekniker och den omedelbara och allmänt förekommande tillgången till information har förändrat hur vi tillhandahåller oss tjänster inklusive inom vården. Digital teknik och digital transformation har gett människor möjlighet att kontrollera sig själv och sin egen hälsa på olika sätt än ansikte mot ansikte och pappersbaserade metoder genom olika tekniker. Denna studie fokuserar på att utforska användningen av de vanligaste digitala hälsoteknologierna inom hälso- och sjukvårdssektorn och hur det påverkar läkarnas dagliga rutin. Studien presenterar resultat från en kvalitativ metod som involverar semistrukturerade, personliga intervjuer med läkare från Sverige och en läkare från Spanien. Intervjuerna fångar vad läkare tycker om digital transformation, digital hälsoteknik och hur det påverkar deras arbete. I ett fält där brist på information om hur läkare arbetar påverkas av digital hälsoteknik avslöjar denna studie en allmän aspekt av hur verkligheten ser ut för läkare. Ett nytt sätt att bedriva medicin och läkarens förändrade roll presenteras tillsammans med de samhälleliga konsekvenserna för läkare och vårdsektorn. Resultaten visar att läkarnas roll, arbete och den digitala transformationen inom hälso- och sjukvården på samhällsnivå är viktiga för att utforma framtiden för vårdindustrin och läkarens roll i framtiden.
25

Machine Learning Potentials - State of the research and potential applications for carbon nanostructures

Rothe, Tom 13 November 2019 (has links)
Machine Learning interatomic potentials (ML-IAP) are currently the most promising Non-empirical IAPs for molecular dynamic (MD) simulations. They use Machine Learning (ML) methods to fit the potential energy surface (PES) with large reference datasets of the atomic configurations and their corresponding properties. Promising near quantum mechanical accuracy while being orders of magnitudes faster than first principle methods, ML-IAPs are the new “hot topic” in material science research. Unfortunately, most of the available publications require advanced knowledge about ML methods and IAPs, making them hard to understand for beginners and outsiders. This work serves as a plain introduction, providing all the required knowledge about IAPs, ML, and ML-IAPs from the beginning and giving an overview of the most relevant approaches and concepts for building those potentials. Exemplary a gaussian approximation potential (GAP) for amorphous carbon is used to simulate the defect induced deformation of carbon nanotubes. Comparing the results with published density-functional tight-binding (DFTB) results and own Empirical IAP MD-simulations shows that publicly available ML-IAP can already be used for simulation, being indeed faster than and nearly as accurate as first-principle methods. For the future two main challenges appear: First, the availability of ML-IAPs needs to be improved so that they can be easily used in the established MD codes just as the Empirical IAPs. Second, an accurate characterization of the bonds represented in the reference dataset is needed to assure that a potential is suitable for a special application, otherwise making it a 'black-box' method.:1 Introduction 2 Molecular Dynamics 2.1 Introduction to Molecular Dynamics 2.2 Interatomic Potentials 2.2.1 Development of PES 3 Machine Learning Methods 3.1 Types of Machine Learning 3.2 Building Machine Learning Models 3.2.1 Preprocessing 3.2.2 Learning 3.2.3 Evaluation 3.2.4 Prediction 4 Machine Learning for Molecular Dynamics Simulation 4.1 Definition 4.2 Machine Learning Potentials 4.2.1 Neural Network Potentials 4.2.2 Gaussian Approximation Potential 4.2.3 Spectral Neighbor Analysis Potential 4.2.4 Moment Tensor Potentials 4.3 Comparison of Machine Learning Potentials 4.4 Machine Learning Concepts 4.4.1 On the fly 4.4.2 De novo Exploration 4.4.3 PES-Learn 5 Simulation of defect induced deformation of CNTs 5.1 Methodology 5.2 Results and Discussion 6 Conclusion and Outlook 6.1 Conclusion 6.2 Outlook
26

Advanced Data Analytics Modelling for Air Quality Assessment

Abdulkadir, Nafisah Abidemi January 2023 (has links)
Air quality assessment plays a crucial role in understanding the impact of air pollution onhuman health and the environment. With the increasing demand for accurate assessment andprediction of air quality, advanced data analytics modelling techniques offer promisingsolutions. This thesis focuses on leveraging advanced data analytics to assess and analyse airpollution concentration levels in Italy over a 4km resolution using the FORAIR_IT datasetsimulated in ENEA on the CRESCO6 infrastructure, aiming to uncover valuable insights andidentifying the most appropriate AI models for predicting air pollution levels. The datacollection, understanding, and pre-processing procedures are discussed, followed by theapplication of big data training and forecasting using Apache Spark MLlib. The research alsoencompasses different phases, including descriptive and inferential analysis to understand theair pollution concentration dataset, hypothesis testing to examine the relationship betweenvarious pollutants, machine learning prediction using several regression models and anensemble machine learning approach and time series analysis on the entire dataset as well asthree major regions in Italy (Northern Italy – Lombardy, Central Italy – Lazio and SouthernItaly – Campania). The computation time for these regression models are also evaluated and acomparative analysis is done on the results obtained. The evaluation process and theexperimental setup involve the usage of the ENEAGRID/CRESCO6 HPC Infrastructure andApache Spark. This research has provided valuable insights into understanding air pollutionpatterns and improving prediction accuracy. The findings of this study have the potential todrive positive change in environmental management and decision-making processes, ultimatelyleading to healthier and more sustainable communities. As we continue to explore the vastpossibilities offered by advanced data analytics, this research serves as a foundation for futureadvancements in air quality assessment in Italy and the models are transferable to other regionsand provinces in Italy, paving the way for a cleaner and greener future.
27

Graph Matching Based on a Few Seeds: Theoretical Algorithms and Graph Neural Network Approaches

Liren Yu (17329693) 03 November 2023 (has links)
<p dir="ltr">Since graphs are natural representations for encoding relational data, the problem of graph matching is an emerging task and has attracted increasing attention, which could potentially impact various domains such as social network de-anonymization and computer vision. Our main interest is designing polynomial-time algorithms for seeded graph matching problems where a subset of pre-matched vertex-pairs (seeds) is revealed. </p><p dir="ltr">However, the existing work does not fully investigate the pivotal role of seeds and falls short of making the most use of the seeds. Notably, the majority of existing hand-crafted algorithms only focus on using ``witnesses'' in the 1-hop neighborhood. Although some advanced algorithms are proposed to use multi-hop witnesses, their theoretical analysis applies only to \ER random graphs and requires seeds to be all correct, which often do not hold in real applications. Furthermore, a parallel line of research, Graph Neural Network (GNN) approaches, typically employs a semi-supervised approach, which requires a large number of seeds and lacks the capacity to distill knowledge transferable to unseen graphs.</p><p dir="ltr">In my dissertation, I have taken two approaches to address these limitations. In the first approach, we study to design hand-crafted algorithms that can properly use multi-hop witnesses to match graphs. We first study graph matching using multi-hop neighborhoods when partially-correct seeds are provided. Specifically, consider two correlated graphs whose edges are sampled independently from a parent \ER graph $\mathcal{G}(n,p)$. A mapping between the vertices of the two graphs is provided as seeds, of which an unknown fraction is correct. We first analyze a simple algorithm that matches vertices based on the number of common seeds in the $1$-hop neighborhoods, and then further propose a new algorithm that uses seeds in the $D$-hop neighborhoods. We establish non-asymptotic performance guarantees of perfect matching for both $1$-hop and $2$-hop algorithms, showing that our new $2$-hop algorithm requires substantially fewer correct seeds than the $1$-hop algorithm when graphs are sparse. Moreover, by combining our new performance guarantees for the $1$-hop and $2$-hop algorithms, we attain the best-known results (in terms of the required fraction of correct seeds) across the entire range of graph sparsity and significantly improve the previous results. We then study the role of multi-hop neighborhoods in matching power-law graphs. Assume that two edge-correlated graphs are independently edge-sampled from a common parent graph with a power-law degree distribution. A set of correctly matched vertex-pairs is chosen at random and revealed as initial seeds. Our goal is to use the seeds to recover the remaining latent vertex correspondence between the two graphs. Departing from the existing approaches that focus on the use of high-degree seeds in $1$-hop neighborhoods, we develop an efficient algorithm that exploits the low-degree seeds in suitably-defined $D$-hop neighborhoods. Our result achieves an exponential reduction in the seed size requirement compared to the best previously known results.</p><p dir="ltr">In the second approach, we study GNNs for seeded graph matching. We propose a new supervised approach that can learn from a training set how to match unseen graphs with only a few seeds. Our SeedGNN architecture incorporates several novel designs, inspired by our theoretical studies of seeded graph matching: 1) it can learn to compute and use witness-like information from different hops, in a way that can be generalized to graphs of different sizes; 2) it can use easily-matched node-pairs as new seeds to improve the matching in subsequent layers. We evaluate SeedGNN on synthetic and real-world graphs and demonstrate significant performance improvements over both non-learning and learning algorithms in the existing literature. Furthermore, our experiments confirm that the knowledge learned by SeedGNN from training graphs can be generalized to test graphs of different sizes and categories.</p>
28

BRAIN-COMPUTER INTERFACE FOR SUPERVISORY CONTROLS OF UNMANNED AERIAL VEHICLES

Abdelrahman Osama Gad (17965229) 15 February 2024 (has links)
<p dir="ltr">This research explored a solution to a high accident rate in remotely operating Unmanned Aerial Vehicles (UAVs) in a complex environment; it presented a new Brain-Computer Interface (BCI) enabled supervisory control system to fuse human and machine intelligence seamlessly. This study was highly motivated by the critical need to enhance the safety and reliability of UAV operations, where accidents often stemmed from human errors during manual controls. Existing BCIs confronted the challenge of trading off a fully remote control by humans and an automated control by computers. This study met such a challenge with the proposed supervisory control system to optimize human-machine collaboration, prioritizing safety, adaptability, and precision in operation.</p><p dir="ltr">The research work included designing, training, and testing BCI and the BCI-enabled control system. It was customized to control a UAV where the user’s motion intents and cognitive states were monitored to implement hybrid human and machine controls. The DJI Tello drone was used as an intelligent machine to illustrate the application of the proposed control system and evaluate its effectiveness through two case studies. The first case study was designed to train a subject and assess the confidence level for BCI in capturing and classifying the subject’s motion intents. The second case study illustrated the application of BCI in controlling the drone to fulfill its missions.</p><p dir="ltr">The proposed supervisory control system was at the forefront of cognitive state monitoring to leverage the power of an ML model. This model was innovative compared to conventional methods in that it could capture complicated patterns within raw EEG data and make decisions to adopt an ensemble learning strategy with the XGBoost. One of the key innovations was capturing the user’s intents and interpreting these into control commands using the EmotivBCI app. Despite the headset's predefined set of detectable features, the system could train the user’s mind to generate control commands for all six degrees of freedom of adapting to the quadcopter by creatively combining and extending mental commands, particularly in the context of the Yaw rotation. This strategic manipulation of commands showcased the system's flexibility in accommodating the intricate control requirements of an automated machine.</p><p dir="ltr">Another innovation of the proposed system was its real-time adaptability. The supervisory control system continuously monitors the user's cognitive state, allowing instantaneous adjustments in response to changing conditions. This innovation ensured that the control system was responsive to the user’s intent and adept at prioritizing safety through the arbitrating mechanism when necessary.</p>
29

State-of-health estimation by virtual experiments using recurrent decoder-encoder based lithium-ion digital battery twins trained on unstructured battery data

Schmitt, Jakob, Horstkötter, Ivo, Bäker, Bernard 15 March 2024 (has links)
Due to the large share of production costs, the lifespan of an electric vehicle’s (EV) lithium-ion traction battery should be as long as possible. The optimisation of the EV’s operating strategy with regard to battery life requires a regular evaluation of the battery’s state-of-health (SOH). Yet the SOH, the remaining battery capacity, cannot be measured directly through sensors but requires the elaborate conduction of special characterisation tests. Considering the limited number of test facilities as well as the rapidly growing number of EVs, time-efficient and scalable SOH estimation methods are urgently needed and are the object of investigation in this work. The developed virtual SOH experiment originates from the incremental capacity measurement and solely relies on the commonly logged battery management system (BMS) signals to train the digital battery twins. The first examined dataset with identical load profiles for new and aged battery state serves as proof of concept. The successful SOH estimation based on the second dataset that consists of varying load profiles with increased complexity constitutes a step towards the application on real driving cycles. Assuming that the load cycles contain pauses and start from the fully charged battery state, the SOH estimation succeeds either through a steady shift of the load sequences (variant one) with an average deviation of 0.36% or by random alignment of the dataset’s subsequences (variant two) with 1.04%. In contrast to continuous capacity tests, the presented framework does not impose restrictions to small currents. It is entirely independent of the prevailing and unknown ageing condition due to the application of battery models based on the novel encoder–decoder architecture and thus provides the cornerstone for a scalable and robust estimation of battery capacity on a pure data basis.
30

Preventing Health Data from Leaking in a Machine Learning System : Implementing code analysis with LLM and model privacy evaluation testing / Förhindra att Hälsodata Läcker ut i ett Maskininlärnings System : Implementering av kod analys med stor språk-modell och modell integritets testning

Janryd, Balder, Johansson, Tim January 2024 (has links)
Sensitive data leaking from a system can have tremendous negative consequences, such as discrimination, social stigma, and fraudulent economic consequences for those whose data has been leaked. Therefore, it’s of utmost importance that sensitive data is not leaked from a system. This thesis investigated different methods to prevent sensitive patient data from leaking in a machine learning system. Various methods have been investigated and evaluated based on previous research; the methods used in this thesis are a large language model (LLM) for code analysis and a membership inference attack on models to test their privacy level. The LLM code analysis results show that the Llama 3 (an LLM) model had an accuracy of 90% in identifying malicious code that attempts to steal sensitive patient data. The model analysis can evaluate and determine membership inference of sensitive patient data used for training in machine learning models, which is essential for determining data leakage a machine learning model can pose in machine learning systems. Further studies in increasing the deterministic and formatting of the LLM‘s responses must be investigated to ensure the robustness of the security system that utilizes LLMs before it can be deployed in a production environment. Further studies of the model analysis can apply a wider variety of evaluations, such as increased size of machine learning model types and increased range of attack testing types of machine learning models, which can be implemented into machine learning systems. / Känsliga data som läcker från ett system kan ha enorma negativa konsekvenser, såsom diskriminering, social stigmatisering och negativa ekonomiska konsekvenser för dem vars data har läckt ut. Därför är det av yttersta vikt att känsliga data inte läcker från ett system. Denna avhandling undersökte olika metoder för att förhindra att känsliga patientdata läcker ut ur ett maskininlärningssystem. Olika metoder har undersökts och utvärderats baserat på tidigare forskning; metoderna som användes i denna avhandling är en stor språkmodell (LLM) för kodanalys och en medlemskapsinfiltrationsattack på maskininlärnings (ML) modeller för att testa modellernas integritetsnivå. Kodanalysresultaten från LLM visar att modellen Llama 3 hade en noggrannhet på 90% i att identifiera skadlig kod som försöker stjäla känsliga patientdata. Modellanalysen kan utvärdera och bestämma medlemskap av känsliga patientdata som används för träning i maskininlärningsmodeller, vilket är avgörande för att bestämma den dataläckage som en maskininlärningsmodell kan exponera. Ytterligare studier för att öka determinismen och formateringen av LLM:s svar måste undersökas för att säkerställa robustheten i säkerhetssystemet som använder LLM:er innan det kan driftsättas i en produktionsmiljö. Vidare studier av modellanalysen kan tillämpa ytterligare bredd av utvärderingar, såsom ökad storlek på maskininlärningsmodelltyper och ökat utbud av attacktesttyper av maskininlärningsmodeller som kan implementeras i maskininlärningssystem.

Page generated in 0.071 seconds