Global ETD Search

91	Understanding the Impact of Cloud-Based Shadow IT on Employee and IT-Manager Perceptions in the Swedish Tech Industry Fager, Adam January 2023 (has links) This study focuses on the impact of Cloud-Based Shadow IT on data privacy in the tech sector of Sweden. It explores the use of unapproved applications by employees without the knowledge and control of the IT department. The objective is to understand how Cloud-Based Shadow IT affects employees' compliance with cloud services and to examine the understanding of IT managers regarding this phenomenon. The research problem addresses the challenges faced in ensuring compliance with regulations and effective utilization of cloud technology. By identifying the strengths, weaknesses, possibilities, and risks associated with Cloud-Based Shadow IT, this study aimed to provide insights for companies and IT managers in making informed decisions. It explores the relationship between Shadow IT and cloud services and investigates employees' and IT managers' adherence to and understanding of these issues. The findings indicate that employees have varying levels of understanding, with limited knowledge of approved cloud services. Managers prioritize security concerns, including data compliance and ownership, but lack strategies to address knowledge gaps. The use of Cloud-Based Shadow IT has both positive and negative consequences, including increased productivity and collaboration but also data loss and non-compliance risks. Factors such as education and awareness of security risks are important for employees to understand and comply with policies. Overall, the study highlights the need for continuous education and awareness programs to improve understanding and decision-making regarding cloud services and Shadow IT. Cloud-Based Shadow IT Shadow IT Swedish Tech Sector Data Privacy Managers Employees End User Computation Workaround Personal IT Third-Party Access Information Systems, Social aspects
92	Preventing Health Data from Leaking in a Machine Learning System : Implementing code analysis with LLM and model privacy evaluation testing / Förhindra att Hälsodata Läcker ut i ett Maskininlärnings System : Implementering av kod analys med stor språk-modell och modell integritets testning Janryd, Balder, Johansson, Tim January 2024 (has links) Sensitive data leaking from a system can have tremendous negative consequences, such as discrimination, social stigma, and fraudulent economic consequences for those whose data has been leaked. Therefore, it’s of utmost importance that sensitive data is not leaked from a system. This thesis investigated different methods to prevent sensitive patient data from leaking in a machine learning system. Various methods have been investigated and evaluated based on previous research; the methods used in this thesis are a large language model (LLM) for code analysis and a membership inference attack on models to test their privacy level. The LLM code analysis results show that the Llama 3 (an LLM) model had an accuracy of 90% in identifying malicious code that attempts to steal sensitive patient data. The model analysis can evaluate and determine membership inference of sensitive patient data used for training in machine learning models, which is essential for determining data leakage a machine learning model can pose in machine learning systems. Further studies in increasing the deterministic and formatting of the LLM‘s responses must be investigated to ensure the robustness of the security system that utilizes LLMs before it can be deployed in a production environment. Further studies of the model analysis can apply a wider variety of evaluations, such as increased size of machine learning model types and increased range of attack testing types of machine learning models, which can be implemented into machine learning systems. / Känsliga data som läcker från ett system kan ha enorma negativa konsekvenser, såsom diskriminering, social stigmatisering och negativa ekonomiska konsekvenser för dem vars data har läckt ut. Därför är det av yttersta vikt att känsliga data inte läcker från ett system. Denna avhandling undersökte olika metoder för att förhindra att känsliga patientdata läcker ut ur ett maskininlärningssystem. Olika metoder har undersökts och utvärderats baserat på tidigare forskning; metoderna som användes i denna avhandling är en stor språkmodell (LLM) för kodanalys och en medlemskapsinfiltrationsattack på maskininlärnings (ML) modeller för att testa modellernas integritetsnivå. Kodanalysresultaten från LLM visar att modellen Llama 3 hade en noggrannhet på 90% i att identifiera skadlig kod som försöker stjäla känsliga patientdata. Modellanalysen kan utvärdera och bestämma medlemskap av känsliga patientdata som används för träning i maskininlärningsmodeller, vilket är avgörande för att bestämma den dataläckage som en maskininlärningsmodell kan exponera. Ytterligare studier för att öka determinismen och formateringen av LLM:s svar måste undersökas för att säkerställa robustheten i säkerhetssystemet som använder LLM:er innan det kan driftsättas i en produktionsmiljö. Vidare studier av modellanalysen kan tillämpa ytterligare bredd av utvärderingar, såsom ökad storlek på maskininlärningsmodelltyper och ökat utbud av attacktesttyper av maskininlärningsmodeller som kan implementeras i maskininlärningssystem. Sensitive Data Machine Learning (ML) Large Language Model (LLM) Code Analysis Llama 3 Data Privacy Membership Inference Attack (MIA) Känsliga Data Maskininlärning (ML) Stor Språkmodell (LLM) Kodanalys Llama 3 Datasekretess Medlemskapsinfiltrationsattack (MIA) Computer Sciences Datavetenskap (datalogi)
93	Lite-Agro: Integrating Federated Learning and TinyML on IoAT-Edge for Plant Disease Classification Dockendorf, Catherine April 05 1900 (has links) Lite-Agro studies applications of TinyML in pear (Pyrus communis) tree disease identification and explores hardware implementations with an ESP32 microcontroller. The study works with the DiaMOS Pear Dataset to learn through image analysis whether the leaf is healthy or not, and classifies it according to curl, healthy, spot or slug categories. The system is designed as a low cost and light-duty computing detection edge solution that compares models such as InceptionV3, XceptionV3, EfficientNetB0, and MobileNetV2. This work also researches integration with federated learning frameworks and provides an introduction to federated averaging algorithms. Computer Science Engineering, System Science Agriculture, Plant Pathology
94	Measuring the Utility of Synthetic Data : An Empirical Evaluation of Population Fidelity Measures as Indicators of Synthetic Data Utility in Classification Tasks / Mätning av Användbarheten hos Syntetiska Data : En Empirisk Utvärdering av Population Fidelity mätvärden som Indikatorer på Syntetiska Datas Användbarhet i Klassifikationsuppgifter Florean, Alexander January 2024 (has links) In the era of data-driven decision-making and innovation, synthetic data serves as a promising tool that bridges the need for vast datasets in machine learning (ML) and the imperative necessity of data privacy. By simulating real-world data while preserving privacy, synthetic data generators have become more prevalent instruments in AI and ML development. A key challenge with synthetic data lies in accurately estimating its utility. For such purpose, Population Fidelity (PF) measures have shown to be good candidates, a category of metrics that evaluates how well the synthetic data mimics the general distribution of the original data. With this setting, we aim to answer: "How well are different population fidelity measures able to indicate the utility of synthetic data for machine learning based classification models?" We designed a reusable six-step experiment framework to examine the correlation between nine PF measures and the performance of four ML for training classification models over five datasets. The six-step approach includes data preparation, training, testing on original and synthetic datasets, and PF measures computation. The study reveals non-linear relationships between the PF measures and synthetic data utility. The general analysis, meaning the monotonic relationship between the PF measure and performance over all models, yielded at most moderate correlations, where the Cluster measure showed the strongest correlation. In the more granular model-specific analysis, Random Forest showed strong correlations with three PF measures. The findings show that no PF measure shows a consistently high correlation over all models to be considered a universal estimator for model performance.This highlights the importance of context-aware application of PF measures and sets the stage for future research to expand the scope, including support for a wider range of types of data and integrating privacy evaluations in synthetic data assessment. Ultimately, this study contributes to the effective and reliable use of synthetic data, particularly in sensitive fields where data quality is vital. / I eran av datadriven beslutsfattning och innovation, fungerar syntetiska data som ett lovande verktyg som bryggar behovet av omfattande dataset inom maskininlärning (ML) och nödvändigheten för dataintegritet. Genom att simulera verklig data samtidigt som man bevarar integriteten, har generatorer av syntetiska data blivit allt vanligare verktyg inom AI och ML-utveckling. En viktig utmaning med syntetiska data är att noggrant uppskatta dess användbarhet. För detta ändamål har mått under kategorin Populations Fidelity (PF) visat sig vara goda kandidater, det är mätvärden som utvärderar hur väl syntetiska datan efterliknar den generella distributionen av den ursprungliga datan. Med detta i åtanke strävar vi att svara på följande: Hur väl kan olika population fidelity mätvärden indikera användbarheten av syntetisk data för maskininlärnings baserade klassifikationsmodeller? För att besvara frågan har vi designat ett återanvändbart sex-stegs experiment ramverk, för att undersöka korrelationen mellan nio PF-mått och prestandan hos fyra ML klassificeringsmodeller, på fem dataset. Sex-stegs strategin inkluderar datatillredning, träning, testning på både ursprungliga och syntetiska dataset samt beräkning av PF-mått. Studien avslöjar förekommandet av icke-linjära relationer mellan PF-måtten och användbarheten av syntetiska data. Den generella analysen, det vill säga den monotona relationen mellan PF-måttet och prestanda över alla modeller, visade som mest medelmåttiga korrelationer, där Cluster-måttet visade den starkaste korrelationen. I den mer detaljerade, modell-specifika analysen visade Random Forest starka korrelationer med tre PF-mått. Resultaten visar att inget PF-mått visar konsekvent hög korrelation över alla modeller för att betraktas som en universell indikator för modellprestanda. Detta understryker vikten av kontextmedveten tillämpning av PF-mått och banar väg för framtida forskning för att utöka omfånget, inklusive stöd för ett bredare utbud för data av olika typer och integrering av integritetsutvärderingar i bedömningen av syntetiska data. Därav, så bidrar denna studie till effektiv och tillförlitlig användning av syntetiska data, särskilt inom känsliga områden där datakvalitet är avgörande. Synthetic Data Machine Learning Population Fidelity Measures Utility Metrics Synthetic Data Quality Evaluation Classification Algorithms Utility Estimation Data Privacy Artificial Intelligence Experiment Framework Model Performance Assessment Syntetisk Data Maskininlärning Population Fidelity Mätvärden Användbarhetsmätvärden Kvalitetsutvärdering av Syntetisk Data Klassificeringsalgoritmer Användbarhetsutvärdering Dataintegritet Artificiell Intelligens AI Experiment Ramverk Utvärdering av Modellprestanda Computer Sciences Datavetenskap (datalogi)
95	Improving Deep Learning-based Object Detection Algorithms for Omnidirectional Images by Simulated Data Scheck, Tobias 08 August 2024 (has links) Perception, primarily through vision, is a vital human ability that informs decision-making and interactions with the world. Computer Vision, the field dedicated to emulating this human capability in computers, has witnessed transformative progress with the advent of artificial intelligence, particularly neural networks and deep learning. These technologies enable automatic feature learning, eliminating the need for laborious hand-crafted features. The increasing global demand for artificial intelligence applications across various industries, however, raises concerns about data privacy and access. This dissertation addresses these challenges by proposing solutions that leverage synthetic data to preserve privacy and enhance the robustness of computer vision algorithms. The primary objective of this dissertation is to reduce the dependence on real data for modern image processing algorithms by utilizing synthetic data generated through computer simulations. Synthetic data serves as a privacy-preserving alternative, enabling the generation of data in scenarios that are difficult or unsafe to replicate in the real world. While purely simulated data falls short of capturing the full complexity of reality, the dissertation explores methods to bridge the gap between synthetic and real data. The dissertation encompasses a comprehensive evaluation of the synthetic THEODORE dataset, focusing on object detection using Convolutional Neural Networks. Fine-tuning CNN architectures with synthetic data demonstrates remarkable performance improvements over relying solely on real-world data. Extending beyond person recognition, these architectures exhibit the ability to recognize various objects in real-world settings. This work also investigates real-time performance and the impact of barrel distortion in omnidirectional images, underlining the potential of using synthetic data. Furthermore, the dissertation introduces two unsupervised domain adaptation methods tailored for anchorless object detection within the CenterNet architecture. The methods effectively reduce the domain gap when synthetic omnidirectional images serve as the source domain, and real images act as the target domain. Qualitative assessments highlight the advantages of these methods in reducing noise and enhancing detection accuracy. The dissertation concludes with creating an application within the Ambient Assisted Living context to realize the concepts. This application encompasses indoor localization heatmaps, human pose estimation, and activity recognition. The methodology leverages synthetically generated data, unique object identifiers, and rotated bounding boxes to enhance tracking in omnidirectional images. Importantly, the system is designed to operate without compromising privacy or using sensitive images, aligning with the growing concerns of data privacy and access in artificial intelligence applications. / Die Wahrnehmung, insbesondere durch das Sehen, ist eine entscheidende menschliche Fähigkeit, die die Entscheidungsfindung und die Interaktion mit der Welt beeinflusst. Die Computer Vision, das Fachgebiet, das sich der Nachahmung dieser menschlichen Fähigkeit in Computern widmet, hat mit dem Aufkommen künstlicher Intelligenz, insbesondere neuronaler Netzwerke und tiefem Lernen, eine transformative Entwicklung erlebt. Diese Technologien ermöglichen das automatische Erlernen von Merkmalen und beseitigen die Notwendigkeit mühsamer, handgefertigter Merkmale. Die steigende weltweite Nachfrage nach Anwendungen künstlicher Intelligenz in verschiedenen Branchen wirft jedoch Bedenken hinsichtlich des Datenschutzes und des Datenzugriffs auf. Diese Dissertation begegnet diesen Herausforderungen, indem sie Lösungen vorschlägt, die auf synthetischen Daten basieren, um die Privatsphäre zu wahren und die Robustheit von Computer-Vision Algorithmen zu steigern. Das Hauptziel dieser Dissertation besteht darin, die Abhängigkeit von realen Daten für moderne Bildverarbeitungsalgorithmen durch die Verwendung von synthetischen Daten zu reduzieren, die durch Computersimulationen generiert werden. Synthetische Daten dienen als datenschutzfreundliche Alternative und ermöglichen die Generierung von Daten in Szenarien, die schwer oder unsicher in der realen Welt nachzustellen sind. Obwohl rein simulierte Daten die volle Komplexität der Realität nicht erfassen, erforscht die Dissertation Methoden zur Überbrückung der Kluft zwischen synthetischen und realen Daten. Die Dissertation umfasst eine Evaluation des synthetischen THEODORE-Datensatzes mit dem Schwerpunkt auf der Objekterkennung mithilfe von Convolutional Neural Networks. Das Feinabstimmen dieser Architekturen mit synthetischen Daten zeigt bemerkenswerte Leistungssteigerungen im Vergleich zur ausschließlichen Verwendung von realen Daten. Über die Erkennung von Personen hinaus zeigen diese Architekturen die Fähigkeit, verschiedene Objekte in realen Umgebungen zu erkennen. Untersucht wird auch die Echtzeit-Performance und der Einfluss der tonnenförmigen Verzerrung in omnidirektionalen Bildern und betont das Potenzial der Verwendung synthetischer Daten. Darüber hinaus führt die Dissertation zwei nicht überwachte Domänenanpassungsmethoden ein, die speziell für die ankerlose Objekterkennung in der CenterNetArchitektur entwickelt wurden. Die Methoden reduzieren effektiv die Domänenlücke, wenn synthetische omnidirektionale Bilder als Quelldomäne und reale Bilder als Zieldomäne dienen. Qualitative Bewertungen heben die Vorteile dieser Methoden bei der Reduzierung von Störungen und der Verbesserung der Erkennungsgenauigkeit hervor. Die Dissertation schließt mit der Entwicklung einer Anwendung im Kontext von Ambient Assisted Living zur Umsetzung der Konzepte. Diese Anwendung umfasst Innenlokalisierungskarten, die Schätzung der menschlichen Körperhaltung und die Erkennung von Aktivitäten. Die Methodologie nutzt synthetisch generierte Daten, eindeutige Objektidentifikatoren und rotierte Begrenzungsrahmen, um die Verfolgung in omnidirektionalen Bildern zu verbessern. Wichtig ist, dass das System entwickelt wurde, um ohne Beeinträchtigung der Privatsphäre oder Verwendung sensibler Bilder zu arbeiten, was den wachsenden Bedenken hinsichtlich des Datenschutzes und des Zugriffs auf Daten in Anwendungen künstlicher Intelligenz entspricht. info:eu-repo/classification/ddc/621.3 ddc:621.3

Page generated in 0.0564 seconds