Global ETD Search

31	Anonymization of directory-structured sensitive data / Anonymisering av katalogstrukturerad känslig data Folkesson, Carl January 2019 (has links) Data anonymization is a relevant and important field within data privacy, which tries to find a good balance between utility and privacy in data. The field is especially relevant since the GDPR came into force, because the GDPR does not regulate anonymous data. This thesis focuses on anonymization of directory-structured data, which means data structured into a tree of directories. In the thesis, four of the most common models for anonymization of tabular data, k-anonymity, ℓ-diversity, t-closeness and differential privacy, are adapted for anonymization of directory-structured data. This adaptation is done by creating three different approaches for anonymizing directory-structured data: SingleTable, DirectoryWise and RecursiveDirectoryWise. These models and approaches are compared and evaluated using five metrics and three attack scenarios. The results show that there is always a trade-off between utility and privacy when anonymizing data. Especially it was concluded that the differential privacy model when using the RecursiveDirectoryWise approach gives the highest privacy, but also the highest information loss. On the contrary, the k-anonymity model when using the SingleTable approach or the t-closeness model when using the DirectoryWise approach gives the lowest information loss, but also the lowest privacy. The differential privacy model and the RecursiveDirectoryWise approach were also shown to give best protection against the chosen attacks. Finally, it was concluded that the differential privacy model when using the RecursiveDirectoryWise approach, was the most suitable combination to use when trying to follow the GDPR when anonymizing directory-structured data. data anonymization data privacy directory-structured data k-anonymity l-diversity t-closeness differential privacy GDPR Computer Engineering Datorteknik
32	Test Data Extraction and Comparison with Test Data Generation Raza, Ali 01 August 2011 (has links) Testing an integrated information system that relies on data from multiple sources can be a challenge, particularly when the data is confidential. This thesis describes a novel test data extraction approach, called semantic-based test data extraction for integrated systems (iSTDE) that solves many of the problems associated with creating realistic test data for integrated information systems containing confidential data. iSTDE reads a consistent cross-section of data from the production databases, manipulates that data to obscure individual identities while still preserving overall semantic data characteristics that are critical to thorough system testing, and then moves that test data to an external test environment. This thesis also presents a theoretical study that compares test-data extraction with a competing technique, named test-data generation. Specifically, this thesis a) describes a comparison method that includes a comprehensive list of characteristics essential for testing the database applications organized into seven different areas, b) presents an analysis of the relative strengths and weaknesses of the different test-data creation techniques, and c) reports a number of specific conclusions that will help testers make appropriate choices. Data Integration Data Sensitization/Anonymization Health Informatics Software Engineering Test Data Extraction Testing Data-Centric Applications Computer Sciences
33	An anonymizable entity finder in judicial decisions Kazemi, Farzaneh January 2008 (has links) Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal Anonymisation Reconnaissance des entités nommées Désidentification Décisions de justice Entropie maximum Anonymization Named entity recognition De-identification Judicial decisions Maximum entropy
34	Návrh algoritmu pro anonymizaci ultrazvukových dat na úrovni snímku / Design of algorithm for anonymization of ultrasound data Bugnerová, Pavla January 2017 (has links) This master’s thesis is focused on anonymization of ultrasound data in DICOM format. Haar wavelet belonging to Daubechies wavelet family is used to detect text areas in the image. Extraction of the text from the image is done using a free tool - tesseract OCR Engine. Finally, detected text is compared to sensitive data from DICOM metadata using Levenshtein - edit distance algorithm.
35	Ähnlichkeitsmessung von ausgewählten Datentypen in Datenbanksystemen zur Berechnung des Grades der Anonymisierung Heinrich, Jan-Philipp, Neise, Carsten, Müller, Andreas January 2018 (has links) Es soll ein mathematisches Modell zur Berechnung von Abweichungen verschiedener Datentypen auf relationalen Datenbanksystemen eingeführt und getestet werden. Basis dieses Modells sind Ähnlichkeitsmessungen für verschiedene Datentypen. Hierbei führen wir zunächst eine Betrachtung der relevanten Datentypen für die Arbeit durch. Danach definieren wir für die für diese Arbeit relevanten Datentypen eine Algebra, welche die Grundlage zur Berechnung des Anonymisierungsgrades θ ist. Das Modell soll zur Messung des Grades der Anonymisierung, vor allem personenbezogener Daten, zwischen Test- und Produktionsdaten angewendet werden. Diese Messung ist im Zuge der Einführung der EU-DSGVO im Mai 2018 sinnvoll, und soll helfen personenbezogene Daten mit einem hohen Ähnlichkeitsgrad zu identifizieren. info:eu-repo/classification/ddc/000 ddc:000
36	Méthode et outil d’anonymisation des données sensibles / Method and tool for anonymization sensitive data Ben Fredj, Feten 03 July 2017 (has links) L’anonymisation des données personnelles requiert l’utilisation d’algorithmes complexes permettant de minimiser le risque de ré-identification tout en préservant l’utilité des données. Dans cette thèse, nous décrivons une approche fondée sur les modèles qui guide le propriétaire des données dans son processus d’anonymisation. Le guidage peut être informatif ou suggestif. Il permet de choisir l’algorithme le plus pertinent en fonction des caractéristiques des données mais aussi de l’usage ultérieur des données anonymisées. Le guidage a aussi pour but de définir les bons paramètres à appliquer à l’algorithme retenu. Dans cette thèse, nous nous focalisons sur les algorithmes de généralisation de micro-données. Les connaissances liées à l’anonymisation tant théoriques qu’expérimentales sont stockées dans une ontologie. / Personal data anonymization requires complex algorithms aiming at avoiding disclosure risk without losing data utility. In this thesis, we describe a model-driven approach guiding the data owner during the anonymization process. The guidance may be informative or suggestive. It helps the data owner in choosing the most relevant algorithm given the data characteristics and the future usage of anonymized data. The guidance process also helps in defining the best input values for the algorithms. In this thesis, we focus on generalization algorithms for micro-data. The knowledge about anonymization is composed of both theoretical aspects and experimental results. It is managed thanks to an ontology. Protection de la vie privée Anonymisation Guidage Approche guidée par les modèles Ontologie Privacy Anonymization Guidelines Model-driven approach Ontology 005.8 302.231
37	Anonymizing Faces without Destroying Information Rosberg, Felix January 2024 (has links) Anonymization is a broad term. Meaning that personal data, or rather data that identifies a person, is redacted or obscured. In the context of video and image data, the most palpable information is the face. Faces barely change compared to other aspect of a person, such as cloths, and we as people already have a strong sense of recognizing faces. Computers are also adroit at recognizing faces, with facial recognition models being exceptionally powerful at identifying and comparing faces. Therefore it is generally considered important to obscure the faces in video and image when aiming for keeping it anonymized. Traditionally this is simply done through blurring or masking. But this de- stroys useful information such as eye gaze, pose, expression and the fact that it is a face. This is an especial issue, as today our society is data-driven in many aspects. One obvious such aspect is autonomous driving and driver monitoring, where necessary algorithms such as object-detectors rely on deep learning to function. Due to the data hunger of deep learning in conjunction with society’s call for privacy and integrity through regulations such as the General Data Protection Regularization (GDPR), anonymization that preserve useful information becomes important. This Thesis investigates the potential and possible limitation of anonymizing faces without destroying the aforementioned useful information. The base approach to achieve this is through face swapping and face manipulation, where the current research focus on changing the face (or identity) while keeping the original attribute information. All while being incorporated and consistent in an image and/or video. Specifically, will this Thesis demonstrate how target-oriented and subject-agnostic face swapping methodologies can be utilized for realistic anonymization that preserves attributes. Thru this, this Thesis points out several approaches that is: 1) controllable, meaning the proposed models do not naively changes the identity. Meaning that what kind of change of identity and magnitude is adjustable, thus also tunable to guarantee anonymization. 2) subject-agnostic, meaning that the models can handle any identity. 3) fast, meaning that the models is able to run efficiently. Thus having the potential of running in real-time. The end product consist of an anonymizer that achieved state-of-the-art performance on identity transfer, pose retention and expression retention while providing a realism. Apart of identity manipulation, the Thesis demonstrate potential security issues. Specifically reconstruction attacks, where a bad-actor model learns convolutional traces/patterns in the anonymized images in such a way that it is able to completely reconstruct the original identity. The bad-actor networks is able to do this with simple black-box access of the anonymization model by constructing a pair-wise dataset of unanonymized and anonymized faces. To alleviate this issue, different defense measures that disrupts the traces in the anonymized image was investigated. The main take away from this, is that naively using what qualitatively looks convincing of hiding an identity is not necessary the case at all. Making robust quantitative evaluations important. Anonymization Data Privacy Generative AI Reconstruction Attacks Deep Fakes Facial Recognition Identity Tracking Biometrics Signal Processing Signalbehandling
38	Utveckling av en anonymiseringsprototyp för säker interaktion med chatbotar Hanna, John Nabil, Berjlund, William January 2024 (has links) I denna studie presenteras en prototyp för anonymisering av känslig information itextdokument, med syfte att möjliggöra säker interaktion med stora språkmodeller(LLM:er), såsom ChatGPT. Prototypen erbjuder en plattform där användare kanladda upp dokument för att anonymisera specifika känsliga ord. Efter anonymiseringkan användare ställa frågor till ChatGPT baserat på det anonymiserade innehållet.Prototypen återställer de anonymiserade delarna i svaren från ChatGPT innan de visas för användaren, vilket säkerställer att känslig information förblir skyddad underhela interaktionen.I studien används metoden Design Science Research in Information Systems (DSRIS). Prototypen utvecklas i Java och testas med påhittade dokument, medan enkätsvar samlasin för att utvärdera användarupplevelsen.Resultaten visar att prototypens funktioner fungerar väl och skyddar känslig information vid interaktionen med ChatGPT. Prototypen har utvärderats med hjälp av svarfrån enkäten som dessutom tar upp förbättringsmöjligheter.Avslutningsvis visar studien att det är möjligt att anonymisera textdokument effektivt och samtidigt få korrekt och användbar feedback från ChatGPT. Trots vissa begränsningar i användargränssnittet på grund av tidsramen visar studien på potentialför säker datahantering med ChatGPT. / This study presents a prototype for anonymizing sensitive information in text documents, with the aim of enabling secure interactions with large language models(LLMs) such as ChatGPT. The prototype offers a platform where users can uploaddocuments to anonymize specific sensitive words. After anonymization, users canpose questions to ChatGPT based on the anonymized content. The prototype restores the anonymized parts in the responses from ChatGPT before they are displayed to the user, ensuring that sensitive information remains protected throughoutthe entire interaction.The study uses the Design Science Research in Information Systems (DSRIS)method. The prototype is developed in Java and tested with fabricated documents,while survey responses were collected to evaluate the user experience.The results show that the prototype's functionalities work well and protect sensitiveinformation during interaction with ChatGPT. The prototype has been evaluated using survey responses that also highlight opportunities for improvement.In conclusion, the study demonstrates that it is possible to effectively anonymizetext documents while obtaining accurate and useful feedback from ChatGPT. Despite some limitations in the user interface due to the timeframe, the study showspotential for secure data handling with ChatGPT. ChatGPT LLM anonymization prompt engineering DSRIS Java ChatGPT LLM anonymisering prompt engineering DSRIS Java Computer Sciences Datavetenskap (datalogi)
39	La protection des données personnelles contenues dans les documents publics accessibles sur Internet : le cas des données judiciaires Duaso Calés, Rosario 12 1900 (has links) "Mémoire présenté à la faculté des études supérieures en vue de l'obtention du grade de maître en droit (LL.M.)" / Les bouleversements engendrés par les nouveaux moyens de communication des données publiques de même que les multiples possibilités offertes par le réseau Internet, telles que le stockage des informations, la mémoire sans faille et l'utilisation des moteurs de recherche, présentent des enjeux majeurs liés à la protection de la vie privée. La diffusion des données publiques en support numérique suscite un changement d'échelle dans le temps et dans l'espace et elle modifie le concept classique de publicité qui existait dans l'univers papier. Nous étudierons les moyens de respecter le droit à la vie privée et les conditions d'accès et d'utilisation des données personnelles, parfois à caractère sensible, contenues dans les documents publics diffusés sur Internet. Le cas particulier des données accessibles dans les banques de données judiciaires exige des solutions particulières : il s'agit de trouver l'équilibre nécessaire entre le principe de transparence judiciaire et le droit à la vie privée. / The upheavals generated by the new means of disseminating public data, together with the multiple possibilities offered by the Internet, such as information storage, comprehensive memory tools and the use of search engines, give rise to major issues related to privacy protection. The dissemination of public data in digital format causes a shift in our scales of time and space, and changes the traditional concept ofpublic nature previously associated with the "paper" universe. We will study the means of protecting privacy, and the conditions for accessing and using the personal information, sometimes of a "sensitive" nature, which is contained in the public documents posted on the Internet. The characteristics of the information available through judicial data banks require special protection solutions, so that the necessary balance can be found between the principle of judicial transparency and the right to privacy. Diffusion Banques de données jurisprudentielles Droit à l'oubli Anonymisation Vie privée Publicité de la justice Dissemination Personal information Internet Public documents Jurisprudential data banks Social forget fullness Anonymization Privacy Public nature of justice
40	La protection des données personnelles contenues dans les documents publics accessibles sur Internet : le cas des données judiciaires Duaso Calés, Rosario 12 1900 (has links) Les bouleversements engendrés par les nouveaux moyens de communication des données publiques de même que les multiples possibilités offertes par le réseau Internet, telles que le stockage des informations, la mémoire sans faille et l'utilisation des moteurs de recherche, présentent des enjeux majeurs liés à la protection de la vie privée. La diffusion des données publiques en support numérique suscite un changement d'échelle dans le temps et dans l'espace et elle modifie le concept classique de publicité qui existait dans l'univers papier. Nous étudierons les moyens de respecter le droit à la vie privée et les conditions d'accès et d'utilisation des données personnelles, parfois à caractère sensible, contenues dans les documents publics diffusés sur Internet. Le cas particulier des données accessibles dans les banques de données judiciaires exige des solutions particulières : il s'agit de trouver l'équilibre nécessaire entre le principe de transparence judiciaire et le droit à la vie privée. / The upheavals generated by the new means of disseminating public data, together with the multiple possibilities offered by the Internet, such as information storage, comprehensive memory tools and the use of search engines, give rise to major issues related to privacy protection. The dissemination of public data in digital format causes a shift in our scales of time and space, and changes the traditional concept ofpublic nature previously associated with the "paper" universe. We will study the means of protecting privacy, and the conditions for accessing and using the personal information, sometimes of a "sensitive" nature, which is contained in the public documents posted on the Internet. The characteristics of the information available through judicial data banks require special protection solutions, so that the necessary balance can be found between the principle of judicial transparency and the right to privacy. / "Mémoire présenté à la faculté des études supérieures en vue de l'obtention du grade de maître en droit (LL.M.)" Diffusion Banques de données jurisprudentielles Droit à l'oubli Anonymisation Vie privée Publicité de la justice Dissemination Personal information Internet Public documents Jurisprudential data banks Social forget fullness Anonymization Privacy Public nature of justice

Search results