Global ETD Search

71	Research in methods for achieving secure voice anonymization : Evaluation and improvement of voice anonymization techniques for whistleblowing / Forskning i metoder för säker röstanonymisering : Utvärdering och förbättring av röstanonymiseringstekniker för visselblåsning Hellman, Erik, Nordstrand, Mattias January 2022 (has links) Safe whistleblowing within companies could give a more transparent and open society, and keeping the whistleblower safe is key, this has led to a new EU Whistleblowing Directive requiring each organization with more than 249 employees to provide an internal channel for whistleblowing before 17 July 2022. A whistleblowing service within an entity should provide secure communication for the organization and its employees. One way to make whistleblowing more accessible is by providing a service for verbal reporting, for example by recording and sending voice messages. However, ensuring that the speaker is secure and can feel anonymous can be difficult since speech varies between individuals - different accents, pitch, or the speed of the voice are examples of factors that a speaker can be identified by. Common ways of voice anonymization, that you hear on the news for example, can often be backtracked or in other ways be deanonymized such that the speaker’s identity is revealed, especially for people who know the speaker. Today we have many developing technologies, such as machine learning, which could be used to greatly improve anonymity or deanonymization. However, greater anonymity is often costly with regard to the intelligibility and sometimes the naturalness of the voice content. Therefore, we studied and evaluated a number of anonymization methods with respect to anonymity, intelligibility, and overall user-friendliness. The aim of this was to map what anonymization methods are suitable for whistleblowing and implement proof of concepts of such an anonymizer. The results show differences between anonymization methods and that some perform better than others, but in different ways. Different methods should be selected depending on the perceived threat. We designed working proof of concepts that can be used in a whistleblowing service and present when respective solutions could be used. Our work shows ways for securer whistleblowing and will be a basis for future work and implementation for the host company Nebulr. / Säker visselblåsning inom företag skulle kunna ge ett mer transparent och öppet samhälle, och att hålla visselblåsaren säker är fundamentalt viktigt, varpå ett nytt EU-direktiv för visselblåsning har formats. Detta direktiv kräver att varje verksamhet med fler än 249 anställda tillhandahåller en intern kanal för visselblåsning före den 17 juli 2022. En tjänst för visselblåsning inom en verksamhet bör tillhandahålla trygg kommunikation för organisationen och dess anställda. Ett sätt att göra visselblåsning mer tillgängligt är genom att tillhandahålla en tjänst för muntligt rapportering, till exempel genom att spela in och skicka röstmeddelanden. Att se till att talaren kan känna sig anonym och trygg kan dock vara svårt eftersom tal skiljer sig mellan individer – olika dialekter, tonhöjd eller tempo är exempel på faktorer som man kan identifieras genom. Vanliga sätt att anonymisera rösten, som man till exempel hör på nyheterna, kan ofta spåras tillbaka eller på andra sätt deanonymiseras så att identiteten avslöjas, särskilt för personer som känner talaren. Idag har vi många teknologier som fortfarande utvecklas och förbättras i det växande området information och kommunikationsteknik, exempelvis maskininlärning, som kan användas för att förbättra anonymiteten. Men mer anonymitet kommer ofta på bekostnad av förståeligheten och ibland röstens naturlighet. Därför studerade och utvärderade vi olika anonymiseringsmetoder utifrån anonymitet, förståelighet och användarvänlighet överlag. Syftet med detta var att kartlägga vilka anonymiseringsmetoder som är lämpliga för visselblåsning och implementera proof of concepts av anonymiserare. Vårt resultat visar på skillnader mellan olika anonymiseringsmetoder och att vissa metoder presterar bättre än andra, men på olika sätt. Olika metoder bör användas beroende på den upplevda hotbilden och vad man eftersträvar. Vi skapade proof-of-concepts för de metoder vi undersökt och beskriver när och för vilka situationer som respektive metod skulle kunna användas. Vårt arbete visar hur man kan uppnå säkrare visselblåsning och kommer att ligga till grund för framtida utveckling och implementering för företaget Nebulr. Voice anonymization voice deanonymization data privacy machine learning neural networks whistleblowing röstanonymisering röstdeanonymisering datasäkerhet maskininlärning neurala nätverk visselblåsning Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik
72	Data Privacy Laws & Social Media Governance : A comparative analysis of Tik Tok & Meta/Facebook using EU, US, and China’s Data Privacy Laws V. L. Aitken, Robin January 2023 (has links) This academic investigation examines the intersection between International Relations (IR), data privacy laws, and social media governance. The case studies are of Tik Tok & Facebook/Meta where we begin the research by taking a comparative analysis of the US congressional hearings of Mark Zuckerberg (CEO of Facebook/Meta) in 2018 and Shou Zi Chew (CEO of Tik Tok) in 2023. In addition, a comparative analysis is made on the referenced or related data privacy laws amongst the EU, US, and China through the realist, liberalist, and new constructivist lens. Lastly, a practice theory approach follows the empirical data of penalties on these two social media companies, the similar corporate solutions from Tik Tok to both EU and the US, and the quantifiable lobbying contributions from both ByteDance (parent company to Tik Tok) and Facebook/Meta. The three research questions are (i) how do the IR theoretical lenses of realism and liberalism acknowledge the significance of social media and its need for regulations, (ii) how can data be conceptualized for social media governance and what are the implications of these dynamics within IR, and (iii) what is the efficacy of data privacy laws in protecting user rights while social media companies influence US policymakers? My conclusion is that data privacy laws are legal jargon that either power-maximize a state or act as a taxation mechanism. User rights are not secure, and there is a battle for accountability and a risk that algorithms may keep social media companies abstained from responsibility. / <p>New Data was released where Meta was fined $1.2 Billion by the EU. It does not take anything away from the paper but is relevant to the systemic act of data protection. Also, no exact definition of governance is given. For following papers that use this, it's important to be defined depending on how they interpet it but this paper didn't need a deep definition.</p> data privacy laws liberalism new constructivism realism social media governance Social Sciences Samhällsvetenskap Political Science Statsvetenskap Law Juridik Media and Communications Medie- och kommunikationsvetenskap
73	Decentralized Large-Scale Natural Language Processing Using Gossip Learning / Decentraliserad Storskalig Naturlig Språkbehandling med Hjälp av Skvallerinlärning Alkathiri, Abdul Aziz January 2020 (has links) The field of Natural Language Processing in machine learning has seen rising popularity and use in recent years. The nature of Natural Language Processing, which deals with natural human language and computers, has led to the research and development of many algorithms that produce word embeddings. One of the most widely-used of these algorithms is Word2Vec. With the abundance of data generated by users and organizations and the complexity of machine learning and deep learning models, performing training using a single machine becomes unfeasible. The advancement in distributed machine learning offers a solution to this problem. Unfortunately, due to reasons concerning data privacy and regulations, in some real-life scenarios, the data must not leave its local machine. This limitation has lead to the development of techniques and protocols that are massively-parallel and data-private. The most popular of these protocols is federated learning. However, due to its centralized nature, it still poses some security and robustness risks. Consequently, this led to the development of massively-parallel, data private, decentralized approaches, such as gossip learning. In the gossip learning protocol, every once in a while each node in the network randomly chooses a peer for information exchange, which eliminates the need for a central node. This research intends to test the viability of gossip learning for large- scale, real-world applications. In particular, it focuses on implementation and evaluation for a Natural Language Processing application using gossip learning. The results show that application of Word2Vec in a gossip learning framework is viable and yields comparable results to its non-distributed, centralized counterpart for various scenarios, with an average loss on quality of 6.904%. / Fältet Naturlig Språkbehandling (Natural Language Processing eller NLP) i maskininlärning har sett en ökande popularitet och användning under de senaste åren. Naturen av Naturlig Språkbehandling, som bearbetar naturliga mänskliga språk och datorer, har lett till forskningen och utvecklingen av många algoritmer som producerar inbäddningar av ord. En av de mest använda av dessa algoritmer är Word2Vec. Med överflödet av data som genereras av användare och organisationer, komplexiteten av maskininlärning och djupa inlärningsmodeller, blir det omöjligt att utföra utbildning med hjälp av en enda maskin. Avancemangen inom distribuerad maskininlärning erbjuder en lösning på detta problem, men tyvärr får data av sekretesskäl och datareglering i vissa verkliga scenarier inte lämna sin lokala maskin. Denna begränsning har lett till utvecklingen av tekniker och protokoll som är massivt parallella och dataprivata. Det mest populära av dessa protokoll är federerad inlärning (federated learning), men på grund av sin centraliserade natur utgör det ändock vissa säkerhets- och robusthetsrisker. Följaktligen ledde detta till utvecklingen av massivt parallella, dataprivata och decentraliserade tillvägagångssätt, såsom skvallerinlärning (gossip learning). I skvallerinlärningsprotokollet väljer varje nod i nätverket slumpmässigt en like för informationsutbyte, vilket eliminerarbehovet av en central nod. Syftet med denna forskning är att testa livskraftighetenav skvallerinlärning i större omfattningens verkliga applikationer. I synnerhet fokuserar forskningen på implementering och utvärdering av en NLP-applikation genom användning av skvallerinlärning. Resultaten visar att tillämpningen av Word2Vec i en skvallerinlärnings ramverk är livskraftig och ger jämförbara resultat med dess icke-distribuerade, centraliserade motsvarighet för olika scenarier, med en genomsnittlig kvalitetsförlust av 6,904%. gossip learning decentralized machine learning distributed machine learning NLP Word2Vec data privacy skvallerinlärning decentraliserad maskininlärning distribuerad maskininlärning naturlig språkbehandling Word2Vec dataintegritet Computer and Information Sciences Data- och informationsvetenskap
74	ANTECEDENTS AND OUTCOMES OF PERCEIVED CREEPINESS IN ONLINE PERSONALIZED COMMUNICATIONS Stevens, Arlonda M. 01 June 2016 (has links) No description available. Marketing Information Science Management Mass Media Creepy Creepy Marketing Personalized Communication Transparency Control Creepy Quadrant Online Information Privacy Concerns Online Behavioral Advertising Data Privacy Trust Customer Satisfaction
75	Exploring User Trust in Natural Language Processing Systems : A Survey Study on ChatGPT Users Aronsson Bünger, Morgan January 2024 (has links) ChatGPT has become a popular technology among people and gained a considerable user base, because of its power to effectively generate responses to users requests. However, as ChatGPT’s popularity has grown and as other natural language processing systems (NLPs) are being developed and adopted, several concerns have been raised about the technology that could have implications on user trust. Because trust plays a central role in user willingness to adopt artificial intelligence (AI) systems and there is no consensus in research on what facilitates trust, it is important to conduct more research to identify the factors that affect user trust in artificial intelligence systems, especially modern technologies such as NLPs. The aim of the study was therefore to identify the factors that affect user trust in NLPs. The findings from the literature within trust and artificial intelligence indicated that there may exist a relationship between trust and transparency, explainability, accuracy, reliability, automation, augmentation, anthropomorphism and data privacy. These factors were quantitatively studied together in order to uncover what affects user trust in NLPs. The result from the study indicated that transparency, accuracy, reliability, automation, augmentation, anthropomorphism and data privacy all have a positive impact on user trust in NLPs, which both supported and opposed previous findings from literature. Natural language processing systems NLP ChatGPT Users Trust Survey study Transparency Explainability Accuracy Reliability Automation Augmentation Anthropomorphism Data privacy Information Systems
76	Privacy preserving software engineering for data driven development Tongay, Karan Naresh 14 December 2020 (has links) The exponential rise in the generation of data has introduced many new areas of research including data science, data engineering, machine learning, artificial in- telligence to name a few. It has become important for any industry or organization to precisely understand and analyze the data in order to extract value out of the data. The value of the data can only be realized when it is put into practice in the real world and the most common approach to do this in the technology industry is through software engineering. This brings into picture the area of privacy oriented software engineering and thus there is a rise of data protection regulation acts such as GDPR (General Data Protection Regulation), PDPA (Personal Data Protection Act), etc. Many organizations, governments and companies who have accumulated huge amounts of data over time may conveniently use the data for increasing business value but at the same time the privacy aspects associated with the sensitivity of data especially in terms of personal information of the people can easily be circumvented while designing a software engineering model for these types of applications. Even before the software engineering phase for any data processing application, often times there can be one or many data sharing agreements or privacy policies in place. Every organization may have their own way of maintaining data privacy practices for data driven development. There is a need to generalize or categorize their approaches into tactics which could be referred by other practitioners who are trying to integrate data privacy practices into their development. This qualitative study provides an understanding of various approaches and tactics that are being practised within the industry for privacy preserving data science in software engineering, and discusses a tool for data usage monitoring to identify unethical data access. Finally, we studied strategies for secure data publishing and conducted experiments using sample data to demonstrate how these techniques can be helpful for securing private data before publishing. / Graduate Data Privacy Privacy Data Engineering Software Engineering Data Driven Developers Data Science Privacy Preserving Data Driven Development Machine Learning One class SVM Data Usage Monitoring Health data k-anonymity l-diversity differential privacy Information management Secure data sharing Survey Audits and access control Data Privacy Tactics
77	Our Humanity Exposed : Predictive Modelling in a Legal Context Greenstein, Stanley January 2017 (has links) This thesis examines predictive modelling from the legal perspective. Predictive modelling is a technology based on applied statistics, mathematics, machine learning and artificial intelligence that uses algorithms to analyse big data collections, and identify patterns that are invisible to human beings. The accumulated knowledge is incorporated into computer models, which are then used to identify and predict human activity in new circumstances, allowing for the manipulation of human behaviour. Predictive models use big data to represent people. Big data is a term used to describe the large amounts of data produced in the digital environment. It is growing rapidly due mainly to the fact that individuals are spending an increasing portion of their lives within the on-line environment, spurred by the internet and social media. As individuals make use of the on-line environment, they part with information about themselves. This information may concern their actions but may also reveal their personality traits. Predictive modelling is a powerful tool, which private companies are increasingly using to identify business risks and opportunities. They are incorporated into on-line commercial decision-making systems, determining, among other things, the music people listen to, the news feeds they receive, the content people see and whether they will be granted credit. This results in a number of potential harms to the individual, especially in relation to personal autonomy. This thesis examines the harms resulting from predictive modelling, some of which are recognized by traditional law. Using the European legal context as a point of departure, this study ascertains to what extent legal regimes address the use of predictive models and the threats to personal autonomy. In particular, it analyses Article 8 of the European Convention on Human Rights (ECHR) and the forthcoming General Data Protection Regulation (GDPR) adopted by the European Union (EU). Considering the shortcomings of traditional legal instruments, a strategy entitled ‘empowerment’ is suggested. It comprises components of a legal and technical nature, aimed at levelling the playing field between companies and individuals in the commercial setting. Is there a way to strengthen humanity as predictive modelling continues to develop? predictive modelling predictive analytics profiling big data algorithm surveillance privacy autonomy identity digital identity data privacy human rights data protection European Convention on Human Rights Data Protection Directive empowerment Law Juridik
78	Data Protection in Transit and at Rest with Leakage Detection Denis A Ulybyshev (6620474) 15 May 2019 (has links) <p>In service-oriented architecture, services can communicate and share data among themselves. This thesis presents a solution that allows detecting several types of data leakages made by authorized insiders to unauthorized services. My solution provides role-based and attribute-based access control for data so that each service can access only those data subsets for which the service is authorized, considering a context and service’s attributes such as security level of the web browser and trust level of service. My approach provides data protection in transit and at rest for both centralized and peer-to-peer service architectures. The methodology ensures confidentiality and integrity of data, including data stored in untrusted cloud. In addition to protecting data against malicious or curious cloud or database administrators, the capability of running a search through encrypted data, using SQL queries, and building analytics over encrypted data is supported. My solution is implemented in the “WAXEDPRUNE” (Web-based Access to Encrypted Data Processing in Untrusted Environments) project, funded by Northrop Grumman Cybersecurity Research Consortium. WAXEDPRUNE methodology is illustrated in this thesis for two use cases, including a Hospital Information System with secure storage and exchange of Electronic Health Records and a Vehicle-to-Everything communication system with secure exchange of vehicle’s and drivers’ data, as well as data on road events and road hazards. </p><p>To help with investigating data leakage incidents in service-oriented architecture, integrity of provenance data needs to be guaranteed. For that purpose, I integrate WAXEDPRUNE with IBM Hyperledger Fabric blockchain network, so that every data access, transfer or update is recorded in a public blockchain ledger, is non-repudiatable and can be verified at any time in the future. The work on this project, called “Blockhub,” is in progress.</p> Computer System Security Database Management Data Privacy data protection mechanism role-based access control attribute-based encryption scheme Leakage Detection encrypted search queries identity management authentication scheme blockchain technology application
79	Bezpečnost jako významný faktor rozvoje cestovního ruchu v České republice / Safety as an important factor for tourism development in Czech Republic Šteflová, Lucie January 2011 (has links) The main objective of this diploma thesis is to analyse the tourism safety situation and the CzechTourism hypothesis: "Czech Republic is a safe destination" for a visit. It is focused on tourism safety and security questions and its main forms, various international safety analysis and reports with emphasis on the situation in Czech Republic. The study of CzechTourism is analysed by means of a direct survey among foreigners and its results lead to the potential development of tourism safety situation in Czech Republic. Finally, the evolution of a situation in Czech Republic is observed according to the different tourism and peace indicators and the direct dependence "safe country / number of arrivals" is investigated.
80	Real-time forecasting of dietary habits and user health using Federated Learning with privacy guarantees Horchidan, Sonia-Florina January 2020 (has links) Modern health self-monitoring devices and applications, such as Fitbit and MyFitnessPal, empower users to take concrete actions and set fitness and lifestyle goals based on their recorded trends and statistics. Predicting such trends is beneficial in the road of achieving long-time targets, as the individuals can adjust their diets and habits at any point to guarantee success. The design and implementation of such a system, which also respects user privacy, is the main objective of our work.This application is modelled as a time-series forecasting problem. Given the historical data of users, we aim to predict their eating and lifestyle habits in real-time. We apply the federated learning paradigm to our use-case be- cause of the highly-distributed nature of our data and the privacy concerns of such sensitive recorded information. However, federated learning from het- erogeneous sequences of data can be challenging, as even state-of-the-art ma- chine learning techniques for time-series forecasting can encounter difficulties when learning from very irregular data sequences. Specifically, in the pro- posed healthcare scenario, the machine learning algorithms might fail to cater to users with unique dietary patterns.In this work, we implement a two-step streaming clustering mechanism and group clients that exhibit similar eating and fitness behaviours. The con- ducted experiments prove that learning federatively in this context can achieve very high prediction accuracy, as our predictions are no more than 0.025% far from the ground truth value with respect to the range of each feature. Training separate models for each group of users is shown to be beneficial, especially in terms of the training time, but it is highly dependent on the parameters used for the models and the training process. Our experiments conclude that the configuration used for the general federated model cannot be applied to the clusters of data. However, a decrease in prediction error of more than 45% can be achieved, given the parameters are optimized for each case.Lastly, this work tackles the problem of data privacy by applying state-of- the-art differential privacy techniques. Our empirical study shows that noising the gradients sent to the server is unsuitable for small datasets and cancels out the benefits obtained by prior users’ clustering. On the other hand, noising the training data achieves remarkable results, obtaining a differential privacy level corresponding to an epsilon value of 0.1 with an increase in the observed mean absolute error by a factor of only 0.21. / Moderna apparater och applikationer för självövervakning av hälsa, som Fitbit och MyFitnessPal, ger användarna möjlighet att vidta konkreta åtgärder och sätta fitness- och livsstilsmål baserat på deras dokumenterade trender och statistik. Att förutsäga sådana trender är fördelaktigt för att uppnå långtidsmål, eftersom individerna kan anpassa sina dieter och vanor när som helst för att garantera framgång.Utformningen och implementeringen av ett sådant system, som dessutom respekterar användarnas integritet, är huvudmålet för vårt arbete. Denna appli- kation är modellerad som ett tidsserieprognosproblem. Med avseende på an- vändarnas historiska data är målet att förutsäga deras matvanor och livsstilsva- nor i realtid. Vi tillämpar det federerade inlärningsparadigmet på vårt använd- ningsfall på grund av den mycket distribuerade karaktären av vår data och in- tegritetsproblemen för sådan känslig bokförd information. Federerade lärande från heterogena datasekvenser kan emellertid vara utmanande, eftersom även de modernaste maskininlärningstekniker för tidsserieprognoser kan stöta på svårigheter när de lär sig från mycket oregelbundna datasekvenser. Specifikt i det föreslagna sjukvårdsscenariot kan maskininlärningsalgoritmerna misslyc- kas med att förse användare med unika dietmönster.I detta arbete implementerar vi en tvåstegsströmmande klustermekanism och grupperar användare som uppvisar liknande ät- och fitnessbeteenden. De genomförda experimenten visar att federerade lärande i detta sammanhang kan uppnå mycket hög nogrannhet i förutsägelse, eftersom våra förutsägelser in- te är mer än 0,025% ifrån det sanna värdet med avseende på intervallet för varje funktion. Träning av separata modeller för varje grupp användare visar sig vara fördelaktigt, särskilt gällande träningstiden, men det är mycket be- roende av parametrarna som används för modellerna och träningsprocessen. Våra experiment drar slutsatsen att konfigurationen som används för den all- männa federerade modellen inte kan tillämpas på dataklusterna. Dock kan en minskning av förutsägelsefel på mer än 45% uppnås, givet att parametrarna är optimerade för varje fall.Slutligen hanteras problemet med datasekretess genom att tillämpa bästa tillgängliga differentiell integritetsteknik. Vår empiriska studie visar att adde- ra brus till gradienter som skickas till servern är olämpliga för liten data och avbryter fördelarna med tidigare användares kluster. Däremot, genom att ad- dera brus till träningsdata uppnås anmärkningsvärda resultat. En differentierad integritetsnivå motsvarande ett epsilonvärde på 0,1 med en ökning av det ob- serverade genomsnittliga absoluta felet med en faktor på endast 0,21 erhölls. Federated Learning Time Series Forecasting Clustering Pattern Matching Real-time Data Processing Differential Privacy Data Privacy. Federerade Lärande Tidsseriesprognos Klustergruppering Mönstermatchning Realtidshantering av data Differentialintegritet Dataintegritet Computer and Information Sciences Data- och informationsvetenskap

Search results