• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 118
  • 19
  • 5
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 181
  • 97
  • 59
  • 40
  • 39
  • 38
  • 37
  • 28
  • 24
  • 20
  • 19
  • 19
  • 19
  • 19
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

A Framework To Implement OpenID Connect Protocol For Federated Identity Management In Enterprises

Rasiwasia, Akshay January 2017 (has links)
Federated Identity Management (FIM) and Single-Sign-On (SSO) concepts improve both productivity andsecurity for organizations by assigning the responsibility of user data management and authentication toone single central entity called identity provider, and consequently, the users have to maintain only oneset of credential to access resources at multiple service provider. The implementation of any FIM and SSOprotocol is complex due to the involvement of multiple organizations, sensitive user data, and myriadsecurity issues. There are many instances of faulty implementations that compromised on security forease of implementation due to lack of proper guidance. OpenID Connect (OIDC) is the latest protocolwhich is an open standard, lightweight and platform independent to implement Federated IdentityManagement; it offers several advantages over the legacy protocols and is expected to have widespreaduse. An implementation framework that addresses all the important aspects of the FIM lifecycle isrequired to ensure the proper application of the OIDC protocol at the enterprise level. In this researchwork, an implementation framework was designed for OIDC protocol by incorporating all the importantrequirements from a managerial, technical and security perspective of an enterprise level federatedidentity management. The research work closely follows the design science research process, and theframework was evaluated for its completeness, efficiency, and usability.
62

Identifying, Relating, Consisting and Querying Large Heterogeneous RDF Sources

VALDESTILHAS, ANDRE 12 January 2021 (has links)
The Linked Data concept relies on a collection of best practices to publish and link structured web-based data. However, the number of available datasets has been growing significantly over the last decades. These datasets are interconnected and now represent the well-known Web of Data, which stands for an extensive collection of concise and detailed interlinked data sets from multiple domains with large datasets. Thus, linking entries across heterogeneous data sources such as databases or knowledge bases becomes an increasing challenge. However, connections between datasets play a leading role in significant activities such as cross-ontology question answering, large-scale inferences, and data integration. In Linked Data, the Linksets are well known for executing the task of generating links between datasets. Due to the heterogeneity of the datasets, this uniqueness is reflected in the structure of the dataset, making a hard task to find relations among those datasets, i.e., to identify how similar they are. In this way, we can say that Linked Data involves Datasets and Linksets and those Linksets needs to be maintained. Such lack of information directed us to the current issues addressed in this thesis, which are: How to Identify and query datasets from a huge heterogeneous collection of RDF (Resource Description Framework) datasets. To address this issue, we need to assure the consistency and to know how the datasets are related and how similar they are. As results, to deal with the need for identifying LOD (Linked Open Data) Datasets, we created an approach called WIMU, which is a regularly updated database index of more than 660K datasets from LODStats and LOD Laundromat, an efficient, low cost and scalable service on the web that shows which dataset most likely defines a URI and various statistics of datasets indexed from LODStats and LOD Laundromat. To integrate and to query LOD datasets, we provide a hybrid SPARQL query processing engine that can retrieve results from 559 active SPARQL endpoints (with a total of 163.23 billion triples) and 668,166 datasets (with a total of 58.49 billion triples) from LOD Stats and LOD Laundromat. To assure consistency of semantic web Linked repositories where these LOD datasets are located we create an approach for the mitigation of the identifier heterogeneity problem and implement a prototype where the user can evaluate existing links, as well as suggest new links to be rated and a time-efficient algorithm for the detection of erroneous links in large-scale link repositories without computing all closures required by the property axiom. To know how the datasets are related and how similar they are we provide a String similarity algorithm called Most Frequent K Characters, in which is based in two nested filters, (1) First Frequency Filter and (2) Hash Intersection filter, that allows discarding candidates before calculating the actual similarity value, thus giving a considerable performance gain, allowing to build a LOD Dataset Relation Index, in which provides information about how similar are all the datasets from LOD cloud, including statistics about the current state of those datasets. The work in this thesis showed that to identify and query LOD datasets, we need to know how those datasets are related, assuring consistency. Our analysis demonstrated that most of the datasets are disconnected from others needing to pass through a consistency and linking process to integrate them, providing a way to query a large number of datasets simultaneously. There is a considerable step towards totally queryable LOD datasets, where the information contained in this thesis is an essential step towards Identifying, Relating, and Querying datasets on the Web of Data.:1 introduction and motivation 1 1.1 The need for identifying and querying LOD datasets . 1 1.2 The need for consistency of semantic web Linked repositories . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 The need for Relation and integration of LOD datasets 2 1.4 Research Questions and Contributions . . . . . . . . . . 3 1.5 Methodology and Contributions . . . . . . . . . . . . . 3 1.6 General Use Cases . . . . . . . . . . . . . . . . . . . . . 6 1.6.1 The Heloise project . . . . . . . . . . . . . . . . . 6 1.7 Chapter overview . . . . . . . . . . . . . . . . . . . . . . 7 2 preliminaries 8 2.1 Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.1 URIs and URLs . . . . . . . . . . . . . . . . . . . 8 2.1.2 Linked Data . . . . . . . . . . . . . . . . . . . . . 9 2.1.3 Resource Description Framework . . . . . . . . 10 2.1.4 Ontologies . . . . . . . . . . . . . . . . . . . . . . 11 2.2 RDF graph . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Transitive property . . . . . . . . . . . . . . . . . . . . . 12 2.4 Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Linkset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.6 RDF graph partitioning . . . . . . . . . . . . . . . . . . 13 2.7 Basic Graph Pattern . . . . . . . . . . . . . . . . . . . . . 13 2.8 RDF Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.9 SPARQL . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.10 Federated Queries . . . . . . . . . . . . . . . . . . . . . . 14 3 state of the art 15 3.1 Identifying Datasets in Large Heterogeneous RDF Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Relating Large amount of RDF datasets . . . . . . . . . 19 3.2.1 Obtaining Similar Resources using String Similarity . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 Consistency on Large amout of RDF sources . . . . . . 21 3.3.1 Heterogeneity in DBpedia Identifiers . . . . . . 21 3.3.2 Detection of Erroneous Links in Large-Scale RDF Datasets . . . . . . . . . . . . . . . . . . . . 22 3.4 Querying Large Heterogeneous RDF Datasets . . . . . 25 4 relation among large amount of rdf sources 29 4.1 Identifying Datasets in Large Heterogeneous RDF sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.1.1 The WIMU approach . . . . . . . . . . . . . . . . 29 4.1.2 The approach . . . . . . . . . . . . . . . . . . . . 30 4.1.3 Use cases . . . . . . . . . . . . . . . . . . . . . . . 33 4.1.4 Evaluation: Statistics about the Datasets . . . . 35 4.2 Relating RDF sources . . . . . . . . . . . . . . . . . . . . 38 4.2.1 The ReLOD approach . . . . . . . . . . . . . . . 38 4.2.2 The approach . . . . . . . . . . . . . . . . . . . . 40 4.2.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . 45 4.3 Relating Similar Resources using String Similarity . . . 50 4.3.1 The MFKC approach . . . . . . . . . . . . . . . . 50 4.3.2 Approach . . . . . . . . . . . . . . . . . . . . . . 51 4.3.3 Correctness and Completeness . . . . . . . . . . 55 4.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . 57 5 consistency in large amount of rdf sources 67 5.1 Consistency in Heterogeneous DBpedia Identifiers . . 67 5.1.1 The DBpediaSameAs approach . . . . . . . . . . 67 5.1.2 Representation of the idea . . . . . . . . . . . . . 68 5.1.3 The work-flow . . . . . . . . . . . . . . . . . . . 69 5.1.4 Methodology . . . . . . . . . . . . . . . . . . . . 69 5.1.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . 70 5.1.6 Normalization on DBpedia URIs . . . . . . . . . 70 5.1.7 Rate the links . . . . . . . . . . . . . . . . . . . . 71 5.1.8 Results . . . . . . . . . . . . . . . . . . . . . . . . 72 5.1.9 Discussion . . . . . . . . . . . . . . . . . . . . . . 72 5.2 Consistency in Large-Scale RDF sources: Detection of Erroneous Links . . . . . . . . . . . . . . . . . . . . . . . 73 5.2.1 The CEDAL approach . . . . . . . . . . . . . . . 73 5.2.2 Method . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.3 Error Types and Quality Measure for Linkset Repositories . . . . . . . . . . . . . . . . . . . . . 78 5.2.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . 80 5.2.5 Experimental setup . . . . . . . . . . . . . . . . . 80 5.3 Detecting Erroneous Link candidates in Educational Link Repositories . . . . . . . . . . . . . . . . . . . . . . 85 5.3.1 The CEDAL education approach . . . . . . . . . 85 5.3.2 Research questions . . . . . . . . . . . . . . . . . 86 5.3.3 Our contributions . . . . . . . . . . . . . . . . . . 86 5.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . 86 6 querying large amount of heterogeneous rdf datasets 89 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.3 The WimuQ . . . . . . . . . . . . . . . . . . . . . . . . . 91 7.1 Identifying Datasets in Large Heterogeneous RDF Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 7.2 Relating Large Amount of RDF Datasets . . . . . . . . 101 7.3 Obtaining Similar Resources Using String Similarity . . 102 7.4 Heterogeneity in DBpedia Identifiers . . . . . . . . . . . 102 7.5 Detection of Erroneous Links in Large-Scale RDF Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 7.7 Querying Large Heterogeneous RDF Datasets . . . . . 104
63

A Study on Federated Learning Systems in Healthcare

Smith, Arthur, M.D. 18 August 2021 (has links)
No description available.
64

Federated Emotion Recognition with Physiological Signals- GSR

Hassani, Tara January 2021 (has links)
Background: Human-computer interaction (HCI) is one of the daily triggering emotional events in today’s world and researchers in this area have been exploring different techniques to enhance emotional ability in computers. Due to privacy concerns and the laboratory's limited capability for gathering data from a large number of users, common machine learning techniques that are extensively used in emotion recognition tasks lack adequate data collection. To address these issues, we propose a decentralized framework based on the Federated Learning architecture where raw data is collected and analyzed locally. The effects of these analyses in large numbers of updates are transferred to a server to aggregate for the creation of a global model for the emotion recognition task using only Galvanic Skin Response (GSR) signals and their extracted features.  Objectives: This thesis aims to explore how the CNN based federated learning approach can be used in emotion recognition considering data privacy protection and investigate if it reaches the same performance as basic centralized CNN.Methods: To investigate the effect of the proposed method in emotion recognition, two architectures including centralized and federated are designed with the CNN model. Then the results of these two architectures are compared to each other. The dataset used in our work is the CASE dataset. In federated architecture, we employ neurons and weights to train the models instead of raw data, which is used in the centralized architecture.  Results: The performance results indicate that the proposed model not only can work well but also performs better than some other related work methods regarding valance accuracy. Besides, it also has the ability to collect more data from various sources and also protecting sensitive users’ data better by supporting tighter privacy regulations. The physiological data is inherently anonymous but when it comes to using it with other modalities such as video or voice, maintaining the same anonymity is challenging.  Conclusions: This thesis concludes that the federated CNN based model can be used in emotion recognition systems and obtains the same accuracy performance as centralized architecture. Regarding classifying the valance, it outperforms some other state-of-the-art methods. Meanwhile, its federated nature can provide better privacy protection and data diversity for the emotion recognition system.
65

Metadata Management in Multi-Grids and Multi-Clouds

Espling, Daniel January 2011 (has links)
Grid computing and cloud computing are two related paradigms used to access and use vast amounts of computational resources. The resources are often owned and managed by a third party, relieving the users from the costs and burdens of acquiring and managing a considerably large infrastructure themselves. Commonly, the resources are either contributed by different stakeholders participating in shared projects (grids), or owned and managed by a single entity and made available to its users with charging based on actual resource consumption (clouds). Individual grid or cloud sites can form collaborations with other sites, giving each site access to more resources that can be used to execute tasks submitted by users. There are several different models of collaborations between sites, each suitable for different scenarios and each posing additional requirements on the underlying technologies. Metadata concerning the status and resource consumption of tasks are created during the execution of the task on the infrastructure. This metadata is used as the primary input in many core management processes, e.g., as a base for accounting and billing, as input when prioritizing and placing incoming task, and as a base for managing the amount of resources allocated to different tasks. Focusing on management and utilization of metadata, this thesis contributes to a better understanding of the requirements and challenges imposed by different collaboration models in both grids and clouds. The underlying design criteria and resulting architectures of several software systems are presented in detail. Each system addresses different challenges imposed by cross-site grid and cloud architectures: The LUTSfed approach provides a lean and optional mechanism for filtering and management of usage data between grid or cloud sites. An accounting and billing system natively designed to support cross-site clouds demonstrates usage data management despite unknown placement and dynamic task resource allocation. The FSGrid system enables fairshare job prioritization across different grid sites, mitigating the problems of heterogeneous scheduling software and local management policies. The results and experiences from these systems are both theoretical and practical, as full scale implementations of each system has been developed and analyzed as a part of this work. Early theoretical work on structure-based service management forms a foundation for future work on structured-aware service placement in cross- site clouds.
66

Implementation of Federated Learning on Raspberry Pi Boards : Implementation of Federated Learning on Raspberry Pi Boards with Paillier Encryption

Wang, Wenhao January 2021 (has links)
The development of innovative applications of Artificial Intelligence (AI) is inseparable from the sharing of public data. However, as people strengthen their awareness of the protection of personal data privacy, it is more and more difficult to collect data from multiple data sources and there is also a risk of leakage in unified data management. But neural networks need a lot of data for model learning and analysis. Federated learning (FL) can solve the above difficulties. It allows the server to learn from the local data of multiple clients without collecting them. This thesis mainly deploys FL on the Raspberry Pi (RPi) and achieves federated averaging (FedAvg) as aggregation method. First in the simulation, we compare the difference between FL and centralized learning (CL). Then we build a reliable communication system based on socket on testbed and implement FL on those devices. In addition, the Paillier encryption algorithm is configured for the communication in FL to avoid model parameters being exposed to public network directly. In other words, the project builds a complete and secure FL system based on hardware. / Utvecklingen av innovativa applikationer för artificiell intelligens (AI) är oskiljaktig från delning av offentlig data. Men eftersom människor stärker sin medvetenhet om skyddet av personuppgiftsskydd är det allt svårare att samla in data från flera datakällor och det finns också risk för läckage i enhetlig datahantering. Men neurala nätverk behöver mycket data för modellinlärning och analys. Federated learning (FL) kan lösa ovanstående svårigheter. Det gör det möjligt för servern att lära av lokala klientdata utan att samla in dem. Denna avhandling använder huvudsakligen FL på Raspberry Pi (RPi) och uppnår federerad genomsnitt (FedAvg) som aggregeringsmetod. Först i simuleringen jämför vi skillnaden mellan FL och CL. Sedan bygger vi ett pålitligt kommunikationssystem baserat på uttag på testbädd och implementerar FL på dessa enheter. Dessutom är Paillier -krypteringsalgoritmen konfigurerad för kommunikation i FL för att undvika att modellparametrar exponeras för det offentliga nätverket direkt. Med andra ord bygger projektet ett komplett och säkert FL -system baserat på hårdvara.
67

UNIFYING DISTILLATION WITH PERSONALIZATION IN FEDERATED LEARNING

Siddharth Divi (10725357) 29 April 2021 (has links)
<div>Federated learning (FL) is a decentralized privacy-preserving learning technique in which clients learn a joint collaborative model through a central aggregator without sharing their data. In this setting, all clients learn a single common predictor (FedAvg), which does not generalize well on each client's local data due to the statistical data heterogeneity among clients. In this paper, we address this problem with PersFL, a discrete two-stage personalized learning algorithm. In the first stage, PersFL finds the optimal teacher model of each client during the FL training phase. In the second stage, PersFL distills the useful knowledge from optimal teachers into each user's local model. The teacher model provides each client with some rich, high-level representation that a client can easily adapt to its local model, which overcomes the statistical heterogeneity present at different clients. We evaluate PersFL on CIFAR-10 and MNIST datasets using three data-splitting strategies to control the diversity between clients' data distributions.</div><div><br></div><div>We empirically show that PersFL outperforms FedAvg and three state-of-the-art personalization methods, pFedMe, Per-FedAvg and FedPer on majority data-splits with minimal communication cost. Further, we study the performance of PersFL on different distillation objectives, how this performance is affected by the equitable notion of fairness among clients, and the number of required communication rounds. We also build an evaluation framework with the following modules: Data Generator, Federated Model Generation, and Evaluation Metrics. We introduce new metrics for the domain of personalized FL, and split these metrics into two perspectives: Performance, and Fairness. We analyze the performance of all the personalized algorithms by applying these metrics to answer the following questions: Which personalization algorithm performs the best in terms of accuracy across all the users?, and Which personalization algorithm is the fairest amongst all of them? Finally, we make the code for this work available at https://tinyurl.com/1hp9ywfa for public use and validation.</div>
68

Towards Peer-to-Peer Federated Learning: Algorithms and Comparisons to Centralized Federated Learning

Mäenpää, Dylan January 2021 (has links)
Due to privacy and regulatory reasons, sharing data between institutions can be difficult. Because of this, real-world data are not fully exploited by machine learning (ML). An emerging method is to train ML models with federated learning (FL) which enables clients to collaboratively train ML models without sharing raw training data. We explored peer-to-peer FL by extending a prominent centralized FL algorithm called Fedavg to function in a peer-to-peer setting. We named this extended algorithm FedavgP2P. Deep neural networks at 100 simulated clients were trained to recognize digits using FedavgP2P and the MNIST data set. Scenarios with IID and non-IID client data were studied. We compared FedavgP2P to Fedavg with respect to models' convergence behaviors and communication costs. Additionally, we analyzed the connection between local client computation, the number of neighbors each client communicates with, and how that affects performance. We also attempted to improve the FedavgP2P algorithm with heuristics based on client identities and per-class F1-scores. The findings showed that by using FedavgP2P, the mean model convergence behavior was comparable to a model trained with Fedavg. However, this came with a varying degree of variation in the 100 models' convergence behaviors and much greater communications costs (at least 14.9x more communication with FedavgP2P). By increasing the amount of local computation up to a certain level, communication costs could be saved. When the number of neighbors a client communicated with increased, it led to a lower variation of the models' convergence behaviors. The FedavgP2P heuristics did not show improved performance. In conclusion, the overall findings indicate that peer-to-peer FL is a promising approach.
69

Applied Machine Learning for Online Education

Serena Alexis Nicoll (12476796) 28 April 2022 (has links)
<p>We consider the problem of developing innovative machine learning tools for online education and evaluate their ability to provide instructional resources.  Prediction tasks for student behavior are a complex problem spanning a wide range of topics: we complement current research in student grade prediction and clickstream analysis by considering data from three areas of online learning: Social Learning Networks (SLN), Instructor Feedback, and Learning Management Systems (LMS). In each of these categories, we propose a novel method for modelling data and an associated tool that may be used to assist students and instructors. First, we develop a methodology for analyzing instructor-provided feedback and determining how it correlates with changes in student grades using NLP and NER--based feature extraction. We demonstrate that student grade improvement can be well approximated by a multivariate linear model with average fits across course sections approaching 83\%, and determine several contributors to student success. Additionally, we develop a series of link prediction methodologies that utilize spatial and time-evolving network architectures to pass network state between space and time periods. Through evaluation on six real-world datasets, we find that our method obtains substantial improvements over Bayesian models, linear classifiers, and an unsupervised baseline, with AUCs typically above 0.75 and reaching 0.99. Motivated by Federated Learning, we extend our model of student discussion forums to model an entire classroom as a SLN. We develop a methodology to represent student actions across different course materials in a shared, low-dimensional space that allows characteristics from actions of different types to be passed jointly to a downstream task. Performance comparisons against several baselines in centralized, federated, and personalized learning demonstrate that our model offers more distinctive representations of students in a low-dimensional space, which in turn results in improved accuracy on a common downstream prediction task. Results from these three research thrusts indicate the ability of machine learning methods to accurately model student behavior across multiple data types and suggest their ability to benefit students and instructors alike through future development of assistive tools. </p>
70

Privacy-Preserved Federated Learning : A survey of applicable machine learning algorithms in a federated environment

Carlsson, Robert January 2020 (has links)
There is a potential in the field of medicine and finance of doing collaborative machine learning. These areas gather data which can be used for developing machine learning models that could predict all from sickness in patients to acts of economical crime like fraud. The problem that exists is that the data collected is mostly of confidential nature and should be handled with precaution. This makes the standard way of doing machine learning - gather data at one centralized server - unwanted to achieve. The safety of the data have to be taken into account. In this project we will explore the Federated learning approach of ”bringing the code to the data, instead of data to the code”. It is a decentralized way of doing machine learning where models are trained on connected devices and data is never shared. Keeping the data privacypreserved.

Page generated in 0.075 seconds