• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 57
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 74
  • 74
  • 38
  • 35
  • 30
  • 26
  • 19
  • 18
  • 17
  • 17
  • 16
  • 14
  • 14
  • 13
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Generating Synthetic Schematics with Generative Adversarial Networks

Daley Jr, John January 2020 (has links)
This study investigates synthetic schematic generation using conditional generative adversarial networks, specifically the Pix2Pix algorithm was implemented for the experimental phase of the study. With the increase in deep neural network’s capabilities and availability, there is a demand for verbose datasets. This in combination with increased privacy concerns, has led to synthetic data generation utilization. Analysis of synthetic images was completed using a survey. Blueprint images were generated and were successful in passing as genuine images with an accuracy of 40%. This study confirms the ability of generative neural networks ability to produce synthetic blueprint images.
32

Synthetic Data Generation Using Transformer Networks / Textgenerering med transformatornätverk : Skapa text från ett syntetiskt dataset i tabellform

Campos, Pedro January 2021 (has links)
One of the areas propelled by the advancements in Deep Learning is Natural Language Processing. These continuous advancements allowed the emergence of new language models such as the Transformer [1], a deep learning model based on attention mechanisms that takes a sequence of symbols as input and outputs another sequence, attending to the input during its generation. This model is often used in translation, text summarization and text generation, outperforming previous used methods such as Recurrent Neural Networks and Generative Adversarial Networks. The problem statement provided by the company Syndata for this thesis is related to this new architecture: Given a tabular dataset, create a model based on the Transformer that can generate text fields considering the underlying context from the rest of the accompanying fields. In an attempt to accomplish this, Syndata has previously implemented a recurrent model, nevertheless, they’re confident that a Transformer could perform better at this task. Their goal is to improve the solution provided with the implementation of a model based on the Transformer architecture. The implemented model should then be compared to the previous recurrent model and it’s expected to outperform it. Since there aren’t many published research articles where Transformers are used for synthetic tabular data generation, this problem is fairly original. Four different models were implemented: a model that is based on the GPT architecture [2], an LSTM [3], a Bidirectional-LSTM with an Encoder- Decoder structure and the Transformer. The first two models are autoregressive models and the later two are sequence-to-sequence models which have an Encoder-Decoder architecture. We evaluated each one of them based on 3 different aspects: on the distribution similarity between the real and generated datasets, on how well each model was able to condition name generation considering the information contained in the accompanying fields and on how much real data the model compromised after generation, which addresses a privacy related issue. We found that the Encoder-Decoder models such as the Transformer and the Bidirectional LSTM seem to perform better for this type of synthetic data generation where the output (or the field to be predicted) has to be conditioned by the rest of the accompanying fields. They’ve outperformed the GPT and the RNNmodels in the aspects that matter most to Syndata: keeping customer data private and being able to correctly condition the output with the information contained in the accompanying fields. / Deep learning har lett till stora framsteg inom textbaserad språkteknologi (Natural Language Processing) där en typ av maskininlärningsarkitektur kallad Transformers[1] har haft ett extra stort intryck. Dessa modeller använder sig av en så kallad attention mekanism, tränas som språkmodeller (Language Models), där de tar in en sekvens av symboler och matar ut en annan. Varje steg i den utgående sekvensen beror olika mycket på steg i den ingående sekvensen givet vad denna attention mekanism lärt sig vara relevant. Dessa modeller används för översättning, sammanfattning och textgenerering och har överträffat andra arkitekturer som Recurrent Neural Networks, RNNs samt Generative Adversarial Networks. Problemformuleringen för denna avhandling kom från företaget Syndata och är relaterat till denna arkitektur: givet tabellbaserad data, implementera en Transformer som genererar textfält beroende av informationen i de medföljande tabellfälten. Syndata har tidigare implementerat ett RNN för detta ändamål men är övertygande om att en Transformer kan prestera bättre. Målet för denna avhandling är att implementera en Transformer och jämföra med den tidigare implementationen med hypotesen att den kommer att prestera bättre. Det underliggande målet är att givet data i tabellform kunna generera ny syntetisk data, användbar för industrin, där problem kring integritet och privat information kan minimeras. Fyra modeller implementerades: en Transformermodel baserad på GPT- arkitekturen[ 2], en LSTM[3]-modell, en encoder-decoder Transformer och en BiLSTM-modell. De två förstnämnda modellerna är auto-regressiva och de senare två är sequence-to-sequence som har en encoder-decoder arkitektur. Dessa modeller utvärderades och jämfördes givet tre kriterier: hur lik sannolikhetsfördelningen mellan den verkliga och den genererade datamängden, hur mycket varje modell baserade generationen på de medföljande fälten och hur mycket verklig data som komprometteras genom synteseringen. Slutsatsen var att Encoder-Decoder varianterna, Transformern och BiLSTM, var bättre för att syntesera data i tabellformat, där utdatan (eller fälten som ska genereras) ska uppvisa ett starkt beroende av resten av de medföljande fälten. De överträffade GPT- och RNN- modellerna i de aspekter som betyder mest för Syndata att hålla kunddata privat och att den syntetiserade datan ska vara beroende av informationen i de medföljande fälten.
33

Learning from 3D generated synthetic data for unsupervised anomaly detection

Fröjdholm, Hampus January 2021 (has links)
Modern machine learning methods, utilising neural networks, require a lot of training data. Data gathering and preparation has thus become a major bottleneck in the machine learning pipeline and researchers often use large public datasets to conduct their research (such as the ImageNet [1] or MNIST [2] datasets). As these methods begin being used in industry, these challenges become apparent. In factories objects being produced are often unique and may even involve trade secrets and patents that need to be protected. Additionally, manufacturing may not have started yet, making real data collection impossible. In both cases a public dataset is unlikely to be applicable. One possible solution, investigated in this thesis, is synthetic data generation. Synthetic data generation using physically based rendering was tested for unsupervised anomaly detection on a 3D printed block. A small image dataset was gathered of the block as control and a data generation model was created using its CAD model, a resource most often available in industrial settings. The data generation model used randomisation to reduce the domain shift between the real and synthetic data. For testing the data, autoencoder models were trained, both on the real and synthetic data separately and in combination. The material of the block, a white painted surface, proved challenging to reconstruct and no significant difference between the synthetic and real data could be observed. The model trained on real data outperformed the models trained on synthetic and the combined data. However, the synthetic data combined with the real data showed promise with reducing some of the bias intentionally introduced in the real dataset. Future research could focus on creating synthetic data for a problem where a good anomaly detection model already exists, with the goal of transferring some of the synthetic data generation model (such as the materials) to a new problem. This would be of interest in industries where they produce many different but similar objects and could reduce the time needed when starting a new machine learning project.
34

Deep Learning for 3D Perception: Computer Vision and Tactile Sensing

Garcia-Garcia, Alberto 23 October 2019 (has links)
The care of dependent people (for reasons of aging, accidents, disabilities or illnesses) is one of the top priority lines of research for the European countries as stated in the Horizon 2020 goals. In order to minimize the cost and the intrusiveness of the therapies for care and rehabilitation, it is desired that such cares are administered at the patient’s home. The natural solution for this environment is an indoor mobile robotic platform. Such robotic platform for home care needs to solve to a certain extent a set of problems that lie in the intersection of multiple disciplines, e.g., computer vision, machine learning, and robotics. In that crossroads, one of the most notable challenges (and the one we will focus on) is scene understanding: the robot needs to understand the unstructured and dynamic environment in which it navigates and the objects with which it can interact. To achieve full scene understanding, various tasks must be accomplished. In this thesis we will focus on three of them: object class recognition, semantic segmentation, and grasp stability prediction. The first one refers to the process of categorizing an object into a set of classes (e.g., chair, bed, or pillow); the second one goes one level beyond object categorization and aims to provide a per-pixel dense labeling of each object in an image; the latter consists on determining if an object which has been grasped by a robotic hand is in a stable configuration or if it will fall. This thesis presents contributions towards solving those three tasks using deep learning as the main tool for solving such recognition, segmentation, and prediction problems. All those solutions share one core observation: they all rely on tridimensional data inputs to leverage that additional dimension and its spatial arrangement. The four main contributions of this thesis are: first, we show a set of architectures and data representations for 3D object classification using point clouds; secondly, we carry out an extensive review of the state of the art of semantic segmentation datasets and methods; third, we introduce a novel synthetic and large-scale photorealistic dataset for solving various robotic and vision problems together; at last, we propose a novel method and representation to deal with tactile sensors and learn to predict grasp stability.
35

Using Synthetic Data to ModelMobile User Interface Interactions

Jalal, Laoa January 2023 (has links)
Usability testing within User Interface (UI) is a central part of assuring high-quality UIdesign that provides good user-experiences across multiple user-groups. The processof usability testing often times requires extensive collection of user feedback, preferablyacross multiple user groups, to ensure an unbiased observation of the potential designflaws within the UI design. Attaining feedback from certain user groups has shown tobe challenging, due to factors such as medical conditions that limits the possibilities ofusers to participate in the usability test. An absence of these hard-to-access groups canlead to designs that fails to consider their unique needs and preferences, which maypotentially result in a worse user experience for these individuals. In this thesis, wetry to address the current gaps within data collection of usability tests by investigatingwhether the Generative Adversarial Network (GAN) framework can be used to generatehigh-quality synthetic user interactions of a particular UI gesture across multiple usergroups. Moreover, a collection UI interaction of 2 user groups, namely the elderlyand young population, was conducted where the UI interaction at focus was thedrag-and-drop operation. The datasets, comprising of both user groups were trainedon separate GANs, both using the doppelGANger architecture, and the generatedsynthetic data were evaluated based on its diversity, how well temporal correlations arepreserved and its performance compared to the real data when used in a classificationtask. The experiment result shows that both GANs produces high-quality syntheticresemblances of the drag-and-drop operation, where the synthetic samples show bothdiversity and uniqueness when compared to the actual dataset. The synthetic datasetacross both user groups also provides similar statistical properties within the originaldataset, such as the per-sample length distribution and the temporal correlationswithin the sequences. Furthermore, the synthetic dataset shows, on average, similarperformance achievements across precision, recall and F1 scores compared to theactual dataset when used to train a classifier to distinguish between the elderly andyounger population drag-and-drop sequences. Further research regarding the use ofmultiple UI gestures, using a single GAN to generate UI interactions across multipleuser groups, and performing a comparative study of different GAN architectures wouldprovide valuable insights of unexplored potentials and possible limitations within thisparticular problem domain.
36

Synthetic Data Generation for 6D Object Pose and Grasping Estimation

Martínez González, Pablo 16 March 2023 (has links)
Teaching a robot how to behave so it becomes completely autonomous is not a simple task. When robotic systems become truly intelligent, interactions with them will feel natural and easy, but nothing could be further from truth. Make a robot understand its surroundings is a huge task that the computer vision field tries to address, and deep learning techniques are bringing us closer. But at the cost of the data. Synthetic data generation is the process of generating artificial data that is used to train machine learning models. This data is generated using computer algorithms and simulations, and is designed to resemble real-world data as closely as possible. The use of synthetic data has become increasingly popular in recent years, particularly in the field of deep learning, due to the shortage of high-quality annotated real-world data and the high cost of collecting it. For that reason, in this thesis we are addressing the task of facilitating the generation of synthetic data with the creation of a framework which leverages advances in modern rendering engines. In this context, the generated synthetic data can be used to train models for tasks such as 6D object pose estimation or grasp estimation. 6D object pose estimation refers to the problem of determining the position and orientation of an object in 3D space, while grasp estimation involves predicting the position and orientation of a robotic hand or gripper that can be used to pick up and manipulate the object. These are important tasks in robotics and computer vision, as they enable robots to perform complex manipulation and grasping tasks. In this work we propose a way of extracting grasping information from hand-object interactions in virtual reality, so that synthetic data can also boost research in that area. Finally, we use this synthetically generated data to test the proposal of applying 6D object pose estimation architectures to grasping region estimation. This idea is based on both problems sharing several underlying concepts such as object detection and orientation. / Enseñar a un robot a ser completamente autónomo no es tarea fácil. Cuando los sistemas robóticos sean realmente inteligentes, las interacciones con ellos parecerán naturales y fáciles, pero nada más lejos de la realidad. Hacer que un robot comprenda y asimile su entorno es una difícil cruzada que el campo de la visión por ordenador intenta abordar, y las técnicas de aprendizaje profundo nos están acercando al objetivo. Pero el precio son los datos. La generación de datos sintéticos es el proceso de generar datos artificiales que se utilizan para entrenar modelos de aprendizaje automático. Estos datos se generan mediante algoritmos informáticos y simulaciones, y están diseñados para parecerse lo más posible a los datos del mundo real. El uso de datos sintéticos se ha vuelto cada vez más popular en los últimos años, especialmente en el campo del aprendizaje profundo, debido a la escasez de datos reales anotados de alta calidad y al alto coste de su recopilación. Por ello, en esta tesis abordamos la tarea de facilitar la generación de datos sintéticos con la creación de una herramienta que aprovecha los avances de los motores modernos de renderizado. En este contexto, los datos sintéticos generados pueden utilizarse para entrenar modelos para tareas como la estimación de la pose 6D de objetos o la estimación de agarres. La estimación de la pose 6D de objetos se refiere al problema de determinar la posición y orientación de un objeto en el espacio 3D, mientras que la estimación del agarre implica predecir la posición y orientación de una mano robótica o pinza que pueda utilizarse para coger y manipular el objeto. Se trata de tareas importantes en robótica y visión por computador, ya que permiten a los robots realizar tareas complejas de manipulación y agarre. En este trabajo proponemos una forma de extraer información de agarres a partir de interacciones mano-objeto en realidad virtual, de modo que los datos sintéticos también puedan impulsar la investigación en esa área. Por último, utilizamos estos datos generados sintéticamente para poner a prueba la propuesta de aplicar arquitecturas de estimación de pose 6D de objetos a la estimación de regiones de agarre. Esta propuesta se basa en que ambos problemas comparten varios conceptos subyacentes, como la detección y orientación de objetos. / This thesis has been funded by the Spanish Ministry of Education [FPU17/00166]
37

Early diagnosis and personalised treatment focusing on synthetic data modelling: Novel visual learning approach in healthcare

Mahmoud, Ahsanullah Y., Neagu, Daniel, Scrimieri, Daniele, Abdullatif, Amr R.A. 09 August 2023 (has links)
Yes / The early diagnosis and personalised treatment of diseases are facilitated by machine learning. The quality of data has an impact on diagnosis because medical data are usually sparse, imbalanced, and contain irrelevant attributes, resulting in suboptimal diagnosis. To address the impacts of data challenges, improve resource allocation, and achieve better health outcomes, a novel visual learning approach is proposed. This study contributes to the visual learning approach by determining whether less or more synthetic data are required to improve the quality of a dataset, such as the number of observations and features, according to the intended personalised treatment and early diagnosis. In addition, numerous visualisation experiments are conducted, including using statistical characteristics, cumulative sums, histograms, correlation matrix, root mean square error, and principal component analysis in order to visualise both original and synthetic data to address the data challenges. Real medical datasets for cancer, heart disease, diabetes, cryotherapy and immunotherapy are selected as case studies. As a benchmark and point of classification comparison in terms of such as accuracy, sensitivity, and specificity, several models are implemented such as k-Nearest Neighbours and Random Forest. To simulate algorithm implementation and data, Generative Adversarial Network is used to create and manipulate synthetic data, whilst, Random Forest is implemented to classify the data. An amendable and adaptable system is constructed by combining Generative Adversarial Network and Random Forest models. The system model presents working steps, overview and flowchart. Experiments reveal that the majority of data-enhancement scenarios allow for the application of visual learning in the first stage of data analysis as a novel approach. To achieve meaningful adaptable synergy between appropriate quality data and optimal classification performance while maintaining statistical characteristics, visual learning provides researchers and practitioners with practical human-in-the-loop machine learning visualisation tools. Prior to implementing algorithms, the visual learning approach can be used to actualise early, and personalised diagnosis. For the immunotherapy data, the Random Forest performed best with precision, recall, f-measure, accuracy, sensitivity, and specificity of 81%, 82%, 81%, 88%, 95%, and 60%, as opposed to 91%, 96%, 93%, 93%, 96%, and 73% for synthetic data, respectively. Future studies might examine the optimal strategies to balance the quantity and quality of medical data.
38

Towards Building Privacy-Preserving Language Models: Challenges and Insights in Adapting PrivGAN for Generation of Synthetic Clinical Text

Nazem, Atena January 2023 (has links)
The growing development of artificial intelligence (AI), particularly neural networks, is transforming applications of AI in healthcare, yet it raises significant privacy concerns due to potential data leakage. As neural networks memorise training data, they may inadvertently expose sensitive clinical data to privacy breaches, which can engender serious repercussions like identity theft, fraud, and harmful medical errors. While regulations such as GDPR offer safeguards through guidelines, rooted and technical protections are required to address the problem of data leakage. Reviews of various approaches show that one avenue of exploration is the adaptation of Generative Adversarial Networks (GANs) to generate synthetic data for use in place of real data. Since GANs were originally designed and mainly researched for generating visual data, there is a notable gap for further exploration of adapting GANs with privacy-preserving measures for generating synthetic text data. Thus, to address this gap, this study aims at answering the research questions of how a privacy-preserving GAN can be adapted to safeguard the privacy of clinical text data and what challenges and potential solutions are associated with these adaptations. To this end, the existing privGAN framework—originally developed and tested for image data—was tailored to suit clinical text data. Following the design science research framework, modifications were made while adhering to the privGAN architecture to incorporate reinforcement learning (RL) for addressing the discrete nature of text data. For synthetic data generation, this study utilised the 'Discharge summary' class from the Noteevents table of the MIMIC-III dataset, which is clinical text data in American English. The utility of the generated data was assessed using the BLEU-4 metric, and a white-box attack was conducted to test the model's resistance to privacy breaches. The experiment yielded a very low BLEU-4 score, indicating that the generator could not produce synthetic data that would capture the linguistic characteristics and patterns of real data. The relatively low white-box attack accuracy of one discriminator (0.2055) suggests that the trained discriminator was less effective in inferring sensitive information with high accuracy. While this may indicate a potential for preserving privacy, increasing the number of discriminators proves less favourable results (0.361). In light of these results, it is noted that the adapted approach in defining the rewards as a measure of discriminators’ uncertainty can signal a contradicting learning strategy and lead to the low utility of data. This study underscores the challenges in adapting privacy-preserving GANs for text data due to the inherent complexity of GANs training and the required computational power. To obtain better results in terms of utility and confirm the effectiveness of the privacy measures, further experiments are required to consider a more direct and granular rewarding system for the generator and to obtain an optimum learning rate. As such, the findings reiterate the necessity for continued experimentation and refinement in adapting privacy-preserving GANs for clinical text.
39

Urban Seismic Event Detection: A Non-Invasive Deep Learning Approach

Parth Sagar Hasabnis (18424092) 23 April 2024 (has links)
<p dir="ltr">As cameras increasingly populate urban environments for surveillance, the threat of data breaches and losses escalates as well. The rapid advancements in generative Artificial Intelligence have greatly simplified the replication of individuals’ appearances from video footage. This capability poses a grave risk as malicious entities can exploit it for various nefarious purposes, including identity theft and tracking individuals’ daily activities to facilitate theft or burglary.</p><p dir="ltr">To reduce reliance on video surveillance systems, this study introduces Urban Seismic Event Detection (USED), a deep learning-based technique aimed at extracting information about urban seismic events. Our approach involves synthesizing training data through a small batch of manually labelled field data. Additionally, we explore the utilization of unlabeled field data in training through semi-supervised learning, with the implementation of a mean-teacher approach. We also introduce pre-processing and post-processing techniques tailored to seismic data. Subsequently, we evaluate the trained models using synthetic, real, and unlabeled data and compare the results with recent statistical methods. Finally, we discuss the insights gained and the limitations encountered in our approach, while also proposing potential avenues for future research.</p>
40

Optical Satellite/Component Tracking and Classification via Synthetic CNN Image Processing for Hardware-in-the-Loop testing and validation of Space Applications using free flying drone platforms

Peterson, Marco Anthony 21 April 2022 (has links)
The proliferation of reusable space vehicles has fundamentally changed how we inject assets into orbit and beyond, increasing the reliability and frequency of launches. Leading to the rapid development and adoption of new technologies into the Aerospace sector, such as computer vision (CV), machine learning (ML), and distributed networking. All these technologies are necessary to enable genuinely autonomous decision-making for space-borne platforms as our spacecraft travel further into the solar system, and our missions sets become more ambitious, requiring true ``human out of the loop" solutions for a wide range of engineering and operational problem sets. Deployment of systems proficient at classifying, tracking, capturing, and ultimately manipulating orbital assets and components for maintenance and assembly in the persistent dynamic environment of space and on the surface of other celestial bodies, tasks commonly referred to as On-Orbit Servicing and In Space Assembly, have a unique automation potential. Given the inherent dangers of manned space flight/extravehicular activity (EVAs) methods currently employed to perform spacecraft construction and maintenance tasking, coupled with the current limitation of long-duration human flight outside of low earth orbit, space robotics armed with generalized sensing and control machine learning architectures is a tremendous enabling technology. However, the large amounts of sensor data required to adequately train neural networks for these space domain tasks are either limited or non-existent, requiring alternate means of data collection/generation. Additionally, the wide-scale tools and methodologies required for hardware in the loop simulation, testing, and validation of these new technologies outside of multimillion-dollar facilities are largely in their developmental stages. This dissertation proposes a novel approach for simulating space-based computer vision sensing and robotic control using both physical and virtual reality testing environments. This methodology is designed to both be affordable and expandable, enabling hardware in the loop simulation and validation of space systems at large scale across multiple institutions. While the focus of the specific computer vision models in this paper are narrowly focused on solving imagery problems found on orbit, this work can be expanded to solve any problem set that requires robust onboard computer vision, robotic manipulation, and free flight capabilities. / Doctor of Philosophy / The lack of real-world imagery of space assets and planetary surfaces required to train neural networks to autonomously identify, classify, and perform decision-making in these environments is either limited, none existent, or prohibitively expensive to obtain. Leveraging the power of the unreal engine, motion capture, and theatre projections technologies combined with robotics, computer vision, and machine learning to provide a means to recreate these worlds for the purpose of optical machine learning testing and validation for space and other celestial applications. This dissertation also incorporates domain randomization methods to increase neural network performance for the above mentioned applications.

Page generated in 0.0864 seconds