• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 269
  • 36
  • 15
  • 11
  • 10
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 431
  • 431
  • 431
  • 275
  • 206
  • 192
  • 127
  • 103
  • 93
  • 82
  • 81
  • 72
  • 68
  • 62
  • 61
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

Deep Learning for 3D Perception: Computer Vision and Tactile Sensing

Garcia-Garcia, Alberto 23 October 2019 (has links)
The care of dependent people (for reasons of aging, accidents, disabilities or illnesses) is one of the top priority lines of research for the European countries as stated in the Horizon 2020 goals. In order to minimize the cost and the intrusiveness of the therapies for care and rehabilitation, it is desired that such cares are administered at the patient’s home. The natural solution for this environment is an indoor mobile robotic platform. Such robotic platform for home care needs to solve to a certain extent a set of problems that lie in the intersection of multiple disciplines, e.g., computer vision, machine learning, and robotics. In that crossroads, one of the most notable challenges (and the one we will focus on) is scene understanding: the robot needs to understand the unstructured and dynamic environment in which it navigates and the objects with which it can interact. To achieve full scene understanding, various tasks must be accomplished. In this thesis we will focus on three of them: object class recognition, semantic segmentation, and grasp stability prediction. The first one refers to the process of categorizing an object into a set of classes (e.g., chair, bed, or pillow); the second one goes one level beyond object categorization and aims to provide a per-pixel dense labeling of each object in an image; the latter consists on determining if an object which has been grasped by a robotic hand is in a stable configuration or if it will fall. This thesis presents contributions towards solving those three tasks using deep learning as the main tool for solving such recognition, segmentation, and prediction problems. All those solutions share one core observation: they all rely on tridimensional data inputs to leverage that additional dimension and its spatial arrangement. The four main contributions of this thesis are: first, we show a set of architectures and data representations for 3D object classification using point clouds; secondly, we carry out an extensive review of the state of the art of semantic segmentation datasets and methods; third, we introduce a novel synthetic and large-scale photorealistic dataset for solving various robotic and vision problems together; at last, we propose a novel method and representation to deal with tactile sensors and learn to predict grasp stability.
172

Tabular Information Extraction from Datasheets with Deep Learning for Semantic Modeling

Akkaya, Yakup 22 March 2022 (has links)
The growing popularity of artificial intelligence and machine learning has led to the adop- tion of the automation vision in the industry by many other institutions and organizations. Many corporations have made it their primary objective to make the delivery of goods and services and manufacturing in a more efficient way with minimal human intervention. Au- tomated document processing and analysis is also a critical component of this cycle for many organizations that contribute to the supply chain. The massive volume and diver- sity of data created in this rapidly evolving environment make this a highly desired step. Despite this diversity, important information in the documents is provided in the tables. As a result, extracting tabular data is a crucial aspect of document processing. This thesis applies deep learning methodologies to detect table structure elements for the extraction of data and preparation for semantic modelling. In order to find optimal structure definition, we analyzed the performance of deep learning models in different formats such as row/column and cell. The combined row and column detection models perform poorly compared to other models’ detection performance due to the highly over- lapping nature of rows and columns. Separate row and column detection models seem to achieve the best average F1-score with 78.5% and 79.1%, respectively. However, de- termining cell elements from the row and column detections for semantic modelling is a complicated task due to spanning rows and columns. Considering these facts, a new method is proposed to set the ground-truth information called a content-focused annota- tion to define table elements better. Our content-focused method is competent in handling ambiguities caused by huge white spaces and lack of boundary lines in table structures; hence, it provides higher accuracy. Prior works have addressed the table analysis problem under table detection and table structure detection tasks. However, the impact of dataset structures on table structure detection has not been investigated. We provide a comparison of table structure detection performance with cropped and uncropped datasets. The cropped set consists of only table images that are cropped from documents assuming tables are detected perfectly. The uncropped set consists of regular document images. Experiments show that deep learning models can improve the detection performance by up to 9% in average precision and average recall on the cropped versions. Furthermore, the impact of cropped images is negligible under the Intersection over Union (IoU) values of 50%-70% when compared to the uncropped versions. However, beyond 70% IoU thresholds, cropped datasets provide significantly higher detection performance.
173

Semantic Segmentation of Urban Scene Images Using Recurrent Neural Networks

Daliparthi, Venkata Satya Sai Ajay January 2020 (has links)
Background: In Autonomous Driving Vehicles, the vehicle receives pixel-wise sensor data from RGB cameras, point-wise depth information from the cameras, and sensors data as input. The computer present inside the Autonomous Driving vehicle processes the input data and provides the desired output, such as steering angle, torque, and brake. To make an accurate decision by the vehicle, the computer inside the vehicle should be completely aware of its surroundings and understand each pixel in the driving scene. Semantic Segmentation is the task of assigning a class label (Such as Car, Road, Pedestrian, or Sky) to each pixel in the given image. So, a better performing Semantic Segmentation algorithm will contribute to the advancement of the Autonomous Driving field. Research Gap: Traditional methods, such as handcrafted features and feature extraction methods, were mainly used to solve Semantic Segmentation. Since the rise of deep learning, most of the works are using deep learning to dealing with Semantic Segmentation. The most commonly used neural network architecture to deal with Semantic Segmentation was the Convolutional Neural Network (CNN). Even though some works made use of Recurrent Neural Network (RNN), the effect of RNN in dealing with Semantic Segmentation was not yet thoroughly studied. Our study addresses this research gap. Idea: After going through the existing literature, we came up with the idea of “Using RNNs as an add-on module, to augment the skip-connections in Semantic Segmentation Networks through residual connections.” Objectives and Method: The main objective of our work is to improve the Semantic Segmentation network’s performance by using RNNs. The Experiment was chosen as a methodology to conduct our study. In our work, We proposed three novel architectures called UR-Net, UAR-Net, and DLR-Net by implementing our idea to the existing networks U-Net, Attention U-Net, and DeepLabV3+ respectively. Results and Findings: We empirically showed that our proposed architectures have shown improvement in efficiently segmenting the edges and boundaries. Through our study, we found that there is a trade-off between using RNNs and Inference time of the model. Suppose we use RNNs to improve the performance of Semantic Segmentation Networks. In that case, we need to trade off some extra seconds during the inference of the model. Conclusion: Our findings will not contribute to the Autonomous driving field, where we need better performance in real-time. But, our findings will contribute to the advancement of Bio-medical Image segmentation, where doctors can trade-off those extra seconds during inference for better performance.
174

Analyzing white blood cells using deep learning techniques

Neelakantan, Suraj, Kalidindi, Sai Sushanth Varma January 2020 (has links)
The field of hematology involves the analysis of blood and its components like platelets, red blood cells, white blood cells. The outcome of this analysis can be vital in determining the condition of the human body and it is important to obtain accurate results. A deep learning algorithm scans over the given input data for unique features and learns them. Then it identifies these features and correlates them to give the result. This can save a significant amount of time and manual work. In contrast, a traditional machine learning algorithm requires the developer to carry-out the feature engineering. This thesis involves the analysis of white blood cells (WBC) using deep learning techniques. In collaboration with a hematology company HemoCue AB based in Angelholm, we will be developing deep learning algorithms for the analysis of white blood cells in the HemoCue R WBC DIFF System. Predominantly, there are two stages in this thesis. The first stage is white blood cell identification, which is used to calculate the number of white blood cells in the given blood sample. The next stage is to identify the different types of white blood cells with which the concentration of each type of WBC in the given blood sample is calculated. We have explored different classification approaches like ’one vs all’ and ’4-class classifier’, and have developed two CNN architectural designs i.e. ’multi-input’ and ’multi-channel’. On comparing the performance of all these design approaches, a final integrated model is put forth for the analysis of WBCs in the company’s device. The proposed ’one vs all’ classification approach combined with a 3-class CNN classifier has yielded very promising results with a combined accuracy 95.45% in WBC identification and 90.49% in WBC differential classification.
175

Machine Learning Methods for Brain Lesion Delineation

Raina, Kevin 02 October 2020 (has links)
Brain lesions are regions of abnormal or damaged tissue in the brain, commonly due to stroke, cancer or other disease. They are diagnosed primarily using neuroimaging, the most common modalities being Magnetic Resonance Imaging (MRI) or Computed Tomography (CT). Brain lesions have a high degree of variability in terms of location, size, intensity and form, which makes diagnosis challenging. Traditionally, radiologists diagnose lesions by inspecting neuroimages directly by eye; however, this is time-consuming and subjective. For these reasons, many automated methods have been developed for lesion delineation (segmentation), lesion identification and diagnosis. The goal of this thesis is to improve and develop automated methods for delineating brain lesions from multimodal MRI scans. First, we propose an improvement to existing segmentation methods by exploiting the bilateral quasi-symmetry of healthy brains, which breaks down when lesions are present. We augment our data using nonlinear registration of a neuroimage to a reflected version of itself, leading to an improvement in Dice coefficient of 13 percent. Second, we model lesion volume in brain image patches with a modified Poisson regression method. The model accurately identified the lesion image with the larger lesion volume for 86 percent of paired sample patches. Both of these projects were published in the proceedings of the BIOSTEC 2020 conference. In the last two chapters, we propose a confidence-based approach to measure segmentation uncertainty, and apply an unsupervised segmentation method based on mutual information.
176

Interpretable Superhuman Machine Learning Systems: An explorative study focusing on interpretability and detecting Unknown Knowns using GAN

Hermansson, Adam, Generalao, Stefan January 2020 (has links)
I en framtid där förutsägelser och beslut som tas av maskininlärningssystem överträffar människors förmåga behöver systemen att vara tolkbara för att vi skall kunna lita på och förstå dem. Vår studie utforskar världen av tolkbar maskininlärning genom att designa och undersöka artefakter. Vi genomför experiment för att utforska förklarbarhet, tolkbarhet samt tekniska utmaningar att skapa maskininlärningsmodeller för att identifiera liknande men unika objekt. Slutligen genomför vi ett användartest för att utvärdera toppmoderna förklaringsverktyg i ett direkt mänskligt sammanhang. Med insikter från dessa experiment diskuterar vi den potentiella framtiden för detta fält / In a future where predictions and decisions made by machine learning systems outperform humans we need the systems to be interpretable in order for us to trust and understand them. Our study explore the realm of interpretable machine learning through designing artifacts. We conduct experiments to explore explainability, interpretability as well as technical challenges of creating machine learning models to identify objects that appear similar to humans. Lastly, we conduct a user test to evaluate current state-of-the-art visual explanatory tools in a human setting. From these insights, we discuss the potential future of this field.
177

Stronger Together? An Ensemble of CNNs for Deepfakes Detection / Starkare Tillsammans? En Ensemble av CNNs för att Identifiera Deepfakes

Gardner, Angelica January 2020 (has links)
Deepfakes technology is a face swap technique that enables anyone to replace faces in a video, with highly realistic results. Despite its usefulness, if used maliciously, this technique can have a significant impact on society, for instance, through the spreading of fake news or cyberbullying. This makes the ability of deepfakes detection a problem of utmost importance. In this paper, I tackle the problem of deepfakes detection by identifying deepfakes forgeries in video sequences. Inspired by the state-of-the-art, I study the ensembling of different machine learning solutions built on convolutional neural networks (CNNs) and use these models as objects for comparison between ensemble and single model performances. Existing work in the research field of deepfakes detection suggests that escalated challenges posed by modern deepfake videos make it increasingly difficult for detection methods. I evaluate that claim by testing the detection performance of four single CNN models as well as six stacked ensembles on three modern deepfakes datasets. I compare various ensemble approaches to combine single models and in what way their predictions should be incorporated into the ensemble output. The results I found was that the best approach for deepfakes detection is to create an ensemble, though, the ensemble approach plays a crucial role in the detection performance. The final proposed solution is an ensemble of all available single models which use the concept of soft (weighted) voting to combine its base-learners’ predictions. Results show that this proposed solution significantly improved deepfakes detection performance and substantially outperformed all single models.
178

Convolutional Neural Network Optimization for Homography Estimation

DiMascio, Michelle Augustine January 2018 (has links)
No description available.
179

Semantic Segmentation of RGB images for feature extraction in Real Time

Elavarthi, Pradyumna January 2019 (has links)
No description available.
180

Deep Learning-Based Speed Sign Detection and Recognition

Robertson, Curtis E. 04 November 2020 (has links)
No description available.

Page generated in 0.1287 seconds