• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 348
  • 42
  • 20
  • 13
  • 10
  • 8
  • 5
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 541
  • 541
  • 253
  • 210
  • 173
  • 134
  • 113
  • 111
  • 108
  • 89
  • 87
  • 80
  • 75
  • 74
  • 73
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
501

Incorporating Metadata Into the Active Learning Cycle for 2D Object Detection / Inkorporera metadata i aktiv inlärning för 2D objektdetektering

Stadler, Karsten January 2021 (has links)
In the past years, Deep Convolutional Neural Networks have proven to be very useful for 2D Object Detection in many applications. These types of networks require large amounts of labeled data, which can be increasingly costly for companies deploying these detectors in practice if the data quality is lacking. Pool-based Active Learning is an iterative process of collecting subsets of data to be labeled by a human annotator and used for training to optimize performance per labeled image. The detectors used in Active Learning cycles are conventionally pre-trained with a small subset, approximately 2% of available data labeled uniformly at random. This is something I challenged in this thesis by using image metadata. With the motivation of many Machine Learning models being a "jack of all trades, master of none", thus it is hard to train models such that they generalize to all of the data domain, it can be interesting to develop a detector for a certain target metadata domain. A simple Monte Carlo method, Rejection Sampling, can be implemented to sample according to a metadata target domain. This would require a target and proposal metadata distribution. The proposal metadata distribution would be a parametric model in the form of a Gaussian Mixture Model learned from the training metadata. The parametric model for the target distribution could be learned in a similar manner, however from a target dataset. In this way, only the training images with metadata most similar to the target metadata distribution can be sampled. This sampling approach was employed and tested with a 2D Object Detector: Faster-RCNN with ResNet-50 backbone. The Rejection Sampling approach was tested against conventional random uniform sampling and a classical Active Learning baseline: Min Entropy Sampling. The performance was measured and compared on two different target metadata distributions that were inferred from a specific target dataset. With a labeling budget of 2% for each cycle, the max Mean Average Precision at 0.5 Intersection Over Union for the target set each cycle was calculated. My proposed approach has a 40 % relative performance advantage over random uniform sampling for the first cycle, and 10% after 9 cycles. Overall, my approach only required 37 % of the labeled data to beat the next best-tested sampler: the conventional uniform random sampling. / De senaste åren har Djupa Neurala Faltningsnätverk visat sig vara mycket användbara för 2D Objektdetektering i många applikationer. De här typen av nätverk behöver stora mängder av etiketterat data, något som kan innebära ökad kostnad för företag som distribuerar dem, om kvaliteten på etiketterna är bristfällig. Pool-baserad Aktiv Inlärning är en iterativ process som innebär insamling av delmängder data som ska etiketteras av en människa och användas för träning, för att optimera prestanda per etiketterat data. Detektorerna som används i Aktiv Inlärning är konventionellt sätt förtränade med en mindre delmängd data, ungefär 2% av all tillgänglig data, etiketterat enligt slumpen. Det här är något jag utmanade i det här arbetet genom att använda bild metadata. Med motiveringen att många Maskininlärningsmodeller presterar sämre på större datadomäner, eftersom det kan vara svårt att lära detektorer stora datadomäner, kan det vara intressant att utveckla en detektor för ett särskild metadata mål-domän. För att samla in data enligt en metadata måldomän, kan en enkel Monte Carlo metod, Rejection Sampling implementeras. Det skulle behövas en mål-metadata-distribution och en faktisk metadata distribution. den faktiska metadata distributionen skulle vara en parametrisk modell i formen av en Gaussisk blandningsmodell som är tränad på träningsdata. Den parametriska modellen för mål-metadata-distributionen skulle kunna vara tränad på liknande sätt, fast ifrån mål-datasetet. På detta sätt, skulle endast träningsbilder med metadata mest lik mål-datadistributionen kunna samlas in. Den här samplings-metoden utvecklades och testades med en 2D objektdetektor: Faster R-CNN med ResNet-50 bildegenskapextraktor. Rejection sampling metoden blev testad mot konventionell likformig slumpmässig sampling av data och en klassisk Aktiv Inlärnings metod: Minimum Entropi sampling. Prestandan mättes och jämfördes mellan två olika mål-metadatadistributioner som var framtagna från specifika mål-metadataset. Med en etiketteringsbudget på 2%för varje cykel, så beräknades medelvärdesprecisionen om 0.5 snitt över union för mål-datasetet. Min metod har 40%bättre prestanda än slumpmässig likformig insamling i första cykeln, och 10 % efter 9 cykler. Överlag behövde min metod endast 37 % av den etiketterade data för att slå den näst basta samplingsmetoden: slumpmässig likformig insamling.
502

Performance Assessment of a 77 GHz Automotive Radar for Various Obstacle Avoidance Application

Komarabathuni, Ravi V. 26 July 2011 (has links)
No description available.
503

Artificial data for Image classification in industrial applications

Yonan, Yonan, Baaz, August January 2022 (has links)
Machine learning and AI are growing rapidly and they are being implemented more often than before due to their high accuracy and performance. One of the biggest challenges to machine learning is data collection. The training data is the most important part of any machine learning project since it determines how the trained model will behave. In the case of object classification and detection, capturing a large number of images per object is not always possible and can be a very time-consuming and tedious process. This thesis explores options specific to image classification that help reducing the need to capture many images per object while still keeping the same performance accuracy. In this thesis, experiments have been performed with the goal of achieving a high classification accuracy with a limited dataset. One method that is explored is to create artificial training images using a game engine. Ways to expand a small dataset such as different data augmentation methods, and regularization methods, are also employed. / Maskininlärning och AI växer snabbt och de implementeras allt oftare på grund av deras höga noggrannhet och prestanda. En av de största utmaningarna för maskininlärning är datainsamling. Träningsdata är den viktigaste delen av ett maskininlärningsprojekt eftersom den avgör hur den tränade modellen kommer att bete sig. När det gäller objektklassificering och detektering är det inte alltid möjligt att ta många bilder per objekt och det kan vara en process som kräver mycket tid och arbete. Det här examensarbetet utforskar alternativ som är specifika för bildklassificering som minskar behovet av att ta många bilder per objekt samtidigt som prestanda bibehålls. I det här examensarbetet, flera experiment har utförts med målet att uppnå en hög klassificeringsprestanda med en begränsad dataset. En metod som utforskas är att skapa träningsbilder med hjälp av en spelmotor. Metoder för att utöka antal bilder i ett litet dataset, som data augmenteringsmetoder och regleringsmetoder, används också.
504

3D Object Detection Using Sidescan Sonar Images

Georgiev, Ivaylo January 2024 (has links)
Sidescan sonars are tools used in seabed inspection and imagery. As a smaller and cheaper compared to the alternatives tool, it has attracted attention and many studies have been developed to extract information about the seabed altitude from the produced images. The main issue is that sidescan sonars do not provide elevation angle information, therefore a 3D map of the seabed cannot be inferred directly. One of the most recent techniques to tackle this problem is called neural rendering [1], in which the sea surface bathymetry is implicitly represented using a neural network. The purpose of this thesis is (1) to find the minimum altitude change that can be detected using this technique, (2) to check whether the position of the sonar ensonification has any effect on these results, and (3) to check from how many sides is it sufficient to ensonify the region with altitude change in order to detect it confidently. To conduct this research, simulations of missions conducted by an autonomous underwater vehicle with sidescan sonar heads on both sides are done on a map, where different objects from various sizes and shapes are put. Then, neural rendering is used to reconstruct the bathymetry of the maps before and after the object insertion from the sidescan sonar. The reconstructed seabed elevations are then compared and the objects with the smallest size or altitude that were detected (meaning that the predicted height from the model trained on the map with the objects is significantly larger than that of the model trained on the initial map) would be the answer to the first question. Then, those smallest objects are again put on the same map, and now smaller autonomous underwater vehicle missions are used to check how many sides are need so that the objects are still detectable. The conducted experiments suggest that objects with bathymetry elevation in the range of centimeters can be detected, and in some cases ensonification from 2 sides is sufficient to detect an object with confidence. / Sidenskannings-sonarer spelar en avgörande roll i inspektionen av havsbotten och erbjuder kostnadseffektiva alternativ till traditionella verktyg. Bristen på information om elevationsvinklar utgör dock en utmaning för att direkt härleda en 3D-karta över havsbotten. Denna avhandling undersöker tillämpningen av neural rendering [1], en nyligen utvecklad teknik som implicit representerar havsytsbathymetri med neurala nätverk, för att adressera denna begränsning. Målen med denna forskning är tre: (1) att bestämma den minsta detekterbara höjdändringen med hjälp av neural rendering, (2) att bedöma effekten av sonarens ensonifieringsposition på detektionsresultaten och (3) att undersöka det minsta antalet sidor som krävs för pålitlig objektdetektion i områden med höjdändringar. Metoden innefattar simuleringar av autonoma undervattensfordonsuppdrag utrustade med sidenskannings-sonarer på båda sidor. Olika objekt av varierande storlekar och former introduceras på en karta, och neural rendering används för att återskapa bathymetrier före och efter objektets insättning. Analysen fokuserar på att jämföra de återskapade havsbottenhöjderna och identifiera de minsta objekten eller höjdändringarna som är möjliga att detektera med modellen. Därefter återintroduceras dessa minimala objekt på kartan, och mindre uppdrag med autonoma undervattensfordon genomförs för att fastställa det minsta antalet sidor som krävs för pålitlig detektion. Forskningsresultaten indikerar att objekt med höjdändringar i centimeterskalan kan detekteras pålitligt. Dessutom tyder experimenten på att i vissa fall är ensonifiering från endast två sidor tillräckligt för pålitlig objektdetektion. Denna forskning bidrar med värdefulla insikter för att optimera sidenskanningssonarapplikationer för havsbotteninspektion, vilket erbjuder potentiella förbättringar av effektivitet och kostnadseffektivitet för undervattensutforskning och kartläggning.
505

Enhanced 3D Object Detection And Tracking In Autonomous Vehicles: An Efficient Multi-modal Deep Fusion Approach

Priyank Kalgaonkar (10911822) 03 September 2024 (has links)
<p dir="ltr">This dissertation delves into a significant challenge for Autonomous Vehicles (AVs): achieving efficient and robust perception under adverse weather and lighting conditions. Systems that rely solely on cameras face difficulties with visibility over long distances, while radar-only systems struggle to recognize features like stop signs, which are crucial for safe navigation in such scenarios.</p><p dir="ltr">To overcome this limitation, this research introduces a novel deep camera-radar fusion approach using neural networks. This method ensures reliable AV perception regardless of weather or lighting conditions. Cameras, similar to human vision, are adept at capturing rich semantic information, whereas radars can penetrate obstacles like fog and darkness, similar to X-ray vision.</p><p dir="ltr">The thesis presents NeXtFusion, an innovative and efficient camera-radar fusion network designed specifically for robust AV perception. Building on the efficient single-sensor NeXtDet neural network, NeXtFusion significantly enhances object detection accuracy and tracking. A notable feature of NeXtFusion is its attention module, which refines critical feature representation for object detection, minimizing information loss when processing data from both cameras and radars.</p><p dir="ltr">Extensive experiments conducted on large-scale datasets such as Argoverse, Microsoft COCO, and nuScenes thoroughly evaluate the capabilities of NeXtDet and NeXtFusion. The results show that NeXtFusion excels in detecting small and distant objects compared to existing methods. Notably, NeXtFusion achieves a state-of-the-art mAP score of 0.473 on the nuScenes validation set, outperforming competitors like OFT by 35.1% and MonoDIS by 9.5%.</p><p dir="ltr">NeXtFusion’s excellence extends beyond mAP scores. It also performs well in other crucial metrics, including mATE (0.449) and mAOE (0.534), highlighting its overall effectiveness in 3D object detection. Visualizations of real-world scenarios from the nuScenes dataset processed by NeXtFusion provide compelling evidence of its capability to handle diverse and challenging environments.</p>
506

<b>LIDAR BASED 3D OBJECT DETECTION USING YOLOV8</b>

Swetha Suresh Menon (18813667) 03 September 2024 (has links)
<p dir="ltr">Autonomous vehicles have gained substantial traction as the future of transportation, necessitating continuous research and innovation. While 2D object detection and instance segmentation methods have made significant strides, 3D object detection offers unparalleled precision. Deep neural network-based 3D object detection, coupled with sensor fusion, has become indispensable for self-driving vehicles, enabling a comprehensive grasp of the spatial geometry of physical objects. In our study of a Lidar-based 3D object detection network using point clouds, we propose a novel architectural model based on You Only Look Once (YOLO) framework. This innovative model combines the efficiency and accuracy of the YOLOv8 network, a swift 2D standard object detector, and a state-of-the-art model, with the real-time 3D object detection capability of the Complex YOLO model. By integrating the YOLOv8 model as the backbone network and employing the Euler Region Proposal (ERP) method, our approach achieves rapid inference speeds, surpassing other object detection models while upholding high accuracy standards. Our experiments, conducted on the KITTI dataset, demonstrate the superior efficiency of our new architectural model. It outperforms its predecessors, showcasing its prowess in advancing the field of 3D object detection in autonomous vehicles.</p>
507

A MULTI-HEAD ATTENTION APPROACH WITH COMPLEMENTARY MULTIMODAL FUSION FOR VEHICLE DETECTION

Nujhat Tabassum (18010969) 03 June 2024 (has links)
<p dir="ltr">In the realm of autonomous vehicle technology, the Multimodal Vehicle Detection Network (MVDNet) represents a significant leap forward, particularly in the challenging context of weather conditions. This paper focuses on the enhancement of MVDNet through the integration of a multi-head attention layer, aimed at refining its performance. The integrated multi-head attention layer in the MVDNet model is a pivotal modification, advancing the network's ability to process and fuse multimodal sensor information more efficiently. The paper validates the improved performance of MVDNet with multi-head attention through comprehensive testing, which includes a training dataset derived from the Oxford Radar Robotcar. The results clearly demonstrate that the Multi-Head MVDNet outperforms the other related conventional models, particularly in the Average Precision (AP) estimation, under challenging environmental conditions. The proposed Multi-Head MVDNet not only contributes significantly to the field of autonomous vehicle detection but also underscores the potential of sophisticated sensor fusion techniques in overcoming environmental limitations.</p>
508

[pt] MAPEAMENTO DA DISTRIBUIÇÃO POPULACIONAL ATRAVÉS DA DETECÇÃO DE ÁREAS EDIFICADAS EM IMAGENS DE REGIÕES HETEROGÊNEAS DO GOOGLE EARTH USANDO DEEP LEARNING / [en] POPULATION DISTRIBUTION MAPPING THROUGH THE DETECTION OF BUILDING AREAS IN GOOGLE EARTH IMAGES OF HETEROGENEOUS REGIONS USING DEEP LEARNING

CASSIO FREITAS PEREIRA DE ALMEIDA 08 February 2018 (has links)
[pt] Informações precisas sobre a distribuição da população são reconhecidamente importantes. A fonte de informação mais completa sobre a população é o censo, cujos os dados são disponibilizados de forma agregada em setores censitários. Esses setores são unidades operacionais de tamanho e formas irregulares, que dificulta a análise espacial dos dados associados. Assim, a mudança de setores censitários para um conjunto de células regulares com estimativas adequadas facilitaria a análise. Uma metodologia a ser utilizada para essa mudança poderia ser baseada na classificação de imagens de sensoriamento remoto para a identificação de domicílios, que é a base das pesquisas envolvendo a população. A detecção de áreas edificadas é uma tarefa complexa devido a grande variabilidade de características de construção e de imagens. Os métodos usuais são complexos e muito dependentes de especialistas. Os processos automáticos dependem de grandes bases de imagens para treinamento e são sensíveis à variação de qualidade de imagens e características das construções e de ambiente. Nesta tese propomos a utilização de um método automatizado para detecção de edificações em imagens Google Earth que mostrou bons resultados utilizando um conjunto de imagens relativamente pequeno e com grande variabilidade, superando as limitações dos processos existentes. Este resultado foi obtido com uma aplicação prática. Foi construído um conjunto de imagens com anotação de áreas construídas para 12 regiões do Brasil. Estas imagens, além de diferentes na qualidade, apresentam grande variabilidade nas características das edificações e no ambiente geográfico. Uma prova de conceito será feita na utilização da classificação de área construída nos métodos dasimétrico para a estimação de população em gride. Ela mostrou um resultado promissor quando comparado com o método usual, possibilitando a melhoria da qualidade das estimativas. / [en] The importance of precise information about the population distribution is widely acknowledged. The census is considered the most reliable and complete source of this information, and its data are delivered in an aggregated form in sectors. These sectors are operational units with irregular shapes, which hinder the spatial analysis of the data. Thus, the transformation of sectors onto a regular grid would facilitate such analysis. A methodology to achieve this transformation could be based on remote sensing image classification to identify building where the population lives. The building detection is considered a complex task since there is a great variability of building characteristics and on the images quality themselves. The majority of methods are complex and very specialist dependent. The automatic methods require a large annotated dataset for training and they are sensitive to the image quality, to the building characteristics, and to the environment. In this thesis, we propose an automatic method for building detection based on a deep learning architecture that uses a relative small dataset with a large variability. The proposed method shows good results when compared to the state of the art. An annotated dataset has been built that covers 12 cities distributed in different regions of Brazil. Such images not only have different qualities, but also shows a large variability on the building characteristics and geographic environments. A very important application of this method is the use of the building area classification in the dasimetric methods for the population estimation into grid. The concept proof in this application showed a promising result when compared to the usual method allowing the improvement of the quality of the estimates.
509

Deep Convolutional Neural Networks for Real-Time Single Frame Monocular Depth Estimation

Schennings, Jacob January 2017 (has links)
Vision based active safety systems have become more frequently occurring in modern vehicles to estimate depth of the objects ahead and for autonomous driving (AD) and advanced driver-assistance systems (ADAS). In this thesis a lightweight deep convolutional neural network performing real-time depth estimation on single monocular images is implemented and evaluated. Many of the vision based automatic brake systems in modern vehicles only detect pre-trained object types such as pedestrians and vehicles. These systems fail to detect general objects such as road debris and roadside obstacles. In stereo vision systems the problem is resolved by calculating a disparity image from the stereo image pair to extract depth information. The distance to an object can also be determined using radar and LiDAR systems. By using this depth information the system performs necessary actions to avoid collisions with objects that are determined to be too close. However, these systems are also more expensive than a regular mono camera system and are therefore not very common in the average consumer car. By implementing robust depth estimation in mono vision systems the benefits from active safety systems could be utilized by a larger segment of the vehicle fleet. This could drastically reduce human error related traffic accidents and possibly save many lives. The network architecture evaluated in this thesis is more lightweight than other CNN architectures previously used for monocular depth estimation. The proposed architecture is therefore preferable to use on computationally lightweight systems. The network solves a supervised regression problem during the training procedure in order to produce a pixel-wise depth estimation map. The network was trained using a sparse ground truth image with spatially incoherent and discontinuous data and output a dense spatially coherent and continuous depth map prediction. The spatially incoherent ground truth posed a problem of discontinuity that was addressed by a masked loss function with regularization. The network was able to predict a dense depth estimation on the KITTI dataset with close to state-of-the-art performance.
510

Development and Evaluation of a Machine Vision System for Digital Thread Data Traceability in a Manufacturing Assembly Environment

Alexander W Meredith (15305698) 29 April 2023 (has links)
<p>A thesis study investigating the development and evaluation of a computer vision (CV) system for a manufacturing assembly task is reported. The CV inference results are compared to a Manufacturing Process Plan and an automation method completes a buyoff in the software, Solumina. Research questions were created and three hypotheses were tested. A literature review was conducted recognizing little consensus of Industry 4.0 technology adoption in manufacturing industries. Furthermore, the literature review uncovered the need for additional research within the topic of CV. Specifically, literature points towards more research regarding the cognitive capabilities of CV in manufacturing. A CV system was developed and evaluated to test for 90% or greater confidence in part detection. A CV dataset was developed and the system was trained and validated with it. Dataset contextualization was leveraged and evaluated, as per literature. A CV system was trained from custom datasets, containing six classes of part. The pre-contextualization dataset and post-contextualization dataset was compared by a Two-Sample T-Test and statistical significance was noted for three classes. A python script was developed to compare as-assembled locations with as-defined positions of components, per the Manufacturing Process Plan. A comparison of yields test for CV-based True Positives (TPs) and human-based TPs was conducted with the system operating at a 2σ level. An automation method utilizing Microsoft Power Automate was developed to complete the cognitive functionality of the CV system testing, by completing a buyoff in the software, Solumina, if CV-based TPs were equal to or greater than human-based TPs.</p>

Page generated in 0.0818 seconds