• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 329
  • 42
  • 19
  • 13
  • 10
  • 8
  • 4
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 516
  • 516
  • 236
  • 196
  • 159
  • 125
  • 106
  • 106
  • 103
  • 84
  • 83
  • 75
  • 73
  • 71
  • 67
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Features identification and tracking for an autonomous ground vehicle

Nguyen, Chuong Hoang 14 June 2013 (has links)
This thesis attempts to develop features identification and tracking system for an autonomous ground vehicle by focusing on four fundamental tasks: Motion detection, object tracking, scene recognition, and object detection and recognition. For motion detection, we combined the background subtraction method using the mixture of Gaussian models and the optical flow to highlight any moving objects or new entering objects which stayed still. To increase robustness for object tracking result, we used the Kalman filter to combine the tracking method based on the color histogram and the method based on invariant features. For scene recognition, we applied the algorithm Census Transform Histogram (CENTRIST), which is based on Census Transform images of the training data and the Support Vector Machine classifier, to recognize a total of 8 scene categories. Because detecting the horizon is also an important task for many navigation applications, we also performed horizon detection in this thesis. Finally, the deformable parts-based models algorithm was implemented to detect some common objects, such as humans and vehicles. Furthermore, objects were only detected in the area under the horizon to reduce the detecting time and false matching rate. / Master of Science
82

Supervoxel Based Object Detection and Seafloor Segmentation Using Novel 3d Side-Scan Sonar

Patel, Kushal Girishkumar 12 November 2021 (has links)
Object detection and seafloor segmentation for conventional 2D side-scan sonar imagery is a well-investigated problem. However, due to recent advances in sensing technology, the side-scan sonar now produces a true 3D point cloud representation of the seafloor embedded with echo intensity. This creates a need to develop algorithms to process the incoming 3D data for applications such as object detection and segmentation, and an opportunity to leverage advances in 3D point cloud processing developed for terrestrial applications using optical sensors (e.g. LiDAR). A bottleneck in deploying 3D side-scan sonar sensors for online applications is attributed to the complexity in handling large amounts of data which requires higher memory for storing and processing data on embedded computers. The present research aims to improve data processing capabilities on-board autonomous underwater vehicles (AUVs). A supervoxel-based framework for over-segmentation and object detection is proposed which reduces a dense point cloud into clusters of similar points in a neighborhood. Supervoxels extracted from the point cloud are then described using feature vectors which are computed using geometry, echo intensity and depth attributes of the constituent points. Unsupervised density based clustering is applied on the feature space to detect objects which appear as outliers. / Master of Science / Acoustic imaging using side-scan sonar sensors has proven to be useful for tasks like seafloor mapping, mine countermeasures and habitat mapping. Due to advancements in sensing technology, a novel type of side-scan sonar sensor is developed which provides true 3D representation of the seafloor along with the echo intensity image. To improve the usability of the novel sensors on-board the carrying vehicles, efficient algorithms needs to be developed. In underwater robotics, limited computational and data storage capabilities are available which poses additional challenges in online perception applications like object detection and segmentation. In this project, I investigate a clustering based approach followed by an unsupervised machine learning method to perform detection of objects on the seafloor using the novel side scan sonar. I also show the usability of the approach for performing segmentation of the seafloor.
83

Sémantický popis obrazovky embedded zařízení / Semantic description of the embedded device screen

Horák, Martin January 2020 (has links)
Tato diplomová práce se zabývá detekcí prvků uživatelského rozhraní na obrázku displejetiskárny za použití konvolučních neuronových sítí. V teoretické části je provedena rešeršesoučasně používaných architektur pro detekci objektů. V praktické čísti je probrána tvorbagalerie, učení a vyhodnocování vybraných modelů za použití Tensorflow ObjectDetectionAPI. Závěr práce pojednává o vhodnosti vycvičených modelů pro zadaný úkol.
84

Machine vision for automation of earth-moving machines : Transfer learning experiments with YOLOv3

Borngrund, Carl January 2019 (has links)
This master thesis investigates the possibility to create a machine vision solution for the automation of earth-moving machines. This research was done as without some type of vision system it will not be possible to create a fully autonomous earth moving machine that can safely be used around humans or other machines. Cameras were used as the primary sensors as they are cheap, provide high resolution and is the type of sensor that most closely mimic the human vision system. The purpose of this master thesis was to use existing real time object detectors together with transfer learning and examine if they can successfully be used to extract information in environments such as construction, forestry and mining. The amount of data needed to successfully train a real time object detector was also investigated. Furthermore, the thesis examines if there are specifically difficult situations for the defined object detector, how reliable the object detector is and finally how to use service-oriented architecture principles can be used to create deep learning systems. To investigate the questions formulated above, three data sets were created where different properties were varied. These properties were light conditions, ground material and dump truck orientation. The data sets were created using a toy dump truck together with a similarly sized wheel loader with a camera mounted on the roof of its cab. The first data set contained only indoor images where the dump truck was placed in different orientations but neither the light nor the ground material changed. The second data set contained images were the light source was kept constant, but the dump truck orientation and ground materials changed. The last data set contained images where all property were varied. The real time object detector YOLOv3 was used to examine how a real time object detector would perform depending on which one of the three data sets it was trained using. No matter the data set, it was possible to train a model to perform real time object detection. Using a Nvidia 980 TI the inference time of the model was around 22 ms, which is more than enough to be able to classify videos running at 30 fps. All three data sets converged to a training loss of around 0.10. The data set which contained more varied data, such as the data set where all properties were changed, performed considerably better reaching a validation loss of 0.164 compared to the indoor data set, containing the least varied data, only reached a validation loss of 0.257. The size of the data set was also a factor in the performance, however it was not as important as having varied data. The result also showed that all three data sets could reach a mAP score of around 0.98 using transfer learning.
85

Experiential Sampling For Object Detection In Video

Paresh, A 05 1900 (has links)
The problem of object detection deals with determining whether an instance of a given class of object is present or not. There are robust, supervised learning based algorithms available for object detection in an image. These image object detectors (image-based object detectors) use characteristics learnt from the training samples to find object and non-object regions. The characteristics used are such that the detectors work under a variety of conditions and hence are very robust. Object detection in video can be performed by using such a detector on each frame of the video sequence. This approach checks for presence of an object around each pixel, at different scales. Such a frame-based approach completely ignores the temporal continuity inherent in the video. The detector declares presence of the object independent of what has happened in the past frames. Also, various visual cues such as motion and color, which give hints about the location of the object, are not used. The current work is aimed at building a generic framework for using a supervised learning based image object detector for video that exploits temporal continuity and the presence of various visual cues. We use temporal continuity and visual cues to speed up the detection and improve detection accuracy by considering past detection results. We propose a generic framework, based on Experiential Sampling [1], which considers temporal continuity and visual cues to focus on a relevant subset of each frame. We determine some key positions in each frame, called attention samples, and object detection is performed only at scales with these positions as centers. These key positions are statistical samples from a density function that is estimated based on various visual cues, past experience and temporal continuity. This density estimation is modeled as a Bayesian Filtering problem and is carried out using Sequential Monte Carlo methods (also known as Particle Filtering), where a density is represented by a weighted sample set. The experiential sampling framework is inspired by Neisser’s perceptual cycle [2] and Itti-Koch’s static visual attention model[3]. In this work, we first use Basic Experiential Sampling as presented in[1]for object detection in video and show its limitations. To overcome these limitations, we extend the framework to effectively combine top-down and bottom-up visual attention phenomena. We use learning based detector’s response, which is a top-down cue, along with visual cues to improve attention estimate. To effectively handle multiple objects, we maintain a minimum number of attention samples per object. We propose to use motion as an alert cue to reduce the delay in detecting new objects entering the field of view. We use an inhibition map to avoid revisiting already attended regions. Finally, we improve detection accuracy by using a particle filter based detection scheme [4], also known as Track Before Detect (TBD). In this scheme, we compute likelihood of presence of the object based on current and past frame data. This likelihood is shown to be approximately equal to the product of average sample weights over past frames. Our framework results in a significant reduction in overall computation required by the object detector, with an improvement in accuracy while retaining its robustness. This enables the use of learning based image object detectors in real time video applications which otherwise are computationally expensive. We demonstrate the usefulness of this framework for frontal face detection in video. We use Viola-Jones’ frontal face detector[5] and color and motion visual cues. We show results for various cases such as sequences with single object, multiple objects, distracting background, moving camera, changing illumination, objects entering/exiting the frame, crossing objects, objects with pose variation and sequences with scene change. The main contributions of the thesis are i) We give an experiential sampling formulation for object detection in video. Many concepts like attention point and attention density which are vague in[1] are precisely defined. ii) We combine detector’s response along with visual cues to estimate attention. This is inspired by a combination of top-down and bottom-up attention maps in visual attention models. To the best of our knowledge, this is used for the first time for object detection in video. iii) In case of multiple objects, we highlight the problem with sample based density representation and solve by maintaining a minimum number of attention samples per object. iv) For objects first detected by the learning based detector, we propose to use a TBD scheme for their subsequent detections along with the learning based detector. This improves accuracy compared to using the learning based detector alone. This thesis is organized as follows . Chapter 1: In this chapter we present a brief survey of related work and define our problem. . Chapter 2: We present an overview of biological models that have motivated our work. . Chapter 3: We give the experiential sampling formulation as in previous work [1], show results and discuss its limitations. . Chapter 4: In this chapter, which is on Enhanced Experiential Sampling, we suggest enhancements to overcome limitations of basic experiential sampling. We propose track-before-detect scheme to improve detection accuracy. . Chapter 5: We conclude the thesis and give possible directions for future work in this area. . Appendix A: A description of video database used in this thesis. . Appendix B: A list of commonly used abbreviations and notations.
86

Automotive 3D Object Detection Without Target Domain Annotations

Gustafsson, Fredrik, Linder-Norén, Erik January 2018 (has links)
In this thesis we study a perception problem in the context of autonomous driving. Specifically, we study the computer vision problem of 3D object detection, in which objects should be detected from various sensor data and their position in the 3D world should be estimated. We also study the application of Generative Adversarial Networks in domain adaptation techniques, aiming to improve the 3D object detection model's ability to transfer between different domains. The state-of-the-art Frustum-PointNet architecture for LiDAR-based 3D object detection was implemented and found to closely match its reported performance when trained and evaluated on the KITTI dataset. The architecture was also found to transfer reasonably well from the synthetic SYN dataset to KITTI, and is thus believed to be usable in a semi-automatic 3D bounding box annotation process. The Frustum-PointNet architecture was also extended to explicitly utilize image features, which surprisingly degraded its detection performance. Furthermore, an image-only 3D object detection model was designed and implemented, which was found to compare quite favourably with current state-of-the-art in terms of detection performance. Additionally, the PixelDA approach was adopted and successfully applied to the MNIST to MNIST-M domain adaptation problem, which validated the idea that unsupervised domain adaptation using Generative Adversarial Networks can improve the performance of a task network for a dataset lacking ground truth annotations. Surprisingly, the approach did however not significantly improve upon the performance of the image-based 3D object detection models when trained on the SYN dataset and evaluated on KITTI.
87

Detekce objektů na GPU / Object Detection on GPU

Macenauer, Pavel January 2015 (has links)
This thesis addresses the topic of object detection on graphics processing units. As a part of it, a system for object detection using NVIDIA CUDA was designed and implemented, allowing for realtime video object detection and bulk processing. Its contribution is mainly to study the options of NVIDIA CUDA technology and current graphics processing units for object detection acceleration. Also parallel algorithms for object detection are discussed and suggested.
88

Detekce objektů pomocí Houghovy transformace / Object Detection Using Hough Transform

Chroboczek, Martin January 2014 (has links)
This diploma thesis deals with object detection using mathematical technique called Hough transform. Hough transform technique is conceived in general terms from the de facto simplest use for the detection of elementary analytically describable shapes such as lines, ellipses, circles or simple analytically definable elements to sophisticated use for the detection of complex - analytically virtually indescribable - objects. These include cars or pedestrians who are detected on the basis of the photographic records of these objects and entities. The document thus maps the definition and use of the respective Hough transform subtechniques along with their basic classification on probabilistic and non-probabilistic methods. The work subsequently culminates in describing the general state-of-the-art technique called Class-Specific Hough Forests for Object Detection, introduces its definition, training procedure on a provided dataset and the detection of test patterns. In conclusion of this work,there is designed and implemented generally trainable object detector using this technique. And there is experimental evaluation of its quality.
89

OBJECT DETECTION USING DEEP LEARNING ON METAL CHIPS IN MANUFACTURING

Andersson Dickfors, Robin, Grannas, Nick January 2021 (has links)
Designing cutting tools for the turning industry, providing optimal cutting parameters is of importance for both the client, and for the company's own research. By examining the metal chips that form in the turning process, operators can recommend optimal cutting parameters. Instead of doing manual classification of metal chips that come from the turning process, an automated approach of detecting chips and classification is preferred. This thesis aims to evaluate if such an approach is possible using either a Convolutional Neural Network (CNN) or a CNN feature extraction coupled with machine learning (ML). The thesis started with a research phase where we reviewed existing state of the art CNNs, image processing and ML algorithms. From the research, we implemented our own object detection algorithm, and we chose to implement two CNNs, AlexNet and VGG16. A third CNN was designed and implemented with our specific task in mind. The three models were tested against each other, both as standalone image classifiers and as a feature extractor coupled with a ML algorithm. Because the chips were inside a machine, different angles and light setup had to be tested to evaluate which setup provided the optimal image for classification. A top view of the cutting area was found to be the optimal angle with light focused on both below the cutting area, and in the chip disposal tray. The smaller proposed CNN with three convolutional layers, three pooling layers and two dense layers was found to rival both AlexNet and VGG16 in terms of both as a standalone classifier, and as a feature extractor. The proposed model was designed with a limited system in mind and is therefore more suited for those systems while still having a high accuracy. The classification accuracy of the proposed model as a standalone classifier was 92.03%. Compared to the state of the art classifier AlexNet which had an accuracy of 92.20%, and VGG16 which had an accuracy of 91.88%. When used as a feature extractor, all three models paired best with the Random Forest algorithm, but the accuracy between the feature extractors is not that significant. The proposed feature extractor combined with Random Forest had an accuracy of 82.56%, compared to AlexNet with an accuracy of 81.93%, and VGG16 with 79.14% accuracy. / DIGICOGS
90

Semi-Supervised Plant Leaf Detection and Stress Recognition / Semi-övervakad detektering av växtblad och möjlig stressigenkänning

Antal Csizmadia, Márk January 2022 (has links)
One of the main limitations of training deep learning-based object detection models is the availability of large amounts of data annotations. When annotations are scarce, semi-supervised learning provides frameworks to improve object detection performance by utilising unlabelled data. This is particularly useful in plant leaf detection and possible leaf stress recognition, where data annotations are expensive to obtain due to the need for specialised domain knowledge. This project aims to investigate the feasibility of the Unbiased Teacher, a semi-supervised object detection algorithm, for detecting plant leaves and recognising possible leaf stress in experimental settings where few annotations are available during training. We build an annotated data set for this task and implement the Unbiased Teacher algorithm. We optimise the Unbiased Teacher algorithm and compare its performance to that of a baseline model. Finally, we investigate which hyperparameters of the Unbiased Teacher algorithm most significantly affect its performance and its ability to utilise unlabelled images. We find that the Unbiased Teacher algorithm outperforms the baseline model in the experimental settings when limited annotated data are available during training. Amongst the hyperparameters we consider, we identify the confidence threshold as having the most effect on the algorithm’s performance and ability to leverage unlabelled data. Ultimately, we demonstrate the feasibility of improving object detection performance with the Unbiased Teacher algorithm in plant leaf detection and possible stress recognition when few annotations are available. The improved performance reduces the amount of annotated data required for this task, reducing annotation costs and thereby increasing usage for real-world tasks. / En av huvudbegränsningarna med att träna djupinlärningsbaserade objektdetekteringsmodeller är tillgången på stora mängder annoterad data. Vid små mängder av tillgänglig data kan semi-övervakad inlärning erbjuda ett ramverk för att förbättra objektdetekteringsprestanda genom att använda icke-annoterad data. Detta är särskilt användbart vid detektering av växtblad och möjlig igenkänning av stressymptom hos bladen, där kostnaden för annotering av data är hög på grund av behovet av specialiserad kunskap inom området. Detta projekt syftar till att undersöka genomförbarheten av Opartiska Läraren (eng. ”Unbiased Teacher”), en semi-övervakad objektdetekteringsalgoritm, för att upptäcka växtblad och känna igen möjliga stressymptom hos blad i experimentella miljöer när endast en liten mängd annoterad data finns tillgänglig under träning. För att åstadkomma detta bygger vi ett annoterat dataset och implementerar Opartiska Läraren. Vi optimerar Opartiska Läraren och jämför dess prestanda med en baslinjemodell. Slutligen undersöker vi de hyperparametrar som mest påverkar Opartiska Lärarens prestanda och dess förmåga att använda icke-annoterade bilder. Vi finner att Opartiska Läraren överträffar baslinjemodellen i de experimentella inställningarna när det finns en begränsad mängd annoterad data under träningen. Bland hyperparametrarna vi överväger identifierar vi konfidensgränsen som har störst effekt på algoritmens prestanda och dess förmåga att utnyttja icke-annoterad data. Vi demonstrerar möjligheten att förbättra objektdetekteringsprestandan med Opartiska Läraren i växtbladsdetektering och möjlig stressigenkänning när få anteckningar finns tillgängliga. Den förbättrade prestandan minskar mängden annoterad data som krävs, vilket minskar anteckningskostnaderna och ökar därmed användbarheten för användning inom mer praktiska områden.

Page generated in 0.2815 seconds