91 |
Detekce objektů pomocí Houghovy transformace / Object Detection Using Hough TransformChroboczek, Martin January 2014 (has links)
This diploma thesis deals with object detection using mathematical technique called Hough transform. Hough transform technique is conceived in general terms from the de facto simplest use for the detection of elementary analytically describable shapes such as lines, ellipses, circles or simple analytically definable elements to sophisticated use for the detection of complex - analytically virtually indescribable - objects. These include cars or pedestrians who are detected on the basis of the photographic records of these objects and entities. The document thus maps the definition and use of the respective Hough transform subtechniques along with their basic classification on probabilistic and non-probabilistic methods. The work subsequently culminates in describing the general state-of-the-art technique called Class-Specific Hough Forests for Object Detection, introduces its definition, training procedure on a provided dataset and the detection of test patterns. In conclusion of this work,there is designed and implemented generally trainable object detector using this technique. And there is experimental evaluation of its quality.
|
92 |
OBJECT DETECTION USING DEEP LEARNING ON METAL CHIPS IN MANUFACTURINGAndersson Dickfors, Robin, Grannas, Nick January 2021 (has links)
Designing cutting tools for the turning industry, providing optimal cutting parameters is of importance for both the client, and for the company's own research. By examining the metal chips that form in the turning process, operators can recommend optimal cutting parameters. Instead of doing manual classification of metal chips that come from the turning process, an automated approach of detecting chips and classification is preferred. This thesis aims to evaluate if such an approach is possible using either a Convolutional Neural Network (CNN) or a CNN feature extraction coupled with machine learning (ML). The thesis started with a research phase where we reviewed existing state of the art CNNs, image processing and ML algorithms. From the research, we implemented our own object detection algorithm, and we chose to implement two CNNs, AlexNet and VGG16. A third CNN was designed and implemented with our specific task in mind. The three models were tested against each other, both as standalone image classifiers and as a feature extractor coupled with a ML algorithm. Because the chips were inside a machine, different angles and light setup had to be tested to evaluate which setup provided the optimal image for classification. A top view of the cutting area was found to be the optimal angle with light focused on both below the cutting area, and in the chip disposal tray. The smaller proposed CNN with three convolutional layers, three pooling layers and two dense layers was found to rival both AlexNet and VGG16 in terms of both as a standalone classifier, and as a feature extractor. The proposed model was designed with a limited system in mind and is therefore more suited for those systems while still having a high accuracy. The classification accuracy of the proposed model as a standalone classifier was 92.03%. Compared to the state of the art classifier AlexNet which had an accuracy of 92.20%, and VGG16 which had an accuracy of 91.88%. When used as a feature extractor, all three models paired best with the Random Forest algorithm, but the accuracy between the feature extractors is not that significant. The proposed feature extractor combined with Random Forest had an accuracy of 82.56%, compared to AlexNet with an accuracy of 81.93%, and VGG16 with 79.14% accuracy. / DIGICOGS
|
93 |
Semi-Supervised Plant Leaf Detection and Stress Recognition / Semi-övervakad detektering av växtblad och möjlig stressigenkänningAntal Csizmadia, Márk January 2022 (has links)
One of the main limitations of training deep learning-based object detection models is the availability of large amounts of data annotations. When annotations are scarce, semi-supervised learning provides frameworks to improve object detection performance by utilising unlabelled data. This is particularly useful in plant leaf detection and possible leaf stress recognition, where data annotations are expensive to obtain due to the need for specialised domain knowledge. This project aims to investigate the feasibility of the Unbiased Teacher, a semi-supervised object detection algorithm, for detecting plant leaves and recognising possible leaf stress in experimental settings where few annotations are available during training. We build an annotated data set for this task and implement the Unbiased Teacher algorithm. We optimise the Unbiased Teacher algorithm and compare its performance to that of a baseline model. Finally, we investigate which hyperparameters of the Unbiased Teacher algorithm most significantly affect its performance and its ability to utilise unlabelled images. We find that the Unbiased Teacher algorithm outperforms the baseline model in the experimental settings when limited annotated data are available during training. Amongst the hyperparameters we consider, we identify the confidence threshold as having the most effect on the algorithm’s performance and ability to leverage unlabelled data. Ultimately, we demonstrate the feasibility of improving object detection performance with the Unbiased Teacher algorithm in plant leaf detection and possible stress recognition when few annotations are available. The improved performance reduces the amount of annotated data required for this task, reducing annotation costs and thereby increasing usage for real-world tasks. / En av huvudbegränsningarna med att träna djupinlärningsbaserade objektdetekteringsmodeller är tillgången på stora mängder annoterad data. Vid små mängder av tillgänglig data kan semi-övervakad inlärning erbjuda ett ramverk för att förbättra objektdetekteringsprestanda genom att använda icke-annoterad data. Detta är särskilt användbart vid detektering av växtblad och möjlig igenkänning av stressymptom hos bladen, där kostnaden för annotering av data är hög på grund av behovet av specialiserad kunskap inom området. Detta projekt syftar till att undersöka genomförbarheten av Opartiska Läraren (eng. ”Unbiased Teacher”), en semi-övervakad objektdetekteringsalgoritm, för att upptäcka växtblad och känna igen möjliga stressymptom hos blad i experimentella miljöer när endast en liten mängd annoterad data finns tillgänglig under träning. För att åstadkomma detta bygger vi ett annoterat dataset och implementerar Opartiska Läraren. Vi optimerar Opartiska Läraren och jämför dess prestanda med en baslinjemodell. Slutligen undersöker vi de hyperparametrar som mest påverkar Opartiska Lärarens prestanda och dess förmåga att använda icke-annoterade bilder. Vi finner att Opartiska Läraren överträffar baslinjemodellen i de experimentella inställningarna när det finns en begränsad mängd annoterad data under träningen. Bland hyperparametrarna vi överväger identifierar vi konfidensgränsen som har störst effekt på algoritmens prestanda och dess förmåga att utnyttja icke-annoterad data. Vi demonstrerar möjligheten att förbättra objektdetekteringsprestandan med Opartiska Läraren i växtbladsdetektering och möjlig stressigenkänning när få anteckningar finns tillgängliga. Den förbättrade prestandan minskar mängden annoterad data som krävs, vilket minskar anteckningskostnaderna och ökar därmed användbarheten för användning inom mer praktiska områden.
|
94 |
Rotation Invariant Histogram Features for Object Detection and Tracking in Aerial ImageryMathew, Alex 05 June 2014 (has links)
No description available.
|
95 |
Transfer learning for object category detectionAytar, Yusuf January 2014 (has links)
Object category detection, the task of determining if one or more instances of a category are present in an image with their corresponding locations, is one of the fundamental problems of computer vision. The task is very challenging because of the large variations in imaged object appearance, particularly due to the changes in viewpoint, illumination and intra-class variance. Although successful solutions exist for learning object category detectors, they require massive amounts of training data. Transfer learning builds upon previously acquired knowledge and thus reduces training requirements. The objective of this work is to develop and apply novel transfer learning techniques specific to the object category detection problem. This thesis proposes methods which not only address the challenges of performing transfer learning for object category detection such as finding relevant sources for transfer, handling aspect ratio mismatches and considering the geometric relations between the features; but also enable large scale object category detection by quickly learning from considerably fewer training samples and immediate evaluation of models on web scale data with the help of part-based indexing. Several novel transfer models are introduced such as: (a) rigid transfer for transferring knowledge between similar classes, (b) deformable transfer which tolerates small structural changes by deforming the source detector while performing the transfer, and (c) part level transfer particularly for the cases where full template transfer is not possible due to aspect ratio mismatches or not having adequately similar sources. Building upon the idea of using part-level transfer, instead of performing an exhaustive sliding window search, part-based indexing is proposed for efficient evaluation of templates enabling us to obtain immediate detection results in large scale image collections. Furthermore, easier and more robust optimization methods are developed with the help of feature maps defined between proposed transfer learning formulations and the “classical” SVM formulation.
|
96 |
Computer vision-based detection of fire and violent actions performed by individuals in videos acquired with handheld devicesMoria, Kawther 28 July 2016 (has links)
Advances in social networks and multimedia technologies greatly facilitate the recording and sharing of video data on violent social and/or political events via In- ternet. These video data are a rich source of information in terms of identifying the individuals responsible for damaging public and private property through vio- lent behavior. Any abnormal, violent individual behavior could trigger a cascade of undesirable events, such as vandalism and damage to stores and public facilities. When such incidents occur, investigators usually need to analyze thousands of hours of videos recorded using handheld devices in order to identify suspects. The exhaus- tive manual investigation of these video data is highly time and resource-consuming. Automated detection techniques of abnormal events and actions based on computer vision would o↵er a more e cient solution to this problem.
The first contribution described in this thesis consists of a novel method for fire detection in riot videos acquired with handheld cameras and smart-phones. This is a typical example of computer vision in the wild, where we have no control over the data acquisition process, and where the quality of the video data varies considerably. The proposed spatial model is based on the Mixtures of Gaussians model and exploits color adjacency in the visible spectrum of incandescence. The experimental results demonstrate that using this spatial model in concert with motion cues leads to highly accurate results for fire detection in noisy, complex scenes of rioting crowds.
The second contribution consists in a method for detecting abnormal, violent actions that are performed by individual subjects and witnessed by passive crowds. The problem of abnormal individual behavior, such as a fight, witnessed by passive bystanders gathered into a crowd has not been studied before. We show that the presence of a passive, standing crowd is an important indicator that an abnormal action might occur. Thus, detecting the standing crowd improves the performance of detecting the abnormal action. The proposed method performs crowd detection first, followed by the detection of abnormal motion events. Our main theoretical contribution consists in linking crowd detection to abnormal, violent actions, as well as in defining novel sets of features that characterize static crowds and abnormal individual actions in both spatial and spatio-temporal domains. Experimental results are computed on a custom dataset, the Vancouver Riot Dataset, that we generated using amateur video footage acquired with handheld devices and uploaded on public social network sites. Our approach achieves good precision and recall values, which validates our system’s reliability of localizing the crowds and the abnormal actions.
To summarize, this thesis focuses on the detection of two types of abnormal events occurring in violent street movements. The data are gathered by passive participants to these movements using handheld devices. Although our data sets are drawn from one single social movement (the Vancouver 2011 Stanley cup riot) we are confident that our approaches would generalize well and would be helpful to forensic activities performed in the context of other similar violent occasions. / Graduate
|
97 |
Ground Object Recognition using Laser Radar Data : Geometric Fitting, Performance Analysis, and ApplicationsGrönwall, Christna January 2006 (has links)
This thesis concerns detection and recognition of ground object using data from laser radar systems. Typical ground objects are vehicles and land mines. For these objects, the orientation and articulation are unknown. The objects are placed in natural or urban areas where the background is unstructured and complex. The performance of laser radar systems is analyzed, to achieve models of the uncertainties in laser radar data. A ground object recognition method is presented. It handles general, noisy 3D point cloud data. The approach is based on the fact that man-made objects on a large scale can be considered be of rectangular shape or can be decomposed to a set of rectangles. Several approaches to rectangle fitting are presented and evaluated in Monte Carlo simulations. There are error-in-variables present and thus, geometric fitting is used. The objects can have parts that are subject to articulation. A modular least squares method with outlier rejection, that can handle articulated objects, is proposed. This method falls within the iterative closest point framework. Recognition when several similar models are available is discussed. The recognition method is applied in a query-based multi-sensor system. The system covers the process from sensor data to the user interface, i.e., from low level image processing to high level situation analysis. In object detection and recognition based on laser radar data, the range value’s accuracy is important. A general direct-detection laser radar system applicable for hard-target measurements is modeled. Three time-of-flight estimation algorithms are analyzed; peak detection, constant fraction detection, and matched filter. The statistical distribution of uncertainties in time-of-flight range estimations is determined. The detection performance for various shape conditions and signal-tonoise ratios are analyzed. Those results are used to model the properties of the range estimation error. The detector’s performances are compared with the Cramér-Rao lower bound. The performance of a tool for synthetic generation of scanning laser radar data is evaluated. In the measurement system model, it is possible to add several design parameters, which makes it possible to test an estimation scheme under different types of system design. A parametric method, based on measurement error regression, that estimates an object’s size and orientation is described. Validations of both the measurement system model and the measurement error model, with respect to the Cramér-Rao lower bound, are presented.
|
98 |
Konvoluční neuronové sítě a jejich využití při detekci objektů / Convolutional neural networks and their application in object detectionHrinčár, Matej January 2013 (has links)
1 Title: Convolutional neural networks and their application in object detection Author: Matej Hrinčár Department: Department of Theoretical Computer Science and Mathematical Logic Supervisor: doc. RNDr. Iveta Mrázová, CSc. Supervisor's e-mail address: Iveta.Mrazova@mff.cuni.cz Abstract: Nowadays, it has become popular to enhance live sport streams with an augmented reality like adding various statistics over the hockey players. To do so, players must be automatically detected first. This thesis deals with such a challenging task. Our aim is to deliver not only a sufficient accuracy but also a speed because we should be able to make the detection in real time. We use one of the newer model of neural network which is a convolutional network. This model is suitable for proces- sing image data a can use input image without any preprocessing whatsoever. After our detailed analysis we choose this model as a detector for hockey players. We have tested several different architectures of the networks which we then compared and choose the one which is not only accurate but also fast enough. We have also tested the robustness of the network with noisy patterns. Finally we assigned detected pla- yers to their corresponding teams utilizing K-mean algorithm using the information about their jersey color. Keywords:...
|
99 |
Pedestrian Detection on Dewarped Fisheye Images using Deep Neural NetworksJEEREDDY, UTTEJH REDDY January 2019 (has links)
In the field of autonomous vehicles, Advanced Driver Assistance Systems (ADAS)play a key role. Their applications vary from aiding with critical safety systems to assisting with trivial parking scenarios. To optimize the use of resources, trivial ADAS applications are often limited to make use of low-cost sensors. As a result, sensors such as Cameras and UltraSonics are preferred over LiDAR (Light Detection and Ranging) and RADAR (RAdio Detection And Ranging) in assisting the driver with parking. In a parking scenario, to ensure the safety of people in and around the car, the sensors need to detect objects around the car in real-time. With the advancements in Deep Learning, Deep Neural Networks (DNN) are becoming increasingly effective in detecting objects with real-time performance. Therefore, the thesis aims to investigate the viability of Deep Neural Networks using Fisheye cameras to detect pedestrians around the car. To achieve the objective, an experiment was conducted on a test vehicle equipped with multiple Fisheye cameras. Three Deep Neural Networks namely, YOLOv3 (You Only Look Once), its faster variant Tiny-YOLOv3 ND ResNet-50 were chosen to detect pedestrians. The Networks were trained on Fisheye image dataset with the help of transfer learning. After training, the models were also compared to pre-trained models that were trained to detect pedestrians on normal images. Our experiments have shown that the YOLOv3 variants have performed well but with a difficulty of localizing the pedestrians. The ResNet model has failed to generate acceptable detections and thus performed poorly. The three models produced detections with a real-time performance for a single camera but when scaled to multiple cameras, the detection speed was not on par. The YOLOv3 variants could detect pedestrians successfully on dewarped fish-eye images but the pipeline still needs a better dewarping algorithm to lessen the distortion effects. Further, the models need to be optimized in order to generate detections with real-time performance on multiple cameras and also to fit the model on an embedded system.
|
100 |
Visual Object Detection using Convolutional Neural Networks in a Virtual EnvironmentNorrstig, Andreas January 2019 (has links)
Visual object detection is a popular computer vision task that has been intensively investigated using deep learning on real data. However, data from virtual environments have not received the same attention. A virtual environment enables generating data for locations that are not easily reachable for data collection, e.g. aerial environments. In this thesis, we study the problem of object detection in virtual environments, more specifically an aerial virtual environment. We use a simulator, to generate a synthetic data set of 16 different types of vehicles captured from an airplane. To study the performance of existing methods in virtual environments, we train and evaluate two state-of-the-art detectors on the generated data set. Experiments show that both detectors, You Only Look Once version 3 (YOLOv3) and Single Shot MultiBox Detector (SSD), reach similar performance quality as previously presented in the literature on real data sets. In addition, we investigate different fusion techniques between detectors which were trained on two different subsets of the dataset, in this case a subset which has cars with fixed colors and a dataset which has cars with varying colors. Experiments show that it is possible to train multiple instances of the detector on different subsets of the data set, and combine these detectors in order to boost the performance.
|
Page generated in 0.0965 seconds