• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 334
  • 42
  • 19
  • 13
  • 10
  • 8
  • 4
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 522
  • 522
  • 239
  • 201
  • 162
  • 128
  • 109
  • 108
  • 104
  • 86
  • 84
  • 76
  • 73
  • 72
  • 69
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Semi-Supervised Plant Leaf Detection and Stress Recognition / Semi-övervakad detektering av växtblad och möjlig stressigenkänning

Antal Csizmadia, Márk January 2022 (has links)
One of the main limitations of training deep learning-based object detection models is the availability of large amounts of data annotations. When annotations are scarce, semi-supervised learning provides frameworks to improve object detection performance by utilising unlabelled data. This is particularly useful in plant leaf detection and possible leaf stress recognition, where data annotations are expensive to obtain due to the need for specialised domain knowledge. This project aims to investigate the feasibility of the Unbiased Teacher, a semi-supervised object detection algorithm, for detecting plant leaves and recognising possible leaf stress in experimental settings where few annotations are available during training. We build an annotated data set for this task and implement the Unbiased Teacher algorithm. We optimise the Unbiased Teacher algorithm and compare its performance to that of a baseline model. Finally, we investigate which hyperparameters of the Unbiased Teacher algorithm most significantly affect its performance and its ability to utilise unlabelled images. We find that the Unbiased Teacher algorithm outperforms the baseline model in the experimental settings when limited annotated data are available during training. Amongst the hyperparameters we consider, we identify the confidence threshold as having the most effect on the algorithm’s performance and ability to leverage unlabelled data. Ultimately, we demonstrate the feasibility of improving object detection performance with the Unbiased Teacher algorithm in plant leaf detection and possible stress recognition when few annotations are available. The improved performance reduces the amount of annotated data required for this task, reducing annotation costs and thereby increasing usage for real-world tasks. / En av huvudbegränsningarna med att träna djupinlärningsbaserade objektdetekteringsmodeller är tillgången på stora mängder annoterad data. Vid små mängder av tillgänglig data kan semi-övervakad inlärning erbjuda ett ramverk för att förbättra objektdetekteringsprestanda genom att använda icke-annoterad data. Detta är särskilt användbart vid detektering av växtblad och möjlig igenkänning av stressymptom hos bladen, där kostnaden för annotering av data är hög på grund av behovet av specialiserad kunskap inom området. Detta projekt syftar till att undersöka genomförbarheten av Opartiska Läraren (eng. ”Unbiased Teacher”), en semi-övervakad objektdetekteringsalgoritm, för att upptäcka växtblad och känna igen möjliga stressymptom hos blad i experimentella miljöer när endast en liten mängd annoterad data finns tillgänglig under träning. För att åstadkomma detta bygger vi ett annoterat dataset och implementerar Opartiska Läraren. Vi optimerar Opartiska Läraren och jämför dess prestanda med en baslinjemodell. Slutligen undersöker vi de hyperparametrar som mest påverkar Opartiska Lärarens prestanda och dess förmåga att använda icke-annoterade bilder. Vi finner att Opartiska Läraren överträffar baslinjemodellen i de experimentella inställningarna när det finns en begränsad mängd annoterad data under träningen. Bland hyperparametrarna vi överväger identifierar vi konfidensgränsen som har störst effekt på algoritmens prestanda och dess förmåga att utnyttja icke-annoterad data. Vi demonstrerar möjligheten att förbättra objektdetekteringsprestandan med Opartiska Läraren i växtbladsdetektering och möjlig stressigenkänning när få anteckningar finns tillgängliga. Den förbättrade prestandan minskar mängden annoterad data som krävs, vilket minskar anteckningskostnaderna och ökar därmed användbarheten för användning inom mer praktiska områden.
92

Rotation Invariant Histogram Features for Object Detection and Tracking in Aerial Imagery

Mathew, Alex 05 June 2014 (has links)
No description available.
93

Transfer learning for object category detection

Aytar, Yusuf January 2014 (has links)
Object category detection, the task of determining if one or more instances of a category are present in an image with their corresponding locations, is one of the fundamental problems of computer vision. The task is very challenging because of the large variations in imaged object appearance, particularly due to the changes in viewpoint, illumination and intra-class variance. Although successful solutions exist for learning object category detectors, they require massive amounts of training data. Transfer learning builds upon previously acquired knowledge and thus reduces training requirements. The objective of this work is to develop and apply novel transfer learning techniques specific to the object category detection problem. This thesis proposes methods which not only address the challenges of performing transfer learning for object category detection such as finding relevant sources for transfer, handling aspect ratio mismatches and considering the geometric relations between the features; but also enable large scale object category detection by quickly learning from considerably fewer training samples and immediate evaluation of models on web scale data with the help of part-based indexing. Several novel transfer models are introduced such as: (a) rigid transfer for transferring knowledge between similar classes, (b) deformable transfer which tolerates small structural changes by deforming the source detector while performing the transfer, and (c) part level transfer particularly for the cases where full template transfer is not possible due to aspect ratio mismatches or not having adequately similar sources. Building upon the idea of using part-level transfer, instead of performing an exhaustive sliding window search, part-based indexing is proposed for efficient evaluation of templates enabling us to obtain immediate detection results in large scale image collections. Furthermore, easier and more robust optimization methods are developed with the help of feature maps defined between proposed transfer learning formulations and the “classical” SVM formulation.
94

Computer vision-based detection of fire and violent actions performed by individuals in videos acquired with handheld devices

Moria, Kawther 28 July 2016 (has links)
Advances in social networks and multimedia technologies greatly facilitate the recording and sharing of video data on violent social and/or political events via In- ternet. These video data are a rich source of information in terms of identifying the individuals responsible for damaging public and private property through vio- lent behavior. Any abnormal, violent individual behavior could trigger a cascade of undesirable events, such as vandalism and damage to stores and public facilities. When such incidents occur, investigators usually need to analyze thousands of hours of videos recorded using handheld devices in order to identify suspects. The exhaus- tive manual investigation of these video data is highly time and resource-consuming. Automated detection techniques of abnormal events and actions based on computer vision would o↵er a more e cient solution to this problem. The first contribution described in this thesis consists of a novel method for fire detection in riot videos acquired with handheld cameras and smart-phones. This is a typical example of computer vision in the wild, where we have no control over the data acquisition process, and where the quality of the video data varies considerably. The proposed spatial model is based on the Mixtures of Gaussians model and exploits color adjacency in the visible spectrum of incandescence. The experimental results demonstrate that using this spatial model in concert with motion cues leads to highly accurate results for fire detection in noisy, complex scenes of rioting crowds. The second contribution consists in a method for detecting abnormal, violent actions that are performed by individual subjects and witnessed by passive crowds. The problem of abnormal individual behavior, such as a fight, witnessed by passive bystanders gathered into a crowd has not been studied before. We show that the presence of a passive, standing crowd is an important indicator that an abnormal action might occur. Thus, detecting the standing crowd improves the performance of detecting the abnormal action. The proposed method performs crowd detection first, followed by the detection of abnormal motion events. Our main theoretical contribution consists in linking crowd detection to abnormal, violent actions, as well as in defining novel sets of features that characterize static crowds and abnormal individual actions in both spatial and spatio-temporal domains. Experimental results are computed on a custom dataset, the Vancouver Riot Dataset, that we generated using amateur video footage acquired with handheld devices and uploaded on public social network sites. Our approach achieves good precision and recall values, which validates our system’s reliability of localizing the crowds and the abnormal actions. To summarize, this thesis focuses on the detection of two types of abnormal events occurring in violent street movements. The data are gathered by passive participants to these movements using handheld devices. Although our data sets are drawn from one single social movement (the Vancouver 2011 Stanley cup riot) we are confident that our approaches would generalize well and would be helpful to forensic activities performed in the context of other similar violent occasions. / Graduate
95

Ground Object Recognition using Laser Radar Data : Geometric Fitting, Performance Analysis, and Applications

Grönwall, Christna January 2006 (has links)
This thesis concerns detection and recognition of ground object using data from laser radar systems. Typical ground objects are vehicles and land mines. For these objects, the orientation and articulation are unknown. The objects are placed in natural or urban areas where the background is unstructured and complex. The performance of laser radar systems is analyzed, to achieve models of the uncertainties in laser radar data. A ground object recognition method is presented. It handles general, noisy 3D point cloud data. The approach is based on the fact that man-made objects on a large scale can be considered be of rectangular shape or can be decomposed to a set of rectangles. Several approaches to rectangle fitting are presented and evaluated in Monte Carlo simulations. There are error-in-variables present and thus, geometric fitting is used. The objects can have parts that are subject to articulation. A modular least squares method with outlier rejection, that can handle articulated objects, is proposed. This method falls within the iterative closest point framework. Recognition when several similar models are available is discussed. The recognition method is applied in a query-based multi-sensor system. The system covers the process from sensor data to the user interface, i.e., from low level image processing to high level situation analysis. In object detection and recognition based on laser radar data, the range value’s accuracy is important. A general direct-detection laser radar system applicable for hard-target measurements is modeled. Three time-of-flight estimation algorithms are analyzed; peak detection, constant fraction detection, and matched filter. The statistical distribution of uncertainties in time-of-flight range estimations is determined. The detection performance for various shape conditions and signal-tonoise ratios are analyzed. Those results are used to model the properties of the range estimation error. The detector’s performances are compared with the Cramér-Rao lower bound. The performance of a tool for synthetic generation of scanning laser radar data is evaluated. In the measurement system model, it is possible to add several design parameters, which makes it possible to test an estimation scheme under different types of system design. A parametric method, based on measurement error regression, that estimates an object’s size and orientation is described. Validations of both the measurement system model and the measurement error model, with respect to the Cramér-Rao lower bound, are presented.
96

Konvoluční neuronové sítě a jejich využití při detekci objektů / Convolutional neural networks and their application in object detection

Hrinčár, Matej January 2013 (has links)
1 Title: Convolutional neural networks and their application in object detection Author: Matej Hrinčár Department: Department of Theoretical Computer Science and Mathematical Logic Supervisor: doc. RNDr. Iveta Mrázová, CSc. Supervisor's e-mail address: Iveta.Mrazova@mff.cuni.cz Abstract: Nowadays, it has become popular to enhance live sport streams with an augmented reality like adding various statistics over the hockey players. To do so, players must be automatically detected first. This thesis deals with such a challenging task. Our aim is to deliver not only a sufficient accuracy but also a speed because we should be able to make the detection in real time. We use one of the newer model of neural network which is a convolutional network. This model is suitable for proces- sing image data a can use input image without any preprocessing whatsoever. After our detailed analysis we choose this model as a detector for hockey players. We have tested several different architectures of the networks which we then compared and choose the one which is not only accurate but also fast enough. We have also tested the robustness of the network with noisy patterns. Finally we assigned detected pla- yers to their corresponding teams utilizing K-mean algorithm using the information about their jersey color. Keywords:...
97

Pedestrian Detection on Dewarped Fisheye Images using Deep Neural Networks

JEEREDDY, UTTEJH REDDY January 2019 (has links)
In the field of autonomous vehicles, Advanced Driver Assistance Systems (ADAS)play a key role. Their applications vary from aiding with critical safety systems to assisting with trivial parking scenarios. To optimize the use of resources, trivial ADAS applications are often limited to make use of low-cost sensors. As a result, sensors such as Cameras and UltraSonics are preferred over LiDAR (Light Detection and Ranging) and RADAR (RAdio Detection And Ranging) in assisting the driver with parking. In a parking scenario, to ensure the safety of people in and around the car, the sensors need to detect objects around the car in real-time. With the advancements in Deep Learning, Deep Neural Networks (DNN) are becoming increasingly effective in detecting objects with real-time performance. Therefore, the thesis aims to investigate the viability of Deep Neural Networks using Fisheye cameras to detect pedestrians around the car. To achieve the objective, an experiment was conducted on a test vehicle equipped with multiple Fisheye cameras. Three Deep Neural Networks namely, YOLOv3 (You Only Look Once), its faster variant Tiny-YOLOv3 ND ResNet-50 were chosen to detect pedestrians. The Networks were trained on Fisheye image dataset with the help of transfer learning. After training, the models were also compared to pre-trained models that were trained to detect pedestrians on normal images. Our experiments have shown that the YOLOv3 variants have performed well but with a difficulty of localizing the pedestrians. The ResNet model has failed to generate acceptable detections and thus performed poorly. The three models produced detections with a real-time performance for a single camera but when scaled to multiple cameras, the detection speed was not on par. The YOLOv3 variants could detect pedestrians successfully on dewarped fish-eye images but the pipeline still needs a better dewarping algorithm to lessen the distortion effects. Further, the models need to be optimized in order to generate detections with real-time performance on multiple cameras and also to fit the model on an embedded system.
98

Visual Object Detection using Convolutional Neural Networks in a Virtual Environment

Norrstig, Andreas January 2019 (has links)
Visual object detection is a popular computer vision task that has been intensively investigated using deep learning on real data. However, data from virtual environments have not received the same attention. A virtual environment enables generating data for locations that are not easily reachable for data collection, e.g. aerial environments. In this thesis, we study the problem of object detection in virtual environments, more specifically an aerial virtual environment. We use a simulator, to generate a synthetic data set of 16 different types of vehicles captured from an airplane. To study the performance of existing methods in virtual environments, we train and evaluate two state-of-the-art detectors on the generated data set. Experiments show that both detectors, You Only Look Once version 3 (YOLOv3) and Single Shot MultiBox Detector (SSD), reach similar performance quality as previously presented in the literature on real data sets. In addition, we investigate different fusion techniques between detectors which were trained on two different subsets of the dataset, in this case a subset which has cars with fixed colors and a dataset which has cars with varying colors. Experiments show that it is possible to train multiple instances of the detector on different subsets of the data set, and combine these detectors in order to boost the performance.
99

Automatic Number Plate Recognition for Android

Larsson, Stefan, Mellqvist, Filip January 2019 (has links)
This thesis describes how we utilize machine learning and image preprocessing to create a system that can extract a license plate number by taking a picture of a car with an Android smartphone. This project was provided by ÅF at the behalf of one of their customers who wanted to make the workflow of their employees more efficient. The two main techniques of this project are object detection to detect license plates and optical character recognition to then read them. In between are several different image preprocessing techniques to make the images as readable as possible. These techniques mainly includes skewing and color distorting the image. The object detection consists of a convolutional neural network using the You Only Look Once technique, trained by us using Darkflow. When using our final product to read license plates of expected quality in our evaluation phase, we found that 94.8% of them were read correctly. Without our image preprocessing, this was reduced to only 7.95%.
100

Region Proposal Based Object Detectors Integrated With an Extended Kalman Filter for a Robust Detect-Tracking Algorithm

Khajo, Gabriel January 2019 (has links)
In this thesis we present a detect-tracking algorithm (see figure 3.1) that combines the detection robustness of static region proposal based object detectors, like the faster region convolutional neural network (R-CNN) and the region-based fully convolutional networks (R-FCN) model, with the tracking prediction strength of extended Kalman filters, by using, what we have called, a translating and non-rigid user input region of interest (RoI-) mapping. This so-called RoI-mapping maps a region, which includes the object that one is interested in tracking, to a featureless three-channeled image. The detection part of our proposed algorithm is then performed on the image that includes only the RoI features (see figure 3.2). After the detection step, our model re-maps the RoI features to the original frame, and translates the RoI to the center of the prediction. If no prediction occurs, our proposed model integrates a temporal dependence through a Kalman filter as a predictor; this filter is continuously corrected when detections do occur. To train the region proposal based object detectors that we integrate into our detect-tracking model, we used TensorFlow®’s object detection api, with a random search hyperparameter tuning, where we fine-tuned, all models from TensorFlow® slim base network classification checkpoints. The trained region proposal based object detectors used the inception V2 base network for the faster R-CNN model and the R-FCN model, while the inception V3 base network only was applied to the faster R-CNN model. This was made to compare the two base networks and their corresponding affects on the detection models. In addition to the deep learning part of this thesis, for the implementation part of our detect-tracking model, like for the extended Kalman filter, we used Python and OpenCV® . The results show that, with a stationary camera reference frame, our proposed detect-tracking algorithm, combined with region proposal based object detectors on images of size 414 × 740 × 3, can detect and track a small object in real-time, like a tennis ball, moving along a horizontal trajectory with an average velocity v ≈ 50 km/h at a distance d = 25 m, with a combined detect-tracking frequency of about 13 to 14 Hz. The largest measured state error between the actual state and the predicted state from the Kalman filter, at the aforementioned horizontal velocity, have been measured to be a maximum of 10-15 pixels, see table 5.1, but in certain frames where many detections occur this error has been shown to be much smaller (3-5 pixels). Additionally, our combined detect-tracking model has also been shown to be able to handle obstacles and two learnable features that overlap, thanks to the integrated extended Kalman filter. Lastly, our detect-tracking model also was applied on a set of infra-red images, where the goal was to detect and track a moving truck moving along a semi-horizontal path. Our results show that a faster R-CNN inception V2 model was able to extract features from a sequence of infra-red frames, and that our proposed RoI-mapping method worked relatively well at detecting only one truck in a short test-sequence (see figure 5.22).

Page generated in 0.1103 seconds