Global ETD Search

51	Contextual models for object detection using boosted random fields Torralba, Antonio, Murphy, Kevin P., Freeman, William T. 25 June 2004 (has links) We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes. AI Object detection context boosting BP random fields
52	Towards the Design of Neural Network Framework for Object Recognition and Target Region Refining for Smart Transportation Systems Zhao, Yiheng 13 August 2018 (has links) Object recognition systems have significant influences on modern life. Face, iris and finger point recognition applications are commonly applied for the security purposes; ASR (Automatic Speech Recognition) is commonly implemented on speech subtitle generation for various videos and audios, such as YouTube; HWR (Handwriting Recognition) systems are essential on the post office for cheque and postcode detection; ADAS (Advanced Driver Assistance System) are well applied to improve drivers’, passages’ and pedestrians’ safety. Object recognition techniques are crucial and valuable for academia, commerce and industry. Accuracy and efficiency are two important standards to evaluate the performance of recognition techniques. Accuracy includes how many objects can be indicated in real scene and how many of them can be correctly classified. Efficiency means speed for system training and sample testing. Traditional object detecting methods, such as HOG (Histogram of orientated Gradient) feature detector combining with SVM (Support Vector Machine) classifier, cannot compete with frameworks of neural networks in both efficiency and accuracy. Since neural network has better performance and potential for improvement, it is worth to gain insight into this field to design more advanced recognition systems. In this thesis, we list and analyze sophisticated techniques and frameworks for object recognition. To understand the mathematical theory for network design, state-of-the-art networks in ILSVRC (ImageNET Large Scale Visual Recognition Challenge) are studied. Based on analysis and the concept of edge detectors, a simple CNN (Convolutional Neural Network) structure is designed as a trail to explore the possibility to utilize the network of high width and low depth for region proposal selection, object recognition and target region refining. We adopt Le-Net as the template, taking advantage of multi-kernels of GoogLe-Net. We made experiments to test the performance of this simple structure of the vehicle and face through ImageNet dataset. The accuracy for the single object detection is 81% in average and for plural object detection is 73.5%. We refined networks through many aspects to reach the final accuracy 95% for single object detection and 89% for plural object detection. Convolutional Neural Network Object Detection Object Recognition Machine Learning
53	Shoulder Keypoint-Detection from Object Detection Kapoor, Prince 22 August 2018 (has links) This thesis presents detailed observation of different Convolutional Neural Network (CNN) architecture which had assisted Computer Vision researchers to achieve state-of-the-art performance on classification, detection, segmentation and much more to name image analysis challenges. Due to the advent of deep learning, CNN had been used in almost all the computer vision applications and that is why there is utter need to understand the miniature details of these feature extractors and find out their pros and cons of each feature extractor meticulously. In order to perform our experimentation, we decided to explore an object detection task using a particular model architecture which maintains a sweet spot between computational cost and accuracy. The model architecture which we had used is LSTM-Decoder. The model had been experimented with different CNN feature extractor and found their pros and cons in variant scenarios. The results which we had obtained on different datasets elucidates that CNN plays a major role in obtaining higher accuracy and we had also achieved a comparable state-of-the-art accuracy on Pedestrian Detection Dataset. In extension to object detection, we also implemented two different model architectures which find shoulder keypoints. So, One of our idea can be explicated as follows: using the detected annotation from object detection, a small cropped image is generated which would be feed into a small cascade network which was trained for detection of shoulder keypoints. The second strategy is to use the same object detection model and fine tune their weights to predict shoulder keypoints. Currently, we had generated our results for shoulder keypoint detection. However, this idea could be extended to full-body pose Estimation by modifying the cascaded network for pose estimation purpose and this had become an important topic of discussion for the future work of this thesis. Shoulder Keypoint Detection Object Detection CNN Feature Extractors LSTM-decoder
54	Visual Attention-based Object Detection and Recognition Mahmood, Hamid January 2013 (has links) This thesis is all about the visual attention, starting from understanding the human visual system up till applying this mechanism to a real-world computer vision application. This has been achieved by taking the advantage of latest findings about the human visual attention and the increased performance of the computers. These two facts played a vital role in simulating the many different aspects of this visual behavior. In addition, the concept of bio-inspired visual attention systems have become applicable due to the emergence of different interdisciplinary approaches to vision which leads to a beneficial interaction between the scientists related to different fields. The problems of high complexities in computer vision lead to consider the visual attention paradigm to become a part of real time computer vision solutions which have increasing demand. In this thesis work, different aspects of visual attention paradigm have been dealt ranging from the biological modeling to the real-world computer vision tasks implementation based on this visual behavior. The implementation of traffic signs detection and recognition system benefited from this mechanism is the central part of this thesis work. Visual Attention Object Detection and Recognition Computer and Information Sciences Data- och informationsvetenskap
55	Cooperative Perception for Connected Autonomous Vehicle Edge Computing System Chen, Qi 08 1900 (has links) This dissertation first conducts a study on raw-data level cooperative perception for enhancing the detection ability of self-driving systems for connected autonomous vehicles (CAVs). A LiDAR (Light Detection and Ranging sensor) point cloud-based 3D object detection method is deployed to enhance detection performance by expanding the effective sensing area, capturing critical information in multiple scenarios and improving detection accuracy. In addition, a point cloud feature based cooperative perception framework is proposed on edge computing system for CAVs. This dissertation also uses the features' intrinsically small size to achieve real-time edge computing, without running the risk of congesting the network. In order to distinguish small sized objects such as pedestrian and cyclist in 3D data, an end-to-end multi-sensor fusion model is developed to implement 3D object detection from multi-sensor data. Experiments show that by solving multiple perception on camera and LiDAR jointly, the detection model can leverage the advantages from high resolution image and physical world LiDAR mapping data, which leads the KITTI benchmark on 3D object detection. At last, an application of cooperative perception is deployed on edge to heal the live map for autonomous vehicles. Through 3D reconstruction and multi-sensor fusion detection, experiments on real-world dataset demonstrate that a high definition (HD) map on edge can afford well sensed local data for navigation to CAVs. Object Detection Multi-sensor Fusion Connected Autonomous Vehicles Edge Computing
56	Detekce objektů na GPU / Object Detection on GPU Jurák, Martin January 2015 (has links) This thesis is focused on the acceleration of Random Forest object detection in an image. Random Forest detector is an ensemble of independently evaluated random decision trees. This feature can be used to acceleration on graphics unit. Development and increasing performance of graphics processing units allow the use of GPU for general-purpose computing (GPGPU). The goal of this thesis is describe how to implement Random Forest method on GPU with OpenCL standard.
57	AI-based Age Estimation using X-ray Hand Images : A comparison of Object Detection and Deep Learning models Westerberg, Erik January 2020 (has links) Bone age assessment can be useful in a variety of ways. It can help pediatricians predict growth, puberty entrance, identify diseases, and assess if a person lacking proper identification is a minor or not. It is a time-consuming process that is also prone to intra-observer variation, which can cause problems in many ways. This thesis attempts to improve and speed up bone age assessments by using different object detection methods to detect and segment bones anatomically important for the assessment and using these segmented bones to train deep learning models to predict bone age. A dataset consisting of 12811 X-ray hand images of persons ranging from infant age to 19 years of age was used. In the first research question, we compared the performance of three state-of-the-art object detection models: Mask R-CNN, Yolo, and RetinaNet. We chose the best performing model, Yolo, to segment all the growth plates in the phalanges of the dataset. We proceeded to train four different pre-trained models: Xception, InceptionV3, VGG19, and ResNet152, using both the segmented and unsegmented dataset and compared the performance. We achieved good results using both the unsegmented and segmented dataset, although the performance was slightly better using the unsegmented dataset. The analysis suggests that we might be able to achieve a higher accuracy using the segmented dataset by adding the detection of growth plates from the carpal bones, epiphysis, and the diaphysis. The best performing model was Xception, which achieved a mean average error of 1.007 years using the unsegmented dataset and 1.193 years using the segmented dataset. / <p>Presentationen gjordes online via Zoom. </p> deep learning object detection bone age assessment Computer Systems Datorsystem
58	Réseaux de neurones convolutionnels profonds pour la détection de petits véhicules en imagerie aérienne / Deep neural networks for the detection of small vehicles in aerial imagery Ogier du Terrail, Jean 20 December 2018 (has links) Cette thèse présente une tentative d'approche du problème de la détection et discrimination des petits véhicules dans des images aériennes en vue verticale par l'utilisation de techniques issues de l'apprentissage profond ou "deep-learning". Le caractère spécifique du problème permet d'utiliser des techniques originales mettant à profit les invariances des automobiles et autres avions vus du ciel.Nous commencerons par une étude systématique des détecteurs dits "single-shot", pour ensuite analyser l'apport des systèmes à plusieurs étages de décision sur les performances de détection. Enfin nous essayerons de résoudre le problème de l'adaptation de domaine à travers la génération de données synthétiques toujours plus réalistes, et son utilisation dans l'apprentissage de ces détecteurs. / The following manuscript is an attempt to tackle the problem of small vehicles detection in vertical aerial imagery through the use of deep learning algorithms. The specificities of the matter allows the use of innovative techniques leveraging the invariance and self similarities of automobiles/planes vehicles seen from the sky.We will start by a thorough study of single shot detectors. Building on that we will examine the effect of adding multiple stages to the detection decision process. Finally we will try to come to grips with the domain adaptation problem in detection through the generation of better looking synthetic data and its use in the training process of these detectors. Détection d'objets Statistical Learning Object-detection Deep-learning Computer-vision
59	Digitalisering av handskrivna siffror på fysiska formulär : Utvärdering av tillförlitlighet och träningstid Manousian, Jonathan January 2020 (has links) Inom arbetslivet finns situationer i vilka vi kan utnyttja digitalisering för att förenkla och effektivisera arbetet. Ett exempel är den analoga hanteringen av fysiska formulär. Oftast överförs data från fysiska formulär till datorn manuellt. Syftet med detta projekt är att effektivisera den generella hanteringen av pappersformulär genom inskanning. Detta kan göras genom att utnyttja en beskärningsfunktion vid inskanningen. Beskärningen används för att beskära bort irrelevant data från formuläret och därmed framhävs det som ska skannas in. Därefter kan objektigenkänning användas för att känna igen siffror och text från det framhävda fältet. En Androidapplikation har utvecklats som utnyttjar mobilens inbyggda kamera för att skanna in och framhäva viktiga fält från formulär. Parallellt tränades en maskininlärningsmodell, med TensorFlow, att känna igen handskrivna siffror. Den färdigtränade modellen jämfördes med olika OCR-verktyg och resultatet visade att modellen detekterar handskrivna siffror bättre. / A workplace can be made more efficient by digitalization. An example of that is the handling of forms. Most of the time physical forms are manually digitalized. The aim of this project is to simplify the general handling of forms by automating the process. This could be done by scanning photos of forms and using a cropping function to highlight the important parts. By doing this we can use object detection to recognize the text or numbers on that highlighted field. An application was built that utilizes a phone camera to snap a photo of a form, and then a cropping function was implemented to crop out the important part of the form excluding irrelevant data. Parallel to that a machine learning model was trained with TensorFlow to recognize handwritten numbers to work with the application. The trained model was evaluated and compared to different OCR tools, and the results showed that a model trained to detect a specific handwriting works better than general OCR tools on handwritten digits. TensorFlow Object detection OCR Cropping. Software Engineering Programvaruteknik
60	Investigation of real-time lightweight object detection models based on environmental parameters Persson, Dennis January 2022 (has links) As the world is moving towards a more digital world with the majority of people having tablets, smartphones and smart objects, solving real-world computational problems with handheld devices seems more common. Detection or tracking of objects using a camera is starting to be used in all kinds of fields, from self-driving cars, sorting items to x-rays, referenced in Introduction. Object detection is very calculation heavy which is why a good computer is necessary for it to work relatively fast. Object detection using lightweight models is not as accurate as a heavyweight model because the model trades accuracy for inference to work relatively fast on such devices. As handheld devices get more powerful and people have better access to object detection models that can work on limited-computing devices, the ability to build their own small object detection machines at home or at work increases substantially. Knowing what kind of factors that have a big impact on object detection can help the user to design or choose the correct model. This study aims to explore what kind of impact distance, angle and light have on Inceptionv2 SSD, MobileNetv3 Large SSD and MobileNetv3 Small SSD on the COCO dataset. The results indicate that distance is the most dominant factor on the Inceptionv2 SSD model using the COCO dataset. The data for the MobileNetv3 SSD models indicate that the angle might have the biggest impact on these models but the data is too inconclusive to say that with certainty. With the knowledge of knowing what kind of factors that affect a certain model’s performance the most, the user can make a more informed choice to their field of use. object detection convolutional neural network environmental parameters Computer Engineering Datorteknik

Search results