Global ETD Search

1	Advances in detecting object classes and their semantic parts Modolo, Davide January 2017 (has links) Object classes are central to computer vision and have been the focus of substantial research in the last fifteen years. This thesis addresses the tasks of localizing entire objects in images (object class detection) and localizing their semantic parts (part detection). We present four contributions, two for each task. The first two improve existing object class detection techniques by using context and calibration. The other two contributions explore semantic part detection in weakly-supervised settings. First, the thesis presents a technique for predicting properties of objects in an image based on its global appearance only. We demonstrate the method by predicting three properties: aspect of appearance, location in the image and class membership. Overall, the technique makes multi-component object detectors faster and improves their performance. The second contribution is a method for calibrating the popular Ensemble of Exemplar- SVM object detector. Unlike the standard approach, which calibrates each Exemplar- SVM independently, our technique optimizes their joint performance as an ensemble. We devise an efficient optimization algorithm to find the global optimal solution of the calibration problem. This leads to better object detection performance compared to using independent calibration. The third innovation is a technique to train part-based model of object classes using data sourced from the web. We learn rich models incrementally. Our models encompass the appearance of parts and their spatial arrangement on the object, specific to each viewpoint. Importantly, it does not require any part location annotation, which is one of the main limits to training many part detectors. Finally, the last contribution is a study on whether semantic object parts emerge in Convolutional Neural Networks trained for higher-level tasks, such as image classification. While previous efforts studied this matter by visual inspection only, we perform an extensive quantitative analysis based on ground-truth part location annotations. This provides a more conclusive answer to the question. 006.3 object detection ; part detection
2	Visual Saliency Application in Object Detection for Search Space Reduction January 2017 (has links) abstract: Vision is the ability to see and interpret any visual stimulus. It is one of the most fundamental and complex tasks the brain performs. Its complexity can be understood from the fact that close to 50% of the human brain is dedicated to vision. The brain receives an overwhelming amount of sensory information from the retina – estimated at up to 100 Mbps per optic nerve. Parallel processing of the entire visual field in real time is likely impossible for even the most sophisticated brains due to the high computational complexity of the task [1]. Yet, organisms can efficiently process this information to parse complex scenes in real time. This amazing feat of nature relies on selective attention which allows the brain to filter sensory information to select only a small subset of it for further processing. Today, Computer Vision has become ubiquitous in our society with several in image understanding, medicine, drones, self-driving cars and many more. With the advent of GPUs and the availability of huge datasets like ImageNet, Convolutional Neural Networks (CNNs) have come to play a very important role in solving computer vision tasks, e.g object detection. However, the size of the networks become prohibitive when higher accuracies are needed, which in turn demands more hardware. This hinders the application of CNNs to mobile platforms and stops them from hitting the real-time mark. The computational efficiency of a computer vision task, like object detection, can be enhanced by adopting a selective attention mechanism into the algorithm. In this work, this idea is explored by using Visual Proto Object Saliency algorithm [1] to crop out the areas of an image without relevant objects before a computationally intensive network like the Faster R-CNN [2] processes it. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2017 Electrical engineering Object Detection Saliency
3	Object Detection with Swin Vision Transformers from Raw ADC Radar Signals Giroux, James 15 August 2023 (has links) Object detection utilizing frequency modulated continuous wave radar is becoming increasingly popular in the field of autonomous vehicles. Radar does not possess the same drawbacks seen by other emission-based sensors such as LiDAR, primarily the degradation or loss of return signals due to weather conditions such as rain or snow. Thus, there is a necessity for fully autonomous systems to utilize radar sensing applications in downstream decision-making tasks, generally handled by deep learning algorithms. Commonly, three transformations have been used to form range-azimuth-Doppler cubes in which deep learning algorithms could perform object detection. This method has drawbacks, specifically the pre-processing costs associated with performing multiple Fourier Transforms and normalization. We develop a network utilizing raw radar analog-to-digital converter output capable of operating in near real-time given the removal of all pre-processing. We obtain inference time estimates one-fifth of the traditional range-Doppler pipeline, decreasing from $\SI{156}{\milli\second}$ to $\SI{30}{\milli\second}$, and similar decreases in comparison to the full range-azimuth-Doppler cube. Moreover, we introduce hierarchical Swin Vision transformers to the field of radar object detection and show their capability to operate on inputs varying in pre-processing, along with different radar configurations, \textit{i.e.}, relatively low and high numbers of transmitters and receivers. Our network increases both average recall, and mean intersection over union performance by $\sim 6-7\%$, obtaining state-of-the-art F1 scores as a result on high-definition radar. On low-definition radar, we note an increase in mean average precision of $\sim 2.5\%$ over state-of-the-art range-Doppler networks when raw analog-to-digital converter data is used, and a $\sim5\%$ increase over networks using the full range-azimuth-Doppler cube. vision transformer radar object detection
4	Scalable Multi-Task Learning R-CNN for Classification and Localization in Autonomous Vehicle Technology Rinchen, Sonam 28 April 2023 (has links) Multi-task learning (MTL) is a rapidly growing field in the world of autonomous vehicles, particularly in the area of computer vision. Autonomous vehicles are heavily reliant on computer vision technology for tasks such as object detection, object segmentation, and object tracking. The complexity of sensor data and the multiple tasks involved in autonomous driving can make it challenging to design effective systems. MTL addresses these challenges by training a single model to perform multiple tasks simultaneously, utilizing shared representations to learn common concepts between a group of related tasks, and improving data efficiency. In this thesis, we proposed a scalable MTL system for object detection that can be used to construct any MTL network with different scales and shapes. The proposed system is an extension to the state-of-art algorithm called Mask RCNN. It is designed to overcome the limitations of learning multiple objects in multi-label learning. To demonstrate the effectiveness of the proposed system, we built three different networks using it and evaluated their performance on the state-of-the-art BDD100k dataset. Our experimental results demonstrate that the proposed MTL networks outperform a base single-task network, Mask RCNN, in terms of mean average precision at 50 (mAP50). Specifically, the proposed MTL networks achieved a mAP50 of 66%, while the base network only achieved 53%. Furthermore, we also conducted comparisons between the proposed MTL networks to determine the most efficient way to group tasks together in order to create an optimal MTL network for object detection on the BDD100k dataset. Multi Task Learning Object detection
5	Incident Response Enhancements using Streamlined UAV Mission Planning, Imaging, and Object Detection Link, Eric Matthew 29 June 2023 (has links) Systems composed of simple, reliable tools are needed to facilitate adoption of Uncrewed Aerial Vehicles (UAVs) into incident response teams. Existing systems require operators to have highly skilled level of knowledge of UAV operations, including mission planning, low-level system operation, and data analysis. In this paper, a system is introduced to reduce required operator knowledge level via streamlined mission planning, in-flight object detection, and data presentation. For mission planning, two software programs are introduced that utilize geographic data to: (1) update existing missions to a constant above ground level altitude; and (2) auto-generate missions along waterways. To test system performance, a UAV platform based on the Tarot 960 was equipped with an Nvidia Jetson TX2 computing device and a FLIR GigE camera. For demonstration of on-board object detection, the You Only Look Once v8 model was trained on mock propane tanks. A Robot Operating System package was developed to manage communication between the flight controller, camera, and object detection model. Finally, software was developed to present collected data in easy to understand interactive maps containing both detected object locations and surveyed area imagery. Several flight demonstrations were conducted to validate both the performance and usability of the system. The mission planning programs accurately adjust altitude and generate missions along waterways. While in flight, the system demonstrated the capability to take images, perform object detection, and return estimated object locations with an average accuracy of 3.5 meters. The calculated object location data was successfully formatted into interactive maps, providing incident responders with a simple visualization of target locations and surrounding environment. Overall, the system presented meets the specified objectives by reducing the required operator skill level for successful deployment of UAVs into incident response scenarios. / Master of Science / Systems composed of simple, reliable tools are needed to facilitate adoption of Uncrewed Aerial Vehicles (UAVs) into incident response teams. Existing systems require operators to have a high level of knowledge of UAV operations. In this paper, a new system is introduced that reduces required operator knowledge via streamlined mission planning, in-flight object detection, and data presentation. Two mission planning computer programs are introduced that allow users to: (1) update existing missions to maintain constant above ground level altitude; and (2) to autonomously generate missions along waterways. For demonstration of in-flight object detection, a computer vision model was trained on mock propane tanks. Software for capturing images and running the computer vision model was written and deployed onto a UAV equipped with a computer and camera. For post-flight data analysis, software was written to create image mosaics of the surveyed area as well as to plot detected objects on maps. The mission planning software was shown to appropriately adjust altitude in existing missions and to generate new missions along waterways. Through several flight demonstrations, the system appropriately captured images and identified detected target locations with an average accuracy of 3.5 meters. Post-flight, the collected images were successfully combined into single-image mosaics with detected objects marked as points of interest. Overall, the system presented meets the specified objectives by reducing the required operator skill level for successful deployment of UAVs into incident response scenarios. UAV Incident Response Object Detection
6	COMPACT AND COST-EFFECTIVE MOBILE 2.4 GHZ RADAR SYSTEM FOR OBJECT DETECTION AND TRACKING Seongha Park (5930117) 17 January 2019 (has links) Various types of small mobile objects such as recreational unmanned vehicles have become easily approachable devices to the public because of technology advancements. The technology advancements make it possible to manufacture small, light, and easy to control unmanned vehicles, therefore the public are able to handily access those unmanned vehicles. As the accessibility to unmanned vehicles for recreational purposes, accidents or attacks to threat a person using those the unmanned vehicles have been arising and growing rapidly. A specific person could be a target of a threat using an unmanned vehicle in open public places due to its small volume and mobility. Even though an unmanned vehicle approaches to a person, it could be difficult to detect the unmanned vehicle before the person encounters because of the compact size and maneuverability. <div><br></div><div>This research is to develop a radar system that is able to operate in open public areas to detect and track unmanned vehicles. It is not capable using existing radar systems such as for navigation, aviation, national defense, air traffic control, or weather forecasting to monitor and scan public places because of large volume, high operation cost, and danger to human health of the radar systems. For example, if electromagnetic fields emitted from high-power radar penetrate exposed skin surface or eyes, the energy from the electromagnetic fields can cause skin burns, eye cataracts, or more (Zamanian & Hardiman, 2005). Therefore, a radar system that can perform at the public place is necessary for monitoring and surveillance the area. <div><br></div><div>The hardware of this proposed radar system is composed of three parts: 1) radio frequency transmission and receiver part which we will call RF part; 2) transmitting radio frequency control and amplifying reflected signal part which we will call electric part; and 3) data collection, data processing, and visualization part which we will call post-processing part. A transmitting radio frequency control and an amplifying reflected signal part are based on a research performed at a lecture and labs designed by researchers at Massachusetts Institute of Technology (MIT) Lincoln Lab, Charvat et al. (2012) and another lecture and labs designed by a professor at University of California at Davis, Liu (2013). The radar system designed at University of California at Davis is based on the system designed at MIT Lincoln Lab that proposed a design of a small, low cost, and low power consuming radar. The low power radar proposed by MIT Lincoln Lab is suitable to operate in any public places without any restrictions for human health because of it low power transmission, however surveillance area is relatively short and limited. To expand monitoring area with this proposed low power radar system, the transmit power of the radar system proposed in this study is enhanced comparing to the radar proposed by MIT Lincoln Lab. Additionally, the radar system is designed and fabricated on printed circuit boards (PCBs) to make the system compact and easy to access for use of various purposed of research fields. For instance, the radar system can be utilized for mapping, localization, or imaging. <div><br></div><div>The first part of post-processing is data collection. The raw data received and amplified through the electric part in the hardware is collected through a compact computer, a Raspberry Pi 3, that is directly connected to the radar. The data collected every second and the collected data is transferred to the post-processing devices, which is a laptop computer in this research. The post-processing device processes data to estimate range of the object, applies filters for tracking, and visualizes the results. In the study, a variant of the Advanced Message Queuing Protocol (AMQP) called RabbitMQ, also called as RMQ (Richardson, 2012; Videla & Williams, 2012) is utilized for real-time data transfer between the Raspberry Pi 3 and a post-processing device. Because each of the data collection, post-processing scripts, and visualization processing have to be performed continuously and sequentially, the RMQ has been used for data exchange between the processes to assist parallel data collection and processing. The processed data show an estimated distance of the object from the radar system in real-time, so that the system can support to monitor a certain area in a remote place if the two distant places are connected through a network.<div><br></div><div>This proposed radar system performed successfully to detect and track an object that was in the sight of the radar. Although further study to improve the system is required, the system will be highly suitable and applicable for research areas requiring sensors for exploration, monitoring, or surveillance because of its accessibility and flexibility. Users who will adopt this radar system for research purposes can develop their own applications that match their research environment for example to support robots for obstacle avoidance or localization and mapping.<br><div><div><div> </div> </div> </div></div></div></div></div> Computer Engineering radar object detection and tracking
7	Visual servoing for mobile robots navigation with collision avoidance and field-of-view constraints / Asservissement visuel pour la navigation de robots mobiles avec évitement d'obstacle et contraintes liées au champ de vision Fu, Wenhao 18 April 2014 (has links) Cette thèse porte sur le problème de la navigation basée sur la vision pour les robots mobiles dans les environnements intérieurs. Plus récemment, de nombreux travaux ont été réalisés pour résoudre la navigation à l'aide d'un chemin visuel, à savoir la navigation basée sur l'apparence. Cependant, en utilisant ce schéma, le mouvement du robot est limité au chemin visuel d'entrainement. Le risque de collision pendant le processus de navigation peut faire écarter le robot de la trajectoire visuelle courante, pour laquelle les repères visuels peuvent être perdus. Dans l'état de nos connaissances, les travaux envisagent rarement l'évitement des collisions et la perte de repère dans le cadre de la navigation basée sur l'apparence. Nous présentons un cadre mobile de navigation pour le robot afin de renforcer la capacité de la méthode basée sur l'apparence, notamment en cas d'évitement de collision et de contraintes de champ de vision. Notre cadre introduit plusieurs contributions techniques. Tout d'abord, les contraintes de mouvement sont considérés dans la détection de repère visuel pour améliorer la performance de détection. Ensuite, nous modélisons l'obstacle en utilisant B-Spline. La représentation de B-Spline n'a pas de régions accidentées et peut générer un mouvement fluide pour la tâche d'évitement de collision. En outre, nous proposons une stratégie de contrôle basée sur la vision, qui peut gérer la perte complète de la cible. Enfin, nous utilisons l'image sphérique pour traiter le cas des projections d'ambiguité et d'infini dus à la projection en perspective. Les véritables expériences démontrent la faisabilité et l'efficacité de notre cadre et de nos méthodes. / This thesis is concerned with the problem of vision-based navigation for mobile robots in indoor environments. Many works have been carried out to solve the navigation using a visual path, namely appearance-based navigation. However, using this scheme, the robot motion is limited to the trained visual path. The potential collision during the navigation process can make robot deviate from the current visual path, in which the visual landmarks can be lost in the current field of view. To the best of our knowledge, seldom works consider collision avoidance and landmark loss in the framework of appearance-based navigation. We outline a mobile robot navigation framework in order to enhance the capability of appearance-based method, especially in case of collision avoidance and field-of-view constraints. Our framework introduces several technical contributions. First of all, the motion constraints are considered into the visual landmark detection to improve the detection performance. Next then, we model the obstacle boundary using B-Spline. The B-Spline representation has no accidented regions and can generate a smooth motion for the collision avoidance task. Additionally, we propose a vision-based control strategy, which can deal with the complete target loss. Finally, we use spherical image to handle the case of ambiguity and infinity projections due to perspective projection. The real experiments demonstrate the feasability and the effectiveness of our framework and methods. Détection de points d'intérêt visuels Visual object detection
8	A High-performance Architecture for Training Viola-Jones Object Detectors Lo, Charles 20 November 2012 (has links) The object detection framework developed by Viola and Jones has become very popular due to its high quality and detection speed. However, the complexity of the computation required to train a detector makes it difficult to develop and test potential improvements to this algorithm or train detectors in the field. In this thesis, a configurable, high-performance FPGA architecture is presented to accelerate this training process. The architecture, structured as a systolic array of pipelined compute engines, is constructed to provide high throughput and make efficient use of the available external memory bandwidth. Extensions to the Viola-Jones detection framework are implemented to demonstrate the flexibility of the architecture. The design is implemented on a Xilinx ML605 development platform running at 200~MHz and obtains a 15-fold speed-up over a multi-threaded OpenCV implementation running on a high-end processor. FPGA Object Detection Reconfigurable Architecture 0544 0800
9	A High-performance Architecture for Training Viola-Jones Object Detectors Lo, Charles 20 November 2012 (has links) The object detection framework developed by Viola and Jones has become very popular due to its high quality and detection speed. However, the complexity of the computation required to train a detector makes it difficult to develop and test potential improvements to this algorithm or train detectors in the field. In this thesis, a configurable, high-performance FPGA architecture is presented to accelerate this training process. The architecture, structured as a systolic array of pipelined compute engines, is constructed to provide high throughput and make efficient use of the available external memory bandwidth. Extensions to the Viola-Jones detection framework are implemented to demonstrate the flexibility of the architecture. The design is implemented on a Xilinx ML605 development platform running at 200~MHz and obtains a 15-fold speed-up over a multi-threaded OpenCV implementation running on a high-end processor. FPGA Object Detection Reconfigurable Architecture 0544 0800
10	Moving Object Detection Based on Ordered Dithering Codebook Model Guo, Jing-Ming, Thinh, Nguyen Van, Lee, Hua 10 1900 (has links) ITC/USA 2014 Conference Proceedings / The Fiftieth Annual International Telemetering Conference and Technical Exhibition / October 20-23, 2014 / Town and Country Resort & Convention Center, San Diego, CA / This paper presents an effective multi-layer background modeling method to detect moving objects by exploiting the advantage of novel distinctive features and hierarchical structure of the Codebook (CB) model. In the block-based structure, the mean-color feature within a block often does not contain sufficient texture information, causing incorrect classification especially in large block size layers. Thus, the Binary Ordered Dithering (BOD) feature becomes an important supplement to the mean RGB feature In summary, the uniqueness of this approach is the incorporation of the halftoning scheme with the codebook model for superior performance over the existing methods. Multilayer codebook Ordered dithering Moving object detection

Search results