Global ETD Search

341	Recognition of irregular-shaped 3D objects. January 1988 (has links) by Chu Kin-cheong. / Thesis (M.Ph.)--Chinese University of Hong Kong, 1988. / Bibliography: leaves 106-109. Computer vision Three-dimensional display systems
342	Recurrent neural network for optimization with application to computer vision. January 1993 (has links) by Cheung Kwok-wai. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1993. / Includes bibliographical references (leaves [146-154]). / Chapter Chapter 1 --- Introduction / Chapter 1.1 --- Programmed computing vs. neurocomputing --- p.1-1 / Chapter 1.2 --- Development of neural networks - feedforward and feedback models --- p.1-2 / Chapter 1.3 --- State of art of applying recurrent neural network towards computer vision problem --- p.1-3 / Chapter 1.4 --- Objective of the Research --- p.1-6 / Chapter 1.5 --- Plan of the thesis --- p.1-7 / Chapter Chapter 2 --- Background / Chapter 2.1 --- Short history on development of Hopfield-like neural network --- p.2-1 / Chapter 2.2 --- Hopfield network model --- p.2-3 / Chapter 2.2.1 --- Neuron's transfer function --- p.2-3 / Chapter 2.2.2 --- Updating sequence --- p.2-6 / Chapter 2.3 --- Hopfield energy function and network convergence properties --- p.2-1 / Chapter 2.4 --- Generalized Hopfield network --- p.2-13 / Chapter 2.4.1 --- Network order and generalized Hopfield network --- p.2-13 / Chapter 2.4.2 --- Associated energy function and network convergence property --- p.2-13 / Chapter 2.4.3 --- Hardware implementation consideration --- p.2-15 / Chapter Chapter 3 --- Recurrent neural network for optimization / Chapter 3.1 --- Mapping to Neural Network formulation --- p.3-1 / Chapter 3.2 --- Network stability verse Self-reinforcement --- p.3-5 / Chapter 3.2.1 --- Quadratic problem and Hopfield network --- p.3-6 / Chapter 3.2.2 --- Higher-order case and reshaping strategy --- p.3-8 / Chapter 3.2.3 --- Numerical Example --- p.3-10 / Chapter 3.3 --- Local minimum limitation and existing solutions in the literature --- p.3-12 / Chapter 3.3.1 --- Simulated Annealing --- p.3-13 / Chapter 3.3.2 --- Mean Field Annealing --- p.3-15 / Chapter 3.3.3 --- Adaptively changing neural network --- p.3-16 / Chapter 3.3.4 --- Correcting Current Method --- p.3-16 / Chapter 3.4 --- Conclusions --- p.3-17 / Chapter Chapter 4 --- A Novel Neural Network for Global Optimization - Tunneling Network / Chapter 4.1 --- Tunneling Algorithm --- p.4-1 / Chapter 4.1.1 --- Description of Tunneling Algorithm --- p.4-1 / Chapter 4.1.2 --- Tunneling Phase --- p.4-2 / Chapter 4.2 --- A Neural Network with tunneling capability Tunneling network --- p.4-8 / Chapter 4.2.1 --- Network Specifications --- p.4-8 / Chapter 4.2.2 --- Tunneling function for Hopfield network and the corresponding updating rule --- p.4-9 / Chapter 4.3 --- Tunneling network stability and global convergence property --- p.4-12 / Chapter 4.3.1 --- Tunneling network stability --- p.4-12 / Chapter 4.3.2 --- Global convergence property --- p.4-15 / Chapter 4.3.2.1 --- Markov chain model for Hopfield network --- p.4-15 / Chapter 4.3.2.2 --- Classification of the Hopfield markov chain --- p.4-16 / Chapter 4.3.2.3 --- Markov chain model for tunneling network and its convergence towards global minimum --- p.4-18 / Chapter 4.3.3 --- Variation of pole strength and its effect --- p.4-20 / Chapter 4.3.3.1 --- Energy Profile analysis --- p.4-21 / Chapter 4.3.3.2 --- Size of attractive basin and pole strength required --- p.4-24 / Chapter 4.3.3.3 --- A new type of pole eases the implementation problem --- p.4-30 / Chapter 4.4 --- Simulation Results and Performance comparison --- p.4-31 / Chapter 4.4.1 --- Simulation Experiments --- p.4-32 / Chapter 4.4.2 --- Simulation Results and Discussions --- p.4-37 / Chapter 4.4.2.1 --- Comparisons on optimal path obtained and the convergence rate --- p.4-37 / Chapter 4.4.2.2 --- On decomposition of Tunneling network --- p.4-38 / Chapter 4.5 --- Suggested hardware implementation of Tunneling network --- p.4-48 / Chapter 4.5.1 --- Tunneling network hardware implementation --- p.4-48 / Chapter 4.5.2 --- Alternative implementation theory --- p.4-52 / Chapter 4.6 --- Conclusions --- p.4-54 / Chapter Chapter 5 --- Recurrent Neural Network for Gaussian Filtering / Chapter 5.1 --- Introduction --- p.5-1 / Chapter 5.1.1 --- Silicon Retina --- p.5-3 / Chapter 5.1.2 --- An Active Resistor Network for Gaussian Filtering of Image --- p.5-5 / Chapter 5.1.3 --- Motivations of using recurrent neural network --- p.5-7 / Chapter 5.1.4 --- Difference between the active resistor network model and recurrent neural network model for gaussian filtering --- p.5-8 / Chapter 5.2 --- From Problem formulation to Neural Network formulation --- p.5-9 / Chapter 5.2.1 --- One Dimensional Case --- p.5-9 / Chapter 5.2.2 --- Two Dimensional Case --- p.5-13 / Chapter 5.3 --- Simulation Results and Discussions --- p.5-14 / Chapter 5.3.1 --- Spatial impulse response of the 1-D network --- p.5-14 / Chapter 5.3.2 --- Filtering property of the 1-D network --- p.5-14 / Chapter 5.3.3 --- Spatial impulse response of the 2-D network and some filtering results --- p.5-15 / Chapter 5.4 --- Conclusions --- p.5-16 / Chapter Chapter 6 --- Recurrent Neural Network for Boundary Detection / Chapter 6.1 --- Introduction --- p.6-1 / Chapter 6.2 --- From Problem formulation to Neural Network formulation --- p.6-3 / Chapter 6.2.1 --- Problem Formulation --- p.6-3 / Chapter 6.2.2 --- Recurrent Neural Network Model used --- p.6-4 / Chapter 6.2.3 --- Neural Network formulation --- p.6-5 / Chapter 6.3 --- Simulation Results and Discussions --- p.6-7 / Chapter 6.3.1 --- Feasibility study and Performance comparison --- p.6-7 / Chapter 6.3.2 --- Smoothing and Boundary Detection --- p.6-9 / Chapter 6.3.3 --- Convergence improvement by network decomposition --- p.6-10 / Chapter 6.3.4 --- Hardware implementation consideration --- p.6-10 / Chapter 6.4 --- Conclusions --- p.6-11 / Chapter Chapter 7 --- Conclusions and Future Researches / Chapter 7.1 --- Contributions and Conclusions --- p.7-1 / Chapter 7.2 --- Limitations and Suggested Future Researches --- p.7-3 / References --- p.R-l / Appendix I The assignment of the boundary connection of 2-D recurrent neural network for gaussian filtering --- p.Al-1 / Appendix II Formula for connection weight assignment of 2-D recurrent neural network for gaussian filtering and the proof on symmetric property --- p.A2-1 / Appendix III Details on reshaping strategy --- p.A3-1 Neural networks (Computer science) Computer vision
343	Acquisition and modeling of 3D irregular objects. January 1994 (has links) by Sai-bun Wong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1994. / Includes bibliographical references (leaves 127-131). / Abstract --- p.v / Acknowledgment --- p.vii / Chapter 1 --- Introduction --- p.1-8 / Chapter 1.1 --- Overview --- p.2 / Chapter 1.2 --- Survey --- p.4 / Chapter 1.3 --- Objectives --- p.6 / Chapter 1.4 --- Thesis Organization --- p.7 / Chapter 2 --- Range Sensing --- p.9-30 / Chapter 2.1 --- Alternative Approaches to Range Sensing --- p.9 / Chapter 2.1.1 --- Size Constancy --- p.9 / Chapter 2.1.2 --- Defocusing --- p.11 / Chapter 2.1.3 --- Deconvolution --- p.14 / Chapter 2.1.4 --- Binolcular Vision --- p.18 / Chapter 2.1.5 --- Active Triangulation --- p.20 / Chapter 2.1.6 --- Time-of-Flight --- p.22 / Chapter 2.2 --- Transmitter and Detector in Active Sensing --- p.26 / Chapter 2.2.1 --- Acoustics --- p.26 / Chapter 2.2.2 --- Optics --- p.28 / Chapter 2.2.3 --- Microwave --- p.29 / Chapter 2.3 --- Conclusion --- p.29 / Chapter 3 --- Scanning Mirror --- p.31-47 / Chapter 3.1 --- Scanning Mechanisms --- p.31 / Chapter 3.2 --- Advantages of Scanning Mirror --- p.32 / Chapter 3.3 --- Feedback of Scanning Mirror --- p.33 / Chapter 3.4 --- Scanning Mirror Controller --- p.35 / Chapter 3.5 --- Point-to-Point Scanning --- p.39 / Chapter 3.6 --- Line Scanning --- p.39 / Chapter 3.7 --- Specifications and Measurements --- p.41 / Chapter 4 --- The Rangefinder with Reflectance Sensing --- p.48-58 / Chapter 4.1 --- Ambient Noises --- p.49 / Chapter 4.2 --- Occlusion/Shadow --- p.49 / Chapter 4.3 --- Accuracy and Precision --- p.50 / Chapter 4.4 --- Optics --- p.53 / Chapter 4.5 --- Range/Reflectance Crosstalk --- p.56 / Chapter 4.6 --- Summary --- p.58 / Chapter 5 --- Computer Generation of Range Map --- p.59-75 / Chapter 5.1 --- Homogenous Transformation --- p.61 / Chapter 5.2 --- From Global to Viewer Coordinate --- p.63 / Chapter 5.3 --- Z-buffering --- p.55 / Chapter 5.4 --- Generation of Range Map --- p.66 / Chapter 5.5 --- Experimental Results --- p.68 / Chapter 6 --- Characterization of Range Map --- p.76-90 / Chapter 6.1 --- Mean and Gaussian Curvature --- p.76 / Chapter 6.2 --- Methods of Curvature Generation --- p.78 / Chapter 6.2.1 --- Convolution --- p.78 / Chapter 6.2.2 --- Local Surface Patching --- p.81 / Chapter 6.3 --- Feature Extraction --- p.84 / Chapter 6.4 --- Conclusion --- p.85 / Chapter 7 --- Merging Multiple Characteristic Views --- p.91-119 / Chapter 7.1 --- Rigid Body Model --- p.91 / Chapter 7.2 --- Sub-rigid Body Model --- p.94 / Chapter 7.3 --- Probabilistic Relaxation Matching --- p.95 / Chapter 7.4 --- Merging the Sub-rigid Body Model --- p.99 / Chapter 7.5 --- Illustration --- p.101 / Chapter 7.6 --- Merging Multiple Characteristic Views --- p.104 / Chapter 7.7 --- Mislocation of Feature Extraction --- p.105 / Chapter 7.7.1 --- The Transform Matrix for Perfect Matching --- p.106 / Chapter 7.7.2 --- Introducing The Errors in Feature Set --- p.108 / Chapter 7.8 --- Summary --- p.113 / Chapter 8 --- Conclusion --- p.120-126 / References --- p.127-131 / Appendix A - Projection of Object --- p.A1-A2 / Appendix B - Performance Analysis on Rangefinder System --- p.B1-B16 / Appendix C - Matching of Two Characteristic views --- p.C1-C3 Image processing Computer vision Remote sensing
344	Surface registration using quasi-conformal Teichmüller theory and its application to texture mapping. / CUHK electronic theses & dissertations collection January 2013 (has links) Lam, Ka Chun. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 64-68). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. Computer vision--Mathematics Quasiconformal mappings Teichmüller spaces
345	Image cosegmentation and denoise. / 图像共同分割和降噪 / CUHK electronic theses & dissertations collection / Tu xiang gong tong fen ge he xiang zao January 2012 (has links) 我们提出了两个新的方法来解决低级别计算机视觉任务，即图像共同分割和降噪。 / 在共同分割模型上，我们发现对象对应可以为前景统计估计提供有用的信息。我们的方法可以处理极具挑战性的场景，如变形，角度的变化和显着不同的视角和尺度。此外，我们研究了一种新的能量最小化模型，可以同时处理多个图像。真实和基准数据的定性和定量实验证明该方法的有效性。 / 另一方面，噪音始终和高频图像结构是紧耦合的，从而使得减少噪音非常很难。在我们的降噪模型中，我们建议稍微使图像光学离焦，以减少图像和噪声的耦合。这使得我们能更有效地降低噪音，随后恢复失焦。我们的分析显示，这是可能的，并且用许多例子证明我们的技术，其中包括低光图像。 / We present two novel methods to tackle low level computer vision tasks,i.e., image cosegmentation and denoise . / In our cosegmentationmodel, we discover object correspondence canprovide useful information for foreground statistical estimation. Ourmethod can handle extremely challenging scenarios such as deformation, perspective changes and dramatically different viewpoints/scales. In addition, we develop a novel energy minimization model that can handlemultiple images. Experiments on real and benchmark data qualitatively and quantitatively demonstrate the effectiveness of the approach. / One the other hand, noise is always tightly coupled with high-frequencyimage structure, making noise reduction generally very difficult. In ourdenoise model, we propose slightly optically defocusing the image in orderto loosen this noise-image structure coupling. This allows us to more effectively reduce noise and subsequently restore the small defocus. Weanalytically show how this is possible, and demonstrate our technique on a number of examples that include low-light images. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Qin, Zenglu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 64-71). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgement --- p.ii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation and Objectives --- p.1 / Chapter 1.1.1 --- Cosegmentation --- p.1 / Chapter 1.1.2 --- Image Denoise --- p.4 / Chapter 1.2 --- Thesis Outline --- p.7 / Chapter 2 --- Background --- p.8 / Chapter 2.1 --- Cosegmentation --- p.8 / Chapter 2.2 --- Image Denoise --- p.10 / Chapter 3 --- Cosegmentation of Multiple Deformable Objects --- p.12 / Chapter 3.1 --- Related Work --- p.12 / Chapter 3.2 --- Object Corresponding Cosegmentation --- p.13 / Chapter 3.3 --- Importance Map with Object Correspondence --- p.15 / Chapter 3.3.1 --- Feature Importance Map --- p.16 / Chapter 3.3.2 --- Importance Energy E[subscript i](xp) --- p.20 / Chapter 3.4 --- Experimental Result --- p.20 / Chapter 3.4.1 --- Two-Image Cosegmentation --- p.21 / Chapter 3.4.2 --- ETHZ Toys Dataset --- p.22 / Chapter 3.4.3 --- More Results --- p.24 / Chapter 3.5 --- Summary --- p.27 / Chapter 4 --- Using Optical Defocus to Denoise --- p.28 / Chapter 4.1 --- Related Work --- p.29 / Chapter 4.2 --- Noise Analysis --- p.30 / Chapter 4.3 --- Noise Estimation with Focal Blur --- p.33 / Chapter 4.3.1 --- Noise Estimation with a Convolution Model --- p.33 / Chapter 4.3.2 --- Determining λ --- p.41 / Chapter 4.4 --- Final Deconvolution and Error Analysis --- p.43 / Chapter 4.5 --- Implementation --- p.45 / Chapter 4.6 --- Quantitative Evaluation --- p.47 / Chapter 4.7 --- More Experimental Results --- p.53 / Chapter 4.8 --- Summary --- p.56 / Chapter 5 --- Conclusion --- p.62 / Bibliography --- p.64 Computer vision Image processing--Digital techniques
346	Autonomous visual learning for robotic systems Beale, Dan January 2012 (has links) This thesis investigates the problem of visual learning using a robotic platform. Given a set of objects the robots task is to autonomously manipulate, observe, and learn. This allows the robot to recognise objects in a novel scene and pose, or separate them into distinct visual categories. The main focus of the work is in autonomously acquiring object models using robotic manipulation. Autonomous learning is important for robotic systems. In the context of vision, it allows a robot to adapt to new and uncertain environments, updating its internal model of the world. It also reduces the amount of human supervision needed for building visual models. This leads to machines which can operate in environments with rich and complicated visual information, such as the home or industrial workspace; also, in environments which are potentially hazardous for humans. The hypothesis claims that inducing robot motion on objects aids the learning process. It is shown that extra information from the robot sensors provides enough information to localise an object and distinguish it from the background. Also, that decisive planning allows the object to be separated and observed from a variety of dierent poses, giving a good foundation to build a robust classication model. Contributions include a new segmentation algorithm, a new classication model for object learning, and a method for allowing a robot to supervise its own learning in cluttered and dynamic environments. 629.892637
347	Example-based water animation Pickup, David Lemor January 2013 (has links) We present the argument that video footage of real scenes can be used as input examples from which novel three-dimensional scenes can be created. We argue that the parameters used by traditional animation techniques based on the underlying physical properties of the water, do not intuitively relate to the resulting visual appearance. We will present a novel approach which allows a range of video examples to be used as a set of visual parameters to design the visible behaviour of a water animation directly. Our work begins with a method for reconstructing the perceived water surface geometry from video footage of natural scenes, captured with only a single static camera. We show that this has not been accomplished before, because previous approaches use sophisticated capturing systems which are limited to a laboratory environment. We will also present an approach for reconstructing the water surface velocities which are consistent with the reconstructed geometry. We then present a method of using these water surface reconstructions as building blocks which can be seamlessly combined to create novel water surface animations. We are also able to extract foam textures from the videos, which can be applied to the water surfaces to enhance their visual appearance. The surfaces we produce can be shaped and curved to fit within a user's three-dimensional scene, and the movement of external objects can be driven by the velocity fields. We present a range of results which show that our method can plausibly emulate a wide range of real-world scenes, different from those from which the water characteristics were captured. As the animations we create are fully three-dimensional, they can be rendered from any viewpoint, in any rendering style. 006.696
348	Computer Vision System-On-Chip Designs for Intelligent Vehicles Zhou, Yuteng 24 April 2018 (has links) Intelligent vehicle technologies are growing rapidly that can enhance road safety, improve transport efficiency, and aid driver operations through sensors and intelligence. Advanced driver assistance system (ADAS) is a common platform of intelligent vehicle technologies. Many sensors like LiDAR, radar, cameras have been deployed on intelligent vehicles. Among these sensors, optical cameras are most widely used due to their low costs and easy installation. However, most computer vision algorithms are complicated and computationally slow, making them difficult to be deployed on power constraint systems. This dissertation investigates several mainstream ADAS applications, and proposes corresponding efficient digital circuits implementations for these applications. This dissertation presents three ways of software / hardware algorithm division for three ADAS applications: lane detection, traffic sign classification, and traffic light detection. Using FPGA to offload critical parts of the algorithm, the entire computer vision system is able to run in real time while maintaining a low power consumption and a high detection rate. Catching up with the advent of deep learning in the field of computer vision, we also present two deep learning based hardware implementations on application specific integrated circuits (ASIC) to achieve even lower power consumption and higher accuracy. The real time lane detection system is implemented on Xilinx Zynq platform, which has a dual core ARM processor and FPGA fabric. The Xilinx Zynq platform integrates the software programmability of an ARM processor with the hardware programmability of an FPGA. For the lane detection task, the FPGA handles the majority of the task: region-of-interest extraction, edge detection, image binarization, and hough transform. After then, the ARM processor takes in hough transform results and highlights lanes using the hough peaks algorithm. The entire system is able to process 1080P video stream at a constant speed of 69.4 frames per second, realizing real time capability. An efficient system-on-chip (SOC) design which classifies up to 48 traffic signs in real time is presented in this dissertation. The traditional histogram of oriented gradients (HoG) and support vector machine (SVM) are proven to be very effective on traffic sign classification with an average accuracy rate of 93.77%. For traffic sign classification, the biggest challenge comes from the low execution efficiency of the HoG on embedded processors. By dividing the HoG algorithm into three fully pipelined stages, as well as leveraging extra on-chip memory to store intermediate results, we successfully achieved a throughput of 115.7 frames per second at 1080P resolution. The proposed generic HoG hardware implementation could also be used as an individual IP core by other computer vision systems. A real time traffic signal detection system is implemented to present an efficient hardware implementation of the traditional grass-fire blob detection. The traditional grass-fire blob detection method iterates the input image multiple times to calculate connected blobs. In digital circuits, five extra on-chip block memories are utilized to save intermediate results. By using additional memories, all connected blob information could be obtained through one-pass image traverse. The proposed hardware friendly blob detection can run at 72.4 frames per second with 1080P video input. Applying HoG + SVM as feature extractor and classifier, 92.11% recall rate and 99.29% precision rate are obtained on red lights, and 94.44% recall rate and 98.27% precision rate on green lights. Nowadays, convolutional neural network (CNN) is revolutionizing computer vision due to learnable layer by layer feature extraction. However, when coming into inference, CNNs are usually slow to train and slow to execute. In this dissertation, we studied the implementation of principal component analysis based network (PCANet), which strikes a balance between algorithm robustness and computational complexity. Compared to a regular CNN, the PCANet only needs one iteration training, and typically at most has a few tens convolutions on a single layer. Compared to hand-crafted features extraction methods, the PCANet algorithm well reflects the variance in the training dataset and can better adapt to difficult conditions. The PCANet algorithm achieves accuracy rates of 96.8% and 93.1% on road marking detection and traffic light detection, respectively. Implementing in Synopsys 32nm process technology, the proposed chip can classify 724,743 32-by-32 image candidates in one second, with only 0.5 watt power consumption. In this dissertation, binary neural network (BNN) is adopted as a potential detector for intelligent vehicles. The BNN constrains all activations and weights to be +1 or -1. Compared to a CNN with the same network configuration, the BNN achieves 50 times better resource usage with only 1% - 2% accuracy loss. Taking car detection and pedestrian detection as examples, the BNN achieves an average accuracy rate of over 95%. Furthermore, a BNN accelerator implemented in Synopsys 32nm process technology is presented in our work. The elastic architecture of the BNN accelerator makes it able to process any number of convolutional layers with high throughput. The BNN accelerator only consumes 0.6 watt and doesn't rely on external memory for storage. FPGA computer vision deep learning ASIC
349	Computer Vision and Machine Learning for Autonomous Vehicles Chen, Zhilu 22 October 2017 (has links) "Autonomous vehicle is an engineering technology that can improve transportation safety, alleviate traffic congestion and reduce carbon emissions. Research on autonomous vehicles can be categorized by functionality, for example, object detection or recognition, path planning, navigation, lane keeping, speed control and driver status monitoring. The research topics can also be categorized by the equipment or techniques used, for example, image processing, computer vision, machine learning, and localization. This dissertation primarily reports on computer vision and machine learning algorithms and their implementations for autonomous vehicles. The vision-based system can effectively detect and accurately recognize multiple objects on the road, such as traffic signs, traffic lights, and pedestrians. In addition, an autonomous lane keeping system has been proposed using end-to-end learning. In this dissertation, a road simulator is built using data collection and augmentation, which can be used for training and evaluating autonomous driving algorithms. The Graphic Processing Unit (GPU) based traffic sign detection and recognition system can detect and recognize 48 traffic signs. The implementation has three stages: pre-processing, feature extraction, and classification. A highly optimized and parallelized version of Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM) is used. The system can process 27.9 frames per second with the active pixels of a 1,628 by 1,236 resolution, and with the minimal loss of accuracy. In an evaluation using the BelgiumTS dataset, the experimental results indicate that the detection rate is about 91.69% with false positives per window of 3.39e-5, and the recognition rate is about 93.77%. We report on two traffic light detection and recognition systems. The first system detects and recognizes red circular lights only, using image processing and SVM. Its performance is better than that of traditional detectors and it achieves the best performance with 96.97% precision and 99.43% recall. The second system is more complicated. It detects and classifies different types of traffic lights, including green and red lights in both circular and arrow forms. In addition, it employs image processing techniques, such as color extraction and blob detection to locate the candidates. Subsequently, a pre-trained PCA network is used as a multi-class classifier for obtaining frame-by-frame results. Furthermore, an online multi-object tracking technique is applied to overcome occasional misses and a forecasting method is used to filter out false positives. Several additional optimization techniques are employed to improve the detector performance and to handle the traffic light transitions. A multi-spectral data collection system is implemented for pedestrian detection, which includes a thermal camera and a pair of stereo color cameras. The three cameras are first aligned using trifocal tensor, and the aligned data are processed by using computer vision and machine learning techniques. Convolutional channel features (CCF) and the traditional HOG+SVM approach are evaluated over the data captured from the three cameras. Through the use of trifocal tensor and CCF, training becomes more efficient. The proposed system achieves only a 9% log-average miss rate on our dataset. Autonomous lane keeping system employs an end- to-end learning approach for obtaining the proper steering angle for maintaining a car in a lane. The convolutional neural network (CNN) model uses raw image frames as input, and it outputs the steering angles corresponding to the input frames. Unlike the traditional approach, which manually decomposes the problem into several parts, such as lane detection, path planning, and steering control, the model learns to extract useful features on its own and learns to steer from human behavior. More importantly, we find that having a simulator for data augmentation and evaluation is important. We then build the simulator using image projection, vehicle dynamics, and vehicle trajectory tracking. The test results reveal that the model trained with augmented data using the simulator has better performance and achieves about a 98% autonomous driving time on our dataset. Furthermore, a vehicle data collection system is developed for building our own datasets from recorded videos. These datasets are used in the above studies and have been released to the public for autonomous vehicle research. The experimental datasets are available at http://computing.wpi.edu/Dataset.html." computer vision machine learning autonomous vehicles
350	GPU Based Real-Time Trinocular Stereovision Yao, Yuanbin 24 August 2012 (has links) "Stereovision has been applied in many fields including UGV (Unmanned Ground Vehicle) navigation and surgical robotics. Traditionally most stereovision applications are binocular which uses information from a horizontal 2-camera array to perform stereo matching and compute the depth image. Trinocular stereovision with a 3-camera array has been proved to provide higher accuracy in stereo matching which could benefit application like distance finding, object recognition and detection. However, as a result of an extra camera, additional information to be processed would increase computational burden and hence not practical in many time critical applications like robotic navigation and surgical robot. Due to the nature of GPUÂ’s highly parallelized SIMD (Single Instruction Multiple Data) architecture, GPGPU (General Purpose GPU) computing can effectively be used to parallelize the large data processing and greatly accelerate the computation of algorithms used in trinocular stereovision. So the combination of trinocular stereovision and GPGPU would be an innovative and effective method for the development of stereovision application. This work focuses on designing and implementing a real-time trinocular stereovision algorithm with GPU (Graphics Processing Unit). The goal involves the use of Open Source Computer Vision Library (OpenCV) in C++ and NVidia CUDA GPGPU Solution. Algorithms were developed with many different basic image processing methods and a winner-take-all method is applied to perform fusion of disparities in different directions. The results are compared in accuracy and speed to verify the improvement." OpenCV Computer Vision Trinocular Stereovision GPGPU CUDA

Search results