41 |
Contextual models for object detection using boosted random fieldsTorralba, Antonio, Murphy, Kevin P., Freeman, William T. 25 June 2004 (has links)
We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes.
|
42 |
A Formulation for Active Learning with Applications to Object DetectionSung, Kah Kay, Niyogi, Partha 06 June 1996 (has links)
We discuss a formulation for active example selection for function learning problems. This formulation is obtained by adapting Fedorov's optimal experiment design to the learning problem. We specifically show how to analytically derive example selection algorithms for certain well defined function classes. We then explore the behavior and sample complexity of such active learning algorithms. Finally, we view object detection as a special case of function learning and show how our formulation reduces to a useful heuristic to choose examples to reduce the generalization error.
|
43 |
Fast Face Finding / Snabb ansiktsdetekteringWesterlund, Tomas January 2004 (has links)
Face detection is a classical application of object detection. There are many practical applications in which face detection is the first step; face recognition, video surveillance, image database management, video coding. This report presents the results of an implementation of the AdaBoost algorithm to train a Strong Classifier to be used for face detection. The AdaBoost algorithm is fast and shows a low false detection rate, two characteristics which are important for face detection algorithms. The application is an implementation of the AdaBoost algorithm with several command-line executables that support testing of the algorithm. The training and detection algorithms are separated from the rest of the application by a well defined interface to allow reuse as a software library. The source code is documented using the JavaDoc-standard, and CppDoc is then used to produce detailed information on classes and relationships in html format. The implemented algorithm is found to produce relatively high detection rate and low false alarm rate, considering the badly suited training data used.
|
44 |
Multi-camera Human Tracking on Realtime 3D Immersive Surveillance SystemHsieh, Meng-da 23 June 2010 (has links)
Conventional surveillance systems present video to a user from more than one camera on a single display. Such a display allows the user to observe different part of the scene, or to observe the same part of the scene from different viewpoints. Each video is usually labeled by a fixed textual annotation displayed under the video segment to identify the image. With the growing number of surveillance cameras set up and the expanse of surveillance area, the conventional split-screen display approach cannot provide intuitive correspondence between the images acquired and the areas under surveillance. Such a system has a number of inherent flaws¡GLower relativity of split videos¡BThe difficulty of tracking new activities¡BLow resolution of surveillance videos¡BThe difficulty of total surveillance¡FIn order to improve the above defects, the ¡§Immersive Surveillance for Total Situational Awareness¡¨ use computer graphic technique to construct 3D model of buildings on the 2D satellite-images, the users can construct the floor platform by defining the information of each floor or building and the position of each camera. This information is combined to construct 3D surveillance scene, and the images acquired by surveillance cameras are pasted into the constructed 3D model to provide intuitively visual presentation. The users could also walk through the scene by a fixed-frequency , self-defined business model to perform a virtual surveillance.
Multi-camera Human Tracking on Realtime 3D Immersive Surveillance System based on the ¡§Immersive Surveillance for Total Situational Awareness,¡¨ 1. Salient object detection¡GThe System converts videos to corresponding image sequences and analyze the videos provided by each camera. In order to filter out the foreground pixels, the background model of each image is calculated by pixel-stability-based background update algorithm. 2. Nighttime image fusion¡GUse the fuzzy enhancement method to enhance the dark area in nighttime image, and also maintain the saturation information. Then apply the Salient object detection Algorithm to extract salient objects of the dark area. The system divides fusion results into 3 parts: wall, ceiling, and floor, then pastes them as materials into corresponding parts of 3D scene. 3. Multi-camera human tracking¡GApply connected component labeling to filter out small area and save each block¡¦s infomation. Use RGB-weight percentage information in each block and 5-state status (Enter¡BLeave¡BMatch¡BOcclusion¡BFraction) to draw out the trajectory of each person in every camera¡¦s field of view on the 3D surveillance scene. Finally, fuse every camera together to complete the multi-camera realtime people tracking. Above all, we can track every human in our 3D immersive surveillance system without watching out each of thousand of camera views.
|
45 |
Interest Curves : Concept, Evaluation, Implementation and ApplicationsLi, Bo January 2015 (has links)
Image features play important roles in a wide range of computer vision applications, such as image registration, 3D reconstruction, object detection and video understanding. These image features include edges, contours, corners, regions, lines, curves, interest points, etc. However, the research is fragmented in these areas, especially when it comes to line and curve detection. In this thesis, we aim to discover, integrate, evaluate and summarize past research as well as our contributions in the area of image features. This thesis provides a comprehensive framework of concept, evaluation, implementation, and applications for image features. Firstly, this thesis proposes a novel concept of interest curves. Interest curves is a concept derived and extended from interest points. Interest curves are significant lines and arcs in an image that are repeatable under various image transformations. Interest curves bring clear guidelines and structures for future curve and line detection algorithms and related applications. Secondly, this thesis presents an evaluation framework for detecting and describing interest curves. The evaluation framework provides a new paradigm for comparing the performance of state-of-the-art line and curve detectors under image perturbations and transformations. Thirdly, this thesis proposes an interest curve detector (Distinctive Curves, DICU), which unifies the detection of edges, corners, lines and curves. DICU represents our state-of-the-art contribution in the areas concerning the detection of edges, corners, curves and lines. Our research efforts cover the most important attributes required by these features with respect to robustness and efficiency. Interest curves preserve richer geometric information than interest points. This advantage gives new ways of solving computer vision problems. We propose a simple description method for curve matching applications. We have found that our proposed interest curve descriptor outperforms all state-of-the-art interest point descriptors (SIFT, SURF, BRISK, ORB, FREAK). Furthermore, in our research we design a novel object detection algorithm that only utilizes DICU geometries without using local feature appearance. We organize image objects as curve chains and to detect an object, we search this curve chain in the target image using dynamic programming. The curve chain matching is scale and rotation-invariant as well as robust to image deformations. These properties have given us the possibility of resolving the rotation-variance problem in object detection applications. In our face detection experiments, the curve chain matching method proves to be scale and rotation-invariant and very computational efficient. / Bilddetaljer har en viktig roll i ett stort antal applikationer för datorseende, t.ex., bildregistrering, 3D-rekonstruktion, objektdetektering och videoförståelse. Dessa bilddetaljer inkluderar kanter, konturer, hörn, regioner, linjer, kurvor, intressepunkter, etc. Forskningen inom dessa områden är splittrad, särskilt för detektering av linjer och kurvor. I denna avhandling, strävar vi efter att hitta, integrera, utvärdera och sammanfatta tidigare forskning tillsammans med vår egen forskning inom området för bildegenskaper. Denna avhandling presenterar ett ramverk för begrepp, utvärdering, utförande och applikationer för bilddetaljer. För det första föreslår denna avhandling ett nytt koncept för intressekurvor. Intressekurvor är ett begrepp som härrör från intressepunkter och det är viktiga linjer och bågar i bilden som är repeterbara oberoende av olika bildtransformationer. Intressekurvor ger en tydlig vägledning och struktur för framtida algoritmer och relaterade tillämpningar för kurv- och linjedetektering. För det andra, presenterar denna avhandling en utvärderingsram för detektorer och beskrivningar av intressekurvor. Utvärderingsramverket utgör en ny paradigm för att jämföra resultatet för de bästa möjliga teknikerna för linje- och kurvdetektorer vid bildstörningar och bildtransformationer. För det tredje presenterar denna avhandling en detektor för intressekurvor (Distinctive curves, DICU), som förenar detektering av kanter, hörn, linjer och kurvor. DICU representerar vårt främsta bidrag inom området detektering av kanter, hörn, kurvor och linjer. Våra forskningsinsatser täcker de viktigaste attribut som krävs av dessa funktioner med avseende på robusthet och effektivitet. Intressekurvor innehåller en rikare geometrisk information än intressepunkter. Denna fördel öppnar för nya sätt att lösa problem för datorseende. Vi föreslår en enkel beskrivningsmetod för kurvmatchningsapplikationer och den föreslagna deskriptorn för intressekurvor överträffar de bästa tillgängliga deskriptorerna för intressepunkter (SIFT, SURF, BRISK, ORB, och FREAK). Dessutom utformar vi en ny objektdetekteringsalgoritm som bara använder geometri för DICU utan att använda det lokala utseendet. Vi organiserar bildobjekt som kurvkedjor och för att upptäcka ett objekt behöver vi endast söka efter denna kurvkedja i målbilden med hjälp av dynamisk programmering. Kurvkedjematchningen är oberoende av skala och rotationer samt robust vid bilddeformationer. Dessa egenskaper ger möjlighet att lösa problemet med rotationsberoende inom objektdetektering. Vårt ansiktsigenkänningsexperiment visar att kurvkedjematchning är oberoende av skala och rotationer och att den är mycket beräkningseffektiv. / INTRO – INteractive RObotics research network
|
46 |
An Energy Efficient FPGA Hardware Architecture for the Acceleration of OpenCV Object DetectionBrousseau, Braiden 21 November 2012 (has links)
The use of Computer Vision in programmable mobile devices could lead to novel and creative applications. However, the computational demands of Computer Vision are ill-suited to low performance mobile processors. Also the evolving algorithms, due to active research in this fi eld, are ill-suited to dedicated digital circuits. This thesis proposes the inclusion of an FPGA co-processor in smartphones as a means of efficiently computing
tasks such as Computer Vision. An open source object detection algorithm is run on a mobile device and implemented on an FPGA to motivate this proposal. Our hardware implementation presents a novel memory architecture and a SIMD processing style that achieves both high performance and energy efficiency. The FPGA implementation outperforms a mobile device by 59 times while being 13.5 times more energy efficient.
|
47 |
An Energy Efficient FPGA Hardware Architecture for the Acceleration of OpenCV Object DetectionBrousseau, Braiden 21 November 2012 (has links)
The use of Computer Vision in programmable mobile devices could lead to novel and creative applications. However, the computational demands of Computer Vision are ill-suited to low performance mobile processors. Also the evolving algorithms, due to active research in this fi eld, are ill-suited to dedicated digital circuits. This thesis proposes the inclusion of an FPGA co-processor in smartphones as a means of efficiently computing
tasks such as Computer Vision. An open source object detection algorithm is run on a mobile device and implemented on an FPGA to motivate this proposal. Our hardware implementation presents a novel memory architecture and a SIMD processing style that achieves both high performance and energy efficiency. The FPGA implementation outperforms a mobile device by 59 times while being 13.5 times more energy efficient.
|
48 |
The evolution of snake toward automation for multiple blob-object segmentationSaha, Baidya Nath Unknown Date
No description available.
|
49 |
Sharing visual features for multiclass and multiview object detectionTorralba, Antonio, Murphy, Kevin P., Freeman, William T. 14 April 2004 (has links)
We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (run-time) computational complexity, and the (training-time) sample complexity, scales linearly with the number of classes to be detected. It seems unlikely that such an approach will scale up to allow recognition of hundreds or thousands of objects.We present a multi-class boosting procedure (joint boosting) that reduces the computational and sample complexity, by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required, and therefore the computational cost, is observed to scale approximately logarithmically with the number of classes. The features selected jointly are closer to edges and generic features typical of many natural structures instead of finding specific object parts. Those generic features generalize better and reduce considerably the computational cost of an algorithm for multi-class object detection.
|
50 |
Contextual Influences on SaliencyTorralba, Antonio 14 April 2004 (has links)
This article describes a model for including scene/context priors in attention guidance. In the proposed scheme, visual context information can be available early in the visual processing chain, in order to modulate the saliency of image regions and to provide an efficient short cut for object detection and recognition. The scene is represented by means of a low-dimensional global description obtained from low-level features. The global scene features are then used to predict the probability of presence of the target object in the scene, and its location and scale, before exploring the image. Scene information can then be used to modulate the saliency of image regions early during the visual processing in order to provide an efficient short cut for object detection and recognition.
|
Page generated in 0.0877 seconds