Spelling suggestions: "subject:"detection anda segmentation"" "subject:"detection ando segmentation""
1 |
Virtual image sensors to track human activity in a smart houseTun, Min Han January 2007 (has links)
With the advancement of computer technology, demand for more accurate and intelligent monitoring systems has also risen. The use of computer vision and video analysis range from industrial inspection to surveillance. Object detection and segmentation are the first and fundamental task in the analysis of dynamic scenes. Traditionally, this detection and segmentation are typically done through temporal differencing or statistical modelling methods. One of the most widely used background modeling and segmentation algorithms is the Mixture of Gaussians method developed by Stauffer and Grimson (1999). During the past decade many such algorithms have been developed ranging from parametric to non-parametric algorithms. Many of them utilise pixel intensities to model the background, but some use texture properties such as Local Binary Patterns. These algorithms function quite well under normal environmental conditions and each has its own set of advantages and short comings. However, there are two drawbacks in common. The first is that of the stationary object problem; when moving objects become stationary, they get merged into the background. The second problem is that of light changes; when rapid illumination changes occur in the environment, these background modelling algorithms produce large areas of false positives. / These algorithms are capable of adapting to the change, however, the quality of the segmentation is very poor during the adaptation phase. In this thesis, a framework to suppress these false positives is introduced. Image properties such as edges and textures are utilised to reduce the amount of false positives during adaptation phase. The framework is built on the idea of sequential pattern recognition. In any background modelling algorithm, the importance of multiple image features as well as different spatial scales cannot be overlooked. Failure to focus attention on these two factors will result in difficulty to detect and reduce false alarms caused by rapid light change and other conditions. The use of edge features in false alarm suppression is also explored. Edges are somewhat more resistant to environmental changes in video scenes. The assumption here is that regardless of environmental changes, such as that of illumination change, the edges of the objects should remain the same. The edge based approach is tested on several videos containing rapid light changes and shows promising results. Texture is then used to analyse video images and remove false alarm regions. Texture gradient approach and Laws Texture Energy Measures are used to find and remove false positives. It is found that Laws Texture Energy Measure performs better than the gradient approach. The results of using edges, texture and different combination of the two in false positive suppression are also presented in this work. This false positive suppression framework is applied to a smart house senario that uses cameras to model ”virtual sensors” to detect interactions of occupants with devices. Results show the accuracy of virtual sensors compared with the ground truth is improved.
|
2 |
Semantic Movie Scene Segmentation Using Bag-of-Words Representationluo, sai 07 December 2017 (has links)
No description available.
|
3 |
Development of advanced 3D medical analysis tools for clinical training, diagnosis and treatmentSkounakis, Emmanouil D. January 2013 (has links)
The objective of this PhD research was the development of novel 3D interactive medical platforms for medical image analysis, simulation and visualisation, with a focus on oncology images to support clinicians in managing the increasing amount of data provided by several medical image modalities. DoctorEye and Automatic Tumour Detector platforms were developed through constant interaction and feedback from expert clinicians, integrating a number of innovations in algorithms and methods, concerning image handling, segmentation, annotation, visualisation and plug-in technologies. DoctorEye is already being used in a related tumour modelling EC project (ContraCancrum) and offers several robust algorithms and tools for fast annotation, 3D visualisation and measurements to assist the clinician in better understanding the pathology of the brain area and define the treatment. It is free to use upon request and offers a user friendly environment for clinicians as it simplifies the implementation of complex algorithms and methods. It integrates a sophisticated, simple-to-use plug-in technology allowing researchers to add algorithms and methods (e.g. tumour growth and simulation algorithms for improving therapy planning) and interactively check the results. Apart from diagnostic and research purposes, it supports clinical training as it allows an expert clinician to evaluate a clinical delineation by different clinical users. The Automatic Tumour Detector focuses on abdominal images, which are more complex than those of the brain. It supports full automatic 3D detection of kidney pathology in real-time as well as 3D advanced visualisation and measurements. This is achieved through an innovative method implementing Templates. They contain rules and parameters for the Automatic Recognition Framework defined interactively by engineers based on clinicians’ 3D Golden Standard models. The Templates enable the automatic detection of kidneys and their possible abnormalities (tumours, stones and cysts). The system also supports the transmission of these Templates to another expert for a second opinion. Future versions of the proposed platforms could integrate even more sophisticated algorithms and tools and offer fully computer-aided identification of a variety of other organs and their dysfunctions.
|
4 |
An Algorithm For Multiscale License Plate Detection And Rule-based Character SegmentationKarali, Ali Onur 01 October 2011 (has links) (PDF)
License plate recognition (LPR) technology has great importance for the development of Intelligent
Transportation Systems by automatically identifying the vehicles using image processing
and pattern recognition techniques. Conventional LPR systems consist of license plate
detection (LPD), character segmentation (CS) and character recognition (CR) steps. Successful
detection of license plate and character locations have vital role for proper LPR. Most LPD
and CS techniques in the literature assume fixed distance and orientation from the vehicle to
the imaging system. Hence, application areas of LPR systems using these techniques are
limited to stationary platforms. However, installation of LPR systems on mobile platforms is
required in many applications and algorithms that are invariant to distance, orientation, and
illumination should be developed for this purpose. In this thesis work, a LPD algorithm that
is based on multi-scale vertical edge density feature, and a character segmentation algorithm
based on local thresholding and connected component analysis operations are proposed. Performance
of the proposed algorithm is measured using ground truth positions of the license
plate and characters. Algorithm parameters are optimized using recall and precision curves.
Proposed techniques for each step give satisfying results for different license plate datasets
and algorithm complexity is proper for real-time implementation if optimized.
|
5 |
Simultaneous object detection and segmentation using top-down and bottom-up processingSharma, Vinay 07 January 2008 (has links)
No description available.
|
6 |
Mathematical Expression Detection and Segmentation in Document ImagesBruce, Jacob Robert 19 March 2014 (has links)
Various document layout analysis techniques are employed in order to enhance the accuracy of optical character recognition (OCR) in document images. Type-specific document layout analysis involves localizing and segmenting specific zones in an image so that they may be recognized by specialized OCR modules. Zones of interest include titles, headers/footers, paragraphs, images, mathematical expressions, chemical equations, musical notations, tables, circuit diagrams, among others. False positive/negative detections, oversegmentations, and undersegmentations made during the detection and segmentation stage will confuse a specialized OCR system and thus may result in garbled, incoherent output. In this work a mathematical expression detection and segmentation (MEDS) module is implemented and then thoroughly evaluated. The module is fully integrated with the open source OCR software, Tesseract, and is designed to function as a component of it. Evaluation is carried out on freely available public domain images so that future and existing techniques may be objectively compared. / Master of Science
|
7 |
Segmentation and structuring of video documents for indexing applicationsTapu, Ruxandra Georgina 07 December 2012 (has links) (PDF)
Recent advances in telecommunications, collaborated with the development of image and video processing and acquisition devices has lead to a spectacular growth of the amount of the visual content data stored, transmitted and exchanged over Internet. Within this context, elaborating efficient tools to access, browse and retrieve video content has become a crucial challenge. In Chapter 2 we introduce and validate a novel shot boundary detection algorithm able to identify abrupt and gradual transitions. The technique is based on an enhanced graph partition model, combined with a multi-resolution analysis and a non-linear filtering operation. The global computational complexity is reduced by implementing a two-pass approach strategy. In Chapter 3 the video abstraction problem is considered. In our case, we have developed a keyframe representation system that extracts a variable number of images from each detected shot, depending on the visual content variation. The Chapter 4 deals with the issue of high level semantic segmentation into scenes. Here, a novel scene/DVD chapter detection method is introduced and validated. Spatio-temporal coherent shots are clustered into the same scene based on a set of temporal constraints, adaptive thresholds and neutralized shots. Chapter 5 considers the issue of object detection and segmentation. Here we introduce a novel spatio-temporal visual saliency system based on: region contrast, interest points correspondence, geometric transforms, motion classes' estimation and regions temporal consistency. The proposed technique is extended on 3D videos by representing the stereoscopic perception as a 2D video and its associated depth
|
8 |
Semantic content analysis for effective video segmentation, summarisation and retrievalRen, Jinchang January 2009 (has links)
This thesis focuses on four main research themes namely shot boundary detection, fast frame alignment, activity-driven video summarisation, and highlights based video annotation and retrieval. A number of novel algorithms have been proposed to address these issues, which can be highlighted as follows. Firstly, accurate and robust shot boundary detection is achieved through modelling of cuts into sub-categories and appearance based modelling of several gradual transitions, along with some novel features extracted from compressed video. Secondly, fast and robust frame alignment is achieved via the proposed subspace phase correlation (SPC) and an improved sub-pixel strategy. The SPC is proved to be insensitive to zero-mean-noise, and its gradient-based extension is even robust to non-zero-mean noise and can be used to deal with non-overlapped regions for robust image registration. Thirdly, hierarchical modelling of rush videos using formal language techniques is proposed, which can guide the modelling and removal of several kinds of junk frames as well as adaptive clustering of retakes. With an extracted activity level measurement, shot and sub-shot are detected for content-adaptive video summarisation. Fourthly, highlights based video annotation and retrieval is achieved, in which statistical modelling of skin pixel colours, knowledge-based shot detection, and improved determination of camera motion patterns are employed. Within these proposed techniques, one important principle is to integrate various kinds of feature evidence and to incorporate prior knowledge in modelling the given problems. High-level hierarchical representation is extracted from the original linear structure for effective management and content-based retrieval of video data. As most of the work is implemented in the compressed domain, one additional benefit is the achieved high efficiency, which will be useful for many online applications.
|
9 |
Caracterizacão de redes de energia elétrica como meio de transmissão de dadosOliveira, Thiago Rodrigues 28 September 2010 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-04-20T12:09:28Z
No. of bitstreams: 1
thiagorodriguesoliveira.pdf: 2429204 bytes, checksum: b1904f2bde31b546890c5bfa77d58c80 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-04-20T12:44:42Z (GMT) No. of bitstreams: 1
thiagorodriguesoliveira.pdf: 2429204 bytes, checksum: b1904f2bde31b546890c5bfa77d58c80 (MD5) / Made available in DSpace on 2017-04-20T12:44:42Z (GMT). No. of bitstreams: 1
thiagorodriguesoliveira.pdf: 2429204 bytes, checksum: b1904f2bde31b546890c5bfa77d58c80 (MD5)
Previous issue date: 2010-09-28 / Esta dissertação apresenta, de forma detalhada, um conjunto de metodologias e técnicas destinadas à análise de redes de energia elétrica como meio de transmissão de dados (power line communication - PLC). As características das redes elétricas que influenciam um sistema de comunicação de dados consideradas neste trabalho são as seguintes: a impedância de acesso à rede elétrica, a resposta ao impulso e o ruído. Para tanto, técnicas de processamento de sinais para estimação da resposta em frequência, estimação do comprimento efetivo da resposta ao impulso, detecção e segmentação de ruídos impulsivos e análise espectral de ruídos aditivos são propostas e discutidas na presente contribuição. Os desempenhos objetivos e a apreciação subjetiva das técnicas propostas, a partir de dados sintéticos e medidos, evidenciam a adequação destas técnicas para a análise em questão. Além disso, formulações matemáticas para a resposta ao impulso de canais PLC invariantes, variantes e periodicamente variantes no tempo, derivadas a partir do modelo de multi-propagação para canais PLC, são apresentadas. Tais formulações proporcionam de forma simples e objetiva a emulação dos possíveis comportamentos temporais de canais PLC reais e, portanto, podem se constituir como ferramentas de grande utilidade para o projeto e a avaliação de sistemas de comunicações baseados na tecnologia PLC. / This thesis addresses a set of methodologies and techniques for the analysis of electric grids as a medium for data communications (power line communications - PLC). The main features influencing a communication system that are considered in this work are the input impedance, the channel impulse response, and the noise. In this regards, signal processing-based techniques are investigated, proposed and analyzed for the estimations of the channel frequency response and the effective length of the channel impulse response; the detection and segmentation of impulsive noise; and the power spectral analysis of the additive noise at the channel output. The numerical performance and subjective analysis regarding the use of the proposed techniques in synthetic and measured data indicate that those techniques fit well in the thesis purposes. In addition, mathematical formulation for invariant, time-varying, and periodically time-varying PLC channel models, which are based on multi-path channel model approach, are presented. These formulations are simple and elegant ones for the emulation of possible temporal behavior of existing PLC channels and, as a result, can constitute a useful tool for the design and analysis of PLC systems.
|
10 |
Kdy kdo mluví? / Speaker DiarizationTomášek, Pavel January 2011 (has links)
This work aims at a task of speaker diarization. The goal is to implement a system which is able to decide "who spoke when". Particular components of implementation are described. The main parts are feature extraction, voice activity detection, speaker segmentation and clustering and finally also postprocessing. This work also contains results of implemented system on test data including a description of evaluation. The test data comes from the NIST RT Evaluation 2005 - 2007 and the lowest error rate for this dataset is 18.52% DER. Results are compared with diarization system implemented by Marijn Huijbregts from The Netherlands, who worked on the same data in 2009 and reached 12.91% DER.
|
Page generated in 0.1928 seconds