Global ETD Search

21	Navigation and Automatic Ground Mapping by Rover Robot Wang, Xuerui, Zhao, Li January 2010 (has links) This project is mainly based on mosaicing of images and similarity measurements with different methods. The map of a floor is created from a database of small-images that have been captured by a camera-mounted robot scanning the wooden floor of a living room. We call this ground mapping. After the ground mapping, the robot can achieve self-positioning on the map by using novel small images it captures as it displaces on the ground. Similarity measurements based on the Schwartz inequality have been used to achieve the ground mapping, as well as to position the robot once the ground map is available. Because the natural light affects the gray value of images, this effect must be accounted for in the envisaged similarity measurements. A new approach to mosaicing is suggested. It uses the local texture orientation, instead of the original gray values, in ground mapping as well as in positioning. Additionally, we report on ground mapping results using other features, gray-values as features. The robot can find its position with few pixel errors by using the novel approach and similarity measurements based on the Schwartz inequality. Image mosaicing Ground mapping Robot positioning Schwartz inequality Texture orientation Structure tensor Linear symmetry
22	Feature Based Image Mosaicing using Regions of Interest for Wide Area Surveillance Camera Arrays with Known Camera Ordering Ballard, Brett S. 16 May 2011 (has links) No description available. Electrical Engineering Remote Sensing Scientific Imaging regions of interest roi known camera ordering image mosaicing feature based homography image stitching stereo computer vision wide area surveillance camera array
23	Total variational optical flow for robust and accurate bladder image mosaicing / Calcul du flot optique dans une approche variationnelle totale pour le mosaïquage robuste et précis d’images de la vessie Ali, Sharib 04 January 2016 (has links) La cystoscopie est l’examen de référence pour le diagnostic et le traitement du cancer de la vessie. Le champ de vue (CdV) réduit des endoscopes complique le diagnostic et le suivi des lésions. Les mosaïques d’images sont une solution à ce problème car elles visualisent des CdV étendus. Toutefois, pour la vessie, le mosaïque d’images est un véritable défi à cause du faible contraste dans les images, des textures peu prononcées, de la variabilité intra- et inter-patient et des changements d’illumination dans les séquences. Ce défi est également à relever dans d’autres modalités endoscopiques ou dans des scènes non médicales comme les vidéos sous-marines. Dans cette thèse, une énergie variationnelle totale a d’abord été minimisée à l’aide d’un algorithme primal-dual du premier ordre pour obtenir un flot optique fournissant une correspondance dense et précise entre les points homologues des paires d’images. Les correspondances sont ensuite utilisées pour déterminer les paramètres des transformations requises pour le placement des images dans le repère global de la mosaïque. Les méthodes proposées pour l’estimation du flot optique dense incluent un terme d’attache aux données qui minimise le nombre des vecteurs aberrants et un terme de régularisation conçu pour préserver les discontinuités du champ devecteurs. Un algorithme de flot optique qui est robuste vis-à-vis de changements d’illumination importants (et utilisable pour différentes modalités) a également été développé dans ce contexte. La précision et la robustesse des méthodes de recalage proposées ont été testées sur des jeux de données (de flot optique) publiquement accessibles et sur des fantômes de vessies et de la peau. Des résultats sur des données patients acquises avec des cystoscopes rigides et flexibles, en lumière blanche ou en fluorescence, montrent la robustesse des algorithmes proposés. Ces résultats sont complétés par ceux obtenus pour d’autres séquences endoscopiques réelles de dermatoscopie, de scène sous-marine et de données d’exploration spatiale. / Cystoscopy is the reference procedure for the diagnosis and treatment of bladder cancer. The small field of view (FOV) of endoscopes makes both the diagnosis and follow-up of lesions difficult. Image mosaics are a solution to this problem since they visualize large FOVs of the bladder scene. However, due to low contrast, weak texture, inter- and intra-patient texture variability and illumination changes in these image sequences, the task of image mosaicing becomes challenging. This is also a major concern in other endoscopic data and non-medical scenes like underwater videos. In this thesis, a total variational energy has been first minimized using a first-order primal-dual algorithm in convex optimization to obtain optical flow vector fields giving a dense and accurate correspondence between homologous points of the image pairs. The correspondences are then used to obtain transformation parameters for registering the images to one global mosaic coordinate system. The proposed methods for dense optical flow estimation include a data-term which is modeled to minimize at most the outliers and a regularizer which is designed to preserve at their best the flow field discontinuities. An optical flow algorithm, which is robust to strong illumination changes (and which suits to different modalities), has also been developed in this framework. The registration accuracy and robustness of the proposed methods are tested on both publicly available datasets for optical flow estimation and on simulated bladder and skin phantoms. Results on patient data acquired with rigid and flexible cystoscopes under the white light and the fluorescence modality show the robustness of the proposed approaches. These results are also complemented with those of other real endoscopic data, dermoscopic sequences, underwater scenes and space exploration data. Approches variationnelles totales Flot optique Constance de structure Descripteurs de voisinages Régularisation anisotropique Mosaïquage d'images endoscopiques Total variational approach Optical flow Structure constancy Neighbordood descriptors Anistropic regularization Convex optimization Endoscopic image mosaicing 621.367 006.6
24	AVALIAÇÃO DE MÉTODOS DE MOSAICO DE IMAGENS APLICADOS EM IMAGENS AGRÍCOLAS OBTIDAS POR MEIO DE RPA Almeida, Pedro Henrique Soares de 15 May 2018 (has links) Submitted by Angela Maria de Oliveira (amolivei@uepg.br) on 2018-06-19T17:15:59Z No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) PEDRO HENRIQUE SOARES DE ALMEIDA.pdf: 8669671 bytes, checksum: bf6252d5566d0b626215c08edce94dca (MD5) / Made available in DSpace on 2018-06-19T17:15:59Z (GMT). No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) PEDRO HENRIQUE SOARES DE ALMEIDA.pdf: 8669671 bytes, checksum: bf6252d5566d0b626215c08edce94dca (MD5) Previous issue date: 2018-05-15 / O mosaico de imagens é o alinhamento de múltiplas imagens em composições maiores que representam partes de uma cena 3D. Diversos algoritmos de mosaico de imagens foram propostos nas últimas duas décadas. Ao mesmo tempo, o advento contínuo de novos métodos de mosaico torna muito difícil escolher um algoritmo apropriado para uma finalidade específica. Este trabalho teve por objetivo avaliar métodos de mosaico baseados em característica de baixo nível utilizando imagens agrícolas obtidas por meio de Aeronave Remotamente Pilotada (RPA). Algoritmos detectores de característica de baixo nível podem ser invariantes à escala e rotação, dentre outras transformações que comumente ocorrem em imagens agrícolas obtidas por meio de RPA. O detector de cantos de Harris, detector de cantos FAST, detector de característica SIFT e detector SURF foram avaliados de acordo com o desempenho computacional e a qualidade do mosaico gerado. Para avaliar o desempenho, foram levados em consideração fatores como a média de características detectadas por imagem, o número de imagens utilizadas para compor o mosaico e o tempo de processamento (tempo de usuário ou user time). Para avaliar a qualidade, os mosaicos gerados pelos métodos foram utilizados para estimar a severidade da ferrugem asiática da soja e uma comparação com o software comercial Pix4Dmapper foi realizada. Em relação à qualidade, não houve diferença significativa e todos os métodos demonstraram estar no mesmo patamar. O detector SURF, dentre todos os métodos, obteve o pior desempenho utilizando, em média, apenas 33,1% das imagens de entrada para compor os mosaicos. O detector de cantos de Harris mostrou-se como a solução mais rápida, chegando a ser 7,27% mais rápido para compor o mosaico. Porém, em seu último mosaico gerado, o aproveitamento das imagens de entrada foi pobre: apenas 52%. O detector de cantos FAST obteve o melhor aproveitamento das imagens de entrada, porém, descontinuidades significativas de objetos ocorreram em seu último mosaico gerado. Além disso, obteve um tempo de processamento consideravelmente superior ao dos demais métodos, chegando a ser 6,42 vezes mais lento para compor o mosaico. O detector de característica SIFT obteve o segundo melhor tempo de processamento e o segundo melhor aproveitamento das imagens de entrada, sem apresentar problemas de descontinuidades de objetos. Portanto, mostrou-se como o método mais adequado para imagens agrícolas obtidas por meio de RPA. / Image mosaicing is the alignment of multiple images into larger compositions which represent portions of a 3D scene. A number of image mosaicing algorithms have been proposed over the last two decades. At the same time, the continuous advent of new mosaicing methods in recent years makes it really difficult to choose an appropriate mosaicing algorithm for a specific purpose. This study aimed to evaluate low level feature-based mosaicing methods using agricultural images obtained by Remotely Piloted Aircraft (RPA). Low-level feature detecting algorithms can be invariant to scale and rotation, among other transformations that commonly occur in agricultural images obtained by RPA. Harris corner detector, FAST corner detector, SIFT feature detector and SURF detector were evaluated according to the computational performance and the quality of the generated mosaic. To evaluate computational performance, were taken into account factors such as the detected features average per image, the number of images used to compose the mosaic and the processing time (user time). To evaluate quality, the mosaics generated by each method were used to estimate the Asian soybean rust severity and a comparison with the commercial software Pix4Dmapper was performed. Regarding quality, there was no significant difference and all methods proved to be on the same level. SURF detector, among all methods, got the worst performance using, on average, only 33.1% of the input images to compose the mosaics. Harris corner detector proved to be the fastest solution, becoming 7.27% faster to compose the mosaic. However, in its final mosaic, the use of the input images was poor: only 52%. FAST corner detector had the best utilization of the input images, however, significant discontinuities of objects occurred in its final mosaic. In addition, it had a considerably longer processing time than the other methods, becoming 6.42 times slower to compose the mosaic. SIFT feature detector had the second best processing time and the second best utilization of the input images, without presenting object discontinuities problems. Therefore, presented itself as the most suitable method for agricultural images obtained by RPA. Mosaico de imagens Imagens agrícolas RPA Característica de baixo nível OpenCV Severidade da ferrugem asiática Image mosaicing Agricultural images RPA Low-level feature OpenCV Asian rust severity
25	Efficient topology estimation for large scale optical mapping Elibol, Armagan 29 July 2011 (has links) Large scale image mosaicing methods are in great demand among scientists who study diﬀerent aspects of the seabed, and have been fostered by impressive advances in the capabilities of underwater robots in gathering optical data from the seaﬂoor. Cost and weight constraints mean that lowcost Remotely operated vehicles (ROVs) usually have a very limited number of sensors. When a low-cost robot carries out a seafloor survey using a down-looking camera, it usually follows a predetermined trajectory that provides several non time-consecutive overlapping image pairs. Finding these pairs (a process known as topology estimation) is indispensable to obtaining globally consistent mosaics and accurate trajectory estimates, which are necessary for a global view of the surveyed area, especially when optical sensors are the only data source. This thesis presents a set of consistent methods aimed at creating large area image mosaics from optical data obtained during surveys with low-cost underwater vehicles. First, a global alignment method developed within a Feature-based image mosaicing (FIM) framework, where nonlinear minimisation is substituted by two linear steps, is discussed. Then, a simple four-point mosaic rectifying method is proposed to reduce distortions that might occur due to lens distortions, error accumulation and the diﬃculties of optical imaging in an underwater medium. The topology estimation problem is addressed by means of an augmented state and extended Kalman ﬁlter combined framework, aimed at minimising the total number of matching attempts and simultaneously obtaining the best possible trajectory. Potential image pairs are predicted by taking into account the uncertainty in the trajectory. The contribution of matching an image pair is investigated using information theory principles. Lastly, a diﬀerent solution to the topology estimation problem is proposed in a bundle adjustment framework. Innovative aspects include the use of fast image similarity criterion combined with a Minimum spanning tree (MST) solution, to obtain a tentative topology. This topology is improved by attempting image matching with the pairs for which there is the most overlap evidence. Unlike previous approaches for large-area mosaicing, our framework is able to deal naturally with cases where time-consecutive images cannot be matched successfully, such as completely unordered sets. Finally, the eﬃciency of the proposed methods is discussed and a comparison made with other state-of-the-art approaches, using a series of challenging datasets in underwater scenarios / Els mètodes de generació de mosaics de gran escala gaudeixen d’una gran demanda entre els científcs que estudien els diferents aspectes del fons submarí, afavorida pels impressionants avenços en les capacitats dels robots submarins per a l’obtenció de dades ptiques del fons. El cost i el pes constitueixen restriccions que impliquen que els vehicles operats remotament disposin habitualment d’un nombre limitat de sensors. Quan un robot de baix cost duu a terme una exploració del fons submarí utilitzant una càmera apuntant cap al terreny, aquest segueix habitualment una trajectòria que dóna com a resultat diverses parelles d’imatges amb superposició de manera sequencial. Trobar aquestes parelles (estimació de la topologia) és una tasca indispensable per a l’obtenció de mosaics globalment consistents així com una estimació de trajectòria precisa, necessària per disposar d’una visió global de la regió explorada, especialment en el cas en què els sensors òptics constitueixen la única font de dades. Aquesta tesi presenta un conjunt de mètodes robustos destinats a la creació de mosaics d’àrees de grans dimensions a partir de dades òptiques (imatges) obtingudes durant exploracions realitzades amb vehicles submarins de baix cost. En primer lloc, es presenta un mètode d’alineament global desenvolupat en el context de la generació de mosaics basat en característiques 2D, substituint una minimització no lineal per dues etapes lineals. Així mateix, es proposa un mètode simple de rectificació de mosaics basat en quatre punts per tal de reduir les distorsions que poden aparèixer a causa de la distorsió de les lents, l’acumulació d’errors i les dificultats d’adquisició d’imatges en el medi submarí. El problema de l’estimació de la topologia s’aborda mitjanant la combinació d’un estat augmentat amb un altre de Kalman estès, amb l’objectiu de minimitzar el nombre total d’intents de cerca de correspondències i obtenir simultàniament la millor trajectòria possible. La predicció de les parelles d’imatges potencials té en compte la incertesa de la trajectòria, i la contribució de l’obtenció de correspondències per a un parell d’imatges s’estudia d’acord amb principis de la teoria de la informació. Així mateix, el problema de l’estimació de la topologia és abordat en el context d’un alineament global. Les innovacions inclouen l’ús d’un criteri ràpid per a determinació de la similitud entre imatges combinat amb una solució basada en arbres d’expansió mínima, per tal d’obtenir una topologia provisional. Aquesta topologia és millorada mitjançant l’intent de cerca de correspondències entre parelles d’imatges amb major probabilitat de superposició. Contràriament al que succeïa en solucions prèvies per a la construcció de mosaics de grans àrees, el nostre entorn de treball és capaç de tractar amb casos en què imatges consecutives en el temps no han pogut ser relacionades satisfactòriament, com és el cas de conjunts d’imatges totalment desordenats. Finalment, es discuteix l’eficiència del mètode proposat i es compara amb altres solucions de l’estat de l’art, utilitzant una sèrie de conjunts de dades complexos en escenaris subaquàtics. Optical mapping Mapeo òptico Mapeig òptic Underwater robotics Robotica submarina Image mosaicing Mosaico de imágenes Mosaic d'imatges Visual navigation Navegación visual Navegació visual Global alignment Alineamiento global Alineament global Topology estimation Estimación de topología Estimació de topologia 004 68
26	Recalage hétérogène pour la reconstruction 3D de scènes sous-marines / Heterogeneous Registration for 3D Reconstruction of Underwater Scene Mahiddine, Amine 30 June 2015 (has links) Le relevé et la reconstruction 3D de scènes sous-marine deviennent chaque jour plus incontournable devant notre intérêt grandissant pour l’étude des fonds sous-marins. La majorité des travaux existants dans ce domaine sont fondés sur l’utilisation de capteurs acoustiques l’image n’étant souvent qu’illustrative.L’objectif de cette thèse consiste à développer des techniques permettant la fusion de données hétérogènes issues d’un système photogrammétrique et d’un système acoustique.Les travaux présentés dans ce mémoire sont organisés en trois parties. La première est consacrée au traitement des données 2D afin d’améliorer les couleurs des images sous-marines pour augmenter la répétabilité des descripteurs en chaque point 2D. Puis, nous proposons un système de visualisation de scène en 2D sous forme de mosaïque.Dans la deuxième partie, une méthode de reconstruction 3D à partir d’un ensemble non ordonné de plusieurs images a été proposée. Les données 3D ainsi calculées seront fusionnées avec les données provenant du système acoustique dans le but de reconstituer le site sous-marin.Dans la dernière partie de ce travail de thèse, nous proposons une méthode de recalage 3D originale qui se distingue par la nature du descripteur extrait en chaque point. Le descripteur que nous proposons est invariant aux transformations isométriques (rotation, transformation) et permet de s’affranchir du problème de la multi-résolution. Nous validons à l’aide d’une étude effectuée sur des données synthétiques et réelles où nous montrons les limites des méthodes de recalages existantes dans la littérature. Au final, nous proposons une application de notre méthode à la reconnaissance d’objets 3D. / The survey and the 3D reconstruction of underwater become indispensable for our growing interest in the study of the seabed. Most of the existing works in this area are based on the use of acoustic sensors image.The objective of this thesis is to develop techniques for the fusion of heterogeneous data from a photogrammetric system and an acoustic system.The presented work is organized in three parts. The first is devoted to the processing of 2D data to improve the colors of the underwater images, in order to increase the repeatability of the feature descriptors. Then, we propose a system for creating mosaics, in order to visualize the scene.In the second part, a 3D reconstruction method from an unordered set of several images was proposed. The calculated 3D data will be merged with data from the acoustic system in order to reconstruct the underwater scene.In the last part of this thesis, we propose an original method of 3D registration in terms of the nature of the descriptor extracted at each point. The descriptor that we propose is invariant to isometric transformations (rotation, transformation) and addresses the problem of multi-resolution. We validate our approach with a study on synthetic and real data, where we show the limits of the existing methods of registration in the literature. Finally, we propose an application of our method to the recognition of 3D objects. Traitement d'images sous-Marine Amélioration des couleurs Mosaïque Reconstruction 3D Triangulation Stéréovision Recalage 3D Reconnaissance de formes 3D Underwater Image processing Couleurs Enhancement Mosaicing Multiview 3D Reconstruction Triangulation Stereo-Vision 3D registration 3D Object Recognition. 004
27	Image blending techniques and their application in underwater mosaicing Prados Gutiérrez, Ricard 02 May 2013 (has links) The fusion of several images of the same scene into a single and larger composite is known as photo-mosaic. Unfortunately, the seams along image boundaries are often noticeable, due to photometrical and geometrical registration inaccuracies. Image blending is the merging step in which those artefacts are minimized. Processing bottlenecks and the lack of medium-specific processing tools have restricted underwater photo-mosaics to small areas despite the hundreds of thousands of square meters that modern surveys can cover. Producing these mosaics is difficult due to the challenging nature of the underwater environment and the image acquisition conditions. This thesis proposes strategies and solutions to tackle the problems of very large underwater optical surveys (Giga-mosaics), presenting contributions in the image preprocessing, enhancing and blending steps, resulting in an improved visual quality in the final photo-mosaic / La unió de diverses imatges d’una mateixa escena en una d’única i més gran és coneguda com a foto-mosaic. Malauradament, els límits de les imatges són habitualment perceptibles, degut a imprecisions en els registres fotomètric i geomètric. La fusió d'imatges és l'etapa del procés d'unió a la qual aquests artefactes són minimitzats. Els colls d'ampolla en el processament i la manca d'eines específiques pel tractament del medi han restringint els foto-mosaics submarins a àrees reduïdes, malgrat que els estudis actuals poden cobrir centenars de milers de m2. . La producció d'aquests mosaics és complexa donada la naturalesa del medi subaquàtic i les condicions d'adquisició de les imatges. Aquesta tesi proposa estratègies i solucions per afrontar el problema de la generació de foto-mosaics submarins de grans dimensions (Giga-mosaics), i presenta contribucions en les etapes de preprocessament, realçat i fusió d’imatges, donant lloc a una qualitat visual millorada del foto-mosaic final Image blending Optical mapping Image mosaicing Underwater surveys Large scale mapping Underwater robotics Fusió d'imatges Mapatge òptic Mosaics d'imatges Estudis submarins Mapatge a gran escala Robòtica submarina Fusión de imágenes Mapeo óptico Mosaicos de imágenes Estudios submarinos Mapeo a gran escala Robótica submarina 004 62 68
28	Camera-Captured Document Image Analysis Kasar, Thotreingam 11 1900 (has links) (PDF) Text is no longer confined to scanned pages and often appears in camera-based images originating from text on real world objects. Unlike the images from conventional flatbed scanners, which have a controlled acquisition environment, camera-based images pose new challenges such as uneven illumination, blur, poor resolution, perspective distortion and 3D deformations that can severely affect the performance of any optical character recognition (OCR) system. Due to the variations in the imaging condition as well as the target document type, traditional OCR systems, designed for scanned images, cannot be directly applied to camera-captured images and a new level of processing needs to be addressed. In this thesis, we study some of the issues commonly encountered in camera-based image analysis and propose novel methods to overcome them. All the methods make use of color connected components. 1. Connected component descriptor for document image mosaicing Document image analysis often requires mosaicing when it is not possible to capture a large document at a reasonable resolution in a single exposure. Such a document is captured in parts and mosaicing stitches them into a single image. Since connected components (CCs) in a document image can easily be extracted regardless of the image rotation, scale and perspective distortion, we design a robust feature named connected component descriptor that is tailored for mosaicing camera-captured document images. The method involves extraction of a circular measurement region around each CC and its description using the angular radial transform (ART). To ensure geometric consistency during feature matching, the ART coefficients of a CC are augmented with those of its 2 nearest neighbors. Our method addresses two critical issues often encountered in correspondence matching: (i) the stability of features and (ii) robustness against false matches due to multiple instances of many characters in a document image. We illustrate the effectiveness of the proposed method on camera-captured document images exhibiting large variations in viewpoint, illumination and scale. 2. Font and background color independent text binarization The first step in an OCR system, after document acquisition, is binarization, which converts a gray-scale/color image into a two-level image -the foreground text and the background. We propose two methods for binarization of color documents whereby the foreground text is output as black and the background as white regardless of the polarity of foreground-background shades. (a) Hierarchical CC Analysis: The method employs an edge-based connected component approach and automatically determines a threshold for each component. It overcomes several limitations of existing locally-adaptive thresholding techniques. Firstly, it can handle documents with multi-colored texts with different background shades. Secondly, the method is applicable to documents having text of widely varying sizes, usually not handled by local binarization methods. Thirdly, the method automatically computes the threshold for binarization and the logic for inverting the output from the image data and does not require any input parameter. However, the method is sensitive to complex backgrounds since it relies on the edge information to identify CCs. It also uses script-specific characteristics to filter out edge components before binarization and currently works well for Roman script only. (b) Contour-based color clustering (COCOCLUST): To overcome the above limitations, we introduce a novel unsupervised color clustering approach that operates on a ‘small’ representative set of color pixels identified using the contour information. Based on the assumption that every character is of a uniform color, we analyze each color layer individually and identify potential text regions for binarization. Experiments on several complex images having large variations in font, size, color, orientation and script illustrate the robustness of the method. 3. Multi-script and multi-oriented text extraction from scene images Scene text understanding normally involves a pre-processing step of text detection and extraction before subjecting the acquired image for character recognition task. The subsequent recognition task is performed only on the detected text regions so as to mitigate the effect of background complexity. We propose a color-based CC labeling for robust text segmentation from natural scene images. Text CCs are identified using a combination of support vector machine and neural network classifiers trained on a set of low-level features derived from the boundary, stroke and gradient information. We develop a semiautomatic annotation toolkit to generate pixel-accurate groundtruth of 100 scenic images containing text in various layout styles and multiple scripts. The overall precision, recall and f-measure obtained on our dataset are 0.8, 0.86 and 0.83, respectively. The proposed method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset, which, however, has only horizontal English text. The overall precision, recall and f-measure obtained are 0.63, 0.59 and 0.61 respectively, which is comparable to the best performing methods in the ICDAR 2005 text locating competition. A recent method proposed by Epshtein et al. [1] achieves better results but it cannot handle arbitrarily oriented text. Our method, however, works well for generic scene images having arbitrary text orientations. 4. Alignment of curved text lines Conventional OCR systems perform poorly on document images that contain multi-oriented text lines. We propose a technique that first identifies individual text lines by grouping adjacent CCs based on their proximity and regularity. For each identified text string, a B-spline curve is fitted to the centroids of the constituent characters and normal vectors are computed along the fitted curve. Each character is then individually rotated such that the corresponding normal vector is aligned with the vertical axis. The method has been tested on a data set consisting of 50 images with text laid out in various ways namely along arcs, waves, triangles and a combination of these with linearly skewed text lines. It yields 95.9% recognition accuracy on text strings, where, before alignment, state-of-the-art OCRs fail to recognize any text. The CC-based pre-processing algorithms developed are well-suited for processing camera-captured images. We demonstrate the feasibility of the algorithms on the publicly-available ICDAR 2003 robust reading competition dataset and our own database comprising camera-captured document images that contain multiple scripts and arbitrary text layouts. Image Processing Document Image Mosaicing Color Text Binarization Camera-based Document Image Analysis Scene Images - Text Localization Images - Curved Text Strings - Alignment Connected Component Descriptor (CCD) Scenic Text Curved Character Strings OCR Readability Camera-based Images Camera-Captured Document Images Applied Optics
29	Video anatomy : spatial-temporal video profile Cai, Hongyuan 31 July 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview. camera motion understanding major flow mosaicing profile of video spatial-temporal synthesis video indexing Information display systems Digital video -- Research Digital cameras Video compression Multimedia systems Image analysis Video surveillance Computer algorithms Human-computer interaction Image transmission Visual perception Automatic abstracting Mechatronics Pattern recognition systems Content-based image retrieval

Search results