Global ETD Search

1	Video Indexing and Retrieval in Compressed Domain Using Fuzzy-Categorization. Fang, H., Qahwaji, Rami S.R., Jiang, Jianmin January 2006 (has links) No / There has been an increased interest in video indexing and retrieval in recent years. In this work, indexing and retrieval system of the visual contents is based on feature extracted from the compressed domain. Direct possessing of the compressed domain spares the decoding time, which is extremely important when indexing large number of multimedia archives. A fuzzy-categorizing structure is designed in this paper to improve the retrieval performance. In our experiment, a database that consists of basketball videos has been constructed for our study. This database includes three categories: full-court match, penalty and close-up. First, spatial and temporal feature extraction is applied to train the fuzzy membership functions using the minimum entropy optimal algorithm. Then, the max composition operation is used to generate a new fuzzy feature to represent the content of the shots. Finally, the fuzzy-based representation becomes the indexing feature for the content-based video retrieval system. The experimental results show that the proposal algorithm is quite promising for semantic-based video retrieval. Video indexing and retrieval Fuzzy-Categorization
2	Automatic Affective Video Indexing: Identification of Slapstick Comedy Using Low-level Video Characteristics French, Jean Helen 01 January 2011 (has links) Recent advances in multimedia technologies have helped create extensive digital video repositories. Users need to be able to search these large video repositories in order to find videos that have preferred content. In order to meet the needs of users, videos in these repositories need to be indexed. Manual indexing is not an appropriate method due to the time and effort involved. Instead, videos need to be accurately indexed by utilizing computer-based methods. Automatic video indexing techniques use computer technology to analyze low-level video features to identify the content that exists in videos. The type of indexing used in this study is automatic affective video indexing, which is an attempt to index videos by automatically detecting content that elicits an emotional response from individuals. The specific affect-related content of interest in this proposed study is slapstick comedy, a technique that is used in videos with humor. The methodology of this study analyzed the audio stream as well as the motion of targeted objects in videos. The relationship between the changes in the two low-level features was used to identify if slapstick comedy was present in the video and where the instance of slapstick could be found. There were three research questions presented in the study which were associated with the two goals. Research Question 1 determined whether or not the targeted content could be identified using low-level features. Research Question 2 measured the relationship between the experimental results and the ground truth in terms of identifying the location of the targeted content in video. Research Question 3 determined whether one type of low-level feature was more strongly associated with the target content than the other. Goal 1 was to utilize sound and motion to predict the existence of slapstick comedy in videos. Goal 2 was to utilize sound and motion to predict the location of slapstick comedy in videos. The results of the study showed that Goals 1 and 2 were partially met, prompting an investigation into methodology improvements as part of this research. The results also showed that motion was more strongly related to the target content than sound. Affective Content Multimedia Slapstick Video Indexing Computer Sciences
3	Exploiting Information Extraction Techniques For Automatic Semantic Annotation And Retrieval Of News Videos In Turkish Kucuk, Dilek 01 February 2011 (has links) (PDF) Information extraction (IE) is known to be an effective technique for automatic semantic indexing of news texts. In this study, we propose a text-based fully automated system for the semantic annotation and retrieval of news videos in Turkish which exploits several IE techniques on the video texts. The IE techniques employed by the system include named entity recognition, automatic hyperlinking, person entity extraction with coreference resolution, and event extraction. The system utilizes the outputs of the components implementing these IE techniques as the semantic annotations for the underlying news video archives. Apart from the IE components, the proposed system comprises a news video database in addition to components for news story segmentation, sliding text recognition, and semantic video retrieval. We also propose a semi-automatic counterpart of system where the only manual intervention takes place during text extraction. Both systems are executed on genuine video data sets consisting of videos broadcasted by Turkish Radio and Television Corporation. The current study is significant as it proposes the first fully automated system to facilitate semantic annotation and retrieval of news videos in Turkish, yet the proposed system and its semi-automated counterpart are quite generic and hence they could be customized to build similar systems for video archives in other languages as well. Moreover, IE research on Turkish texts is known to be rare and within the course of this study, we have proposed and implemented novel techniques for several IE tasks on Turkish texts. As an application example, we have demonstrated the utilization of the implemented IE components to facilitate multilingual video retrieval. QA Computer Software 76.75-76.765
4	Design and Implementation of Query Processing Strategies for Video Data Yang, Wen-Haur 09 July 2002 (has links) Traditional database systems only support textual and numerical data. Video data stored in these database systems can only be retrieved through their video identifiers, titles or descriptions. In the video data, frame-by-frame object change is one of the most obvious information. Each video contains temporal and spatial relationships between content objects. The temporal relationships can be specified between frame sequences and the spatial relationships can be specified by the relationships between objects in a single frame. The difficulty in designing a content-based video database system is how to store and describe the relationships between moving objects completely. Many researches on content-based video retrieval represented the content of video as a set of frames, but they either left out the temporal ordering of frames in the shot or only stored the relationships between objects in a single frame. According to these observations, we conclude that a content-based video database system requires video indexing, query processing and a convenient user interface to fit the requirements and characteristics of videos. In this thesis, we design and implement a query processing strategy for video data. In the proposed strategy, we consider three query types: the exact object match, the spatial-temporal object retrieval and the motion query, where a exact object match is to find the video files which contain the specific objects, a spatial-temporal objects retrieval is to retrieve the object pairs that satisfy some spatial-temporal relationships and a motion query is to find the set of frames which contain the object movements. Moreover, we consider three design issues: the video indexing, the video query processing and the video query interface. When there are a large number of videos in a video database and each video contains many shots, frames and objects, the processing time for content retrieval is tremendous. Thus, we need a proper video indexing strategy to speed up the searching time. In order to fulfill the spatial-temporal relationships of objects between different frames, we give the indexes both in the spatial and temporal axes. In the temporal index file structure, we propose the shot-based B+-tree to index the temporal data. In the spatial index file structure, we use R-tree to store not only the relationships between objects in one frame, but also the relationships of one object when the object first and last appears in the shot. Based on this strategy, we can describe the status of a moving object in details. For the part of query processing, we propose a signature file structure to filter out the videos that absolutely can not be the answer. After that, in order to determine whether the answer exists in the candidate videos, we use a multi-dimensional string, called binary string, to represent the spatial-temporal relationships between objects. Then, the video query processing problem will become a binary string matching problem. Finally, we design and implement an user-friendly user interface. Our system is performed on a Pentium III machine with one CPU clock rate of 550 MHz, 256 MB of main memory, running under Windows 2000 Professional edition, used Access 2000 database and coded in Delphi 6 with about 10,000 lines. From our experience, we show that the proposed system can support an efficient query processing, a fast searching capabilities and an user-friendly user interface. Video Query Processing Video Data Spatial-Temporal Relationships Video Indexing shot-based B+-tree
5	Sumarização de vídeos de histerocopias diagnósticas / Content-based summarization of diagnostic hysteroscopy videos Gavião Neto, Wilson Pires January 2009 (has links) Dada uma biblioteca com milhares de vídeos de histeroscopias diagnósticas, sobre a qual deseja-se realizar consultas como "retornar imagens contendo miomas submucosos" ou "recuperar imagens cujo diagnóstico é pólipo endometrial". Este é o contexto deste trabalho. Vídeos de histeroscopias diagnósticas são usados para avaliar a aparência do útero e são importantes não só para propósitos de diagnóstico de doenças mas também em estudos científicos em áreas da medicina, como reprodução humana e estudos sobre fertilidade. Estes vídeos contêm uma grande quantidade de informação, porém somente um número reduzido de quadros são úteis para propósitos de diagnósticos e/ou prognósticos. Esta tese apresenta um método para identificar automaticamente a informação relevante em vídeos de histeroscopias diagnósticas, criando um sumário do vídeo. Propõe-se uma representação hierárquica do conteúdo destes vídeos que é baseada no rastreamento de pontos geometricamente consistentes através da seqüência dos quadros. Demonstra-se que esta representação é uma maneira útil de organizar o conteúdo de vídeos de histeroscopias diagnósticas, permitindo que especialistas possam realizar atividades de browsing de uma forma rápida e sem introduzir informações espúrias no sumário do vídeo. Os experimentos indicam que o método proposto produz sumários compactos (com taxas de redução de dados em torno de 97.5%) sem descartar informações clinicamente relevantes. / Given a library containing thousands of diagnostic hysteroscopy videos, which are only indexed according to a patient ID and the exam date. Usually, users browse through this library in order to obtain answers to queries like retrieve images of submucosal myomas or recover images whose diagnosis is endometrial polyp. This is the context of this work. Specialists have been used diagnostic hysteroscopy videos to inspect the uterus appearance, once the images are important for diagnosis purposes as well as in medical research fields like human reproduction. These videos contain lots of information, but only a reduced number of frames are actually useful for diagnosis/prognosis purposes. This thesis proposes a technique to identify clinically relevant information in diagnostic hysteroscopy videos, creating a rich video summary. We propose a hierarchical representation based on a robust tracking of image points through the frame sequence. We demonstrate this representation is a helpful way to organize the hysteroscopy video content, allowing specialists to perform fast browsing without introducing spurious information in the video summary. The experimental results indicate that the method produces compact video summaries (data-rate reduction around 97.5%) without discarding clinically relevant information. Computação gráfica Processamento : Imagem Informática médica Video summarization Video indexing Video browsing Hysteroscopy Medical video
6	Sumarização de vídeos de histerocopias diagnósticas / Content-based summarization of diagnostic hysteroscopy videos Gavião Neto, Wilson Pires January 2009 (has links) Dada uma biblioteca com milhares de vídeos de histeroscopias diagnósticas, sobre a qual deseja-se realizar consultas como "retornar imagens contendo miomas submucosos" ou "recuperar imagens cujo diagnóstico é pólipo endometrial". Este é o contexto deste trabalho. Vídeos de histeroscopias diagnósticas são usados para avaliar a aparência do útero e são importantes não só para propósitos de diagnóstico de doenças mas também em estudos científicos em áreas da medicina, como reprodução humana e estudos sobre fertilidade. Estes vídeos contêm uma grande quantidade de informação, porém somente um número reduzido de quadros são úteis para propósitos de diagnósticos e/ou prognósticos. Esta tese apresenta um método para identificar automaticamente a informação relevante em vídeos de histeroscopias diagnósticas, criando um sumário do vídeo. Propõe-se uma representação hierárquica do conteúdo destes vídeos que é baseada no rastreamento de pontos geometricamente consistentes através da seqüência dos quadros. Demonstra-se que esta representação é uma maneira útil de organizar o conteúdo de vídeos de histeroscopias diagnósticas, permitindo que especialistas possam realizar atividades de browsing de uma forma rápida e sem introduzir informações espúrias no sumário do vídeo. Os experimentos indicam que o método proposto produz sumários compactos (com taxas de redução de dados em torno de 97.5%) sem descartar informações clinicamente relevantes. / Given a library containing thousands of diagnostic hysteroscopy videos, which are only indexed according to a patient ID and the exam date. Usually, users browse through this library in order to obtain answers to queries like retrieve images of submucosal myomas or recover images whose diagnosis is endometrial polyp. This is the context of this work. Specialists have been used diagnostic hysteroscopy videos to inspect the uterus appearance, once the images are important for diagnosis purposes as well as in medical research fields like human reproduction. These videos contain lots of information, but only a reduced number of frames are actually useful for diagnosis/prognosis purposes. This thesis proposes a technique to identify clinically relevant information in diagnostic hysteroscopy videos, creating a rich video summary. We propose a hierarchical representation based on a robust tracking of image points through the frame sequence. We demonstrate this representation is a helpful way to organize the hysteroscopy video content, allowing specialists to perform fast browsing without introducing spurious information in the video summary. The experimental results indicate that the method produces compact video summaries (data-rate reduction around 97.5%) without discarding clinically relevant information. Computação gráfica Processamento : Imagem Informática médica Video summarization Video indexing Video browsing Hysteroscopy Medical video
7	Sumarização de vídeos de histerocopias diagnósticas / Content-based summarization of diagnostic hysteroscopy videos Gavião Neto, Wilson Pires January 2009 (has links) Dada uma biblioteca com milhares de vídeos de histeroscopias diagnósticas, sobre a qual deseja-se realizar consultas como "retornar imagens contendo miomas submucosos" ou "recuperar imagens cujo diagnóstico é pólipo endometrial". Este é o contexto deste trabalho. Vídeos de histeroscopias diagnósticas são usados para avaliar a aparência do útero e são importantes não só para propósitos de diagnóstico de doenças mas também em estudos científicos em áreas da medicina, como reprodução humana e estudos sobre fertilidade. Estes vídeos contêm uma grande quantidade de informação, porém somente um número reduzido de quadros são úteis para propósitos de diagnósticos e/ou prognósticos. Esta tese apresenta um método para identificar automaticamente a informação relevante em vídeos de histeroscopias diagnósticas, criando um sumário do vídeo. Propõe-se uma representação hierárquica do conteúdo destes vídeos que é baseada no rastreamento de pontos geometricamente consistentes através da seqüência dos quadros. Demonstra-se que esta representação é uma maneira útil de organizar o conteúdo de vídeos de histeroscopias diagnósticas, permitindo que especialistas possam realizar atividades de browsing de uma forma rápida e sem introduzir informações espúrias no sumário do vídeo. Os experimentos indicam que o método proposto produz sumários compactos (com taxas de redução de dados em torno de 97.5%) sem descartar informações clinicamente relevantes. / Given a library containing thousands of diagnostic hysteroscopy videos, which are only indexed according to a patient ID and the exam date. Usually, users browse through this library in order to obtain answers to queries like retrieve images of submucosal myomas or recover images whose diagnosis is endometrial polyp. This is the context of this work. Specialists have been used diagnostic hysteroscopy videos to inspect the uterus appearance, once the images are important for diagnosis purposes as well as in medical research fields like human reproduction. These videos contain lots of information, but only a reduced number of frames are actually useful for diagnosis/prognosis purposes. This thesis proposes a technique to identify clinically relevant information in diagnostic hysteroscopy videos, creating a rich video summary. We propose a hierarchical representation based on a robust tracking of image points through the frame sequence. We demonstrate this representation is a helpful way to organize the hysteroscopy video content, allowing specialists to perform fast browsing without introducing spurious information in the video summary. The experimental results indicate that the method produces compact video summaries (data-rate reduction around 97.5%) without discarding clinically relevant information. Computação gráfica Processamento : Imagem Informática médica Video summarization Video indexing Video browsing Hysteroscopy Medical video
8	HIERARCHICAL SUMMARIZATION OF VIDEO DATA LI, WEI 09 October 2007 (has links) No description available. Computer Science Video data processing Video summarization Video content structure video indexing
9	Indexation de la vidéo portée : application à l’étude épidémiologique des maladies liées à l’âge / Indexing of activities in wearable videos : application to epidemiological studies of aged dementia Karaman, Svebor 12 December 2011 (has links) Le travail de recherche de cette thèse de doctorat s'inscrit dans le cadre du suivi médical des patients atteints de démences liées à l'âge à l'aide des caméras videos portées par les patients. L'idée est de fournir aux médecins un nouvel outil pour le diagnostic précoce de démences liées à l'âge telles que la maladie d'Alzheimer. Plus précisément, les Activités Instrumentales du Quotidien (IADL: Instrumental Activities of Daily Living en anglais) doivent être indexées automatiquement dans les vidéos enregistrées par un dispositif d'enregistrement portable.Ces vidéos présentent des caractéristiques spécifiques comme de forts mouvements ou de forts changements de luminosité. De plus, la tâche de reconnaissance visée est d'un très haut niveau sémantique. Dans ce contexte difficile, la première étape d'analyse est la définition d'un équivalent à la notion de « plan » dans les contenus vidéos édités. Nous avons ainsi développé une méthode pour le partitionnement d'une vidéo tournée en continu en termes de « points de vue » à partir du mouvement apparent.Pour la reconnaissance des IADL, nous avons développé une solution selon le formalisme des Modèles de Markov Cachés (MMC). Un MMC hiérarchique à deux niveaux a été introduit, modélisant les activités sémantiques ou des états intermédiaires. Un ensemble complexe de descripteurs (dynamiques, statiques, de bas niveau et de niveau intermédiaire) a été exploité et les espaces de description joints optimaux ont été identifiés expérimentalement.Dans le cadre de descripteurs de niveau intermédiaire pour la reconnaissance d'activités nous nous sommes particulièrement intéressés aux objets sémantiques que la personne manipule dans le champ de la caméra. Nous avons proposé un nouveau concept pour la description d'objets ou d'images faisant usage des descripteurs locaux (SURF) et de la structure topologique sous-jacente de graphes locaux. Une approche imbriquée pour la construction des graphes où la même scène peut être décrite par plusieurs niveaux de graphes avec un nombre de nœuds croissant a été introduite. Nous construisons ces graphes par une triangulation de Delaunay sur des points SURF, préservant ainsi les bonnes propriétés des descripteurs locaux c'est-à-dire leur invariance vis-à-vis de transformations affines dans le plan image telles qu'une rotation, une translation ou un changement d'échelle.Nous utilisons ces graphes descripteurs dans le cadre de l'approche Sacs-de-Mots-Visuels. Le problème de définition d'une distance, ou dissimilarité, entre les graphes pour la classification non supervisée et la reconnaissance est nécessairement soulevé. Nous proposons une mesure de dissimilarité par le Noyau Dépendant du Contexte (Context-Dependent Kernel: CDK) proposé par H. Sahbi et montrons sa relation avec la norme classique L2 lors de la comparaison de graphes triviaux (les points SURF).Pour la reconnaissance d'activités par MMC, les expériences sont conduites sur le premier corpus au monde de vidéos avec caméra portée destiné à l'observation des d'IADL et sur des bases de données publiques comme SIVAL et Caltech-101 pour la reconnaissance d'objets. / The research of this PhD thesis is fulfilled in the context of wearable video monitoring of patients with aged dementia. The idea is to provide a new tool to medical practitioners for the early diagnosis of elderly dementia such as the Alzheimer disease. More precisely, Instrumental Activities of Daily Living (IADL) have to be indexed in videos recorded with a wearable recording device.Such videos present specific characteristics i.e. strong motion or strong lighting changes. Furthermore, the tackled recognition task is of a very strong semantics. In this difficult context, the first step of analysis is to define an equivalent to the notion of “shots” in edited videos. We therefore developed a method for partitioning continuous video streams into viewpoints according to the observed motion in the image plane.For the recognition of IADLs we developed a solution based on the formalism of Hidden Markov Models (HMM). A hierarchical HMM with two levels modeling semantic activities or intermediate states has been introduced. A complex set of features (dynamic, static, low-level, mid-level) was proposed and the most effective description spaces were identified experimentally.In the mid-level features for activities recognition we focused on the semantic objects the person manipulates in the camera view. We proposed a new concept for object/image description using local features (SURF) and the underlying semi-local connected graphs. We introduced a nested approach for graphs construction when the same scene can be described by levels of graphs with increasing number of nodes. We build these graphs with Delaunay triangulation on SURF points thus preserving good properties of local features i.e. the invariance with regard to affine transformation of image plane: rotation, translation and zoom.We use the graph features in the Bag-of-Visual-Words framework. The problem of distance or dissimilarity definition between graphs for clustering or recognition is obviously arisen. We propose a dissimilarity measure based on the Context Dependent Kernel of H. Sahbi and show its relation with the classical entry-wise norm when comparing trivial graphs (SURF points).The experiments are conducted on the first corpus in the world of wearable videos of IADL for HMM based activities recognition, and on publicly available academic datasets such as SIVAL and Caltech-101 for object recognition. Indexation Video Modèles de Markov Cachés Reconnaissance d'Objets Mots-Graphes Visuels Video Indexing Hidden Markov Models Object Recognition Visual Graph Words
10	Décompositions spatio-temporelles pour l'étude des textures dynamiques : contribution à l'indexation vidéo / Spatio-temporal decompositions for the study of Dynamic Textures : contribution to video indexing Dubois, Sloven 19 November 2010 (has links) Nous nous intéresserons dans cette thèse à l'étude et la caractérisation des Textures Dynamiques (TDs), avec comme application visée l'indexation dans de grandes bases de vidéos. Ce thème de recherche étant émergent, nous proposons une définition des TDs, une taxonomie de celles-ci, ainsi qu'un état de l'art. La classe de TD la plus représentative est décrite par un modèle formel qui considère les TDs comme la superposition d'ondes porteuses et de phénomènes locaux. La construction d'outils d'analyse spatio-temporelle adaptés aux TDs est notre principale contribution. D'une part, nous montrons que la transformée en curvelets 2D+T est pertinente pour la représentation de l'onde porteuse. D'autre part, dans un objectif de décomposition des séquences vidéos, nous proposons d'utiliser l'approche par Analyse en Composantes Morphologiques. Notre contribution consiste en l'apport et l'étude de nouvelles stratégies de seuillage. Ces méthodes sont testées sur plusieurs applications: segmentation spatio-temporelle, décomposition de TDs, estimation du mouvement global d'une TD, ... Nous avons de plus montré que l'Analyse en Composantes Morphologiques et les approches multi-échelles donnent des résultats significatifs pour la recherche par le contenu et l'indexation de Textures Dynamiques de la base de données DynTex. Cette thèse constitue ainsi un premier pas vers l'indexation automatique de textures dynamiques dans des séquences d'images, et ouvre la voie à de nombreux développements sur ce sujet nouveau. Enfin, le caractère générique des approches proposées permet d'envisager leurs applications dans un cadre plus large mettant en jeu par exemple des données 3D. / This report is focused on the study and the characterization of Dynamic Textures (DTs), with the aim of video indexing in large databases. This research topic being new and emerging, we propose a taxonomy, a definition of DTs and a state of the art. The most representative DT class is described by a model that considers DTs as the superposition of several wavefronts and local oscillating phenomena. The design of spatio-temporal analysis tools adapted to DT is our main contribution. We first show that the 2D+T curvelet transform is relevant for representing wavefronts. In order to analyse and better understand the DTs, we propose in a second step to adapt the Morphological Component Analysis approach using new thresholding strategies. These methods are tested on several applications: decomposition of DTs, spatio-temporal segmentation, global motion estimation of a DT, ... We have shown that Morphological Component Analysis and multi-scale approaches enable significant results for content-based retrieval applications and dynamic texture indexing on the DynTex database. This thesis constitutes a first step towards automatic indexing of DTs in image sequences and opens the way for many new developments in this topic. Moreover, the proposed approaches are generic and could be applied in a broader context, for instance the processing of 3D data. Textures dynamiques Décompositions multi-échelles 2D+T Analyse en composantes morphologiques Indexation vidéo Dynamic textures 2D+T multiscale decompositions Morphological component analysis Video indexing

Search results