Global ETD Search

1	Multiple feature temporal models for the characterization of semantic video contents Sánchez Secades, Juan María 11 December 2003 (has links) La estructura de alto nivel del vídeo se puede obtener a partir de conocimiento sobre el dominio más una representación de los contenidos que proporcione información semántica. En este contexto, las representaciones de la semántica de nivel medio vienen dadas en términos de características de bajo nivel y de la información que expresan acerca de los contenidos del vídeo. Las representaciones de nivel medio permiten obtener de forma automática agrupamientos semánticamente significativos de los shots, que son posteriormente utilizados conjuntamente con conocimientos de alto nivel específicos del dominio para obtener la estructura del vídeo. En general, las representaciones de nivel medio también dependen del dominio. Los descriptores que forman parte de la representación están específicamente diseñados para una aplicación concreta, teniendo en cuenta los requisitos del dominio y el conocimiento que tenemos del mismo. En esta tesis se propone una representación de nivel medio de los contenidos videográficos que permite obtener agrupamientos de shots que son semánticamente significativos. Esta representación no depende del dominio, y sin embargo aporta la información necesaria para obtener la estructura de alto nivel del vídeo, gracias a la combinación de las contribuciones de diferentes características de bajo nivel de las imágenes a la semántica de nivel medio.La semántica de nivel medio se encuentra implícita en las características de bajo nivel, dado que un concepto semántico concreto genera una combinación específica de valores de las mismas. El problema consiste en "tender un puente sobre el vacío" entre las características de bajo nivel que se observan y sus correspondientes conceptos semánticos de nivel medio ocultos. Para establecer relaciones entre estos dos niveles, se utilizan técnicas de visión por computador y procesamiento de imágenes. Otras disciplinas como la cinematografía y la semiótica también proporcionan pistas importantes para determinar como se usan las características de bajo nivel para crear conceptos semánticos. Una descripción adecuada de las características de bajo nivel puede proporcionar una representación de sus correspondientes contenidos semánticos. Más en concreto, el color resumido en un histograma se utiliza para representar la apariencia de los objetos. Cuando el objeto es el fondo de la escena, su color aporta información sobre la localización. De la misma manera, en esta tesis se analiza la semántica que transmite una descripción del movimiento. Las características de movimiento resumidas en una matriz de coocurrencias temporales proporcionan información sobre las operaciones de la cámara y el tipo de toma (primer plano, etc.) en función de la distancia relativa entre la cámara y los objetos filmados.La principal contribución de esta tesis es una representación de los contenidos visuales del vídeo basada en el resumen del comportamiento dinámico de las características de bajo nivel como procesos temporales descritos por cadenas de Markov. Los estados de la cadena de Markov vienen dados por los valores observados de una característica de bajo nivel. A diferencia de las representaciones de los shots basadas en keyframes, el modelo de cadena de Markov considera información de todos los frames del shot en la misma representación. Las medidas de similitud naturales en un marco probabilístico, como la divergencia de Kullback-Leibler, pueden ser utilizadas para comparar cadenas de Markov y, por tanto, el contenido de los shots que representan. En la misma representación se pueden combinar múltiples características de las imágenes mediante el acoplamiento de sus correspondientes cadenas. Esta tesis presenta diferentes formas de acoplar cadenas de Markov, y en particular la llamada Cadenas Acopladas de Markov (Coupled Markov Chains, CMC). También se detalla un método para encontrar la estructura de acoplamiento óptima en términos de coste mínimo y mínima pérdida de información, ya que esta merma se relaciona directamente con la pérdida de precisión de la estructura acoplada para representar contenidos de vídeo. Durante el proceso de cálculo de las representaciones de los shots se detectan las fronteras entre éstos usando el mismo modelo y medidas de similitud.Cuando las características de color y movimiento se combinan, la representación en cadenas acopladas de Markov proporciona un descriptor semántico de nivel medio que contiene información implícita sobre objetos (sus identidades, tamaños y patrones de movimiento), movimiento de cámara, localización, tipo de toma, relaciones temporales entre los elementos que componen la escena y actividad global, entendida como la cantidad de acción. Conceptos semánticos más complejos emergen de la unión de estos descriptores de nivel medio, tales como "cabeza parlante", que surge de la combinación de un primer plano con el color de la piel de la cara. Añadiendo el componente de localización en el dominio de Noticiarios, las cabezas parlantes se pueden subclasificar en "presentadores" (localizados en estudio) y "corresponsales" (localizados en exteriores). Estas y otras categorías semánticamente significativas aparecen cuando los shots representados usando el modelo CMC se agrupan de forma no supervisada. Los conceptos mejor definidos se corresponden con grupos compactos, que pueden ser detectados usando una medida de densidad. Conocimiento de alto nivel sobre el dominio se puede definir mediante simples reglas basadas en estos conceptos, que establecen fronteras en la estructura semántica del vídeo. El modelado de contenidos de vídeo por cadenas acopladas de Markov unifica los primeros pasos del proceso de análisis semántico de vídeo y proporciona una representación de nivel medio semánticamente significativa sin necesidad de detectar previamente las fronteras entre shots. / The high-level structure of a video can be obtained once we have knowledge about the domain plus a representation of the contents that provides semantic information. In this context, intermediate-level semantic representations are defined in terms of low-level features and the information they convey about the contents of the video. Intermediate-level representations allow us to obtain semantically meaningful clusterings of shots, which are then used together with high-level domain-specific knowledge in order to obtain the structure of the video. Intermediate-level representations are usually domain-dependent as well. The descriptors involved in the representation are specifically tailored for the application, taking into account the requirements of the domain and the knowledge we have about it. This thesis proposes an intermediate-level representation of video contents that allows us to obtain semantically meaningful clusterings of shots. This representation does not depend on the domain, but still provides enough information to obtain the high-level structure of the video by combining the contributions of different low-level image features to the intermediate-level semantics.Intermediate-level semantics are implicitly supplied by low-level features, given that a specific semantic concept generates some particular combination of feature values. The problem is to bridge the gap between observed low-level features and their corresponding hidden intermediate-level semantic concepts. Computer vision and image processing techniques are used to establish relationships between them. Other disciplines such as filmmaking and semiotics also provide important clues to discover how low-level features are used to create semantic concepts. A proper descriptor of low-level features can provide a representation of their corresponding semantic contents. Particularly, color summarized as a histogram is used to represent the appearance of objects. When this object is the background, color provides information about location. In the same way, the semantics conveyed by a description of motion have been analyzed in this thesis. A summary of motion features as a temporal cooccurrence matrix provides information about camera operation and the type of shot in terms of relative distance of the camera to the subject matter.The main contribution of this thesis is a representation of visual contents in video based on summarizing the dynamic behavior of low-level features as temporal processes described by Markov chains (MC). The states of the MC are given by the values of an observed low-level feature. Unlike keyframe-based representations of shots, information from all the frames is considered in the MC modeling. Natural similarity measures such as likelihood ratios and Kullback-Leibler divergence are used to compare MC's, and thus the contents of the shots they are representing. In this framework, multiple image features can be combined in the same representation by coupling their corresponding MC's. Different ways of coupling MC's are presented, particularly the one called Coupled Markov Chains (CMC). A method to find the optimal coupling structure in terms of minimal cost and minimal loss of information is detailed in this dissertation. The loss of information is directly related to the loss of accuracy of the coupled structure to represent video contents. During the same process of computing shot representations, the boundaries between shots are detected using the same modeling of contents and similarity measures.When color and motion features are combined, the CMC representation provides an intermediate-level semantic descriptor that implicitly contains information about objects (their identities, sizes and motion patterns), camera operation, location, type of shot, temporal relationships between elements of the scene and global activity understood as the amount of action. More complex semantic concepts emerge from the combination of these intermediate-level descriptors, such as a "talking head" that combines a close-up with the skin color of a face. Adding the location component in the News domain, talking heads can be further classified into "anchors" (located in the studio) and "correspondents" (located outdoors). These and many other semantically meaningful categories are discovered when shots represented using the CMC model are clustered in an unsupervised way. Well-defined concepts are given by compact clusters, which can be determined by a measure of their density. High-level domain knowledge can then be defined by simple rules on these salient concepts, which will establish boundaries in the semantic structure of the video. The CMC modeling of video shots unifies the first steps of the video analysis process providing an intermediate-level semantically meaningful representation of contents without prior shot boundary detection. Semantic content analysis Computer vision Video analysis Ciències Experimentals 68
2	The reference and content of proper names: a social and pragmatic approach Kui, Yimin 17 May 2005 (has links) No description available. Philosophy reference meaning semantic content proper name philosophy of language
3	Automatic Semantic Content Extraction In Videos Using A Spatio-temporal Ontology Model Yildirim, Yakup 01 March 2009 (has links) (PDF) Recent increase in the use of video in many applications has revealed the need for extracting the content in videos. Raw data and low-level features alone are not sufficient to fulfill the user&#039 / s need / that is, a deeper understanding of the content at the semantic level is required. Currently, manual techniques are being used to bridge the gap between low-level representative features and high-level semantic content, which are inefficient, subjective and costly in time and have limitations on querying capabilities. Therefore, there is an urgent need for automatic semantic content extraction from videos. As a result of this requirement, we propose an automatic semantic content extraction system for videos in terms of object, event and concept extraction. We introduce a general purpose ontology-based video semantic content model that uses object definitions, spatial relations and temporal relations in event and concept definitions. Various relation types are defined to describe fuzzy spatio-temporal relations between ontology classes. Thus, the video semantic content model is utilized to construct domain ontologies. In addition, domain ontologies are enriched with rule definitions to lower spatial relation computation cost and to be able to define some complex situations more effectively. As a case study, we have performed a number experiments for event and concept extraction in videos for basketball and surveillance domains. We have obtained satisfactory precision and recall rates for object, event and concept extraction. A domain independent application for the proposed framework has been fully implemented and tested.
4	Enhancing Content Management Systems With Semantic Capabilities Gonul, Suat 01 August 2012 (has links) (PDF) Content Management Systems (CMS) generally store data in a way that the content is distributed among several relational database tables or stored in files as a whole without any distinctive characteristics. These storage mechanisms cannot provide the management of semantic information about the data. They lack semantic retrieval, search and browsing of the stored content. To enhance non-semantic CMSes with advanced semantic features, the semantics within the CMS itself and additional semantic information related with the actual managed content should also be taken into account. However, extracting implicit knowledge from the legacy CMSes, lifting to a semantic content management system environment and providing semantic operations on the content is a challenging task which includes adoption of several latest advancements in information extraction (IE), information retrieval (IR) and Semantic Web areas. In this study, we propose an integrative approach including automatic lifting of content from legacy systems, automatic annotation of data with the information retrieved from the Linked Open Data (LOD) cloud and several semantic operations on the content in terms of storage and search. We use a simple RDF path language to create custom, semantic indexes and filter annotations obtained from LOD cloud in a way that is eligible for specific use cases. Filtered annotations are materialized along with the actual content of document in dedicated indexes. This semantix indexing infrastructure allows semantically meaningful search facilities on top of it. We realize our approach in the scope of Apache Stanbol project, which is a subproject developed in the scope of IKS project, by focusing on document storage and retrival parts of it. We evaluate our approach in healthcare domain with different domain ontologies (SNOMED/CT, ART, RXNORM) in addition to DBpedia as parts of LOD cloud which are used annotate documents and content obtained from different health portals. QA Computer Software 76.75-76.765
5	Combining Social Network and Semantic Content Analysis to Improve Knowledge Translation in Online Communities of Practice Stewart, Samuel Alan 11 December 2013 (has links) Establishing online communities of practice is an important part of the knowledge translation process in the modern healthcare system, but these online communities are new entity that is inherently different from traditional communities of practice that are dependent on existing social structures. The objective of this thesis is to combine communication analysis and content analysis to delve deeper into the communications within an online community to try and determine how online communities exist, and how that information can be leveraged to improve online knowledge translation. Using a novel approach this project will map the contents of online conversations to a structured medical lexicon (MeSH), and then use the inherent relationships of that lexicon to calculate term, user and thread similarities within an online community. These similarities, combined with connection analysis results, will provide a much deeper understanding of how online communities function. The methods developed here will then be tested on two separate mailing lists, the Pediatric Pain Mailing List (PPML) and SURGINET, a mailing list of general surgeons.
6	Implication relative des traits de haut niveau et de bas niveau des stimuli dans la catégorisation, chez l'homme et le singe / Relative contribution of low level and high level features of stimuli in categorization in humans and monkeys Collet, Anne-Claire 12 February 2016 (has links) Dans cette thèse, nous nous sommes proposé d'explorer les contributions relatives des caractéristiques de haut et de bas niveau des stimuli dans la catégorisation d'objet. Ce travail comporte trois études, chez l'homme et le singe. L'originalité de cette thèse réside donc dans la construction des stimuli. Notre première étude a visé à caractériser les corrélats neuraux de la reconnaissance d'images chez le singe en ECoG. Pour cela nous avons développé un protocole de catégorisation où les stimuli étaient des séquences visuelles dans lesquelles les contours des objets (information sémantique, caractéristique de haut niveau) étaient modulés cycliquement grâce à la technique SWIFT (créée par Roger Koenig et Rufin VanRullen) alors que la luminance, les contrastes et les fréquences spatiales (caractéristiques de bas niveau) étaient conservées. Grâce à une analyse en potentiels évoqués, nous avons pu mettre en évidence une activité électrophysiologique tardive en " tout ou rien " spécifique de la reconnaissance de la cible de la tâche par le singe. Mais parce que les objets sont rarement isolés en conditions réelles, nous nous sommes penchés dans une deuxième étude sur l'effet de congruence contextuelle lors de la catégorisation d'objets chez l'homme et le singe. Nous avons comparé la contribution du spectre d'amplitude d'une transformée de Fourier à cet effet de congruence chez ces deux espèces. Nous avons révélé une divergence de stratégie, le singe semblant davantage sensible à ces caractéristiques de bas niveau que l'homme. Enfin dans une dernière étude nous avons tenté de quantifier l'effet de congruence sémantique multisensorielle dans une tâche de catégorisation audiovisuelle chez l'homme. Dans cette étude nous avons égalisé un maximum de paramètres de bas niveau dans les deux modalités sensorielles, que nous avons toujours stimulées conjointement. Dans le domaine visuel, nous avons réutilisé la technique SWIFT, et dans le domaine auditif nous avons utilisé une technique de randomisation de snippets. Nous avons pu alors constater un gain multisensoriel important pour les essais congruents (l'image et le son désignant le même objet), s'expliquant spécifiquement par le contenu sémantique des stimuli. Cette thèse ouvre donc de nouvelles perspectives, tant sur la cognition comparée entre homme et primate non humain que sur la nécessité de contrôler les caractéristiques physiques de stimuli utilisés dans les tâches de reconnaissance d'objets. / In this thesis, we explored the relative contributions of high level and low level features of stimuli used in object categorization tasks. This work consists of three studies in human and monkey. The originality of this thesis lies in stimuli construction. Our first study aimed to characterize neural correlates of image recognition in monkey, using ECoG recordings. For that purpose we developped a categorization task using SWIFT technique (technique created by Roger Koenig and Rufin VanRullen). Stimuli were visual sequences in which object contours (semantic content, high level feature) were cyclically modulated while luminance, contrasts and spatial frequencies (low level features) remained stable. By analyzing evoked potentials, we brought to light a late electrophysiological activity, in an " all or none " fashion, specifically related to the target recognition in monkey. But because in real condition objects are never isolated, we explored in a second study contextual congruency effect in visual categorization task in humans and monkeys. We compared the contribution of Fourier transform amplitude spectrum to this congruency effect in the both species. We found a strategy divergence showing that monkeys were more sensitive to the low level features of stimuli than humans. Finally, in the last study, we tried to quantify multisensory semantic congruency effect, during a audiovisual categorization task in humans. In that experiment, we equalized a maximum of low level features, in both sensory modalities which were always jointly stimulated. In the visual domain, we used again the SWIFT technique, whereas in auditory domain we used a snippets randomization technique. We highlighted a large multisensory gain in congruent trials (i.e. image and sound related to the same object), specifically linked to the semantic content of stimuli. This thesis offers new perspectives both for comparative cognition between human and non human primates and for the importance of controlling the physical features of stimuli used in object recognition tasks. Catégorisation Caractéristiques de bas niveau Contenu sémantique Effet de congruence Categorization Low level features Semantic content Congruency effect
7	Rigid Designation, the Modal Argument, and the Nominal Description Theory Isenberg, Jillian January 2005 (has links) In this thesis, I describe and evaluate two recent accounts of naming. These accounts are motivated by Kripke?s response to Russell?s Description Theory of Names (DTN). Particularly, I consider Kripke?s Modal Argument (MA) and various arguments that have been given against it, as well as Kripke?s responses to these arguments. Further, I outline a version of MA that has recently been presented by Scott Soames, and consider how he responds to the criticisms that the argument faces. In order to evaluate the claim that MA is decisive against all description theories, I outline the Nominal Description Theory (NDT) put forth by Kent Bach and consider whether it constitutes a principled response to MA. I do so by exploring how Bach both responds to Kripke?s arguments against descriptivism and highlights the problems with rigid designation as a purely semantic thesis. Finally, I consider the relative merits of the accounts put forth by Bach and Soames. Upon doing so, I argue that MA is not as decisive against description theories as it has long been thought to be. In fact, NDT seems to provide a better account of our uses of proper names than the rigid designation thesis as presented by Kripke and Soames. Philosophy Philosophy of Language Description Theory of Names Nominal Description Theory rigid designation Modal Argument Saul Kripke Scott Soames Kent Bach proper names meaning reference semantic content semantics pragmatics
8	Rigid Designation, the Modal Argument, and the Nominal Description Theory Isenberg, Jillian January 2005 (has links) In this thesis, I describe and evaluate two recent accounts of naming. These accounts are motivated by Kripke?s response to Russell?s Description Theory of Names (DTN). Particularly, I consider Kripke?s Modal Argument (MA) and various arguments that have been given against it, as well as Kripke?s responses to these arguments. Further, I outline a version of MA that has recently been presented by Scott Soames, and consider how he responds to the criticisms that the argument faces. In order to evaluate the claim that MA is decisive against all description theories, I outline the Nominal Description Theory (NDT) put forth by Kent Bach and consider whether it constitutes a principled response to MA. I do so by exploring how Bach both responds to Kripke?s arguments against descriptivism and highlights the problems with rigid designation as a purely semantic thesis. Finally, I consider the relative merits of the accounts put forth by Bach and Soames. Upon doing so, I argue that MA is not as decisive against description theories as it has long been thought to be. In fact, NDT seems to provide a better account of our uses of proper names than the rigid designation thesis as presented by Kripke and Soames. Philosophy Philosophy of Language Description Theory of Names Nominal Description Theory rigid designation Modal Argument Saul Kripke Scott Soames Kent Bach proper names meaning reference semantic content semantics pragmatics
9	An Ontology-driven Video Annotation And Retrieval System Demirdizen, Goncagul 01 October 2010 (has links) (PDF) In this thesis, a system, called Ontology-Driven Video Annotation and Retrieval System (OntoVARS) is developed in order to provide a video management system which is used for ontology-driven semantic content annotation and querying. The proposed system is based on MPEG-7 ontology which provides interoperability and common communication platform with other MPEG-7 ontology compatible systems. The Rhizomik MPEG-7 ontology is used as the core ontology and domain specific ontologies are integrated to the core ontology in order to provide ontology-based video content annotation and querying capabilities to the user. The proposed system supports content-based annotation and spatio-temporal data modeling in video databases by using the domain ontology concepts. Moreover, the system enables ontology-driven query formulation and processing according to the domain ontology instances and concepts. In the developed system, ontology-driven concept querying, spatio-temporal querying, region-based and time-based querying capabilities are performed as simple querying types. Besides these simple query types, compound queries are also generated by combining simple queries with &quot / (&quot / , &quot / )&quot / , &quot / AND&quot / and &quot / OR&quot / operators. For all these query types, the system supports both general and video specific query processing. By this means, the user is able to pose queries on all videos in the video databases as well as the details of a specific video of interest.
10	Samband mellan fonetiska aspekter och bedömningar av känslor i barnriktat tal / Relationships between phonetic aspects and ratings of affects in infant-directed speech Karlsson, Denise January 2018 (has links) Barnriktat tal (BRT) är ett speciellt sätt för vuxna att tala till barn som bland annat kännetecknas av att vissa känslolägen och intentioner ofta uttrycks starkare än i vuxenriktat tal (VRT). Denna studie undersökte subjektivt bedömda känslolägen och intentioner i BRT och hur de korrelerar med fonetiska aspekter. Män och kvinnor bedömde känslolägen i BRT som bestod av 25 sekunder långa yttranden av mammor och pappor som talade till sina barn i åldrarna 3, 6, 9 och 12 månader. Yttrandena av mammorna var både på svenska och australiensisk engelska, medan de av papporna endast var på svenska. De känslolägen och intentioner som bedömdes var positivt/negativt känsloläge, uttrycka kärleksfullhet, trösta/lugna, uppmuntra uppmärksamhet och styra beteende. De fonetiska aspekterna som korrelerades med bedömningarna var medelvärde av grundtonsfrekvensen, grundtonsfrekvensens omfång och medelvärden av den första och andra formanten. Hur yttrandena bedömdes på den positiva/negativa skalan jämfördes även med bedömningar av samma yttranden fast lågpassfiltrerade (till 400 Hz) som användes i en annan studie. Bedömningarna av positivt/negativt känsloläge jämfördes också mellan yttrandena av de två könen och på de två språken. Resultatet visade att bedömningarna inte skiljde sig signifikant. Korrelationen av bedömda känslolägen och intentioner, och fonetiska aspekter indikerar att det främst är grundtonsfrekvensens höjd som har betydelse för hur högt de olika känslolägena och intentionerna bedöms, men även till viss del grundtonsfrekvensens omfång. Den första formanten korrelerade inte med några känslolägen eller intentioner, men den andra formanten korrelerade med uttrycka kärleksfullhet. Bedömningarna av de australiensisk-engelska yttrandena och de svenska skiljer sig inte signifikant trots skillnader gällande de fonetiska aspektera, och detsamma gäller bedömningar av mammornas och pappornas yttranden. Resultaten tillsammans indikerar att främst höjden av f0-medel och omfångets bredd är relevanta för uppfattade känslolägen. / This study examined the subjectively rated affects in infant directed speech and their correlations with acoustic parameters. Men and women rated affects in 25 second utterances of infant directed speech by mothers and fathers speaking to their infants aged 3, 6, 9 and 12 months. The mothers' utterances were in both Swedish and Australian English, while the fathers' utterances were only in Swedish. The affects that were rated were positive/negative affect, express affection, soothe/calm, encourage attention and direct behaviour. The acoustic parameters that were correlated with the ratings were mean fundamental frequency, range of fundamental frequency and means of the first and the second formant. How the utterances were rated on the positive/negative scale were compared with ratings of the same utterances but low-pass filtered (to 400 Hz), which were used in a different study. The ratings of positive/negative affect were also compared between the utterances of the two genders and the two languages. The result was that the ratings did not differ significantly between the filtered and unfiltered utterances. The correlation of rated affects and acoustic parameters indicate that most affects are rated higher when the fundamental frequency is higher, and the range of the fundamental frequency also appears to have some bearing on the ratings. The first formant did not correlate with any affects, but the second formant correlated with express affection. The ratings of the Australian English and the Swedish utterances did not differ significantly, nor did the ratings by mothers and fathers. Together the results indicate that mainly the height of the fundamental frequency and the width of the range are relevant regarding which affects are perceived. Acoustic parameters affects gender differences infant directed speech semantic content Akustiska parametrar barnriktat tal känslolägen könsskillnader semantiskt innehåll General Language Studies and Linguistics

Search results