Global ETD Search

21	Indexation sémantique des images et des vidéos par apprentissage actif / Semantic indexing of images and videos by active learning. Safadi, Bahjat 17 September 2012 (has links) Le cadre général de cette thèse est l'indexation sémantique et la recherche d'informations, appliquée à des documents multimédias. Plus précisément, nous nous intéressons à l'indexation sémantique des concepts dans des images et vidéos par les approches d'apprentissage actif, que nous utilisons pour construire des corpus annotés. Tout au long de cette thèse, nous avons montré que les principales difficultés de cette tâche sont souvent liées, en général, à l'fossé sémantique. En outre, elles sont liées au problème de classe-déséquilibre dans les ensembles de données à grande échelle, où les concepts sont pour la plupart rares. Pour l'annotation de corpus, l'objectif principal de l'utilisation de l'apprentissage actif est d'augmenter la performance du système en utilisant que peu d'échantillons annotés que possible, ainsi minimisant les coûts de l'annotations des données (par exemple argent et temps). Dans cette thèse, nous avons contribué à plusieurs niveaux de l'indexation multimédia et nous avons proposé trois approches qui succèdent des systèmes de l'état de l'art: i) l'approche multi-apprenant (ML) qui surmonte le problème de classe-déséquilibre dans les grandes bases de données, ii) une méthode de reclassement qui améliore l'indexation vidéo, iii) nous avons évalué la normalisation en loi de puissance et de l'APC et a montré son efficacité dans l'indexation multimédia. En outre, nous avons proposé l'approche ALML qui combine le multi-apprenant avec l'apprentissage actif, et nous avons également proposé une méthode incrémentale qui accélère l'approche proposé (ALML). En outre, nous avons proposé l'approche de nettoyage actif, qui aborde la qualité des annotations. Les méthodes proposées ont été tous validées par plusieurs expériences, qui ont été menées et évaluées sur des collections à grande échelle de l'indice de benchmark internationale bien connue, appelés TRECVID. Enfin, nous avons présenté notre système d'annotation dans le monde réel basé sur l'apprentissage actif, qui a été utilisé pour mener les annotations de l'ensemble du développement de la campagne TRECVID en 2011, et nous avons présenté notre participation à la tâche d'indexation sémantique de cette campagne, dans laquelle nous nous sommes classés à la 3ème place sur 19 participants. / The general framework of this thesis is semantic indexing and information retrieval, applied to multimedia documents. More specifically, we are interested in the semantic indexing of concepts in images and videos by the active learning approaches that we use to build annotated corpus. Throughout this thesis, we have shown that the main difficulties of this task are often related, in general, to the semantic-gap. Furthermore, they are related to the class-imbalance problem in large scale datasets, where concepts are mostly sparse. For corpus annotation, the main objective of using active learning is to increase the system performance by using as few labeled samples as possible, thereby minimizing the cost of labeling data (e.g. money and time). In this thesis, we have contributed in several levels of multimedia indexing and proposed three approaches that outperform state-of-the-art systems: i) the multi-learner approach (ML) that overcomes the class-imbalance problem in large-scale datasets, ii) a re-ranking method that improves the video indexing, iii) we have evaluated the power-law normalization and the PCA and showed its effectiveness in multimedia indexing. Furthermore, we have proposed the ALML approach that combines the multi-learner with active learning, and also proposed an incremental method that speeds up ALML approach. Moreover, we have proposed the active cleaning approach, which tackles the quality of annotations. The proposed methods were validated through several experiments, which were conducted and evaluated on large-scale collections of the well-known international benchmark, called TrecVid. Finally, we have presented our real-world annotation system based on active learning, which was used to lead the annotations of the development set of TrecVid 2011 campaign, and we have presented our participation at the semantic indexing task of the mentioned campaign, in which we were ranked at the 3rd place out of 19 participants. Indexation multimédia Indexation sémantique Apprentissage actif Multimedia indexing Semantic indexing Active learning
22	UNDERSTANDING AND IDENTIFYING LARGE-SCALE ADAPTIVE CHANGES FROM VERSION HISTORIES Meqdadi, Omar Mohammed 30 July 2013 (has links) No description available. Computer Science Adaptive Maintenance Commit Software Engineering Stereotype Latent Semantic Indexing Traceability Version History Topic Modeling
23	Sociological Applications of Topic Extraction Techniques: Two Case Studies Zougris, Konstantinos 08 1900 (has links) Limited research has been conducted with regards to the applicability of topic extraction techniques in Sociology. Addressing the modern methodological opportunities, and responding to the skepticism with regards to the absence of theoretical foundations supporting the use of text analytics, I argue that Latent Semantic Analysis (LSA), complemented by other text analysis techniques and multivariate techniques, can constitute a unique hybrid method that can facilitate the sociological interpretations of web-based textual data. To illustrate the applicability of the hybrid technique, I developed two case studies. My first case study is associated with the Sociology of media. It focuses on the topic extraction and sentiment polarization among partisan texts posted on two major news sites. I find evidence of highly polarized opinions on comments posted on the Huffington Post and the Daily Caller. The highest polarizing topic was associated with a commentator’s reference on Hoodies in the context of the Trayvon Martin’s incident. My findings support contemporary research suggesting that media pundits frequently use tactics of outrage to provoke polarization of public opinion. My second case study contributes to the research domain of the Sociology of knowledge. The hybrid method revealed evidence of topical divides and topical “bridges” in the intellectual landscape of the British and the American sociological journals. My findings confirm the theoretical assertions describing Sociology as a fractured field, and partially support the existence of more globalized topics in the discipline. text analytics sociology of media sociology of knowledge hybrid methods Latent semantic indexing. Content analysis (Communication) Mass media -- Sociological aspects. Knowledge, Sociology of.
24	Applications of Linear Algebra to Information Retrieval Vasireddy, Jhansi Lakshmi 28 May 2009 (has links) Some of the theory of nonnegative matrices is first presented. The Perron-Frobenius theorem is highlighted. Some of the important linear algebraic methods of information retrieval are surveyed. Latent Semantic Indexing (LSI), which uses the singular value de-composition is discussed. The Hyper-Text Induced Topic Search (HITS) algorithm is next considered; here the power method for finding dominant eigenvectors is employed. Through the use of a theorem by Sinkohrn and Knopp, a modified HITS method is developed. Lastly, the PageRank algorithm is discussed. Numerical examples and MATLAB programs are also provided. PageRank Power method Hyper-Text Induced Topic Search Perron-Frobenius theorem Latent Semantic Indexing Eigenvector Nonnegative matrix Eigenvalue Mathematics
25	Human-centered semantic retrieval in multimedia databases Chen, Xin. January 2008 (has links) (PDF) Thesis (Ph. D.)--University of Alabama at Birmingham, 2008. / Additional advisors: Barrett R. Bryant, Yuhua Song, Alan Sprague, Robert W. Thacker. Description based on contents viewed Oct. 8, 2008; title from PDF t.p. Includes bibliographical references (p. 172-183).
26	SCRIBE a clustering approach to semantic information retrieval / Langley, Joseph R., January 2006 (has links) Thesis (M.S.) -- Mississippi State University. Department of Computer Science and Engineering. / Title from title screen. Includes bibliographical references.
27	Evaluating Semantic Internalization Among Users of an Online Review Platform Zaras, Dimitrios 08 1900 (has links) The present study draws on recent sociological literature that argues that the study of cognition and culture can benefit from theories of embodied cognition. The concept of semantic internalization is introduced, which is conceptualized as the ability to perceive and articulate the topics that are of most concern to a community as they are manifested in social discourse. Semantic internalization is partly an application of emotional intelligence in the context of community-level discourse. Semantic internalization is measured through the application of Latent Semantic Analysis. Furthermore, it is investigated whether this ability is related to an individual’s social capital and habitus. The analysis is based on data collected from the online review platform yelp.com. Semantic internalization latent semantic analysis cognitive sociology social media Cognition and culture. Latent semantic indexing. User-generated content. Social media. Emotional intelligence.
28	Možnosti využití netradičních kvantitativních metod při předpovídání finančních krizí / Usage possibilities of nontraditional quantitative methods for financial crises prediction. Hájek, Petr January 2007 (has links) Práce je rozdělena na tři části. V teoretické části práce jsou přiblíženy významné krize za posledních několik set let, typologie krizí, selhání finančních trhů dle P. Krugmana, generační modely, cenové bubliny, souvislost kapitálových toků a dluhového problému, nákaza, prevence před krizemi a jejich management. V druhé části jsou v rámci popisu současného stavu bádání v oblasti predikce finančních krizí citovány desítky studií. Jejich výsledky jsou následně porovnány. Pozornost je také věnována definici finanční krize. Ve třetí části je provedena aplikace metody Latent Semantic Indexing (LSI) na úlohu predikce finančních krizí. Testovanou hypotézou je předpoklad, že akciové trhy dokáží během jednoho čtvrtletí (64 pozorování akciového trhu) reflektovat budoucí vývoj v měnové politice (během dalších 128 pozorování). Tato hypotéza byla na vzorku 39 zemí, intervalu let 1985 - 2007 a interpretace vývoje úrokových sazeb a měnového kurzu domácí měny vůči USD v disertační práci potvrzena. Uvedená metoda LSI a její studovaná aplikace na akciovém trhu, přestože dokázala nalézt několik krizí i přesně na den, je vhodná spíše pro specifikaci a analýzu křehkých období, kdy ke krizi může dojít, než přímo k předpovídání krizí.
29	Analyse et interprétation de scènes visuelles par approches collaboratives / Analysis and interpretation of visual scenes through collaborative approaches / Analiza si interpretarea scenelor vizuale prin abordari colaborative Strat, Sabin Tiberius 04 December 2013 (has links) Les dernières années, la taille des collections vidéo a connu une forte augmentation. La recherche et la navigation efficaces dans des telles collections demande une indexation avec des termes pertinents, ce qui nous amène au sujet de cette thèse, l’indexation sémantique des vidéos. Dans ce contexte, le modèle Sac de Mots (BoW), utilisant souvent des caractéristiques SIFT ou SURF, donne de bons résultats sur les images statiques. Notre première contribution est d’améliorer les résultats des descripteurs SIFT/SURF BoW sur les vidéos en pré-traitant les vidéos avec un modèle de rétine humaine, ce qui rend les descripteurs SIFT/SURF BoW plus robustes aux dégradations vidéo et qui leurs donne une sensitivité à l’information spatio-temporelle. Notre deuxième contribution est un ensemble de descripteurs BoW basés sur les trajectoires. Ceux-ci apportent une information de mouvement et contribuent vers une description plus riche des vidéos. Notre troisième contribution, motivée par la disponibilité de descripteurs complémentaires, est une fusion tardive qui détermine automatiquement comment combiner un grand ensemble de descripteurs et améliore significativement la précision moyenne des concepts détectés. Toutes ces approches sont validées sur les bases vidéo du challenge TRECVid, dont le but est la détection de concepts sémantiques visuels dans un contenu multimédia très riche et non contrôlé. / During the last years, we have witnessed a great increase in the size of digital video collections. Efficient searching and browsing through such collections requires an indexing according to various meaningful terms, bringing us to the focus of this thesis, the automatic semantic indexing of videos. Within this topic, the Bag of Words (BoW) model, often employing SIFT or SURF features, has shown good performance especially on static images. As our first contribution, we propose to improve the results of SIFT/SURF BoW descriptors on videos by pre-processing the videos with a model of the human retina, thereby making these descriptors more robust to video degradations and sensitivite to spatio-temporal information. Our second contribution is a set of BoW descriptors based on trajectories. These give additional motion information, leading to a richer description of the video. Our third contribution, motivated by the availability of complementary descriptors, is a late fusion approach that automatically determines how to combine a large set of descriptors, giving a high increase in the average precision of detected concepts. All the proposed approaches are validated on the TRECVid challenge datasets which focus on visual concept detection in very large and uncontrolled multimedia content. Indexation sémantique Vidéo Sac de mots SIFT SURF Rétine Spatio-temporel Trajectoires Fusion tardive Semantic indexing Video Bag of Words SIFT SURF Retina Spatio-temporal Trajectories Late fusion
30	Contribution à la construction d’ontologies et à la recherche d’information : application au domaine médical / Contribution to ontology building and to semantic information retrieval : application to medical domain Drame, Khadim 10 December 2014 (has links) Ce travail vise à permettre un accès efficace à des informations pertinentes malgré le volume croissant des données disponibles au format électronique. Pour cela, nous avons étudié l’apport d’une ontologie au sein d’un système de recherche d'information (RI).Nous avons tout d’abord décrit une méthodologie de construction d’ontologies. Ainsi, nous avons proposé une méthode mixte combinant des techniques de traitement automatique des langues pour extraire des connaissances à partir de textes et la réutilisation de ressources sémantiques existantes pour l’étape de conceptualisation. Nous avons par ailleurs développé une méthode d’alignement de termes français-anglais pour l’enrichissement terminologique de l’ontologie. L’application de notre méthodologie a permis de créer une ontologie bilingue de la maladie d’Alzheimer.Ensuite, nous avons élaboré des algorithmes pour supporter la RI sémantique guidée par une ontologie. Les concepts issus d’une ontologie ont été utilisés pour décrire automatiquement les documents mais aussi pour reformuler les requêtes. Nous nous sommes intéressés à : 1) l’identification de concepts représentatifs dans des corpus, 2) leur désambiguïsation, 3), leur pondération selon le modèle vectoriel, adapté aux concepts et 4) l’expansion de requêtes. Ces propositions ont permis de mettre en œuvre un portail de RI sémantique dédié à la maladie d’Alzheimer. Par ailleurs, le contenu des documents à indexer n’étant pas toujours accessible dans leur ensemble, nous avons exploité des informations incomplètes pour déterminer les concepts pertinents permettant malgré tout de décrire les documents. Pour cela, nous avons proposé deux méthodes de classification de documents issus d’un large corpus, l’une basée sur l’algorithme des k plus proches voisins et l’autre sur l’analyse sémantique explicite. Ces méthodes ont été évaluées sur de larges collections de documents biomédicaux fournies lors d’un challenge international. / This work aims at providing efficient access to relevant information among the increasing volume of digital data. Towards this end, we studied the benefit from using ontology to support an information retrieval (IR) system.We first described a methodology for constructing ontologies. Thus, we proposed a mixed method which combines natural language processing techniques for extracting knowledge from text and the reuse of existing semantic resources for the conceptualization step. We have also developed a method for aligning terms in English and French in order to enrich terminologically the resulting ontology. The application of our methodology resulted in a bilingual ontology dedicated to Alzheimer’s disease.We then proposed algorithms for supporting ontology-based semantic IR. Thus, we used concepts from ontology for describing documents automatically and for query reformulation. We were particularly interested in: 1) the extraction of concepts from texts, 2) the disambiguation of terms, 3) the vectorial weighting schema adapted to concepts and 4) query expansion. These algorithms have been used to implement a semantic portal about Alzheimer’s disease. Further, because the content of documents are not always fully available, we exploited incomplete information for identifying the concepts, which are relevant for indexing the whole content of documents. Toward this end, we have proposed two classification methods: the first is based on the k nearest neighbors’ algorithm and the second on the explicit semantic analysis. The two methods have been evaluated on large standard collections of biomedical documents within an international challenge. Construction d’ontologie Réutilisation de RTO Recherche d’information Indexation sémantique Classification de documents biomédicaux Maladie d’Alzheimer Ontology construction TOR reuse Information retrieval Semantic indexing Biomedical document classification Alzheimer’s disease

Search results