Global ETD Search

251	A presença das literaturas portuguesa e africana de língua portuguesa no Suplemento Literário Minas Gerais (1966/1988) : indexação, coletânea de textos e banco de dados / Camargos, Léia Patrícia. January 2004 (has links) Orientador: Rosane Gazolla Alves Feitosa / Banca: Tania Celestino de Macêdo / Banca: Álvaro Santos Simões Junior / Resumo: Indexação de textos de crítica e de criação literária das literaturas portuguesa e africanas de língua portuguesa publicadas no Suplemento Literário Minas Gerais (1966-1988), com o objetivo de: a) resgatar a memória das referidas literaturas; b) traçar o percurso do periódico Suplemento Literário Minas Gerais; c) indexar os textos das literaturas mencionadas; d) elaborar uma coletânea de textos integrais (impressa) de crítica e de criação literária com os textos referentes ao item c; e) criar um Banco de Dados informatizado (coletânea de textos integrais digitalizados, em formato PDF, com possibilidade de acesso por meio de fichas catalográficas) com os textos do item d. Por meio do contato com as fontes primárias, procedeu-se à indexação dos textos referentes às literaturas acima, tendo sido estes organizados em fichas catalográficas e em índices remissivos, em formato de quadros,observando-se os itens: cronologia de publicação, colaboradores, escritores e frequência. O produto da pesquisa democratizará e disponibilizará o acesso a periódicos brasileiros e a um número considerável de textos integrais digitalizados das literaturas portuguesa e africanas de língua portuguesa. / Abstract: This is indexation of critical and literary texts of Portuguese literature and African literatures in Portuguese language published in Literary Supplement Minas Gerais (newspaper) (1966/1988) with the purpose of: a) keeping the memory of the mentioned literatures; b) reviewing the course of the Brazilian periodical Literary Supplement Minas Gerais; c) indexing the texts from those literatures mentioned above; d) making up a collecting the critical and literary texts mentioned in item c in an unabridged printed version; e) making up a Data Base (collected texts digitalized in full, in PDF format, with search access through a cataloguing cards. After contacting the primary sources, the indexation of Portuguese literature and African literatures in Portuguese language were done, as these texts were organized in cataloguing cards and reviewing indexes, in table format, watching the following items: publishing chronology, collaborators, critical articles, literary articles, writers and literary texts. The final product of the research - Data Base and collected texts - will democratize and enable the reading of a Brazilian periodical, the Literary Supplement Minas Gerais and a large number of digitalized unabridged texts in full from Portuguese literature and African literatures in Portuguese language. / Mestre Periódicos brasileiros. Literatura portuguesa. Literatura africana. Portuguese literature. eng Indexation. eng
252	O Fundo de Garantia do Tempo de Serviço (FGTS) e o desenvolvimento brasileiro - propostas legislativas em face da Ação Declaratória de Inconstitucionalidade (ADI) 5090/DF / Guarantee Fund for Time os Service (FGTS) and Brazilian development - Legislative proposals in the face of Direct Action of Unconstitutionality (ADI) 5090/DF Francisco Sergio Nunes 09 March 2017 (has links) Esta dissertação trata do nascimento e evolução do Fundo de Garantia do Tempo de Serviço (FGTS), seu momento histórico, sua evolução, o tratamento constitucional dado ao instituto na Constituição de 1988 e a sua efetividade através da lei 8.036/90. Analisa o tema da proteção do trabalho, demissão sem justa causa, no mundo contemporâneo, com passagens por vários países dentro de um contexto de plena globalização. A proteção do emprego no Brasil como conhecemos, através de um Fundo com natureza jurídica híbrida, com função de direito social do trabalhador no momento de sua despedida e a aplicação dos recursos das contas vinculadas em programas habitacionais, de saneamento ambiental e infraestrutura tornam o FGTS um Fundo de natureza única no mundo todo. Os investimentos realizados com recursos do FGTS são imprescindíveis para o desenvolvimento da economia brasileira, com reflexos diretos na geração de empregos e na melhoria do bem estar social, não só dos trabalhadores filiados ao sistema FGTS, mas a toda população brasileira. A Ação Direta de Inconstitucionalidade 5090/DF trata da inconstitucionalidade do índice de correção do FGTS, a Taxa Referencial (TR), informando que o índice não é apto a representar o fenômeno inflacionário e por isso deve ser considerada sua inconstitucionalidade, por ferir o direito de propriedade (art.5º, XXII, CF), a moralidade administrativa (art. 37, X) e o próprio Fundo de Garantia do Tempo de Serviço (art. 7º, III). A proposta legislativa que encerra o trabalho deve levar em consideração a natureza jurídica polivalente do FGTS, atendendo tanto quanto possível aos interesses dos cotistas do Fundo e ao mesmo tempo os tomadores de empréstimos com recursos do FGTS, não quebrando o equilíbrio econômico financeiro do Fundo. / This dissertation deals with Guarantee Fund for Time of Service (FGTS) origin and evolution, its historical moment, the treatment conferred by Brazilian Constitution of 1988 and the effectiveness provided by the law 8.036/90. It analyses employment protection, unjustified resignation on the contemporary world, according the law of many countries inserted in the context of globalization. Brazilian employment protection policy by means of a hybrid juridical nature fund, the function as laid off worker\'s social right, and investment in housing programs, environmental sanitation and infrastructure projects make FGTS a fund with unique characteristics. FGTS resources investments are indispensable to Brazilian economy development, resulting in job creation and welfare improvement for whole Brazilian population. The Direct Action of Unconstitutionality 5090/DF deals with unconstitutionality of FGTS correction index, called \"referential tax\" (TR), alleging the unfitness of this index in representing the inflationary phenomenon as cause to declare its unconstitutionality, in reason of disagreement with property right (art. 5º, XXIII, CF), administrative morality (art. 37, X), and the Guarantee Fund for Time of Service itself (art. 7º, III). The legislative proposal at the end of this dissertation considers the polyvalent juridical nature of FGTS, attending the interests of fund quotaholders and FGTS resources borrowers, and preserving its economic-financial balance. Ação Direta de Inconstitucionalidade Correção monetária Desenvolvimento FGTS Judicialização Natureza jurídica TR Development Direct Action of Unconstitutionality FGTS Indexation Judicialization Juridical nature TR
253	利用企業投資指標建構投資組合 - 以台灣科技業為例 / Portfolio Construction Using Corporate Investment Metrics - An Empirical Study on Taiwan Technology Sector 吳永丞, Wu, Yung Cheng Unknown Date (has links) 本研究以985筆台灣科技業公司為樣本，並且使用企業投資指標作為指數加權基礎，探討以有形和無形資產投資規模進行基本面指數化的績效表現與可行性。我們發現即使在考慮了價值風險和規模風險之後，以研究發展費用相關指標建構的基本面指數仍可以產生超額報酬。此外，研究結果顯示部分的基本面指數具有市場擇時能力，能避免投資組合績效受到價格不效率的影響。在對樣本進行流動性的篩選以及考慮投資組合的交易成本之後，我們仍得到一樣的結果。 / We employ 985 companies in technology industry in Taiwan to examine the performance and feasibility of the fundamental indices constructed by corporate investment metrics (including both tangible and intangible investment). We find that the fundamental indices constructed by R&D expenditure-related metrics generate significant Fama-French alpha. Besides, evidence shows that parts of the fundamental indices have market timing ability to prevent performance dragged by price inefficiency. We draw a same conclusion after weeding out the companies with low liquidity and adjusting for transaction costs. 基本面指數化企業投資投資組合績效 Fundamental indexation Corporate investment Portfolio performance
254	Indoor location estimation using a wearable camera with application to the monitoring of persons at home / Localisation à partir de caméra vidéo portée Dovgalecs, Vladislavs 05 December 2011 (has links) L’indexation par le contenu de lifelogs issus de capteurs portées a émergé comme un enjeu à forte valeur ajoutée permettant l’exploitation de ces nouveaux types de donnés. Rendu plus accessible par la récente disponibilité de dispositifs miniaturisés d’enregistrement, les besoins pour l’extraction automatique d’informations pertinents générées par autres applications, la localisation en environnement intérieur est un problème difficile à l’analyse de telles données.Beaucoup des solutions existantes pour la localisation fonctionnent insuffisamment bien ou nécessitent une intervention important à l’intérieur de bâtiment. Dans cette thèse, nous abordons le problème de la localisation topologique à partir de séquences vidéo issues d’une camera portée en utilisant une approche purement visuelle. Ce travail complète d’extraction des descripteurs visuels de bas niveaux jusqu’à l’estimation finale de la localisation à l’aide d’algorithmes automatiques.Dans ce cadre, les contributions principales de ce travail ont été faites pour l’exploitation efficace des informations apportées par descripteurs visuels multiples, par les images non étiquetées et par la continuité temporelle de la vidéo. Ainsi, la fusion précoce et la fusion tardive des données visuelles ont été examinées et l’avantage apporté par la complémentarité des descripteurs visuels a été mis en évidence sur le problème de la localisation. En raison de difficulté à obtenir des données étiquetées en quantités suffisantes, l’ensemble des données a été exploité ; d’une part les approches de réduction de dimensionnalité non-linéaire ont été appliquées, afin d’améliorer la taille des données à traiter et la complexité associée ; d’autre part des approches semi-supervisés ont été étudiées pour utiliser l’information supplémentaire apportée par les images non étiquetées lors de la classification. Ces éléments ont été analysé séparément et on été mis en œuvre ensemble sous la forme d’une nouvelle méthode par co-apprentissage temporelle. Finalement nous avons également exploré la question de l’invariance des descripteurs, en proposant l’utilisation d’un apprentissage invariant à la transformation spatiale, comme un autre réponse possible un manque de données annotées et à la variabilité visuelle.Ces méthodes ont été évaluées sur des séquences vidéo en environnement contrôlé accessibles publiquement pour évaluer le gain spécifique de chaque contribution. Ce travail a également été appliqué dans le cadre du projet IMMED, qui concerne l’observation et l’indexation d’activités de la vie quotidienne dans un objectif d’aide au diagnostic médical, à l’aide d’une caméra vidéo portée. Nous avons ainsi pu mettre en œuvre le dispositif d’acquisition vidéo portée, et montrer le potentiel de notre approche pour l’estimation de la localisation topologique sur un corpus présentant des conditions difficiles représentatives des données réelles. / Visual lifelog indexing by content has emerged as a high reward application. Enabled by the recent availability of miniaturized recording devices, the demand for automatic extraction of relevant information from wearable sensors generated content has grown. Among many other applications, indoor localization is one challenging problem to be addressed.Many standard solutions perform unreliably in indoors conditions or require significant intervention. In this thesis we address from the perspective of wearable video camera sensors using an image-based approach. The key contribution of this work is the development and the study of a location estimation system composed of diverse modules, which perform tasks ranging from low-level visual information extraction to final topological location estimation with the aid of automatic indexing algorithms. Within this framework, important contributions have been made by efficiently leveraging information brought by multiple visual features, unlabeled image data and the temporal continuity of the video.Early and late data fusion were considered, and shown to take advantage of the complementarities of multiple visual features describing the images. Due to the difficulty in obtaining annotated data in our context, semi-supervised approaches were investigated, to use unlabeled data as additional source of information, both for non-linear data-adaptive dimensionality reduction, and for improving classification. Herein we have developed a time-aware co-training approach that combines late data-fusion with the semi-supervised exploitation of both unlabeled data and time information. Finally, we have proposed to apply transformation invariant learning to adapt non-invariant descriptors to our localization framework.The methods have been tested on controlled publically available datasets to evaluate the gain of each contribution. This work has also been applied to the IMMED project, dealing with activity recognition and monitoring of the daily living using a wearable camera. In this context, the developed framework has been used to estimate localization on the real world IMMED project video corpus, which showed the potential of the approaches in such challenging conditions. Lifelogging Indexation de vidéo Suivi des activités Apprentissage semi-supervisé Apprentissage invariante Lifelogging Video indexing Indoors location estimation Activity monitoring Semi-supervised learning Invariant learning
255	Grammatical gender in New Guinea Svärd, Erik January 2015 (has links) The present study investigates the gender systems of 20 languages in the New Guinea region, an often overlooked area in typological research. The languages were classified with five criteria used by Di Garbo (2014) to classify gender systems of African languages. The results showed that the gender systems were diverse, although around half of the languages have two-gendered sex-based systems with semantic assignment, more than four gender-indexing targets, and no gender marking on nouns. The gender systems of New Guinea are remarkably representative of the world, although formal assignment is much less common. However, the gender systems of New Guinea and Africa are very different. The most significant difference isthe prevalence of non-sex-based gender systems and gender marking on nouns in Africa, whereas the opposite is true in New Guinea. However, gender in Africa is also less diverse largely due to the numerous Bantu languages. Finally, four typologically rare characteristics were found in the sample: (1) size and shape as important criteria of gender assignment, with large/long being masculine and small/short feminine, (2) the presence of two separate nominal classification systems, (3) no gender distinctions in pronouns, and (4) verbs as the most common indexing target. / Denna studie undersöker genussystemen hos 20 språk i Nya Guinea-regionen, vilken ofta förbises i typologisk forskning. Språken klassificerades utifrån fem kriterier som användes av Di Garbo (2014) för att klassificera genussystem i Afrika. Resultaten visade att genussystemen var varierade, men ungefär hälften av språken har könsbaserade genussystem med tvaå genus, semantisk genustilldelning, fler än fyra genusindex och ingen genusmarkering på substantiv. Genussystemen är anmärkningsvärt representativa för världen, men formell genustilldelning är mycket mindre vanlig. Jämfört med genussystemen i Afrika är dock Nya Guinea väldigt annorlunda. Den viktigaste skillnaden är den större utbredningen av icke-könsbaserade genussystem och genusmarkering på substantiv i Afrika, medan motsatsen gäller i Nya Guinea. Genus i Afrika är dock till stor del mindre varierat på grund av de talrika bantuspråken. Slutligen hittades fyra typologiskt sällsynta karaktärsdrag i urvalet: (1) storlek och form som viktiga kriterier för genustilldelning, där stort/långt är maskulint och litet/kort feminint, (2) närvaron av två separata nominalklassificeringssystem, (3) inga genusdistinktioner i pronomen och (4) verb som det vanligaste genusindexet. agreement grammatical gender indexation New Guinea Papuan languages typology grammatiskt genus indexering kongruens Nya Guinea papuanska språk typologi General Language Studies and Linguistics
256	Apprentissage automatique pour simplifier l’utilisation de banques d’images cardiaques / Machine Learning for Simplifying the Use of Cardiac Image Databases Margeta, Ján 14 December 2015 (has links) L'explosion récente de données d'imagerie cardiaque a été phénoménale. L'utilisation intelligente des grandes bases de données annotées pourrait constituer une aide précieuse au diagnostic et à la planification de thérapie. En plus des défis inhérents à la grande taille de ces banques de données, elles sont difficilement utilisables en l'état. Les données ne sont pas structurées, le contenu des images est variable et mal indexé, et les métadonnées ne sont pas standardisées. L'objectif de cette thèse est donc le traitement, l'analyse et l'interprétation automatique de ces bases de données afin de faciliter leur utilisation par les spécialistes de cardiologie. Dans ce but, la thèse explore les outils d'apprentissage automatique supervisé, ce qui aide à exploiter ces grandes quantités d'images cardiaques et trouver de meilleures représentations. Tout d'abord, la visualisation et l'interprétation d'images est améliorée en développant une méthode de reconnaissance automatique des plans d'acquisition couramment utilisés en imagerie cardiaque. La méthode se base sur l'apprentissage par forêts aléatoires et par réseaux de neurones à convolution, en utilisant des larges banques d'images, où des types de vues cardiaques sont préalablement établies. La thèse s'attache dans un deuxième temps au traitement automatique des images cardiaques, avec en perspective l'extraction d'indices cliniques pertinents. La segmentation des structures cardiaques est une étape clé de ce processus. A cet effet une méthode basée sur les forêts aléatoires qui exploite des attributs spatio-temporels originaux pour la segmentation automatique dans des images 3Det 3D+t est proposée. En troisième partie, l'apprentissage supervisé de sémantique cardiaque est enrichi grâce à une méthode de collecte en ligne d'annotations d'usagers. Enfin, la dernière partie utilise l'apprentissage automatique basé sur les forêts aléatoires pour cartographier des banques d'images cardiaques, tout en établissant les notions de distance et de voisinage d'images. Une application est proposée afin de retrouver dans une banque de données, les images les plus similaires à celle d'un nouveau patient. / The recent growth of data in cardiac databases has been phenomenal. Cleveruse of these databases could help find supporting evidence for better diagnosis and treatment planning. In addition to the challenges inherent to the large quantity of data, the databases are difficult to use in their current state. Data coming from multiple sources are often unstructured, the image content is variable and the metadata are not standardised. The objective of this thesis is therefore to simplify the use of large databases for cardiology specialists withautomated image processing, analysis and interpretation tools. The proposed tools are largely based on supervised machine learning techniques, i.e. algorithms which can learn from large quantities of cardiac images with groundtruth annotations and which automatically find the best representations. First, the inconsistent metadata are cleaned, interpretation and visualisation of images is improved by automatically recognising commonly used cardiac magnetic resonance imaging views from image content. The method is based on decision forests and convolutional neural networks trained on a large image dataset. Second, the thesis explores ways to use machine learning for extraction of relevant clinical measures (e.g. volumes and masses) from3D and 3D+t cardiac images. New spatio-temporal image features are designed andclassification forests are trained to learn how to automatically segment the main cardiac structures (left ventricle and left atrium) from voxel-wise label maps. Third, a web interface is designed to collect pairwise image comparisons and to learn how to describe the hearts with semantic attributes (e.g. dilation, kineticity). In the last part of the thesis, a forest-based machinelearning technique is used to map cardiac images to establish distances and neighborhoods between images. One application is retrieval of the most similar images. L'indexation Recherche d'image par le contenu Analyse des images médicales Informatique décisionnelle IRM cardiaque Indexation Context based image retrieval Medical image analysis Clinical decision support systems Cardiac MRI 004.3
257	FreeCore : un système d'indexation de résumés de document sur une Table de Hachage Distribuée (DHT) / FreeCore : an index system of summary of documents on an Distributed Hash Table (DHT) Ngom, Bassirou 13 July 2018 (has links) Cette thèse étudie la problématique de l’indexation et de la recherche dans les tables de hachage distribuées –Distributed Hash Table (DHT). Elle propose un système de stockage distribué des résumés de documents en se basant sur leur contenu. Concrètement, la thèse utilise les Filtre de Blooms (FBs) pour représenter les résumés de documents et propose une méthode efficace d’insertion et de récupération des documents représentés par des FBs dans un index distribué sur une DHT. Le stockage basé sur contenu présente un double avantage, il permet de regrouper les documents similaires afin de les retrouver plus rapidement et en même temps, il permet de retrouver les documents en faisant des recherches par mots-clés en utilisant un FB. Cependant, la résolution d’une requête par mots-clés représentée par un filtre de Bloom constitue une opération complexe, il faut un mécanisme de localisation des filtres de Bloom de la descendance qui représentent des documents stockés dans la DHT. Ainsi, la thèse propose dans un deuxième temps, deux index de filtres de Bloom distribués sur des DHTs. Le premier système d’index proposé combine les principes d’indexation basée sur contenu et de listes inversées et répond à la problématique liée à la grande quantité de données stockée au niveau des index basés sur contenu. En effet, avec l’utilisation des filtres de Bloom de grande longueur, notre solution permet de stocker les documents sur un plus grand nombre de serveurs et de les indexer en utilisant moins d’espace. Ensuite, la thèse propose un deuxième système d’index qui supporte efficacement le traitement des requêtes de sur-ensembles (des requêtes par mots-clés) en utilisant un arbre de préfixes. Cette dernière solution exploite la distribution des données et propose une fonction de répartition paramétrable permettant d’indexer les documents avec un arbre binaire équilibré. De cette manière, les documents sont répartis efficacement sur les serveurs d’indexation. En outre, la thèse propose dans la troisième solution, une méthode efficace de localisation des documents contenant un ensemble de mots-clés donnés. Comparé aux solutions de même catégorie, cette dernière solution permet d’effectuer des recherches de sur-ensembles en un moindre coût et constitue est une base solide pour la recherche de sur-ensembles sur les systèmes d’index construits au-dessus des DHTs. Enfin, la thèse propose le prototype d’un système pair-à-pair pour l’indexation de contenus et la recherche par mots-clés. Ce prototype, prêt à être déployé dans un environnement réel, est expérimenté dans l’environnement de simulation peersim qui a permis de mesurer les performances théoriques des algorithmes développés tout au long de la thèse. / This thesis examines the problem of indexing and searching in Distributed Hash Table (DHT). It provides a distributed system for storing document summaries based on their content. Concretely, the thesis uses Bloom filters (BF) to represent document summaries and proposes an efficient method for inserting and retrieving documents represented by BFs in an index distributed on a DHT. Content-based storage has a dual advantage. It allows to group similar documents together and to find and retrieve them more quickly at the same by using Bloom filters for keywords searches. However, processing a keyword query represented by a Bloom filter is a difficult operation and requires a mechanism to locate the Bloom filters that represent documents stored in the DHT. Thus, the thesis proposes in a second time, two Bloom filters indexes schemes distributed on DHT. The first proposed index system combines the principles of content-based indexing and inverted lists and addresses the issue of the large amount of data stored by content-based indexes. Indeed, by using Bloom filters with long length, this solution allows to store documents on a large number of servers and to index them using less space. Next, the thesis proposes a second index system that efficiently supports superset queries processing (keywords-queries) using a prefix tree. This solution exploits the distribution of the data and proposes a configurable distribution function that allow to index documents with a balanced binary tree. In this way, documents are distributed efficiently on indexing servers. In addition, the thesis proposes in the third solution, an efficient method for locating documents containing a set of keywords. Compared to solutions of the same category, the latter solution makes it possible to perform subset searches at a lower cost and can be considered as a solid foundation for supersets queries processing on over-dht index systems. Finally, the thesis proposes a prototype of a peer-to-peer system for indexing content and searching by keywords. This prototype, ready to be deployed in a real environment, is experimented with peersim that allowed to measure the theoretical performances of the algorithms developed throughout the thesis. Table de hachage distribuée Indexation Recherche par mots-clés Filtres de Blooms Arbres de préfixe FreeCore Distributed Hash Table Indexing Keywords search Bloom filters Prefix tree FreeCore 025.3
258	Webová aplikace pro fulltextové vyhledávání nad PDF dokumenty / Web Application for Fulltext Search in PDF Documents Svoboda, Ondřej January 2012 (has links) This master's thesis describes principles of full text search engines, design and implementation of web application for referencing and full text searching in PDF documents. It also contains an overview and comparison with currently available reference management software. There are discussed bibliographic information export possibilities in various citation styles and formats. Final application is written in PHP scripting language and uses MySQL database.
259	Contribution à la construction d’ontologies et à la recherche d’information : application au domaine médical / Contribution to ontology building and to semantic information retrieval : application to medical domain Drame, Khadim 10 December 2014 (has links) Ce travail vise à permettre un accès efficace à des informations pertinentes malgré le volume croissant des données disponibles au format électronique. Pour cela, nous avons étudié l’apport d’une ontologie au sein d’un système de recherche d'information (RI).Nous avons tout d’abord décrit une méthodologie de construction d’ontologies. Ainsi, nous avons proposé une méthode mixte combinant des techniques de traitement automatique des langues pour extraire des connaissances à partir de textes et la réutilisation de ressources sémantiques existantes pour l’étape de conceptualisation. Nous avons par ailleurs développé une méthode d’alignement de termes français-anglais pour l’enrichissement terminologique de l’ontologie. L’application de notre méthodologie a permis de créer une ontologie bilingue de la maladie d’Alzheimer.Ensuite, nous avons élaboré des algorithmes pour supporter la RI sémantique guidée par une ontologie. Les concepts issus d’une ontologie ont été utilisés pour décrire automatiquement les documents mais aussi pour reformuler les requêtes. Nous nous sommes intéressés à : 1) l’identification de concepts représentatifs dans des corpus, 2) leur désambiguïsation, 3), leur pondération selon le modèle vectoriel, adapté aux concepts et 4) l’expansion de requêtes. Ces propositions ont permis de mettre en œuvre un portail de RI sémantique dédié à la maladie d’Alzheimer. Par ailleurs, le contenu des documents à indexer n’étant pas toujours accessible dans leur ensemble, nous avons exploité des informations incomplètes pour déterminer les concepts pertinents permettant malgré tout de décrire les documents. Pour cela, nous avons proposé deux méthodes de classification de documents issus d’un large corpus, l’une basée sur l’algorithme des k plus proches voisins et l’autre sur l’analyse sémantique explicite. Ces méthodes ont été évaluées sur de larges collections de documents biomédicaux fournies lors d’un challenge international. / This work aims at providing efficient access to relevant information among the increasing volume of digital data. Towards this end, we studied the benefit from using ontology to support an information retrieval (IR) system.We first described a methodology for constructing ontologies. Thus, we proposed a mixed method which combines natural language processing techniques for extracting knowledge from text and the reuse of existing semantic resources for the conceptualization step. We have also developed a method for aligning terms in English and French in order to enrich terminologically the resulting ontology. The application of our methodology resulted in a bilingual ontology dedicated to Alzheimer’s disease.We then proposed algorithms for supporting ontology-based semantic IR. Thus, we used concepts from ontology for describing documents automatically and for query reformulation. We were particularly interested in: 1) the extraction of concepts from texts, 2) the disambiguation of terms, 3) the vectorial weighting schema adapted to concepts and 4) query expansion. These algorithms have been used to implement a semantic portal about Alzheimer’s disease. Further, because the content of documents are not always fully available, we exploited incomplete information for identifying the concepts, which are relevant for indexing the whole content of documents. Toward this end, we have proposed two classification methods: the first is based on the k nearest neighbors’ algorithm and the second on the explicit semantic analysis. The two methods have been evaluated on large standard collections of biomedical documents within an international challenge. Construction d’ontologie Réutilisation de RTO Recherche d’information Indexation sémantique Classification de documents biomédicaux Maladie d’Alzheimer Ontology construction TOR reuse Information retrieval Semantic indexing Biomedical document classification Alzheimer’s disease
260	Scalable location-temporal range query processing for structured peer-to-peer networks / Traitement de requêtes spatio-temporelles pour les réseaux pair-à-pair structurés Cortés, Rudyar 06 April 2017 (has links) La recherche et l'indexation de données en fonction d'une date ou d'une zone géographique permettent le partage et la découverte d'informations géolocalisées telles que l'on en trouve sur les réseaux sociaux comme Facebook, Flickr, ou Twitter. Cette réseau social connue sous le nom de Location Based Social Network (LBSN) s'applique à des millions d'utilisateurs qui partagent et envoient des requêtes ciblant des zones spatio-temporelles, permettant d'accéder à des données géolocalisées générées dans une zone géographique et dans un intervalle de temps donné. Un des principaux défis pour de telles applications est de fournir une architecture capable de traiter la multitude d'insertions et de requêtes spatio-temporelles générées par une grande quantité d'utilisateurs. A ces fins, les Tables de Hachage Distribué (DHT) et le paradigme Pair-à-Pair (P2P) sont autant de primitives qui forment la base pour les applications de grande envergure. Cependant, les DHTs sont mal adaptées aux requêtes ciblant des intervalles donnés; en effet, l'utilisation de fonctions de hachage sacrifie la localité des données au profit d'un meilleur équilibrage de la charge. Plusieurs solutions ajoutent le support de requêtes ciblant des ensembles aux DHTs. En revanche ces solutions ont tendance à générer un nombre de messages et une latence élevée pour des requêtes qui ciblent des intervalles. Cette thèse propose deux solutions à large échelle pour l'indexation des données géolocalisées. / Indexing and retrieving data by location and time allows people to share and explore massive geotagged datasets observed on social networks such as Facebook, Flickr, and Twitter. This scenario known as a Location Based Social Network (LBSN) is composed of millions of users, sharing and performing location-temporal range queries in order to retrieve geotagged data generated inside a given geographic area and time interval. A key challenge is to provide a scalable architecture that allow to perform insertions and location-temporal range queries from a high number of users. In order to achieve this, Distributed Hash Tables (DHTs) and the Peer-to-Peer (P2P) computing paradigms provide a powerful building block for implementing large scale applications. However, DHTs are ill-suited for supporting range queries because the use of hash functions destroy data locality for the sake of load balance. Existing solutions that use a DHT as a building block allow to perform range queries. Nonetheless, they do not target location-temporal range queries and they exhibit poor performance in terms of query response time and message traffic. This thesis proposes two scalable solutions for indexing and retrieving geotagged data based on location and time. Scalabilité Indexation spatio-Temporelle Pair à pair Table de hachage distribuée Données géolocalisées Traitement des requêtes Scalability Local-temporal indexing Peer-to-Peer 004

Search results