• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 17
  • 10
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 48
  • 48
  • 32
  • 32
  • 8
  • 8
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Estudo sobre o impacto da adição de vocabulários estruturados da área de ciências da saúde no Currículo Lattes

Araújo, Charles Henrique de January 2016 (has links)
A busca de informações em bases de dados de instituições que possuem grande volume de dados necessita cada vez mais de processos mais eficientes para realização dessa tarefa. Problemas de grafia, idioma, sinonímia, abreviação de termos e a falta de padronização dos termos, tanto nos argumentos de busca, quanto na indexação dos documentos, interferem diretamente nos resultados. Diante disso, este estudo teve como objetivo avaliar o impacto da adição de vocabulários estruturados da área de Ciências da Saúde no Currículo Lattes, na recuperação de perfis similares de pesquisadores das áreas de Ciências Biológicas e Ciências da Saúde, utilizando técnicas de mineração de dados, expansão de consultas, modelos vetoriais de consultas e utilização de algoritmo de trigramas. Foram realizados cruzamentos de informações entre as palavras-chaves de artigos publicados registrados no Currículo Lattes e as informações contidas no Medical Subject Headings (MeSH) e nos Descritores em Ciências da Saúde (DeCS), bem como comparações entre os resultados das consultas, utilizando as palavras-chaves originais e adicionando-lhes os termos resultantes do processo de expansão de consultas. Os resultados mostram que a metodologia adotada neste estudo pode incrementar qualitativamente o universo de perfis recuperados, podendo dessa forma contribuir para a melhoria dos Sistemas de Informações do Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq. / Information retrieval in large databases need increasingly more efficient ways for accomplishing this task. There are many problems, like spelling, language, synonym, acronyms, lack of standardization of terms, both in the search arguments, as in the indexing of documents. They directly interfere in the results. Thus, this study aimed to evaluate the impact of the addition of structured vocabularies of Health Sciences area in Lattes Database, in the recovery of similar profiles of researchers that work in Biological Sciences and Health Sciences, using Query Expansion, Data Mining procedures, Vector Models and Trigram Phrase Matching algorithm. Crosschecking keywords of articles registered in Lattes Database and Medical Subject Headings (MeSH) and Health Sciences Descriptors (DeCS) terms, as well as comparisons between the results of queries using the original keywords and adding them to query expansion terms. The results show that the methodology used in this study can qualitatively increase the set of recovered profiles, contributing to the improvement of CNPq Information Systems.
22

Estudo sobre o impacto da adição de vocabulários estruturados da área de ciências da saúde no Currículo Lattes

Araújo, Charles Henrique de January 2016 (has links)
A busca de informações em bases de dados de instituições que possuem grande volume de dados necessita cada vez mais de processos mais eficientes para realização dessa tarefa. Problemas de grafia, idioma, sinonímia, abreviação de termos e a falta de padronização dos termos, tanto nos argumentos de busca, quanto na indexação dos documentos, interferem diretamente nos resultados. Diante disso, este estudo teve como objetivo avaliar o impacto da adição de vocabulários estruturados da área de Ciências da Saúde no Currículo Lattes, na recuperação de perfis similares de pesquisadores das áreas de Ciências Biológicas e Ciências da Saúde, utilizando técnicas de mineração de dados, expansão de consultas, modelos vetoriais de consultas e utilização de algoritmo de trigramas. Foram realizados cruzamentos de informações entre as palavras-chaves de artigos publicados registrados no Currículo Lattes e as informações contidas no Medical Subject Headings (MeSH) e nos Descritores em Ciências da Saúde (DeCS), bem como comparações entre os resultados das consultas, utilizando as palavras-chaves originais e adicionando-lhes os termos resultantes do processo de expansão de consultas. Os resultados mostram que a metodologia adotada neste estudo pode incrementar qualitativamente o universo de perfis recuperados, podendo dessa forma contribuir para a melhoria dos Sistemas de Informações do Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq. / Information retrieval in large databases need increasingly more efficient ways for accomplishing this task. There are many problems, like spelling, language, synonym, acronyms, lack of standardization of terms, both in the search arguments, as in the indexing of documents. They directly interfere in the results. Thus, this study aimed to evaluate the impact of the addition of structured vocabularies of Health Sciences area in Lattes Database, in the recovery of similar profiles of researchers that work in Biological Sciences and Health Sciences, using Query Expansion, Data Mining procedures, Vector Models and Trigram Phrase Matching algorithm. Crosschecking keywords of articles registered in Lattes Database and Medical Subject Headings (MeSH) and Health Sciences Descriptors (DeCS) terms, as well as comparisons between the results of queries using the original keywords and adding them to query expansion terms. The results show that the methodology used in this study can qualitatively increase the set of recovered profiles, contributing to the improvement of CNPq Information Systems.
23

Cell assemblies para expansão de consultas / Cell assemblies for query expansion

Volpe, Isabel Cristina January 2011 (has links)
Uma das principais tarefas de Recuperação de Informações é encontrar documentos que sejam relevantes a uma consulta. Esta tarefa é difícil porque, em muitos casos os termos de busca escolhidos pelo usuário são diferentes dos termos utilizados pelos autores dos documentos. Ao longo dos anos, várias abordagens foram propostas para lidar com este problema. Uma das técnicas mais utilizadas, com o objetivo de expandir o número de documentos relevantes recuperados é a Expansão de Consultas, que consiste em expandir a consulta com a adição de termos relacionados. Este trabalho propõe um método que utiliza o modelo de Cell Assemblies para a expansão da consulta. Cell Assemblies são grupos de neurônios conectados, com padrões de disparo, que permitem que a atividade persista mesmo após a remoção dos estímulos externos. A modificação das sinapses entre os neurônios é feita através de regras de aprendizagem Hebbiana. Neste trabalho, o modelo Cell Assemblies foi adaptado a fim de aprender os relacionamentos entre os termos de uma coleção de documentos. Esses relacionamentos são utilizados para expandir a consulta original com termos relacionados. A avaliação experimental sobre uma coleção de testes padrão em Recuperação de Informações mostrou que algumas consultas melhoraram significativamente seus resultados com a técnica proposta. / One of the main tasks in Information Retrieval is to match a user query to the documents that are relevant for it. This matching is challenging because in many cases the keywords the user chooses will be different from the words the authors of the relevant documents have used. Throughout the years, many approaches have been proposed to deal with this problem. One of the most popular consists in expanding the query with related terms with the goal of retrieving more relevant documents. In this work, we propose a new method in which a Cell Assembly model is applied for query expansion. Cell Assemblies are reverberating circuits of neurons that can persist long beyond the initial stimulus has ceased. They learn through Hebbian Learning rules and have been used to simulate the formation and the usage of human concepts. We adapted the Cell Assembly model to learn relationships between the terms in a document collection. These relationships are then used to augment the original queries. Our experiments use standard Information Retrieval test collections and show that some queries significantly improved their results with the proposed technique.
24

Modelo fuzzy para recuperação de informação utilizando multiplas ontologias relacionadas / Fuzzy information retrieval model using multiple related ontologies

Leite, Maria Angelica de Andrade 13 August 2018 (has links)
Orientador: Ivan Luiz Marques Ricarte / Tese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação / Made available in DSpace on 2018-08-13T09:57:01Z (GMT). No. of bitstreams: 1 Leite_MariaAngelicadeAndrade_D.pdf: 1895167 bytes, checksum: fdce073bd2fe535322ed192c85f7b61a (MD5) Previous issue date: 2009 / Resumo: Com a crescente popularidade da World Wide Web mais pessoas têm acesso à informação cujo volume vem expandindo ao longo do tempo. A área de recuperação de informação ganhou um novo desafio visando buscar os recursos pelo significado da informação neles contida. Uma forma de recuperar a informação, pelo seu significado, é pelo uso de uma base de conhecimento que modela os conceitos de um domínio e seus relacionamentos. Atualmente, ontologias têm sido utilizadas para modelar bases de conhecimento. Para tratar com a imprecisão e a incerteza, presentes no conhecimento e no processo de recuperação de informação, são empregadas técnicas da teoria de conjuntos fuzzy. Trabalhos precedentes codificam a base de conhecimento utilizando apenas uma ontologia. Entretanto, uma coleção de documentos pode tratar temas pertencentes a domínios diferentes, expressos por ontologias distintas, que podem estar relacionados. Neste trabalho, uma forma de organização e representação do conhecimento em múltiplas ontologias relacionadas foi investigada e um novo método de expansão de consulta foi desenvolvido. A organização do conhecimento e o método de expansão de consulta foram integrados no modelo fuzzy para recuperação de informação utilizando múltiplas ontologias relacionadas. O desempenho do modelo foi comparado com outro modelo fuzzy para recuperação de informação e com a máquina de busca Lucene do projeto Apache. Em ambos os casos o modelo proposto apresentou uma melhora nas medidas de precisão e cobertura. / Abstract: With the World Wide Web popularity growth, more people has access to information and this information volume is expanding over the time. The information retrieval area has a new challenge intending to search information resources by their meaning. A way to retrieve information, by its meaning, is by using a knowledge base that encodes the domain concepts and their relationships. Nowadays ontologies are being used to model knowledge bases. To deal with imprecison and uncertainty present in the knowledge and in the information retrieval process, fuzzy set theory techniques are employed. Preceding works encode a knowledge base using just one ontology. However a document collection can deal with different domain themes, expressed by distinct ontologies, that can be related. In this work a way of knowledge organization and representation, using multiple related ontologies, was investigated and a new method of query expansion was developed. The knowledge organization and the query expansion method were integrated in the fuzzy model for information retrieval based on mutiple related ontologies. The model performance was compared with another fuzzy-based approach for information retrieval and with the Apache Lucene search engine. In both cases the proposed model improves the precision and recall measures. / Doutorado / Engenharia de Computação / Doutor em Engenharia Elétrica
25

Query Expansion Research and Application in Search Engine Based on Concepts Lattice

Cui, Jun January 2009 (has links)
Formal concept analysis is increasingly applied to query expansion and data mining problems. In this paper I analyze and compare the current concept lattice construction algorithm, and choose iPred and Border algorithms to adapt for query expansion. After I adapt two concept lattice construction algorithms, I apply these four algorithms on one query expansion prototype system. The calculation time for four algorithms are recorded and analyzed. The result of adapted algorithms is good. Moreover I find the efficiency of concept lattice construction is not consistent with complex analysis result. In stead, it is high depend on the structure of data set, which is data source of concept lattice.
26

Rozšiřování dotazů pro vyhledávání medicínských informací / Query expansion for medical information retrieval

Bibyna, Feraena January 2015 (has links)
One of the challenges in medical information retrieval is the terminology gap between the documents (commonly written by medical professional, using medical jargons), and the queries (commonly composed by non professional, using layman terms). In this thesis, we investigate the effect of query expansion, using domain-specific knowledge resource, to deal with this challenge. We use the Unified Medical Language System (UMLS), a repository of biomedical vocabularies, and utilize two of its resources: the Metathesaurus and the Semantic Network. We use the query set and document set provided by CLEF eHealth organizer. The query sets, provided for the medical information retrieval shared task, represent two different use cases of medical information retrieval. We experiment with query expansion using synonymous terms and non-synonymous concepts, blind relevance feedback, field weighting, and linear interpolation of different systems. Powered by TCPDF (www.tcpdf.org)
27

Chinese-English cross-lingual information retrieval in biomedicine using ontology-based query expansion

Wang, Xinkai January 2011 (has links)
In this thesis, we propose a new approach to Chinese-English Biomedical cross-lingual information retrieval (CLIR) using query expansion based on the eCMeSH Tree, a Chinese-English ontology extended from the Chinese Medical Subject Headings (CMeSH) Tree. The CMeSH Tree is not designed for information retrieval (IR), since it only includes heading terms and has no term weighting scheme for these terms. Therefore, we design an algorithm, which employs a rule-based parsing technique combined with the C-value term extraction algorithm and a filtering technique based on mutual information, to extract Chinese synonyms for the corresponding heading terms. We also develop a term-weighting mechanism. Following the hierarchical structure of CMeSH, we extend the CMeSH Tree to the eCMeSH Tree with synonymous terms and their weights. We propose an algorithm to implement CLIR using the eCMeSH Tree terms to expand queries. In order to evaluate the retrieval improvements obtained from our approach, the results of the query expansion based on the eCMeSH Tree are individually compared with the results of the experiments of query expansion using the CMeSH Tree terms, query expansion using pseudo-relevance feedback, and document translation. We also evaluate the combinations of these three approaches. This study also investigates the factors which affect the CLIR performance, including a stemming algorithm, retrieval models, and word segmentation.
28

Automatisk synonymgenerering med Word2Vec for query expansion inom e-handel

Kojic, Kemal, Petersson, Emil January 2018 (has links)
I detta arbete undersöks hur väl automatisk synonymgenerering genom maskininlärnings-metoden Word2Vec, som tränats över en datamängd från Google News på hundra miljarder ord, lämpar sig för query expansion inom ehandel. Detta görs genom användning av produkt- och eventdata från ett välkänt modebolag där synonymer genereras utifrån söksträngar som loggats i eventdata genom olika metoder som i sin tur bildar synonymböcker som används i framtida sökningar med hjälp av query expansion. För att kunna besvara studiens forskningsfrågor utförs först en kvantitativ analys. Denna analys utförs på data som matchade köp, produktträffar, no-hits och söktid. Information om denna data genereras utifrån en söksimulator som simulerar loggade händelser från användarsessioner i ett ehandelssystem. Därefter filtreras de genererade synonymböckerna genom att ta bort synonymer som är kopplade till de söksträngar som producerat ett sämre resultat i simuleringen med synonymer, än utan. För att validera vårt resultat från den kvantitativa analysen utförs även en kvalitativ analys på skillnaden i sökresultatet som de olika metoderna tar fram, där vi undersöker vad det är för produkter som tas fram med hjälp av synonymerna, för att undersöka dess relevans. Våra tester uppvisar att ett lägre tröskelvärde leder till fler produkträffar och minskar antalet no-hits. Antalet produktträffar ökades med mellan 4\%-10\%, no-hits reducerades med mellan 11\%-22\%. I de fall där söksträngen har tilldelats bra synonymer påverkas relevansen av produkterna positivt då fler relevanta produkter dyker upp i sökresultatet. I de fall där söksträngen har tilldelats mindre bra synonymer påverkas relevansen av produkterna negativt då vissa irrelevanta produkter dyker upp i sökresultatet som användaren antagligen inte vill se i sitt sökresultat. I alla fall där de automatiskt genererade synonymerna används så befinner sig majoriteten av alla köpta produkter i den första halvan av sökresultatet, däremot minskar antalet köpta produkter på den första platsen i sökresultatet i alla fallen. / In this thesis, we examine automatic synonym generation through the use of the machine learning algorithm Word2Vec that has been trained using a Google News data set containing a hundred million words to find out if it is suitable for query expansions in e-commerce. This is examined through the use of product- and event data from a well-known fashion company where synonyms are generated from search-queries that have been logged in the event data through different methods, resulting in thesaurus' that are used in future searches with the use of query expansions. In order to answer the thesis' research question, a quantitative analysis is performed. This analysis is performed on data such as matched payments, product matches, no-hits and search time. Information about this data is generated through a search simulator that simulates logged events from user sessions in a e-commerce system. The generated thesaurus' are later filtered through the removal of synonyms that are connected to search queries whose results have produced worse results than the results without synonyms. In order to validate our results from the quantitative analysis a qualitative analysis is also performed on the difference of the search result that the different methods produce. In this qualitative analysis we research what type of products that the added synonyms produce in order to understand the relevance of the search query. Our tests show that the lower the threshold is, the higher the number of product hits and the lower the number of no-hits. Our tests shows that the number of product hits was increased by between 4\%-10\%, the number of no-hits was reduced by 11\%-22\%. In all of the tests using automatically generated synonyms, the results show that the majority of the purchased products are presented in the first half of the search result, however, in all of the tests using automatically generated synonyms the number of purchases in the first position of the search result was reduced.
29

Utilising semantic technologies for intelligent indexing and retrieval of digital images

Osman, T., Thakker, Dhaval, Schaefer, G. 15 October 2013 (has links)
Yes / Yes / The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing a colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they in principle rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this paper we present a semantically-enabled image annotation and retrieval engine that is designed to satisfy the requirements of the commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as the exploitation of lexical databases for explicit semantic-based query expansion.
30

Automated Vocabulary Building for Characterizing and Forecasting Elections using Social Media Analytics

Mahendiran, Aravindan 12 February 2014 (has links)
Twitter has become a popular data source in the recent decade and garnered a significant amount of attention as a surrogate data source for many important forecasting problems. Strong correlations have been observed between Twitter indicators and real-world trends spanning elections, stock markets, book sales, and flu outbreaks. A key ingredient to all methods that use Twitter for forecasting is to agree on a domain-specific vocabulary to track the pertinent tweets, which is typically provided by subject matter experts (SMEs). The language used in Twitter drastically differs from other forms of online discourse, such as news articles and blogs. It constantly evolves over time as users adopt popular hashtags to express their opinions. Thus, the vocabulary used by forecasting algorithms needs to be dynamic in nature and should capture emerging trends over time. This thesis proposes a novel unsupervised learning algorithm that builds a dynamic vocabulary using Probabilistic Soft Logic (PSL), a framework for probabilistic reasoning over relational domains. Using eight presidential elections from Latin America, we show how our query expansion methodology improves the performance of traditional election forecasting algorithms. Through this approach we demonstrate how we can achieve close to a two-fold increase in the number of tweets retrieved for predictions and a 36.90% reduction in prediction error. / Master of Science

Page generated in 0.1673 seconds