Global ETD Search

1	Improving the relevance of search results via search-term disambiguation and ontological filtering Zhu, Dengya January 2007 (has links) With the exponential growth of the Web and the inherent polysemy and synonymy problems of the natural languages, search engines are facing many challenges such as information overload, mismatch of search results, missing relevant documents, poorly organized search results, and mismatch of human mental model of clustering engines. To address these issues, much effort including employing different information retrieval (IR) models, information categorization/clustering, personalization, semantic Web, ontology-based IR, and so on, has been devoted to improve the relevance of search results. The major focus of this study is to dynamically re-organize Web search results under a socially constructed hierarchical knowledge structure, to facilitate information seekers to access and manipulate the retrieved search results, and consequently to improve the relevance of search results. / To achieve the above research goal, a special search-browser is developed, and its retrieval effectiveness is evaluated. The hierarchical structure of the Open Directory Project (ODP) is employed as the socially constructed knowledge structure which is represented by the Tree component of Java. Yahoo! Search Web Services API is utilized to obtain search results directly from Yahoo! search engine databases. The Lucene text search engine calculates similarities between each returned search result and the semantic characteristics of each category in the ODP; and thus to assign the search results to the corresponding ODP categories by Majority Voting algorithm. When an interesting category is selected by a user, only search results categorized under the category are presented to the user, and the quality of the search results is consequently improved. / Experiments demonstrate that the proposed approach of this research can improve the precision of Yahoo! search results at the 11 standard recall levels from an average 41.7 per cent to 65.2 per cent; the improvement is as high as 23.5 per cent. This conclusion is verified by comparing the improvements of the P@5 and P@10 of Yahoo! search results and the categorized search results of the special search-browser. The improvement of P@5 and P@10 are 38.3 per cent (85 per cent - 46.7 per cent) and 28 per cent (70 per cent - 42 per cent) respectively. The experiment of this research is well designed and controlled. To minimize the subjectiveness of relevance judgments, in this research five judges (experts) are asked to make their relevance judgments independently, and the final relevance judgment is a combination of the five judges’ judgments. The judges are presented with only search-terms, information needs, and the 50 search results of Yahoo! Search Web Service API. They are asked to make relevance judgments based on the information provided above, there is no categorization information provided. / The first contribution of this research is to use an extracted category-document to represent the semantic characteristics of each of the ODP categories. A category-document is composed of the topic of the category, description of the category, the titles and the brief descriptions of the submitted Web pages under this category. Experimental results demonstrate the category-documents of the ODP can represent the semantic characteristics of the ODP in most cases. Furthermore, for machine learning algorithms, the extracted category-documents can be utilized as training data which otherwise demand much human labor to create to ensure the learning algorithm to be properly trained. The second contribution of this research is the suggestion of the new concepts of relevance judgment convergent degree and relevance judgment divergent degree that are used to measure how well different judges agree with each other when they are asked to judge the relevance of a list of search results. When the relevance judgment convergent degree of a search-term is high, an IR algorithm should obtain a higher precision as well. On the other hand, if the relevance judgment convergent degree is low, or the relevance judgment divergent degree is high, it is arguable to use the data to evaluate the IR algorithm. This intuition is manifested by the experiment of this research. The last contribution of this research is that the developed search-browser is the first IR system (IRS) to utilize the ODP hierarchical structure to categorize and filter search results, to the best of my knowledge.
2	Um sistema inteligente baseado em ontologia para apoio ao esclarecimento de dúvida Amorim, Marta Talitha Carvalho Freire de 31 August 2012 (has links) Made available in DSpace on 2016-12-23T14:33:48Z (GMT). No. of bitstreams: 1 Marta Talitha Carvalho Freire De Amorim.pdf: 1718108 bytes, checksum: 60eb34219545d0ffacecb5e5e80f2ea7 (MD5) Previous issue date: 2012-08-31 / When people want to learn a concept, the most common way is to use a search engine like: Google, Yahoo, Bing, among others. A natural language query is submitted to a search tool and which returns a lot of pages related to the concept studied. Usually the returned pages are listed and organized mainly based on the combination of keywords instead of using the interpretation and relevance of the terms found. The user must have read a lot of pages and selects the most appropriate to his needs. This kind of behavior takes time and focus on user-learner is dispersed to his goal. The use of intelligent systems that support the clarification of doubt has intent to solve this problem, presenting the most accurate answers to questions or sentences in natural language. Examples clarification of doubt systems are: question-answer system, help-desk intelligent among others. This work uses an architectural approach to a question answering system based on three steps: question analysis, selection and extraction of the answer and answer generation. One of the merits of this architecture is to use techniques that complement each other, such as ontologies, information retrieval techniques and a knowledge base written in AIML language to extract the answer quickly. The focus of this work is to answer questions WH-question (What, Who, When, Where, What, Who) of the English language / Quando as pessoas querem aprender algum conceito, a forma mais comum é usar uma ferramenta de pesquisa, como: Google, Yahoo, Bing, dentre outros. Uma consulta em linguagem natural é submetida para uma ferramenta e a pesquisa retorna uma grande quantidade de páginas relacionadas ao conceito pesquisado. Geralmente as páginas retornadas são listadas e organizadas principalmente baseando-se na combinação de palavras chaves ao invés de utilizar a interpretação e a relevância dos termos consultados. O usuário terá que ler uma grande quantidade de páginas e selecionar a mais apropriada a sua necessidade. Esse tipo de comportamento consome tempo e o foco do usuário-aprendiz é disperso do seu objetivo. A utilização de um sistema inteligente que apoie o esclarecimento de dúvidas pretende resolver esse problema, apresentando as respostas mais precisas ou frases para as perguntas em linguagem natural. Exemplos de sistemas de esclarecimento de dúvidas são: sistema de pergunta-resposta, help-desk inteligentes, entre outros. Este trabalho utiliza uma abordagem arquitetônica para um sistema de pergunta-resposta baseado em três passos: análise da pergunta, seleção e extração da resposta e geração da resposta. Um dos méritos dessa arquitetura é utilizar técnicas que se complementam, tais como: ontologias, técnicas de recuperação de informação e uma base de conhecimento escrita em linguagem AIML para extrair a resposta de forma rápida. O foco deste trabalho é responder perguntas WH-question (O que, Quem, Quando, Onde, Quais, Quem) da língua inglesa Sistema de pergunta-resposta Ontologia e recuperação da informação Question answering system Ontology and information retrieval

Search results

Improving the relevance of search results via search-term disambiguation and ontological filtering

Um sistema inteligente baseado em ontologia para apoio ao esclarecimento de dúvida