Global ETD Search

411	Recuperação de informação baseada em ontologia: uma proposta utilizando o modelo vetorial / Ontology based information retrieval: a proposal using the vector space model Janaite Neto, Jorge [UNESP] 30 May 2018 (has links) Submitted by Jorge Janaite Neto (janaite@gmail.com) on 2018-06-24T23:56:37Z No. of bitstreams: 1 janaite_neto_j_me_mar.pdf: 1649007 bytes, checksum: 66467a076d4f716197896c6dc3c5ee2b (MD5) / Approved for entry into archive by Satie Tagara (satie@marilia.unesp.br) on 2018-06-25T13:46:39Z (GMT) No. of bitstreams: 1 janaiteneto_j_me_mar.pdf: 1649007 bytes, checksum: 66467a076d4f716197896c6dc3c5ee2b (MD5) / Made available in DSpace on 2018-06-25T13:46:39Z (GMT). No. of bitstreams: 1 janaiteneto_j_me_mar.pdf: 1649007 bytes, checksum: 66467a076d4f716197896c6dc3c5ee2b (MD5) Previous issue date: 2018-05-30 / Não recebi financiamento / A recuperação de informação ocorre por meio da comparação entre as representações dos documentos de um acervo e a representação da necessidade de informação do usuário. Um documento é recuperado quando sua representação coincidir total ou parcialmente com a representação da necessidade de informação do usuário. O processo de recuperação de informação pode ser visto como um problema linguístico no qual o conteúdo informacional dos documentos e a necessidade de informação do usuário são representados por um conjunto de termos. A eficiência do processo de recuperação de informação depende da qualidade das representações dos documentos e dos termos empregados pelo usuário para representar sua necessidade de informação. Quanto mais compatíveis forem essas representações maior será a eficiência do processo de recuperação. A partir de uma pesquisa exploratória e descritiva fundamentada em bibliografia específica, este trabalho propõe a utilização de ontologias computacionais em sistemas de recuperação de informação baseados no Modelo Espaço Vetorial. As ontologias são empregadas como estrutura terminológica externa utilizadas tanto na expansão dos termos de indexação quanto na expansão dos termos que compõe a expressão de busca. A expansão dos termos de indexação é feita logo após a extração dos termos mais representativos do documento em análise durante o processo de indexação, consistindo na adição de novos termos conceitualmente relacionados a fim de enriquecer a representação do documento. A expansão da consulta é obtida a partir da adição de novos termos relacionados aos já existentes na expressão de busca com o objetivo de melhor contextualizá-los. Nesta proposta utiliza-se apenas a estrutura terminológica e hierárquica oferecida por uma ontologia computacional OWL, sem considerar os demais tipos de relações possíveis nem as restrições lógicas que podem ser descritas, podendo esses recursos serem utilizados em trabalhos futuros na tentativa de melhorar ainda mais a eficiência do processo de recuperação. A proposta apresentada neste estudo pode ser implementada e futuramente tornar-se um sistema de recuperação de informação totalmente operacional. / The information retrieval occurs by means of match between the representations of documents from a collection and the representation of user information’s needs. A document is retrieved when its representation matches totally or partially to the user information’s needs. The process of information retrieval can be seen as a linguistic issue in which the document information content and the user information need are represented by a set of terms. Its efficiency depends on the quality of the representations of the documents and the terms used to represent the user’s information need. The more compatible these representations were, the more efficient the retrieval process. Based on an exploratory and descriptive research substantiated in a specific bibliography, this paper offers to use computational ontologies in information retrieval systems based on the Vector Space Model. The ontologies are applied as external terminological structures used in the indexing terms expansion as well as in the expansion of the terms which compound the query expression. The indexing terms expansion is made as soon as the extraction of the more representative terms of the document in analysis during the indexing process, consisting on the adding of new conceptually related terms in order to improve the document representation. Query expansion is obtained from adding new related terms to the existent ones in the query expression to better contextualize them. In this propose, only the terminological and hierarchical structure offered by an OWL computational ontology was used, regardless other possible relations and logical restrictions that could be descripted, saving these resources to be used in further works in an attempt to improve the retrieval process efficiency. The shown proposition can be implemented and become a fully operational information retrieval system. Recuperação de informação Ontologia Indexação automática Expansão de consulta OWL OWL2 Information retrieval Ontology Automatic indexing Query expansion
412	Možnosti využití Microsoft Power BI v prostředí malých a středních firem Plánička, Ondřej January 2015 (has links) This thesis describes modern approaches to Business intelligence. It also explains in detail platform Power BI and Business intelligence add-ins for Microsoft Excel 2013. This thesis also shows how to create an application that communicates with free billing service iDoklad. The resulting aaplication is evaluated from an economic and implementation standpoint.
413	Suporte a consultas temporais por palavras-chave em documentos XML / Supporting temporal keyword queries on XML documents Manica, Edimar January 2010 (has links) Consultas por palavras-chave permitem o acesso fácil a dados XML, uma vez que não exigem que o usuário aprenda uma linguagem de consulta estruturada nem estude possíveis esquemas de dados complexos. Com isso, vários motores de busca XML foram propostos para permitir a extração de fragmentos XML relevantes para consultas por palavras-chave. No entanto, esses motores de busca tratam as expressões temporais da mesma forma que qualquer outra palavra-chave. Essa abordagem ocasiona inúmeros problemas, como por exemplo, considerar como casamentos para uma expressão temporal nodos do domínio preço ou código. Este trabalho descreve TPI (Two Phase Interception), uma abordagem que permite o suporte a consultas temporais por palavras-chave em documentos XML orientados a dados. O suporte a consultas temporais é realizado através de uma camada adicional de software que executa duas interceptações no processamento de consultas, realizado por um motor de busca XML. Esta camada adicional de software é responsável pelo tratamento adequado das informações temporais presentes na consulta e no conteúdo dos documentos XML. O trabalho ainda especifica TKC (Temporal Keyword Classification), uma classificação de consultas temporais que serve de guia para qualquer mecanismo de consulta por palavras-chave, inclusive TPI. São apresentados os algoritmos de mapeamento das diferentes formas de predicados temporais por palavras-chave, especificadas em TKC, para expressões relacionais a fim de orientar a implementação do processamento das consultas temporais. É proposto um índice temporal e definidas estratégias para identificação de caminhos temporais, desambiguação de formatos de valores temporais, identificação de datas representadas por vários elementos e identificação de intervalos temporais. São demonstrados experimentos que comparam a qualidade, o tempo de processamento e a escalabilidade de um motor de busca XML com e sem a utilização de TPI. A principal contribuição desse trabalho é melhorar significativamente a qualidade dos resultados de consultas temporais por palavras-chave em documentos XML. / Keyword queries enable users to easily access XML data, since the user does not need to learn a structured query language or study possibly complex data schemas. Therewith, several XML search engines have been proposed to extract relevant XML fragments in response to keyword queries. However, these search engines treat the temporal expressions as any other keyword. This approach may lead to several problems. It could, for example, consider prices and codes as matches to a temporal expression. This work describes TPI (Two Phase Interception), an approach that supports temporal keyword queries on data-centric XML documents. The temporal query support is performed by adding an additional software layer that executes two interceptions in the query processing performed by a XML search engine. This additional software layer is responsible for the adequate treatment of the temporal expressions contained in the query and in the contents of the XML documents. This work also specifies TKC (Temporal Keyword Classification), a temporal query classification to be used as guidance for any keyword query mechanism, including TPI. We present the algorithms for mapping different temporal predicates expressed by keywords to relational expressions in order to guide the implementation of the temporal query processing. We propose a temporal index together with strategies to perform temporal path identification, format disambiguation, identification of dates represented by many elements and detection of temporal intervals. This work also reports on experiments which evaluate quality, processing time and scalability of an XML search engine with TPI and without TPI. The main contribution of this work is the significant improvement in the quality of the results of temporal keyword queries on XML documents. Recuperacao : Informacao XML (Linguagem de marcação) Banco : Dados Temporal query Keyword search XML
414	Indexing and querying dataspaces Mergen, Sérgio Luis Sardi January 2011 (has links) Over theWeb, distributed and heterogeneous sources with structured and related content form rich repositories of information commonly referred to as dataspaces. To provide access to this heterogeneous data, information integration systems have traditionally relied on the availability of a mediated schema, along with mappings between this schema and the schema of the source schemas. On dataspaces, where sources are plentiful, autonomous and extremely volatile, a system based on the existence of a pre-defined mediated schema and mapping information presents several drawbacks. Notably, the cost of keeping the mappings up to date as new sources are found or existing sources change can be prohibitively high. We propose a novel querying architecture that requires neither a mediated schema nor source mappings, which is based mainly on indexing mechanisms and on-the-fly rewriting algorithms. Our indexes are designed for data that is represented as relations, and are able to capture the structure of the sources, their instances and the connections between them. In the absence of a mediated schema, the user formulates structured queries based on what she expects to find. These queries are rewritten using a best-effort approach: the proposed rewriting algorithms compare a user query against the source schemas and produces a set of rewritings based on the matches found. Based on this architecture, two different querying approaches are tested. Experiments show that the indexing and rewriting algorithms are scalable, i.e., able to handle a very large number of structured Web sources; and that support simple, yet expressive queries that exploit the inherent structure of the data. Recuperacao : Informacao Banco : Dados Dataspaces Data integration Search engine Indexing Query rewriting
415	Cell assemblies para expansão de consultas / Cell assemblies for query expansion Volpe, Isabel Cristina January 2011 (has links) Uma das principais tarefas de Recuperação de Informações é encontrar documentos que sejam relevantes a uma consulta. Esta tarefa é difícil porque, em muitos casos os termos de busca escolhidos pelo usuário são diferentes dos termos utilizados pelos autores dos documentos. Ao longo dos anos, várias abordagens foram propostas para lidar com este problema. Uma das técnicas mais utilizadas, com o objetivo de expandir o número de documentos relevantes recuperados é a Expansão de Consultas, que consiste em expandir a consulta com a adição de termos relacionados. Este trabalho propõe um método que utiliza o modelo de Cell Assemblies para a expansão da consulta. Cell Assemblies são grupos de neurônios conectados, com padrões de disparo, que permitem que a atividade persista mesmo após a remoção dos estímulos externos. A modificação das sinapses entre os neurônios é feita através de regras de aprendizagem Hebbiana. Neste trabalho, o modelo Cell Assemblies foi adaptado a fim de aprender os relacionamentos entre os termos de uma coleção de documentos. Esses relacionamentos são utilizados para expandir a consulta original com termos relacionados. A avaliação experimental sobre uma coleção de testes padrão em Recuperação de Informações mostrou que algumas consultas melhoraram significativamente seus resultados com a técnica proposta. / One of the main tasks in Information Retrieval is to match a user query to the documents that are relevant for it. This matching is challenging because in many cases the keywords the user chooses will be different from the words the authors of the relevant documents have used. Throughout the years, many approaches have been proposed to deal with this problem. One of the most popular consists in expanding the query with related terms with the goal of retrieving more relevant documents. In this work, we propose a new method in which a Cell Assembly model is applied for query expansion. Cell Assemblies are reverberating circuits of neurons that can persist long beyond the initial stimulus has ceased. They learn through Hebbian Learning rules and have been used to simulate the formation and the usage of human concepts. We adapted the Cell Assembly model to learn relationships between the terms in a document collection. These relationships are then used to augment the original queries. Our experiments use standard Information Retrieval test collections and show that some queries significantly improved their results with the proposed technique. Recuperacao : Informacao Redes neurais Query expansion Information retrieval Neural networks Hebbian learning
416	Extension d'ASP pour couvrir des fragments DL traitables : étude théorique et implémentation / Extension of ASP to cover treatable DL fragments : theorical study and implementation Garreau, Fabien 24 November 2016 (has links) Les ontologies sont utilisées pour la représentation et l’interrogation de connaissances d’un domaine précis et peuvent être représentées en partie à l’aide des logiques de description légères. Ces ontologies peuvent être issues de plusieurs sources dont les données sont plus ou moins complétés, ainsi certaines données peuvent être incomplètes ou incohérentes empêchant la déduction d’autres données. L’Answer Set Programming (ASP) est un langage de programmation logique non-monotone à base de règles permettant de représenter des données incomplètes mais il ne permet pas de représenter les logiques de description légères. Les règles existentielles généralisent les logiques de description légères et forment aussi un langage de programmation logique mais ne permettant pas la définition d’exceptions. A partir d’une étude théorique d’ASP et des règles existentielles nous proposons de regrouper en un seul formalisme ces deux langages, nous définissons le formalisme des programmes non-monotones existentiels permettant de traiter un programme provenant d’une ontologie avec exceptions. Cette extension a pour but de généraliser à la fois ASP et les règles existentielles et d’utiliser la puissance des solveurs ASP pour raisonner sur des ontologies avec exceptions. Cette étude propose d’approfondir les travaux sur la décidabilité d’un programme avec l’extension aux programmes non-monotones existentiels. Nous proposons aussi d’améliorer les résultats lies à l’interrogation d’un programme ASP ainsi qu’une implémentation d’une extension du solveur ASPeRiX pour traiter les programmes non-monotones existentiels. / Ontologies are meant to represent or to queryknowledge from a precise domain and can berepresented, in part, by logic formalisms such thatdescription logics. These ontologies can be providedby several sources where knowledge is more or lesscomplete, hence some data can be incomplete orincoherent preventing the deduction of other data.Answer Set Programming (ASP) formalism is anon-monotonic logic programming language based onrules, often used in knowledge representation, whichhas the feature to represent incomplete data.However, it’s impossible to represent lite descriptionlogics in ASP, because of existential variables in rules.Existential rules generalize lite description logics andalso form a programmation logic language that butdoesn’t offer the possibility to represent exceptions.Based on a theoritical study of ASP and existentialrules, we propose to gather both languages in aunique formalism, we define non-monotonic existentialprogram allowing to deal with ontology withexceptions. This extension aims to generalize bothASP and existential rules program and to use theefficiency of ASP solvers to reason on ontologies withexceptions. This thesis propose to deepen worksabout entailment and decidability of a non-monotonicexistential program. Another result from this study isthe improvement of interrogation in ASP and theimplementation of an extension of the ASPeRiX solverto deal with non-monotonic existential programs. Règles existentielles Interrogation Answer Set Programming Existential rules Query Answering Decidability Inconsistency 004
417	Implementação de consultas para um modelo de dados temporal orientado a objetos / Implementation of queries for a temporal object data model Carvalho, Tanisi Pereira de January 1997 (has links) O modelo TF-ORM (Temporal Functionality in Objects With Roles Model) é um modelo de dados temporal orientado a objetos que utiliza o conceito de papeis para representar os diferentes comportamentos dos objetos. 0 modelo permite a modelagem dos aspectos estáticos e dinâmicos da aplicação pois considera todos os estados dos objetos ao longo de sua evolução. Sua linguagem de consulta e baseada na linguagem SQL e possibilita a recuperação de diferentes histórias do banco de dados. Este trabalho apresenta um sistema visual de consulta para o modelo TFORM. O VQS TF-ORM (Visual Query System TF-ORM) é um ambiente para recuperação de informações temporais. O sistema permite que as consultas sejam elaboradas de três formas alternativas: textual, gráfica ou por formulários. A linguagem gráfica possui o mesmo poder de expressão da linguagem textual, permitindo que a consulta seja elaborada diretamente sobre o esquema conceitual gráfico do modelo com o auxilio de um conjunto de janelas e elementos visuais. A recuperação de informações utilizando-se formulários não possui o mesmo poder de expressão da linguagem textual, mas possibilita a recuperação dos valores das propriedades de um determinado objeto através de uma hierarquia de janelas. A recuperação de informações através do sistema visual de consulta do modelo apresenta algumas facilidades tais como: representação visual dos operadores temporais do modelo, definição de níveis de detalhe e navegação sobre o esquema gráfico, armazenamento das consultas para posterior utilizando, possibilidade de representar uma consulta textual na forma visual e vice-versa, entre outras. Alem da preocupação com a definição de restrições temporais, o ambiente considera ainda as diferentes formas de apresentação do resultado da consulta que podem ser selecionadas pelo usuário. No sistema apresentado neste trabalho, o modelo TF-ORM é implementado em um banco de dados relacional que utiliza a linguagem SQL para recuperação de informações. Para a implementação do modelo em um banco de dados relacional foi feito um mapeamento, que determina como os conceitos de orientação a objetos, papel e tempo devem ser mapeados para tabelas e atributos no modelo relacional. As consultas realizadas na linguagem TF-ORM são então traduzidas para a linguagem de consulta do banco de dados relacional. O ambiente foi implementado utilizando a ferramenta para desenvolvimento de aplicações Delphi e o banco de dados Watcom, um banco de dados relacional que permite a recuperação de informações no padrão SQL/ANSI. / TF-ORM model (Temporal Functionality in Objects with Roles Model) is an object-oriented temporal data model which uses the role concept to represent different behaviors of objects. The model allows modelling of the static and the dynamic aspects of an application representing all the states of its evolution. The TF-ORM query language is based on the SQL language and enables the recovery of different database histories. This work represents a visual query system for the TF-ORM model. The VQS TF-ORM (Visual Query System TF-ORM) is an environment for recovery of temporal information. The system allows queries to be elaborated in three alternatives way: textual, graphic or by forms. The graphic language has the same functionality of the textual lan g uage permitting the query to be elaborated directly on the graphic conceptual schema of the model this operation is supported by a set of windows and visual elements. The information recovery using forms doesn't have the same functionality of the textual lan guage, but enables recovery of property values of an object through window hierarchies. Information recovery using the visual query system of the model presents some facilities: the visual representation of temporal operators, different levels of details for the navigation on the graphic schema, query storage for later use, possibility of representing a textual query in a visual way and vice-versa. The environment supports the definition of temporal constraints and the selection by the user of different representations forms for the results of a query. In the presented system, the TF-ORM model is implemented in a relational database which uses SQL language for information recovery. In order to implement the model in a relational database, a mapping was done - the concepts of the object orientation, roles and time were mapped in to tables and attributes to the relational model. The queries performed in the TF-ORM language are translated into the query lan guage of relational database. The environment was implemented using Delphi and the Watcom database, a relational database which allows information recovery in SQL/ANSI standard. Banco : Dados Banco : Dados temporais Orientacao : Objetos Database Information recovery Visual query language Temporal model
418	O estudo e desenvolvimento do protótipo de uma ferramenta de apoio a formulação de consultas a bases de dados na área da saúde / The study and development of the prototype of a tool for supporting query formulation to databases in the health area Webber, Carine Geltrudes January 1997 (has links) O objetivo deste trabalho é, através do estudo de diversas tecnologias, desenvolver o protótipo de uma ferramenta capaz de oferecer suporte ao usuário na formulacdo de uma consulta a MEDLINE (Medical Literature Analysis and Retrieval System On Line). A MEDLINE é um sistema de recuperação de informações bibliográficas, na área da biomedicina, desenvolvida pela National Library of Medicine. Ela é uma ferramenta cuja utilizando tem sido ampliada nesta área em decorrência do aumento da utilizando de literatura, disponível eletronicamente, por profissionais da área da saúde. As pessoas, em geral, buscam informação e esperam encontrá-la exatamente de acordo com as suas expectativas, de forma ágil e utilizando todas as fontes de recursos disponíveis. Foi com este propósito que surgiram os primeiros Sistema de Recuperação de Informação (SRI) onde, de forma simplificada, um usuário constrói uma consulta, a qual expressa sua necessidade de informação, em seguida o sistema a processa e os resultados obtidas através dela retornam ao usuário. Grande parte dos usuários encontram dificuldades em representar a sua necessidade de informação de forma a obter resultados satisfatórios em um SRI. Os termos que o usuário escolhe para compor a consulta nem sempre são os mesmos que o sistema reconhece. A fim de que um usuário seja bem sucedido na definição dos termos que compõem a sua consulta é aconselhável que ele conheça a terminologia que foi empregada na indexação dos itens que ele deseja recuperar ou que possa contar com um intermediário que possua esse conhecimento. Em situações em que nenhuma dessas possibilidades seja verdadeira recursos que viabilizem uma consulta bem sucedida se fazem necessários. Este trabalho, inicialmente, apresenta um estudo geral sobre os Sistemas de Recuperação de Informações (SRI), enfocando todos os processos envolvidos e relacionados ao armazenamento, organização e a própria recuperação. Posteriormente, são destacados aspectos relacionados aos vocabulários e classificações medicas em uso, os quais serão Úteis para uma maior compreensão das dificuldades encontradas pelos usuários durante a interação com um sistema com esta finalidade. E, finalmente, é apresentado o protótipo do Sistema para Formulação de Consultas a MEDLINE, bem como seus componentes e funcionalidades. O Sistema para Formulação de Consultas a MEDLINE foi desenvolvido com o intuito de permitir que o usuário utilize qualquer termo na formulação de uma consulta destinada a MEDLINE. Ele possibilita a integração de diferentes terminologias médicas, originárias de vocabulários e classificações disponíveis em língua portuguesa e atualmente em uso. Esta abordagem permite a criação de uma terminologia biomédica mais completa, sendo que cada termo mantém relacionamentos, os quais descrevem a sua semântica, com outros. / The goal of this work is, through the study of many technologies, to develop the prototype of a tool able to offer support to the user in query formulation to the MEDLINE (Medical Literature Analysis and Retrieval System On Line). The MEDLINE is a bibliographical information retrieval system in the biomedicine area developed by National Library of Medicine. It is a tool whose usefulness has been amplifyed in this area by the increase of literature utilization, eletronically available, by health care profissionals. People, in general, look for information and are interested in finding it exactly like their expectations, in an agile way and using every single information source available. With this purpouse the first Information Retrieval System (IRS ) emerged, where in a simplifyed way, a user defines a query, that expresses an information necessity and, one step ahead, the system processes it and returns to the user answers from the query. Most of the users think is difficult to represent their information necessity in order to be succesful in searching an IRS. The terms that the user selects to compose the query are not always the same that the system recognizes. In order to be successfull in the definition of the terms that will compose his/her query is advisable that the user know the terminology that was employed in the indexing process of the wanted items or that he/she can have an intermediary person who knows about it. In many situations where no one of these possibilities can be true, resources that make a successfull query possible will be needed. This work, firstly, presents a general study on IRS focusing all the process involved and related to the storage, organization and retrieval. Lately, aspects related to the medical classifications and vocabulary are emphasized, which will be usefull for a largest comprehension of the difficulties found by users during interaction with a system like this. And, finally, the prototype of the Query Formulation System to MEDLINE is presented, as well as its components and funcionalities. The Query Formulation System to MEDLINE was developed with the intention of allowing the user to use any term in the formulation of a query to the MEDLINE. It allows the integration of different medical terminologies originated from classifications and vocabulary available in Portuguese language and in use today. This approach permits the creation of a more complete biomedical terminology in which each term maintains relationships that describe its semantic. Armazenamento : Dados Recuperacao : Informacao Formulacao : Consulta Tesauro Informática médica Information retrieval Query formulation Medical terminology Thesaurus
419	Evaluating conjunctive and graph queries over the EL profile of OWL 2 Stefanoni, Giorgio January 2015 (has links) OWL 2 EL is a popular ontology language that is based on the EL family of description logics and supports regular role inclusions,axioms that can capture compositional properties of roles such as role transitivity and reflexivity. In this thesis, we present several novel complexity results and algorithms for answering expressive queries over OWL 2 EL knowledge bases (KBs) with regular role inclusions. We first focus on the complexity of conjunctive query (CQ) answering in OWL 2 EL and show that the problem is PSpace-complete in combined complexity, the complexity measured in the total size of the input. All the previously known approaches encode the regular role inclusions using finite automata that can be worst-case exponential in size, and thus are not optimal. In our PSpace procedure, we address this problem by using a novel, succinct encoding of regular role inclusions based on pushdown automata with a bounded stack. Moreover, we strengthen the known PSpace lower complexity bound and show that the problem is PSpace-hard even if we consider only the regular role inclusions as part of the input and the query is acyclic; thus, our algorithm is optimal in knowledge base complexity, the complexity measured in the size of the KB, as well as for acyclic queries. We then study graph queries for OWL 2 EL and show that answering positive, converse- free conjunctive graph queries is PSpace-complete. Thus, from a theoretical perspective, we can add navigational features to CQs over OWL 2 EL without an increase in complexity. Finally, we present a practicable algorithm for answering CQs over OWL 2 EL KBs with only transitive and reflexive composite roles. None of the previously known approaches target transitive and reflexive roles specifically, and so they all run in PSpace and do not provide a tight upper complexity bound. In contrast, our algorithm is optimal: it runs in NP in combined complexity and in PTime in KB complexity. We also show that answering CQs is NP-hard in combined complexity if the query is acyclic and the KB contains one transitive role, one reflexive role, or nominalsâconcepts containing precisely one individual. 006.3
420	Implementace suggesteru pro vyhledávač OpenGrok / Suggester implementation for the OpenGrok search engine Hornáček, Adam January 2018 (has links) The suggester functionality is an important feature of modern search engines. The aim of the thesis is to implement it for the OpenGrok project. The OpenGrok search engine is based on Apache Lucene and supports its query syntax. Presented suggester implementation supports this query syntax and provides suggestions not only for prefixes but also for wildcards, regular expressions, or phrases. The implementation also takes into account the possibility of grouping queries. That means, if one query is already specified and user is typing another query, then the first query will restrict the suggestions for the second query. The promotion of specific suggestions is based on the underlying Lucene index data structure and previous searches of the users.

Search results