Spelling suggestions: "subject:"apache olr"" "subject:"apache solr""
1 |
BioEve: User Interface Framework Bridging IE and IRJanuary 2010 (has links)
abstract: Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to rapidly and effectively survey the literature is necessary for the creation of large scale models of the relationships among biomedical entities as well as hypothesis generation to guide biomedical research. To reduce the effort and time spent in performing these activities, an intelligent search system is required. Even though many systems aid in navigating through this wide collection of documents, the vastness and depth of this information overload can be overwhelming. An automated extraction system coupled with a cognitive search and navigation service over these document collections would not only save time and effort, but also facilitate discovery of the unknown information implicitly conveyed in the texts. This thesis presents the different approaches used for large scale biomedical named entity recognition, and the challenges faced in each. It also proposes BioEve: an integrative framework to fuse a faceted search with information extraction to provide a search service that addresses the user's desire for "completeness" of the query results, not just the top-ranked ones. This information extraction system enables discovery of important semantic relationships between entities such as genes, diseases, drugs, and cell lines and events from biomedical text on MEDLINE, which is the largest publicly available database of the world's biomedical journal literature. It is an innovative search and discovery service that makes it easier to search/navigate and discover knowledge hidden in life sciences literature. To demonstrate the utility of this system, this thesis also details a prototype enterprise quality search and discovery service that helps researchers with a guided step-by-step query refinement, by suggesting concepts enriched in intermediate results, and thereby facilitating the "discover more as you search" paradigm. / Dissertation/Thesis / M.S. Computer Science 2010
|
2 |
Using clickthrough data to optimize search result ranking : An evaluation of clickthrough data in terms of relevancy and efficiency / Användning av clickthrough data för att optimera rankning av sökresultat : En utvärdering av clickthrough data gällande relevans och effektivitetPaulsson, Anton January 2017 (has links)
Search engines are in a constant need for improvements as the rapid growth of information is affecting the search engines ability to return documents with high relevance. Search results are being lost in between pages and the search algorithms are being exploited to gain a higher ranking on the documents. This study attempts to minimize those two issues, as well as increasing the relevancy of search results by usage of clickthrough data to add another layer of weighting the search results. Results from the evaluation indicate that clickthrough data in fact can be used to gain more relevant search results.
|
3 |
Investigations of Free Text Indexing Using NLP : Comparisons of Search Algorithms and Models in Apache Solr / Undersöka hur fritextindexering kan förbättras genom NLPSundstedt, Alfred January 2023 (has links)
As Natural Language Processing progresses societal and applications like OpenAI obtain more considerable popularity in society, businesses encourage the integration of NLP into their systems. Both to improve the user experience and provide users with their requested information. For case management systems, a complicated task is to provide the user with relevant documents, since customers often have large databases containing similar information. This presumes that the user needs to match the requested topic perfectly. Imagine if there was a solution to search for context, instead of formulating the perfect prompt, via established NLP models like BERT. Imagine if the system understood its content. This thesis aims to investigate how a free text index can be improved using NLP from a user perspective and implement it. Using AI to help a free text index, in this case, Apache Solr, can make it easier for users to find the specific content the users are looking for. It is interesting to see how the search can be improved with the help of NLP models and present a more relevant result for the user. NLP can improve user prompts, known as queries, and assist in indexing the information. The task is to conduct a practical investigation by configuring the free text database Apache Solr, with and without NLP support. This is investigated by learning the search models' content, letting the search models provide their relevant search results, for some user queries, and evaluating the results. The investigated search models were a string-based model, an OpenNLP model, and BERT models segmented on paragraph level and sentence level. A hybrid search model of OpenNLP and BERT, on paragraph level, was the best solution overall.
|
Page generated in 0.0496 seconds