In this work I argue that field of text search has focused mostly on long text documents, but there is a growing need for efficient short text search, which has different user expectations. Due to this reduced data set size requirements different algorithmic techniques become more computationally affordable. The focus of this work is on approximate and prefix search and purely text based ranking methods, which are needed due to lower precision of text statistics on short text. A basic prototype search engine has been created using the researched techniques. Its capabilities were demonstrated on example search scenarios and the implementation was compared to two other open source systems representing currently recommended approaches for short text search problem. The results show feasibility of the implemented prototype regarding both user expectations and performance. Several options of future direction of the system are proposed.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:262227 |
Date | January 2016 |
Creators | Maršálek, Tomáš |
Contributors | Palovská, Helena, Strossa, Petr |
Publisher | Vysoká škola ekonomická v Praze |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0019 seconds