This master's thesis presents methods for mining important pieces of information from text. It analyses the problem of terms extraction from large document collection and describes the implementation using C# language and Microsoft SQL Server. The system uses stemming and a number of statistical methods for term extraction. This project also compares used methods and suggests the process of the dictionary derivation.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:236591 |
Date | January 2012 |
Creators | Pavlín, Václav |
Contributors | Masařík, Karel, Kreslíková, Jitka |
Publisher | Vysoké učení technické v Brně. Fakulta informačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0015 seconds