The most of existing text compression methods is based on the same base concept. First the Input text is divided into sequence of text units. These text units cat be single symbols, syllables or words. When compressing large text files, searching for redundancies over longer text units is usually more effective than searching over the shorter ones. But if we choose words as base units we cannot anymore catch redundancies over symbols and syllables. In this paper we propose a new text compression method called Hierarchical compresssion. It constructs hierarchical grammar to store redundancies over syllables, words and upper levels of text. The code of the text then consists of code of this grammer. We proposed a strategy for constructing hierarchical grammar for concrete input text and we proposed an effective way how to encode it. Above mentioned our proposed method is compared with some other common methods of text compression.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:297887 |
Date | January 2011 |
Creators | Kreibichová, Lenka |
Contributors | Lánský, Jan, Dvořák, Tomáš |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.002 seconds