Global ETD Search

1	Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem Chapin, Brenton 08 1900 (has links) Burrows-Wheeler compression is a three stage process in which the data is transformed with the Burrows-Wheeler Transform, then transformed with Move-To-Front, and finally encoded with an entropy coder. Move-To-Front, Transpose, and Frequency Count are some of the many algorithms used on the List Update problem. In 1985, Competitive Analysis first showed the superiority of Move-To-Front over Transpose and Frequency Count for the List Update problem with arbitrary data. Earlier studies due to Bitner assumed independent identically distributed data, and showed that while Move-To-Front adapts to a distribution faster, incurring less overwork, the asymptotic costs of Frequency Count and Transpose are less. The improvements to Burrows-Wheeler compression this work covers are increases in the amount, not speed, of compression. Best x of 2x-1 is a new family of algorithms created to improve on Move-To-Front's processing of the output of the Burrows-Wheeler Transform which is like piecewise independent identically distributed data. Other algorithms for both the middle stage of Burrows-Wheeler compression and the List Update problem for which overwork, asymptotic cost, and competitive ratios are also analyzed are several variations of Move One From Front and part of the randomized algorithm Timestamp. The Best x of 2x - 1 family includes Move-To-Front, the part of Timestamp of interest, and Frequency Count. Lastly, a greedy choosing scheme, Snake, switches back and forth as the amount of compression that two List Update algorithms achieves fluctuates, to increase overall compression. The Burrows-Wheeler Transform is based on sorting of contexts. The other improvements are better sorting orders, such as “aeioubcdf...” instead of standard alphabetical “abcdefghi...” on English text data, and an algorithm for computing orders for any data, and Gray code sorting instead of standard sorting. Both techniques lessen the overwork incurred by whatever List Update algorithms are used by reducing the difference between adjacent sorted contexts. Data compression (Computer science) Burrows-Wheeler data compression list update
2	Alternative Measures for the Analysis of Online Algorithms Dorrigiv, Reza 26 February 2010 (has links) In this thesis we introduce and evaluate several new models for the analysis of online algorithms. In an online problem, the algorithm does not know the entire input from the beginning; the input is revealed in a sequence of steps. At each step the algorithm should make its decisions based on the past and without any knowledge about the future. Many important real-life problems such as paging and routing are intrinsically online and thus the design and analysis of online algorithms is one of the main research areas in theoretical computer science. Competitive analysis is the standard measure for analysis of online algorithms. It has been applied to many online problems in diverse areas ranging from robot navigation, to network routing, to scheduling, to online graph coloring. While in several instances competitive analysis gives satisfactory results, for certain problems it results in unrealistically pessimistic ratios and/or fails to distinguish between algorithms that have vastly differing performance under any practical characterization. Addressing these shortcomings has been the subject of intense research by many of the best minds in the field. In this thesis, building upon recent advances of others we introduce some new models for analysis of online algorithms, namely Bijective Analysis, Average Analysis, Parameterized Analysis, and Relative Interval Analysis. We show that they lead to good results when applied to paging and list update algorithms. Paging and list update are two well known online problems. Paging is one of the main examples of poor behavior of competitive analysis. We show that LRU is the unique optimal online paging algorithm according to Average Analysis on sequences with locality of reference. Recall that in practice input sequences for paging have high locality of reference. It has been empirically long established that LRU is the best paging algorithm. Yet, Average Analysis is the first model that gives strict separation of LRU from all other online paging algorithms, thus solving a long standing open problem. We prove a similar result for the optimality of MTF for list update on sequences with locality of reference. A technique for the analysis of online algorithms has to be effective to be useful in day-to-day analysis of algorithms. While Bijective and Average Analysis succeed at providing fine separation, their application can be, at times, cumbersome. Thus we apply a parameterized or adaptive analysis framework to online algorithms. We show that this framework is effective, can be applied more easily to a larger family of problems and leads to finer analysis than the competitive ratio. The conceptual innovation of parameterizing the performance of an algorithm by something other than the input size was first introduced over three decades ago [124, 125]. By now it has been extensively studied and understood in the context of adaptive analysis (for problems in P) and parameterized algorithms (for NP-hard problems), yet to our knowledge this thesis is the first systematic application of this technique to the study of online algorithms. Interestingly, competitive analysis can be recast as a particular form of parameterized analysis in which the performance of opt is the parameter. In general, for each problem we can choose the parameter/measure that best reflects the difficulty of the input. We show that in many instances the performance of opt on a sequence is a coarse approximation of the difficulty or complexity of a given input sequence. Using a finer, more natural measure we can separate paging and list update algorithms which were otherwise indistinguishable under the classical model. This creates a performance hierarchy of algorithms which better reflects the intuitive relative strengths between them. Lastly, we show that, surprisingly, certain randomized algorithms which are superior to MTF in the classical model are not so in the parameterized case, which matches experimental results. We test list update algorithms in the context of a data compression problem known to have locality of reference. Our experiments show MTF outperforms other list update algorithms in practice after BWT. This is consistent with the intuition that BWT increases locality of reference. Analysis of Algorithms Online Algorithms Paging List Update Adaptive Analysis Data Compression Computer Science
3	Alternative Measures for the Analysis of Online Algorithms Dorrigiv, Reza 26 February 2010 (has links) In this thesis we introduce and evaluate several new models for the analysis of online algorithms. In an online problem, the algorithm does not know the entire input from the beginning; the input is revealed in a sequence of steps. At each step the algorithm should make its decisions based on the past and without any knowledge about the future. Many important real-life problems such as paging and routing are intrinsically online and thus the design and analysis of online algorithms is one of the main research areas in theoretical computer science. Competitive analysis is the standard measure for analysis of online algorithms. It has been applied to many online problems in diverse areas ranging from robot navigation, to network routing, to scheduling, to online graph coloring. While in several instances competitive analysis gives satisfactory results, for certain problems it results in unrealistically pessimistic ratios and/or fails to distinguish between algorithms that have vastly differing performance under any practical characterization. Addressing these shortcomings has been the subject of intense research by many of the best minds in the field. In this thesis, building upon recent advances of others we introduce some new models for analysis of online algorithms, namely Bijective Analysis, Average Analysis, Parameterized Analysis, and Relative Interval Analysis. We show that they lead to good results when applied to paging and list update algorithms. Paging and list update are two well known online problems. Paging is one of the main examples of poor behavior of competitive analysis. We show that LRU is the unique optimal online paging algorithm according to Average Analysis on sequences with locality of reference. Recall that in practice input sequences for paging have high locality of reference. It has been empirically long established that LRU is the best paging algorithm. Yet, Average Analysis is the first model that gives strict separation of LRU from all other online paging algorithms, thus solving a long standing open problem. We prove a similar result for the optimality of MTF for list update on sequences with locality of reference. A technique for the analysis of online algorithms has to be effective to be useful in day-to-day analysis of algorithms. While Bijective and Average Analysis succeed at providing fine separation, their application can be, at times, cumbersome. Thus we apply a parameterized or adaptive analysis framework to online algorithms. We show that this framework is effective, can be applied more easily to a larger family of problems and leads to finer analysis than the competitive ratio. The conceptual innovation of parameterizing the performance of an algorithm by something other than the input size was first introduced over three decades ago [124, 125]. By now it has been extensively studied and understood in the context of adaptive analysis (for problems in P) and parameterized algorithms (for NP-hard problems), yet to our knowledge this thesis is the first systematic application of this technique to the study of online algorithms. Interestingly, competitive analysis can be recast as a particular form of parameterized analysis in which the performance of opt is the parameter. In general, for each problem we can choose the parameter/measure that best reflects the difficulty of the input. We show that in many instances the performance of opt on a sequence is a coarse approximation of the difficulty or complexity of a given input sequence. Using a finer, more natural measure we can separate paging and list update algorithms which were otherwise indistinguishable under the classical model. This creates a performance hierarchy of algorithms which better reflects the intuitive relative strengths between them. Lastly, we show that, surprisingly, certain randomized algorithms which are superior to MTF in the classical model are not so in the parameterized case, which matches experimental results. We test list update algorithms in the context of a data compression problem known to have locality of reference. Our experiments show MTF outperforms other list update algorithms in practice after BWT. This is consistent with the intuition that BWT increases locality of reference. Analysis of Algorithms Online Algorithms Paging List Update Adaptive Analysis Data Compression Computer Science

1

Page generated in 0.0535 seconds