Global ETD Search

11	DNA PATTERN MATCHING ON LOOSELY COUPLED RECONFIGURABLE SYSTEMS SARELLA, HANANIEL 27 May 2005 (has links) No description available. Throughput
12	Implementation of Pattern Matching Calculus Using Type-Indexed Expressions Ji, Xiaoheng 09 1900 (has links) <p> The pattern matching calculus introduced by Kahl provides a fine-grained mechanism of modelling non-strict pattern matching in modern functional programming languages. By changing the rule of interpreting the empty expression that results from matching failures, the pattern matching calculus can be transformed into another calculus that abstracts a "more successful" evaluation. Kahl also showed that the two calculi have both a confluent reduction system and a same normalising strategy, which constitute the operational semantics of the pattern matching calculi.</p> <p> As a new technique based on Haskell's language extensions of type-saft cast, arbitrary-rank polymorphism and generalised algebraic data types, type-indexed expressions introduced by Kahl demonstrate a uniform way of defining all expressions as type-indexed to guarantee type safety.</p> <p> In this thesis, we implemented the type-indexed syntax and operational semantics of the pattern matching calculi using type-indexed expressions. Our type-indexed syntax mirrors the definition of the pattern matching calculi. We implemented the operational semantics of the two calculi perfectly and provided reduction and normalisation examples that show that the pattern matching calculus can be a useful basis of modelling non-strict pattern matching.</p> <p> We formalised and implemented the bimonadic semantics of the pattern matching calculi using categorical concepts and type-indexed expressions respectively. The bimonadic semantics employs two monads to reflect two kinds of computational effects, which correspond to the two major syntactic categories of the pattern matching calculi, i.e. expressons and matchings. Thus, the resulting implementation provides the detotational model of non-strict pattern matching with more accuracy.</p> <p> Finally, from a practical programming viewpoint, our implementation is a good demonstration of how to program in the pure type-indexed setting by taking fully advantage of Haskell's language extensions of type-safe cast, arbitrary-rank polymorphism and generalised algebraic data types.</p> / Thesis / Master of Science (MSc)
13	Compressed Pattern Matching For Text And Images Tao, Tao 01 January 2005 (has links) The amount of information that we are dealing with today is being generated at an ever-increasing rate. On one hand, data compression is needed to efficiently store, organize the data and transport the data over the limited-bandwidth network. On the other hand, efficient information retrieval is needed to speedily find the relevant information from this huge mass of data using available resources. The compressed pattern matching problem can be stated as: given the compressed format of a text or an image and a pattern string or a pattern image, report the occurrence(s) of the pattern in the text or image with minimal (or no) decompression. The main advantages of compressed pattern matching versus the naïve decompress-then-search approach are: First, reduced storage cost. Since there is no need to decompress the data or there is only minimal decompression required, the disk space and the memory cost is reduced. Second, less search time. Since the size of the compressed data is smaller than that of the original data, a searching performed on the compressed data will result in a shorter search time. The challenge of efficient compressed pattern matching can be met from two inseparable aspects: First, to utilize effectively the full potential of compression for the information retrieval systems, there is a need to develop search-aware compression algorithms. Second, for data that is compressed using a particular compression technique, regardless whether the compression is search-aware or not, we need to develop efficient searching techniques. This means that techniques must be developed to search the compressed data with no or minimal decompression and with not too much extra cost. Compressed pattern matching algorithms can be categorized as either for text compression or for image compression. Although compressed pattern matching for text compression has been studied for a few years and many publications are available in the literature, there is still room to improve the efficiency in terms of both compression and searching. None of the search engines available today make explicit use of compressed pattern matching. Compressed pattern matching for image compression, on the other hand, has been relatively unexplored. However, it is getting more attention because lossless compression has become more important for the ever-increasing large amount of medical images, satellite images and aerospace photos, which requires the data to be losslessly stored. Developing efficient information retrieval techniques from the losslessly compressed data is therefore a fundamental research challenge. In this dissertation, we have studied compressed pattern matching problem for both text and images. We present a series of novel compressed pattern matching algorithms, which are divided into two major parts. The first major work is done for the popular LZW compression algorithm. The second major work is done for the current lossless image compression standard JPEG-LS. Specifically, our contributions from the first major work are: 1. We have developed an "almost-optimal" compressed pattern matching algorithm that reports all pattern occurrences. An earlier "almost-optimal" algorithm reported in the literature is only capable of detecting the first occurrence of the pattern and the practical performance of the algorithm is not clear. We have implemented our algorithm and provide extensive experimental results measuring the speed of our algorithm. We also developed a faster implementation for so-called "simple patterns". The simple patterns are patterns that no unique symbol appears more than once. The algorithm takes advantage of this property and runs in optimal time. 2. We have developed a novel compressed pattern matching algorithm for multiple patterns using the Aho-Corasick algorithm. The algorithm takes O(mt+n+r) time with O(mt) extra space, where n is the size of the compressed file, m is the total size of all patterns, t is the size of the LZW trie and r is the number of occurrences of the patterns. The algorithm is particularly efficient when being applied on archival search if the archives are compressed with a common LZW trie. All the above algorithms have been implemented and extensive experiments have been conducted to test the performance of our algorithms and to compare with the best existing algorithms. The experimental results show that our compressed pattern matching algorithm for multiple patterns is competitive among the best algorithms and is practically the fastest among all approaches when the number of patterns is not very large. Therefore, our algorithm is preferable for general string matching applications. LZW is one of the most efficient and popular compression algorithms used extensively and both of our algorithms require no modification on the compression algorithm. Our work, therefore, has great economical and market potential Our contributions from the second major work are: 1 We have developed a new global context variation of the JPEG-LS compression algorithm and the corresponding compressed pattern matching algorithm. Comparing to the original JPEG-LS, the global context variation is search-aware and has faster encoding and decoding speeds. The searching algorithm based on the global-context variation requires partial decompression of the compressed image. The experimental results show that it improves the search speed by about 30% comparing to the decompress-then-search approach. Based on our best knowledge, this is the first two-dimensional compressed pattern matching work for the JPEG-LS standard. 2 We have developed a two-pass variation of the JPEG-LS algorithm and the corresponding compressed pattern matching algorithm. The two-pass variation achieves search-awareness through a common compression technique called semi-static dictionary. Comparing to the original algorithm, the compression of the new algorithm is equally well but the encoding takes slightly longer. The searching algorithm based on the two-pass variation requires no decompression at all and therefore works in the fully compressed domain. It runs in time O(nc+mc+nm+m^2) with extra space O(n+m+mc), where n is the number of columns of the image, m is the number of rows and columns of the pattern, nc is the compressed image size and mc is the compressed pattern size. The algorithm is the first known two-dimensional algorithm that works in the fully compressed domain. compression pattern matching JPEG-LS LZW information retrieval compressed pattern matching Computer Sciences Engineering
14	Parallelization of a software based intrusion detection system - Snort Zhang, Huan January 2011 (has links) Computer networks are already ubiquitous in people’s lives and work and network security is becoming a critical part. A simple firewall, which can only scan the bottom four OSI layers, cannot satisfy all security requirements. An intrusion detection system (IDS) with deep packet inspection, which can filter all seven OSI layers, is becoming necessary for more and more networks. However, the processing throughputs of the IDSs are far behind the current network speed. People have begun to improve the performance of the IDSs by implementing them on different hardware platforms, such as Field-Programmable Gate Array (FPGA) or some special network processors. Nevertheless, all of these options are either less flexible or more expensive to deploy. This research focuses on some possibilities of implementing a parallelized IDS on a general computer environment based on Snort, which is the most popular open-source IDS at the moment. In this thesis, some possible methods have been analyzed for the parallelization of the pattern-matching engine based on a multicore computer. However, owing to the small granularity of the network packets, the pattern-matching engine of Snort is unsuitable for parallelization. In addition, a pipelined structure of Snort has been implemented and analyzed. The universal packet capture API - LibPCAP has been modified for a new feature, which can capture a packet directly to an external buffer. Then, the performance of the pipelined Snort can have an improvement up to 60% on an Intel i7 multicore computer for jumbo frames. A primary limitation is on the memory bandwidth. With a higher bandwidth, the performance of the parallelization can be further improved. Snort IDS Intrusion Detection Multicore Parallelization Pattern Matching
15	Efficient and Flexible Search in Large Scale Distributed Systems Ahmed, Reaz January 2007 (has links) Peer-to-peer (P2P) technology has triggered a wide range of distributed systems beyond simple file-sharing. Distributed XML databases, distributed computing, server-less web publishing and networked resource/service sharing are only a few to name. Despite of the diversity in applications, these systems share a common problem regarding searching and discovery of information. This commonality stems from the transitory nodes population and volatile information content in the participating nodes. In such dynamic environment, users are not expected to have the exact information about the available objects in the system. Rather queries are based on partial information, which requires the search mechanism to be flexible. On the other hand, to scale with network size the search mechanism is required to be bandwidth efficient. Since the advent of P2P technology experts from industry and academia have proposed a number of search techniques - none of which is able to provide satisfactory solution to the conflicting requirements of search efficiency and flexibility. Structured search techniques, mostly Distributed Hash Table (DHT)-based, are bandwidth efficient while semi(un)-structured techniques are flexible. But, neither achieves both ends. This thesis defines the Distributed Pattern Matching (DPM) problem. The DPM problem is to discover a pattern (\ie bit-vector) using any subset of its 1-bits, under the assumption that the patterns are distributed across a large population of networked nodes. Search problem in many distributed systems can be reduced to the DPM problem. This thesis also presents two distinct search mechanisms, named Distributed Pattern Matching System (DPMS) and Plexus, for solving the DPM problem. DPMS is a semi-structured, hierarchical architecture aiming to discover a predefined number of matches by visiting a small number of nodes. Plexus, on the other hand, is a structured search mechanism based on the theory of Error Correcting Code (ECC). The design goal behind Plexus is to discover all the matches by visiting a reasonable number of nodes. Distributed Pattern Matching DPM DPMS Plexus Computer Science
16	OOMatch: Pattern Matching as Dispatch in Java Richard, Adam January 2007 (has links) We present a new language feature, specified as an extension to Java. The feature is a form of dispatch, which includes and subsumes multimethods, but which is not as powerful as general predicate dispatch. It is, however, intended to be more practical and easier to use than the latter. The extension, dubbed OOMatch, allows method parameters to be specified as patterns, which are matched against the arguments to the method call. When matches occur, the method applies; if multiple methods apply, the method with the more specific pattern overrides the others. The pattern matching is very similar to that found in the "case" constructs of many functional languages, with an important difference: functional languages normally allow pattern matching over variant types (and other primitives such as tuples), while OOMatch allows pattern matching on Java ob jects. Indeed, the wider goal here is the study of the combination of functional and ob ject-oriented programming paradigms. Maintaining encapsulation while allowing pattern matching is of special importance. Class designers should have the control needed to prevent implementation details (such as private variables) from being exposed to clients of the class. We here present both an informal "tutorial" description of OOMatch, as well as a formal specification of the language, and a proof that the conditions specified guarantee run-time safety. predicate dispatch dynamic dispatch pattern matching multimethods Java Computer Science
17	Efficient and Flexible Search in Large Scale Distributed Systems Ahmed, Reaz January 2007 (has links) Peer-to-peer (P2P) technology has triggered a wide range of distributed systems beyond simple file-sharing. Distributed XML databases, distributed computing, server-less web publishing and networked resource/service sharing are only a few to name. Despite of the diversity in applications, these systems share a common problem regarding searching and discovery of information. This commonality stems from the transitory nodes population and volatile information content in the participating nodes. In such dynamic environment, users are not expected to have the exact information about the available objects in the system. Rather queries are based on partial information, which requires the search mechanism to be flexible. On the other hand, to scale with network size the search mechanism is required to be bandwidth efficient. Since the advent of P2P technology experts from industry and academia have proposed a number of search techniques - none of which is able to provide satisfactory solution to the conflicting requirements of search efficiency and flexibility. Structured search techniques, mostly Distributed Hash Table (DHT)-based, are bandwidth efficient while semi(un)-structured techniques are flexible. But, neither achieves both ends. This thesis defines the Distributed Pattern Matching (DPM) problem. The DPM problem is to discover a pattern (\ie bit-vector) using any subset of its 1-bits, under the assumption that the patterns are distributed across a large population of networked nodes. Search problem in many distributed systems can be reduced to the DPM problem. This thesis also presents two distinct search mechanisms, named Distributed Pattern Matching System (DPMS) and Plexus, for solving the DPM problem. DPMS is a semi-structured, hierarchical architecture aiming to discover a predefined number of matches by visiting a small number of nodes. Plexus, on the other hand, is a structured search mechanism based on the theory of Error Correcting Code (ECC). The design goal behind Plexus is to discover all the matches by visiting a reasonable number of nodes. Distributed Pattern Matching DPM DPMS Plexus Computer Science
18	OOMatch: Pattern Matching as Dispatch in Java Richard, Adam January 2007 (has links) We present a new language feature, specified as an extension to Java. The feature is a form of dispatch, which includes and subsumes multimethods, but which is not as powerful as general predicate dispatch. It is, however, intended to be more practical and easier to use than the latter. The extension, dubbed OOMatch, allows method parameters to be specified as patterns, which are matched against the arguments to the method call. When matches occur, the method applies; if multiple methods apply, the method with the more specific pattern overrides the others. The pattern matching is very similar to that found in the "case" constructs of many functional languages, with an important difference: functional languages normally allow pattern matching over variant types (and other primitives such as tuples), while OOMatch allows pattern matching on Java ob jects. Indeed, the wider goal here is the study of the combination of functional and ob ject-oriented programming paradigms. Maintaining encapsulation while allowing pattern matching is of special importance. Class designers should have the control needed to prevent implementation details (such as private variables) from being exposed to clients of the class. We here present both an informal "tutorial" description of OOMatch, as well as a formal specification of the language, and a proof that the conditions specified guarantee run-time safety. predicate dispatch dynamic dispatch pattern matching multimethods Java Computer Science
19	A Unified Model of Pattern-Matching Circuits for Field-Programmable Gate Arrays Clark, Christopher R. 28 August 2006 (has links) The objective of this dissertation is to develop a methodology for describing the functionality, analyzing the complexity, and evaluating the performance of a large class of pattern-matching circuit design approaches for field-programmable gate arrays (FPGAs). The developed methodology consists of three elements. The first is a functional model and associated nomenclature that unifies a significant portion of published circuit design approaches while also illuminating many novel approaches. The second is a set of analytical expressions that model the area and time complexity of each circuit design approach based on attributes of a given pattern set. Third, software tools are developed that facilitate architectural design space exploration and circuit implementation. This methodology is used to conduct an extensive evaluation and comparison of design approaches under a wide range of conditions using pattern sets from multiple application domains as well as synthetic pattern sets. The results indicate strong dependences between pattern set properties and circuit performance and provide new insights on the fundamental nature of various design approaches. A number of techniques have been proposed for designing pattern-matching hardware circuits with reconfigurable FPGA chips. The use of FPGAs enables high performance because the circuits can be customized for a particular application and pattern set. A relatively unstudied consequence of tailoring circuits for specific patterns is that circuit area and performance are affected by various properties of the patterns used. Most previous work in this field only considers a single design approach and a small number of pattern sets. Therefore, it is not clear how each design is affected by pattern set properties. For a given set of patterns, it is difficult to determine which approach would be the most efficient or provide the highest performance. Previous attempts to compare approaches using results from different publications are conflicting and inconclusive due to variations in the FPGA devices, patterns, and circuit optimizations used. There has been no attempt to evaluate a wide range of designs under a common set of conditions. The methodology presented in this dissertation provides a framework for studying multiple aspects of FPGA pattern-matching circuits in a controlled and consistent manner. Finite automata Brute force Pattern-matching FPGA NFA DFA
20	The Implementation and Applications of Multi-pattern Matching Algorithm over General Purpose GPU Cheng, Yan-Hui 08 July 2011 (has links) With the current technology more and more developed, in our daily life, whether doing research or work, we often use a variety of computer equipment to help us deal with some of our frequently used data. And the type and quantity of data have become more and more, such as satellite imaging data, genetic engineering, the global climate forecasting data, and complex event processing, etc. Some certain types of the data require both accuracy and timeliness. That is, we hope to look for some data in a shorter time. According to MIT Technology Review in August 2010 reported that the relevant published, complex event processing becomes a new research, and it also includes in the part of data search. Data search often means data comparing. Given specified keywords or key information which we are looking for, we design a pattern matching algorithm to find the results within a shorter time, or even real-time. In our research, the purpose is to use the general-purpose GPU, NVIDIA Tesla C2050, with parallel computing architecture to implement parallelism of the pattern matching. Finally, we construct a service to handle a large number of real-time data. We also make some performance tests and compare the results with the well-known software ¡§Apache Solr¡¨ to find the differences and the possible application in the future. real-time GPU parallel compute Solr pattern matching

Search results