Global ETD Search

1	Practical Earley parsing and the SPARK toolkit Aycock, John Daniel 23 May 2018 (has links) Domain-specific, “little” languages are commonplace in computing. So too is the need to implement such languages; to meet this need, we have created SPARK (Scanning, Parsing, And Rewriting Kit), a toolkit for little language implementation in Python, an object-oriented scripting language. SPARK greatly simplifies the task of little language implementation. It requires little code to be written, and accommodates a wide range of users—even those without a background in compiler theory. Our toolkit is seeing increasing use on a variety of diverse projects. SPARK was designed to be easy-to-use with few limitations, and relies heavily on Earley's general parsing algorithm internally, which helps in meeting these design goals. Earley's algorithm, in its standard form, can be hard to use; indeed, experience with SPARK has highlighted several problems with the practical use of Earley's algorithm. Our research addresses and provides solutions for these problems, making some significant improvements to the implementation and use of Earley's algorithm. First, Earley's algorithm suffers from the performance problem . Even under optimum conditions, a standard Earley parser is burdened with overhead. We extend directly-executable parsing techniques for use in Earley parsers, the results of which run in time comparable to the much-more-specialized LALR(1) parsing algorithm. Second is what we call the delayed action problem. General parsers like Earley must, in the worst case, read the entire input before executing any semantic actions associated with the grammar rules. We attack this problem in two ways. We have identified conditions under which it is safe to execute semantic actions on the fly during recognition; as a side effect, this has yielded space savings of over 90% for some grammars. The other approach to the delayed action problem deals with the difficulty of handling context-dependent tokens. Such tokens are easy to handle using what we call “Schrödinger's tokens,” a superposition of token types. Finally, Earley parsers are complicated by the need to process grammar rules with empty right-hand sides. We present a simple, efficient way to handle these empty rules, and prove that our new method is correct. We also show how our method may be used to create a new type of LR(0) automaton which is ideally suited for use in Earley parsers. Our work has made Earley parsing faster and more space-efficient, turning it into an excellent candidate for practical use in many applications. / Graduate Parsing (Computer grammar) SPARK (Computer program language)
2	New Primitives for Tackling Graph Problems and Their Applications in Parallel Computing Zhong, Peilin January 2021 (has links) We study fundamental graph problems under parallel computing models. In particular, we consider two parallel computing models: Parallel Random Access Machine (PRAM) and Massively Parallel Computation (MPC). The PRAM model is a classic model of parallel computation. The efficiency of a PRAM algorithm is measured by its parallel time and the number of processors needed to achieve the parallel time. The MPC model is an abstraction of modern massive parallel computing systems such as MapReduce, Hadoop and Spark. The MPC model captures well coarse-grained computation on large data --- data is distributed to processors, each of which has a sublinear (in the input data) amount of local memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. We usually desire fully scalable MPC algorithms, i.e., algorithms that can work for any local memory size. The efficiency of a fully scalable MPC algorithm is measured by its parallel time and the total space usage (the local memory size times the number of machines). Consider an 𝑛-vertex 𝑚-edge undirected graph 𝐺 (either weighted or unweighted) with diameter 𝐷 (the largest diameter of its connected components). Let 𝑁=𝑚+𝑛 denote the size of 𝐺. We present a series of efficient (randomized) parallel graph algorithms with theoretical guarantees. Several results are listed as follows: 1) Fully scalable MPC algorithms for graph connectivity and spanning forest using 𝑂(𝑁) total space and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time. 2) Fully scalable MPC algorithms for 2-edge and 2-vertex connectivity using 𝑂(𝑁) total space where 2-edge connectivity algorithm needs 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time, and 2-vertex connectivity algorithm needs 𝑂(log 𝐷⸱log²log_{𝑁/𝑛} n+\log D'⸱loglog_{𝑁/𝑛} 𝑛) parallel time. Here 𝐷' denotes the bi-diameter of 𝐺. 3) PRAM algorithms for graph connectivity and spanning forest using 𝑂(𝑁) processors and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time. 4) PRAM algorithms for (1 + 𝜖)-approximate shortest path and (1 + 𝜖)-approximate uncapacitated minimum cost flow using 𝑂(𝑁) processors and poly(log 𝑛) parallel time. These algorithms are built on a series of new graph algorithmic primitives which may be of independent interests. Computer science Computer algorithms Parallel programming (Computer science) SPARK (Computer program language) MapReduce (Computer file) Apache Hadoop

Search results

Practical Earley parsing and the SPARK toolkit

New Primitives for Tackling Graph Problems and Their Applications in Parallel Computing