Global ETD Search

241	Scalable Dynamic Big Data Geovisualization With Spatial Data Structure Siqi Gu (8779961) 29 April 2020 (has links) Comparing to traditional cartography, big data geographic information processing is not a simple task at all, it requires special methods and methods. When existing geovisualization systems face millions of data, the zoom function and the dynamical data adding function usually cannot be satisfied at the same time. This research classify the existing methods of geovisualization, then analyze its functions and bottlenecks, analyze its applicability in the big data environment, and proposes a method that combines spatial data structure and iterative calculation on demand. It also proves that this method can effectively balance the performance of scaling and new data, and it is significantly better than the existing library in the time consumption of new data and scaling<br> Computer Graphics Data Structures data visualizations Geovisualisation Spatial data management Big Data
242	Towards Dynamic Programming on Generalized Data Structures: and Applications of Dynamic Programming in Bioinformatics Berkemer, Sarah Juliane 11 March 2020 (has links) Dynamische Programmierung (DP) ist eine Methode um Optimisierungsprobleme zu lösen. Hierbei wird das Problem in sich überlappende Teilprobleme unterteilt und eine optimale Lösung zu jedem der Teilprobleme berechnet. Diese werden dann wiederrum zur Gesamtlösung zusammengesetzt. Teillösungen werden in einer Tabelle gespeichert, sodass jede Teillösung nur einmal berechnet werden muss. So kann ein Suchraum exponentieller Größe in polynomieller Zeit durchsucht und eine optimale Lösung gefunden werden. Die dynamische Programmierung wurde 1952 von Bellman entwickelt und eine der ersten Anwendung war die Detektion von Tippfehlern beim Programmieren. DP Algorithmen werden oft und sehr vielschichtig in der Bioinformatik angewendet wie zum Beispiel beim Vergleich von Gensequenzen, Sequenzalignment genannt, oder der Vorhersage von Molekülstrukturen. Die Menge an Daten und somit auch deren Analyse steigt stetig an, weshalb neue und komplexere Datenstrukturen immer wichtiger werden. Ein Ziel ist es deswegen, DP Algorithmen zu entwickeln, die auf komplexeren Daten- strukturen als Strings angewendet werden können. Durch das Prinzip der algebraischen dynamischen Programmierung (ADP) können DP Algorithmen in kleinere Bestandteile zerlegt werden, die dann unabhängig voneinander weiterentwickelt und abgeändert werden können. Die Arbeit ist in zwei Teile gegliedert, wobei der erste Teil die theoretische Arbeit zur Entwicklung von Algorithmen der dynamischen Programmierung beinhaltet. Hierbei werden zuerst Prinzipien und Definitionen zur dynamischen Programmierung vorgestellt (Kapitel 2), um ein besseres Verständnis der darauffolgenden Kapitel zu gewährleisten. Der zweite Teil der Arbeit zeigt unterschiedliche bioinformatische Anwendungen von DP Algorithmen auf biologische Daten. In einem ersten Kapitel (Kapitel 5) werden Grundsätze biologischer Daten und Algorithmen vorgestellt, die dann in den weiteren Kapiteln benutzt werden. info:eu-repo/classification/ddc/000 ddc:000
243	Saving Energy in Network Hosts With an Application Layer Proxy: Design and Evaluation of New Methods That Utilize Improved Bloom Filters Jimeno, Miguel 11 December 2009 (has links) One of the most urgent challenges of the 21st century is to investigate new technologies that can enable a transition towards a society with a reduced CO2 footprint. Information Technology generates about 2% of the global CO2, which is comparable to the aviation industry. Being connected to the Internet requires active participation in responding to protocol messages. Billions of dollars worth of electricity every year are used to keep network hosts fully powered-on at all times only for the purpose of maintaining network presence. Most network hosts are idle most of the time, thus presenting a huge opportunity for energy savings and reduced CO2 emissions. Proxying has been previously explored as a means for allowing idle hosts to sleep yet still maintain network presence. This dissertation develops general requirements for proxying and is the first exploration of application-level proxying. Proxying for TCP connections, SIP, and Gnutella P2P was investigated. The TCP proxy keeps TCP connections open (when a host is sleeping) and buffers and/or discards packets as appropriate. The SIP proxy handles all communication with the SIP server and wakes up a sleeping SIP phone on an incoming call. The P2P proxy enables a Gnutella leaf node to sleep when not actively uploading or downloading files by handling all query messages and keyword lookups in a list of shared files. All proxies were prototyped and experimentally evaluated. Proxying for P2P lead to the exploration of space and time efficient data structures to reduce the computational requirements of keyword search in the proxy. The use of pre-computation and hierarchical structures for reducing the false positive rate of a Bloom filter was explored. A Best-of-N Bloom filter was developed, which was shown to have a lower false positive rate than a standard Bloom filter and the Power-of-2 Bloom filter. An analysis of the Best-of-N Bloom Filter was completed using Order Statistics to predict the false positive rate. Potential energy savings are shown to be in the hundreds of millions of dollars per year assuming a modest adoption rate of the methods investigated in this dissertation. Future directions could lead to greater savings. Data structures energy efficiency performance evaluation proxying network applications American Studies Arts and Humanities
244	Optimizing Query Processing Under Skew Zhang, Wangda January 2020 (has links) Big data systems such as relational databases, data science platforms, and scientific workflows all process queries over large and complex datasets. Skew is common in these real-world datasets and workloads. Different types of skew can have different impacts on the performance of query processing. Although skew sometimes causes load imbalance in a parallel execution environment, negatively impacting query performance, we demonstrate in this thesis that, in many cases we can actually improve the query performance in the presence of skew. To optimize query processing under skew, we develop a set of techniques to exploit the positive effects of skew and to avoid the negative effects. In order to exploit skew, we propose techniques including: (a) intentionally creating skew and clustering data in a distributed database system; (b) optimizing data layout for better caching in main-memory databases; and (c) adaptive execution techniques that are responsive to the underlying data in the context of compilers. In order to ameliorate skew, we study optimized hash-based partitioning that alleviate outliers in a genomic data context, as well as parallel prefix sum algorithms that used to develop skew-insensitive algorithms. We evaluate the effectiveness of our techniques over synthetic data, standard benchmarks, as well as empirical datasets, and show that the performance of query processing under skew can be greatly improved. Overall this thesis has made a concrete contribution to skew-related query processing. Computer science Querying (Computer science) Data structures (Computer science) Computer algorithms
245	Performance Study of Concurrent Search Trees and Hash Algorithms on Multiprocessors Systems Demuynck, Marie-Anne 05 1900 (has links) This study examines the performance of concurrent algorithms for B-trees and linear hashing. B-trees are widely used as an access method for large, single key, database files, stored in lexicographic order on secondary storage devices. Linear hashing is a fast and reliable hash algorithm, suitable for accessing records stored unordered in buckets. This dissertation presents performance results on implementations of concurrent Bunk-tree and linear hashing algorithms, using lock-based, partitioned and distributed methods on the Sequent Symmetry shared memory multiprocessor system and on a network of distributed processors created with PVM (Parallel Virtual Machine) software. Initial experiments, which started with empty data structures, show good results for the partitioned implementations and lock-based linear hashing, but poor ones for lock-based Blink-trees. A subsequent test, which started with loaded data structures, shows similar results, but with much improved performances for locked Blink- trees. The data also highlighted the high cost of split operations, which reached up to 70% of the total insert time. computer science search trees concurrent algorithms hash algorithms Trees (Graph theory) Algorithms. Data structures (Computer science)
246	Evaluating Memory Models for Graph‐Like Data Structures in the Rust Programming Language: Performance and Usabiliy / Utvärdering av Minnesmodeller för Graf‐Liknande Datastrukturer i Programmeringsspråket Rust: Användbarhet och Prestanda Viitanen, Rasmus January 2020 (has links) Representing graphs in Rust is a problematic issue, as ownership forbids typical representations found in e.g. C++. A common approach is to use reference counting to represent graphs, but this can easily lead to memory leaks if cycles are present in the graph. As naïve reference counting is not suﬀicient, we must search for alternative representations. In this thesis, we explore different memory models that allow safe representations of graph-like data structures in Rust. These memory models are later evaluated in terms of performance and usability. We find that region-based allocation is, in most cases, the best model to use when performance is of importance. In cases where usability is more important, either reference-counting with cycle collection or tracing garbage collection is a solid choice. When it comes to multi-threading, we propose a new implementation of a lock-free transactional graph in Rust. To our knowledge, this is the first lock-free graph representation in Rust. The model demonstrates poor scalability, but for certain graph topologies and sizes, it offers performance that exceeds the other graph models. Rust graphs performance usability data structures memory models Computer Sciences Datavetenskap (datalogi)
247	Simple, safe, and efficient memory management using linear pointers Liu, Likai 22 January 2016 (has links) Efficient and safe memory management is a hard problem. Garbage collection promises automatic memory management but comes with the cost of increased memory footprint, reduced parallelism in multi-threaded programs, unpredictable pause time, and intricate tuning parameters balancing the program's workload and designated memory usage in order for an application to perform reasonably well. Existing research mitigates the above problems to some extent, but programmer error could still cause memory leak by erroneously keeping memory references when they are no longer needed. We need a methodology for programmers to become resource aware, so that efficient, scalable, predictable and high performance programs may be written without the fear of resource leak. Linear logic has been recognized as the formalism of choice for resource tracking. It requires explicit introduction and elimination of resources and guarantees that a resource cannot be implicitly shared or abandoned, hence must be linear. Early languages based on linear logic focused on Curry-Howard correspondence. They began by limiting the expressive powers of the language and then reintroduced them by allowing controlled sharing which is necessary for recursive functions. However, only by deviating from Curry-Howard correspondence could later development actually address programming errors in resource usage. The contribution of this dissertation is a simple, safe, and efficient approach introducing linear resource ownership semantics into C++ (which is still a widely used language after 30 years since inception) through linear pointer, a smart pointer inspired by linear logic. By implementing various linear data structures and a parallel, multi-threaded memory allocator based on these data structures, this work shows that linear pointer is practical and efficient in the real world, and that it is possible to build a memory management stack that is entirely leak free. The dissertation offers some closing remarks on the difficulties a formal system would encounter when reasoning about a concurrent linear data algorithm, and what might be done to solve these problems. Computer science Concurrency C++ Data structures Linear pointer Memory allocator Memory management
248	Practical Type Inference for the GADT Type System Lin, Chuan-kai 01 January 2010 (has links) Generalized algebraic data types (GADTs) are a type system extension to algebraic data types that allows the type of an algebraic data value to vary with its shape. The GADT type system allows programmers to express detailed program properties as types (for example, that a function should return a list of the same length as its input), and a general-purpose type checker will automatically check those properties at compile time. Type inference for the GADT type system and the properties of the type system are both currently areas of active research. In this dissertation, I attack both problems simultaneously by exploiting the symbiosis between type system research and type inference research. Deficiencies of GADT type inference algorithms motivate research on specific aspects of the type system, and discoveries about the type system bring in new insights that lead to improved GADT type inference algorithms. The technical contributions of this dissertation are therefore twofold: in addition to new GADT type system properties (such as the prevalence of pointwise type information flow in GADT patterns, a generalized notion of existential types, and the effects of enforcing the GADT branch reachability requirement), I will also present a new GADT type inference algorithm that is significantly more powerful than existing algorithms. These contributions should help programmers use the GADT type system more effectively, and they should also enable language implementers to provide better support for the GADT type system. Functional programming languages Data structures (Computer science) Algebra -- Data processing
249	Data views for a programming environment Robson, R. January 1988 (has links) No description available. Computer programming -- Management. Data structures (Computer science)
250	Online Covering: Efficient and Learning-Augmented Algorithms Young-san Lin (12868319) 14 June 2022 (has links) <p>We start by slightly modifying the generic framework for solving online covering and packing linear programs (LP) proposed in the seminal work of Buchbinder and Naor (Mathematics of Operations Research, 34, 2009) to obtain efficient implementations in settings in which one has access to a separation oracle.</p> <p><br></p> <p>We then apply the generic framework to several online network connectivity problems with LP formulations, namely pairwise spanners and directed Steiner forests. Our results are comparable to the previous state-of-the-art results for these problems in the offline setting.</p> <p><br></p> <p>Further, we extend the generic frameworks to online optimization problems enhanced with <strong>machine-learning predictions</strong>. In particular, we present <strong>learning-augmented</strong> algorithms for online covering LPs and semidefinite programs (SDP), which outperform any optimal online algorithms when the prediction is accurate while maintaining reasonable guarantees when the prediction is misleading. Specifically, we obtain general online learning-augmented algorithms for covering LPs with fractional advice and general constraints and initiate the study of learning-augmented algorithms for covering SDPs.</p> Data structures and algorithms Online Algorithms Linear programming Semidefinite Programming machine learning-based Spanners Steiner forest algorithm

Search results