Spelling suggestions: "subject:"computerscience"" "subject:"composerscience""
501 |
Exploring the design space of cooperative streaming multicastJanuary 2009 (has links)
Video streaming over the Internet is rapidly rising in popularity, but the availability and quality of video content is currently limited by the high bandwidth costs and infrastructure needs of server-based solutions. Recently, however, cooperative end-system multicast (CEM) has emerged as a promising paradigm for content distribution in the Internet, because the bandwidth overhead of disseminating content is shared among the participants of the CEM overlay network. In this thesis, we identify the dimensions in the design space of CEMs, explore the design space, and seek to understand the inherent tradeoffs of different design choices.
In the first part of the thesis, we study the control mechanisms for CEM overlay maintenance. We demonstrate that the control task of neighbor acquisition in CEMs can be factored out into a separate control overlay that provides a single primitive: a configurable anycast for peer selection. The separation of control from data overlay avoids the efficiency tradeoffs that afflict some of the current systems. The anycast primitive can be used to build and maintain different data overlay organizations like single-tree, multi-tree, mesh-based, and hybrids, by expressing appropriate policies. We built SAAR, a reusable, shared control overlay for CEMs, that efficiently implements this anycast primitive, and thereby, efficiently serves the control needs for CEMs.
In the second part of the thesis, we focus on techniques for data dissemination. We built a common framework in which different CEM data delivery techniques can be faithfully compared. A systematic empirical comparison of CEM design choices demonstrates that there is no single approach that is best in all scenarios. In fact, our results suggest that every CEM protocol is inherently limited in certain aspects of its performance. We distill our observations into a novel model that explains the inherent tradeoffs of CEM design choices and provides bounds on the practical performance limits of any future CEM protocol. In particular, the model asserts that no CEM design can simultaneously achieve all three of low overhead, low lag, and high streaming quality.
|
502 |
Compiling dynamic languages via statically typed functional languagesJanuary 2009 (has links)
Dynamic languages enable rapid prototyping, but are generally not viewed as providing the best performance. As a result, software developers generally build a prototype in a dynamic language and then rewrite the application in C or Fortran for high performance. This costly rewriting step can be avoided by improving the performance of dynamic languages. Dynamic languages are usually interpreted for easier implementation. The traditional approach to improve their performance is to build an optimizing compiler. However, building a compiler from scratch is much more time-consuming than implementing an interpreter. Our thesis is that we can build effective compilers for dynamic languages by translating them into statically typed functional languages which have good compilers and automatic memory management. In particular, we believe that modern statically typed languages provide precise control over data representations, and come with runtime systems that have competitive performance.
To investigate the viability of this approach, we have built a compiler for the dynamic language Python by translating it into the statically typed functional language OCaml. An interesting practical advantage of using modern statically typed functional languages is that they use Hindley-Milner type systems, which means that there is no need for the translation to construct type terms.
We compare the performance of our implementation, Monty, with that of CPython, the reference Python implementation, and with Jython, a Java implementation of Python, using a suite of 370 benchmarks. Our experiments show that some programs compiled using our approach run up to 4.6 times faster than CPython. However, due to a number of engineering reasons, some programs also run significantly slower than CPython. We pinpoint the specific causes of performance degradation and assess the potential for removing these causes in future work. Our implementation is significantly faster than Jython, up to a factor of 100 in some cases.
A by product of our research is a proposal for an improved array copying implementation in OCaml.
|
503 |
Efficient duty cycle MAC protocols for dynamic traffic loads in wireless sensor networksJanuary 2009 (has links)
Idle listening is one of the most significant causes of energy consumption in wireless sensor networks (WSNs), and many protocols have been proposed based on duty cycling to reduce this cost. These protocols, either synchronous or asynchronous, are mainly optimized for light traffic loads. A WSN, however, could often experience bursty and high traffic loads, as may happen for example with broadcast or convergecast traffic. In this thesis, I design and evaluate a new synchronous protocol, DW-MAC (Demand Wakeup MAC), and a new asynchronous protocol, RI-MAC (Receiver Initiated MAC), that are both efficient under dynamic traffic loads, including light or heavy loads. I also design and evaluate ADB (Asynchronous Duty-cycle Broadcasting), a new protocol for efficient multihop broadcasting in WSNs using asynchronous duty cycling.
DW-MAC introduces a new low-overhead scheduling algorithm that allows nodes to wake up on demand during the Sleep period of an operational cycle and ensures that data transmissions do not collide at their intended receivers; this demand wakeup adaptively increases effective channel capacity as traffic load increases. RI-MAC, instead, uses receiver-initiated transmissions, in which each transmitter passively waits until its intended receiver wakes up and transmits a beacon frame; this technique minimizes the time a sender and its intended receiver occupy the wireless medium to find a rendezvous time for exchanging data. ADB is integrated with RI-MAC to exploit information only available at this layer; rather than treating the data transmission from a node to all of its neighbors as the basic unit of progress for the multihop broadcast. ADB dynamically optimizes the broadcast at the level of transmission to each individual neighbor of a node as the neighbors asynchronously wakeup, avoiding redundant transmissions and transmissions over poor links, and allowing a transmitter to go to sleep as early as possible. In detailed simulation of all three protocols using ns-2, they each substantially outperform earlier competing protocols in terms of reduced energy and latency and increased packet delivery ratio. I also implemented RI-MAC and ADB in a testbed of MICAz motes using TinyOS and further demonstrate the significant performance improvements made over prior protocols.
|
504 |
Providing incentive to peer-to-peer applicationsJanuary 2009 (has links)
Cooperative peer-to-peer applications are designed to share the resources of participating computers for the common good of ail users. However, users do not necessarily have an incentive to donate resources to the system if they can use the system's resources for free. As commonly observed in deployed applications, this situation adversely affects the applications' performance and sometimes even their availability and usability.
While traditional resource management is handled by a centralized enforcement entity, adopting similar solution raises new concerns for distributed peer-to-peer systems. This dissertation proposes to solve the incentive problem in peer-to-peer applications by designing fair sharing policies and enforcing these policies in a distributed manner. The feasibility and practicability of this approach is demonstrated through numerous applications, namely archival storage systems, streaming systems, content distribution systems, and anonymous communication systems.
|
505 |
Array optimizations for high productivity programming languagesJanuary 2009 (has links)
While the HPCS languages (Chapel, Fortress and X10) have introduced improvements in programmer productivity, several challenges still remain in delivering high performance. In the absence of optimization, the high-level language constructs that improve productivity can result in order-of-magnitude runtime performance degradations.
This dissertation addresses the problem of efficient code generation for high-level array accesses in the X10 language. The X10 language supports rank-independent specification of loop and array computations using regions and points. Three aspects of high-level array accesses in X10 are important for productivity but also pose significant performance challenges: high-level accesses are performed through Point objects rather than integer indices, variables containing references to arrays are rank-independent, and array subscripts are verified as legal array indices during runtime program execution.
Our solution to the first challenge is to introduce new analyses and transformations that enable automatic inlining and scalar replacement of Point objects. Our solution to the second challenge is a hybrid approach. We use an interprocedural rank analysis algorithm to automatically infer ranks of arrays in X10. We use rank analysis information to enable storage transformations on arrays. If rank-independent array references still remain after compiler analysis, the programmer can use X10's dependent type system to safely annotate array variable declarations with additional information for the rank and region of the variable, and to enable the compiler to generate efficient code in cases where the dependent type information is available. Our solution to the third challenge is to use a new interprocedural array bounds analysis approach using regions to automatically determine when runtime bounds checks are not needed.
Our performance results show that our optimizations deliver performance that rivals the performance of hand-tuned code with explicit rank-specific loops and lower-level array accesses, and is up to two orders of magnitude faster than unoptimized, high-level X10 programs. These optimizations also result in scalability improvements of X10 programs as we increase the number of CPUs. While we perform the optimizations primarily in X10, these techniques are applicable to other high-productivity languages such as Chapel and Fortress.
|
506 |
A type-based prototype compiler for telescoping languagesJanuary 2009 (has links)
Scientists want to encode their applications in domain languages with high-level operators that reflect the way they conceptualize computations in their domains. Telescoping languages calls for automatically generating optimizing compilers for these languages by pre-compiling the underlying libraries that define them to generate multiple variants optimized for use in different possible contexts, including different argument types. The resulting compiler replaces calls to the high-level constructs with calls to the optimized variants. This approach aims to automatically derive high-performance executables from programs written in high-level domain-specific languages.
TeleGen is a prototype telescoping-languages compiler that performs type-based specializations. For the purposes of this dissertation, types include any set of variable properties such as intrinsic type, size and array sparsity pattern. Type inference and specialization are cornerstones of the telescoping-languages strategy. Because optimization of library routines must occur before their full calling contexts are available, type inference gives critical information needed to determine which specialized variants to generate as well as how to best optimize each variant to achieve the highest performance. To build the prototype compiler, we developed a precise type-inference algorithm that infers all legal type tuples, or type configurations, for the program variables, including routine arguments, for all legal calling contexts.
We use the type information inferred by our algorithm to drive specialization and optimization. We demonstrate the practical value of our type-inference algorithm and the type-based specialization strategy in TeleGen.
|
507 |
Online social networks: Measurement, analysis, and applications to distributed information systemsJanuary 2009 (has links)
Recently, online social networking sites have exploded in popularity. Numerous sites are dedicated to finding and maintaining contacts and to locating and sharing different types of content. Online social networks represent a new kind of information network that differs significantly from existing networks like the Web. For example, in the Web, hyperlinks between content form a graph that is used to organize, navigate, and rank information. The properties of the Web graph have been studied extensively, and have lead to useful algorithms such as PageRank. In contrast, few links exist between content in online social networks and instead, the links exist between content and users, and between users themselves. However, little is known in the research community about the properties of online social network graphs at scale, the factors that shape their structure, or the ways they can be leveraged in information systems.
In this thesis, we use novel measurement techniques to study online social networks at scale, and use the resulting insights to design innovative new information systems. First, we examine the structure and growth patterns of online social networks, focusing on how users are connecting to one another. We conduct the first large-scale measurement study of multiple online social networks at scale, capturing information about over 50 million users and 400 million links. Our analysis identifies a common structure across multiple networks, characterizes the underlying processes that are shaping the network structure, and exposes the rich community structure.
Second, we leverage our understanding of the properties of online social networks to design new information systems. Specifically, we build two distinct applications that leverage different properties of online social networks. We present and evaluate Ostra, a novel system for preventing unwanted communication that leverages the difficulty in establishing and maintaining relationships in social networks. We also present, deploy, and evaluate PeerSpective, a system for enhancing Web search using the natural community, structure in social networks. Each of these systems has been evaluated on data from real online social networks or in a deployment with real users.
|
508 |
Workload shaping for QoS and power efficiency of storage systemsJanuary 2009 (has links)
The growing popularity of hosted storage services and shared storage infrastructure in data centers is driving the recent interest in resource management and QoS in storage systems. The bursty nature of storage workloads raises significant performance and provisioning challenges, leading to increased resource requirements, management costs, and energy consumption. We present a novel dynamic workload shaping framework to handle bursty server workloads, where the arrival stream is dynamically decomposed to isolate its bursty, and then rescheduled to exploit available slack. An optimal decomposition algorithm RTT and a recombination algorithm Miser make up the scheduling framework. We evaluate this framework using several real world storage workloads traces. The results show that workload shaping: (i) reduces the server capacity requirements and power consumption dramatically while affecting QoS guarantees minimally, (ii) provides better response time distributions over non-decomposed traditional scheduling methods, and (iii) decomposition can be used to provide more accurate capacity estimates for multiplexing several clients on a shared server.
|
509 |
Buchi containment and size-change terminationJanuary 2009 (has links)
We compare tools for complementing nondeterministic Buchi automata with a recent termination-analysis algorithm. Complementation of Buchi automata is a well-explored problem in program verification. Early solutions using a Ramsey-based combinatorial argument have been supplanted by rank-based constructions with exponentially better bounds. In 2001 Lee et al. presented the size-change termination (SCT) problem, along with both a reduction to Buchi automata and a Ramsey-based algorithm This algorithm strongly resembles the initial complementation constructions for Buchi automata. This leads us to wonder if theoretical gains in efficiency are mirrored in empirical performance.
We prove the SCT algorithm is a specialized realization of the Ramsey-based complementation construction. Doing so allows us to generalize SCT solvers to handle Buchi automata. We experimentally demonstrate that, surprisingly, Ramsey-based approaches are superior over the domain of SCT problems, while rank-based approaches dominate automata universality tests. This reveals several interesting properties of the problem spaces and both approaches.
|
510 |
VoteBox: A tamper-evident, verifiable voting machineJanuary 2009 (has links)
This thesis details the design and implementation of V OTEBOX, a new software platform for building and evaluating secure and reliable electronic voting machines. Current electronic voting systems have experienced many high-profile software, hardware, and usability failures in real elections. Recent research has revealed systemic flaws in commercial voting systems that can cause malfunctions, lose votes, and possibly even allow outsiders to influence the outcome of a national election. These failures and flaws cast doubt on the accuracy of elections conducted with electronic systems and threaten to undermine public trust in the electoral system.
While some consequently argue for total abandonment of electronic voting, VOTEBOX shows how a combination of security, distributed systems, and cryptographic principles can yield trustworthy and usable voting systems. It employs a pre-rendered user interface to reduce the size of the runtime system that must be absolutely trusted. VOTE BOX machines keep secure logs of essential election events, allowing credible audits during or after the election; they are connected using the Auditorium, a novel peer-to-peer network that replicates and intertwines secure logs in order to survive failure, attack, and poll worker error. While the election is ongoing, any voter may choose to challenge a VOTEB OX to immediately produce cryptographic proof that it will correctly and faithfully cast ballots.
This work uniquely demonstrates how these disparate approaches can be used in concert to increase assurance in a voting system; the resulting design also offers a number of pragmatic benefits that can help reduce the frequency and impact of poll worker or voter errors. VOTEBOX is a model for new implementations, but its component techniques can be practically applied to existing systems. VOTEBOX ideas should therefore find their way into commercial electronic voting machines as well as other problem domains in which tamper-evidence, robustness, and verifiability are crucial.
|
Page generated in 0.2332 seconds