41 |
Dynamic Binding of Names in Calculi for Mobile ProcessesVivas Frontana, Jose Luis January 2001 (has links)
No description available.
|
42 |
Performance Isolation in Cloud Storage SystemsSingh, Akshay K. 09 1900 (has links)
Cloud computing enables data centres to provide resource sharing across multiple tenants.
This sharing, however, usually comes at a cost in the form of reduced isolation
between tenants, which can lead to inconsistent and unpredictable performance. This variability
in performance becomes an impediment for clients whose services rely on consistent,
responsive performance in cloud environments. The problem is exacerbated for applications
that rely on cloud storage systems as performance in these systems is a ffected by disk
access times, which often dominate overall request service times for these types of data
services.
In this thesis we introduce MicroFuge, a new distributed caching and scheduling middleware
that provides performance isolation for cloud storage systems. To provide performance
isolation, MicroFuge's cache eviction policy is tenant and deadline-aware, which
enables the provision of isolation to tenants and ensures that data for queries with more
urgent deadlines, which are most likely to be a ffected by competing requests, are less likely
to be evicted than data for other queries. MicroFuge also provides simplifi ed, intelligent
scheduling in addition to request admission control whose performance model of the underlying
storage system will reject requests with deadlines that are unlikely to be satisfi ed.
The middleware approach of MicroFuge makes it unique among other systems which
provide performance isolation in cloud storage systems. Rather than providing performance
isolation for some particular cloud storage system, MicroFuge can be deployed on top of
any already deployed storage system without modifying it. Keeping in mind the wide
spectrum of cloud storage systems available today, such an approach make MicroFuge very
adoptable.
In this thesis, we show that MicroFuge can provide signifi cantly better performance
isolation between tenants with di fferent latency requirements than Memcached, and with
admission control enabled, can ensure that more than certain percentage of requests meet
their deadlines.
|
43 |
Diagnosing performance changes in distributed systems by comparing request flowsSambasivan, Raja R. 01 May 2013 (has links)
Diagnosing performance problems in modern datacenters and distributed systems is challenging, as the root cause could be contained in any one of the system’s numerous components or, worse, could be a result of interactions among them. As distributed systems continue to increase in complexity, diagnosis tasks will only become more challenging. There is a need for a new class of diagnosis techniques capable of helping developers address problems in these distributed environments.
As a step toward satisfying this need, this dissertation proposes a novel technique, called request-flow comparison, for automatically localizing the sources of performance changes from the myriad potential culprits in a distributed system to just a few potential ones. Request-flow comparison works by contrasting the workflow of how individual requests are serviced within and among every component of the distributed system between two periods: a non-problem period and a problem period. By identifying and ranking performance-affecting changes, request-flow comparison provides developers with promising starting points for their diagnosis efforts. Request workflows are obtained with less than 1% overhead via use of recently developed end-to-end tracing techniques.
To demonstrate the utility of request-flow comparison in various distributed systems, this dissertation describes its implementation in a tool called Spectroscope and describes how Spectroscope was used to diagnose real, previously unsolved problems in the Ursa Minor distributed storage service and in select Google services. It also explores request-flow comparison’s applicability to the Hadoop File System. Via a 26-person user study, it identifies effective visualizations for presenting request-flow comparison’s results and further demonstrates that request-flow comparison helps developers quickly identify starting points for diagnosis.This dissertation also distills design choices that will maximize an end-to-end tracing infrastructure’s utility for diagnosis tasks and other use cases.
|
44 |
DistNeo4j: Scaling Graph Databases through Dynamic Distributed PartitioningNicoara, Daniel 14 October 2014 (has links)
Social networks are large graphs which require multiple servers to store and manage them. Providing performant scalable systems that store these graphs through partitioning them into subgraphs is an important issue. In such systems each partition is hosted by a server to satisfy multiple objectives. These objectives include balancing server loads, reducing remote traversals (number of edges cut), and adapting the partitioning to changes in the structure of the graph in the face of changing workloads. To address these issues, a dynamic repartitioning algorithm is required to modify an existing partitioning to maintain good quality partitions. Such a repartitioner should not impose a significant overhead to the system. This thesis introduces a greedy repartitioner, which dynamically modifies a partitioning using a small amount of resources. In contrast to the existing repartitioning algorithms, the greedy repartitioner is performant (in terms of time and memory), making it suitable for implementing and using it in a real system. The greedy repartitioner is integrated into DistNeo4j, which is designed as an extension of the open source Neo4j graph database system, to support workloads over partitioned graph data distributed over multiple servers. Using real-world data sets, this thesis shows that DistNeo4j leverages the greedy repartitioner to maintain high quality partitions and provides a 2 to 3 times performance improvement over the de-facto hash-based partitioning.
|
45 |
Coding and Maintenance Strategies for Cloud Storage: Correlated Failures, Mobility and Architecture AwarenessCalis, Gokhan, Calis, Gokhan January 2017 (has links)
As a result of evergrowing data and recent interest in storing and analyzing it, distributed storage systems (DSS), which is also known as cloud storage, have become one of the most important research areas in the literature. Not only such networks are being used as backbone systems for companies like Google, Microsoft and Facebook but also they have accelerated the growth of cloud computing, which is an essential business line for institutions such as IBM, Amazon and Salesforce. In this dissertation, the focus is on the storage side of cloud in order to address the important questions in designing such systems. First, coding theoretic approach is taken to handle correlated failures of multiple storage nodes. In particular, this dissertation studies distributed storage systems that can provide resilience against correlated failure patterns that affect the availability of multiple storage nodes, i.e., power loss that may affect multiple disks. Maximum file size that can be stored in such DSS is studied and then optimal code constructions are provided. This dissertation also studies cloud storage systems that prevent data loss from mixed failure patterns of disks and sectors in disk drives. Specifically, a general code construction is proposed to overcome such failures for any given parameter set. Due to its large field size requirement of proposed construction, a relaxation on the efficiency of storage system is considered to provide codes with smaller field sizes. Maintenance of cloud storage systems is also studied. To that end, this dissertation first studies the maintenance of DSS that include a backup node, which is called hierarchical DSS. Hierarchical DSS can model cellular networks such as femtocell as well as caching in wireless networks. In particular, we present an upper bound on the file size that can be stored over hierarchical DSS and propose optimal code constructions. Then, maintenance cost and data access cost for users of such DSS are studied. Lastly, mobility effects of cloud storage over wireless devices are studied. Specifically, an analysis on the mobile cloud storage system that initiates the maintenance process after certain number of devices remains in the network is performed and different maintenance strategies are proposed that are optimal with respect to average cost in certain mobility regimes.
|
46 |
Distributed Crawling of Rich Internet ApplicationsMir Taheri, Seyed Mohammad January 2015 (has links)
Web crawlers visit internet applications, collect data, and learn about new web pages from visited pages. Web crawlers have a long and interesting history. Quick expansion of the web, and the complexity added to web applications have made the process of crawling a very challenging one. Different solutions have been proposed to reduce the time and cost of crawling. New generation of web applications, known as Rich Internet Applications (RIAs), pose major challenges to the web crawlers. RIAs shift a portion of the computation to the client side. Shifting a portion of the application to the client browser influences the web crawler in two ways: First, the one-to-one correlation between the URL and the state of the application, that exists in traditional web applications, is broken. Second, reaching a state of the application is no longer a simple operation of navigating to the target URL, but often means navigating to a seed URL and executing a chain of events from it. Due to these challenges, crawling a RIA can take a prohibitively long time. This thesis studies applying distributed computing and parallel processing principles to the field of RIA crawling to reduce the time. We propose different algorithms to concurrently crawl a RIA over several nodes. The proposed algorithms are used as a building block to construct a distributed crawler of RIAs. The different algorithms proposed represent different trade-offs between communication and computation. This thesis explores the effect of making different trade-offs and their effect on the time it takes to crawl RIAs. We study the cost of running a distributed RIA crawl with client-server architecture and compare it with a peer-to-peer architecture. We further study distribution of different crawling strategies, namely: Breath-First search, Depth-First search, Greedy algorithm, and Probabilistic algorithm. To measure the effect of different design decisions in practice, a prototype of each algorithm is implemented. The implemented prototypes are used to obtain empirical performance measurements and to refine the algorithms. The ultimate refined algorithm is used for experimentation with a wide range of applications under different circumstances. This thesis finally includes two theoretical studies of load balancing algorithms and distributed component-based crawling and sets the stage for future studies.
|
47 |
Collaborative detection of cyberbullying behavior in Twitter dataMangaonkar, Amrita January 2017 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / As the size of Twitter© data is increasing, so are undesirable behaviors of its users. One such undesirable behavior is cyberbullying, which could lead to catastrophic consequences. Hence, it is critical to efficiently detect cyberbullying behavior by analyzing tweets, in real-time if possible. Prevalent approaches to identifying cyberbullying are mainly stand-alone, and thus, are time-consuming. This thesis proposes a new approach called distributed-collaborative approach for cyberbullying detection. It contains a network of detection nodes, each of which is independent and capable of classifying tweets it receives. These detection nodes collaborate with each other in case they need help in classifying a given tweet. The study empirically evaluates various collaborative patterns, and it assesses the performance of each pattern in detail. Results indicate an improvement in recall and precision of the detection mechanism over the stand- alone paradigm. Further, this research analyzes scalability of the approach by increasing the number of nodes in the network. The empirical results obtained from experimentation show that the system is scalable. The study performed also incorporates the experiments that analyze behavior distributed-collaborative approach in case of failures in the system. Additionally, the proposed thesis tests this approach on a different domain, such as politics, to explore the possibility of the generalization of results.
|
48 |
ARTS and CRAFTS: Predictive Scaling for Request-Based Services in the CloudGuenther, Andrew 01 June 2014 (has links) (PDF)
Modern web services can see well over a billion requests per day. Data and services at such scale require advanced software and large amounts of computational resources to process requests in reasonable time. Advancements in cloud computing now allow us to acquire additional resources faster than in traditional capacity planning scenarios. Companies can scale systems up and down as required, allowing them to meet the demand of their customers without having to purchase their own expensive hardware. Unfortunately, these, now routine, scaling operations remain a primarily manual task. To solve this problem, we present CRAFTS (Cloud Resource Anticipation For Timing Scaling), a system for automatically identifying application throughput and predictive scaling of cloud computing resources based on historical data. We also present ARTS (Automated Request Trace Simulator), a request based workload generation tool for constructing diverse and realistic request patterns for modern web applications. ARTS allows us to evaluate CRAFTS' algorithms on a wide range of scenarios. In this thesis, we outline the design and implementation of both ARTS and CRAFTS and evaluate the effectiveness of various prediction algorithms applied to real-world request data and artificial workloads generated by ARTS.
|
49 |
PRIMA - Privilege Management and Authorization in Grid Computing EnvironmentsLorch, Markus 28 April 2004 (has links)
Computational grids and other heterogeneous, large-scale distributed systems require more powerful and more flexible authorization mechanisms to realize fine-grained access-control of resources. Computational grids are increasingly used for collaborative problem-solving and advanced science and engineering applications. Usage scenarios for advanced grids require support for small, dynamic working groups, direct delegation of access privileges among users, procedures for establishing trust relationships without requiring organizational level agreements, precise management by individuals of their privileges, and retention of authority by resource providers. Existing systems fail to provide the necessary flexibility and granularity to support these scenarios. The reasons include the overhead imposed by required administrator intervention, coarse granularity that only allows for all-or-nothing access control decisions, and the inability to implement finer-grained access control without requiring trusted application code.
PRIMA, the model and system developed in this research, focuses on management and enforcement of fine-grained privileges. The PRIMA model introduces novel approaches that can be used in place of, or in combination with existing access control mechanisms. PRIMA enables the users of a system to manage access to their own assets directly without the need for, and costs of intervention by technical personnel. System administrators benefit from more flexible and fine-grained definition of access privileges and policies. A novel access control decision and enforcement model with support for legacy applications has been developed. The model uses on-demand account leasing and implements expressive enforcement mechanisms built on existing low-overhead security primitives of the operating systems. The combination of the PRIMA components constitutes a comprehensive security model that facilitates highly dynamic authorization scenarios and increases security through least privilege access to resources. In summary, PRIMA mechanisms enable the use of fine-grained access rights, reduce administrative costs to resource providers, enable ad-hoc and dynamic collaboration scenarios, and provide improved security service to long-lived grid communities. / Ph. D.
|
50 |
Rich Cloud-based Web Applications with Cloudbrowser 2.0Pan, Xiaozhong 21 June 2015 (has links)
When developing web applications using traditional methods, developers need to partition the application logic between client side and server side, then implement these two parts separately (often using two different programming languages) and write the communication code to synchronize the application's state between the two parts. CloudBrowser is a server- centric web framework that eliminates this need for partitioning applications entirely. In CloudBrowser, the application code is executed in server side virtual browsers which preserve the application's presentation state. The client web browsers act like rendering devices, fetching and rendering the presentation state from the virtual browsers. The client-server communication and user interface rendering is implemented by the framework under the hood. CloudBrowser applications are developed in a way similar to regular web pages, using no more than HTML, CSS and JavaScript. Since the user interface state is preserved, the framework also provides a continuous experience for users who can disconnect from the application at any time and reconnect to pick up at where they left off.
The original implementation of CloudBrowser was single-threaded and supported deployment on only one process. We implemented CloudBrowser 2.0, a multi-process implementation of CloudBrowser. CloudBrowser 2.0 can be deployed on a cluster of servers as well as a single multi-core server. It distributes the virtual browsers to multiple processes and dispatches client requests to the associated virtual browsers. CloudBrowser 2.0 also refines the CloudBrowser application deployment model to make the framework a PaaS platform. The developers can develop and deploy different types of applications and the platform will automatically scale them to multiple servers. / Master of Science
|
Page generated in 0.1089 seconds