Global ETD Search

1	Balancing Money and Time for OLAP Queries on Cloud Databases Sabih, Rafia January 2016 (has links) (PDF) Enterprise Database Management Systems (DBMSs) have to contend with resource-intensive and time-varying workloads, making them well-suited candidates for migration to cloud plat-forms { specifically, they can dynamically leverage the resource elasticity while retaining affordability through the pay-as-you-go rental interface. The current design of database engine components lays emphasis on maximizing computing efficiency, but to fully capitalize on the cloud's benefits, the outlays of these computations also need to be factored into the planning exercise. In this thesis, we investigate this contemporary problem in the context of industrial-strength deployments of relational database systems on real-world cloud platforms. Specifically, we consider how the traditional metric used to compare query execution plans, namely response-time, can be augmented to incorporate monetary costs in the decision process. The challenge here is that execution-time and monetary costs are adversarial metrics, with a decrease in one entailing a rise in the other. For instance, a Virtual Machine (VM) with rich physical resources (RAM, cores, etc.) decreases the query response-time, but is expensive with regard to rental rates. In a nutshell, there is a tradeoff between money and time, and our goal therefore is to identify the VM that others the best tradeoff between these two competing considerations. In our study, we pro le the behavior of money versus time for a given query, and de ne the best tradeoff as the \knee" { that is, the location on the pro le with the minimum Euclidean distance from the origin. To study the performance of industrial-strength database engines on real-world cloud infrastructure, we have deployed a commercial DBMS on Google cloud services. On this platform, we have carried out extensive experimentation with the TPC-DS decision-support benchmark, an industry-wide standard for evaluating database system performance. Our experiments demonstrate that the choice of VM for hosting the database server is a crucial decision, because: (i) variation in time and money across VMs is significant for a given query, (ii) no one VM offers the best money-time tradeoff across all queries. To efficiently identify the VM with the best tradeoff from a large suite of available configurations, we propose a technique to characterize the money-time pro le for a given query. The core of this technique is a VM pruning mechanism that exploits the property of partially ordered set of the VMs on their resources. It processes the minimal and maximal VMs of this poset for estimated query response-time. If the response-times on these extreme VMs are similar, then all the VMs sandwiched between them are pruned from further consideration. Otherwise, the already processed VMs are set aside, and the minimal and maximal VMs of the remaining unprocessed VMs are evaluated for their response-times. Finally, the knee VM is identified from the processed VMs as the one with the minimum Euclidean distance from the origin on the money-time space. We theoretically prove that this technique always identifies the knee VM; further, if it is acceptable to and a \near-optimal" knee by providing a relaxation-factor on the response-time distance from the optimal knee, then it is also capable of finding more efficiently a satisfactory knee under these relaxed conditions. We propose two favors of this approach: the first one prunes the VMs using complete plan information received from database engine API, and named as Plan-based Identification of Knee (PIK). On the other hand, to further increase the efficiency of the identification of the knee VM, we propose a sub-plan based pruning algorithm called Sub-Plan-based Identification of Knee (SPIK), which requires modifications in the query optimizer. We have evaluated PIK on a commercial system and found that it often requires processing for only 20% of the total VMs. The efficiency of the algorithm is further increased significantly, by using 10-20% relaxation in response-time. For evaluating SPIK , we prototyped it on an open-source engine { Postgresql 9.3, and also implemented it as Java wrapper program with the commercial engine. Experimentally, the processing done by SPIK is found to be only 40% of the PIK approach. Therefore, from an overall perspective, this thesis facilitates the desired migration of enterprise databases to cloud platforms, by identifying the VM(s) that offer competitive tradeoffs between money and time for the given query. Database Management Syatem (DBMS) Virtual Machine Google Cloud Services Cloud Platforms Cloud Databases Cloud Query Processing Model Plan-based Identification of Knee (PIK ) Knee VM Computational and Data Sciences
2	En jämförelse i kostnad och prestanda för molnbaserad datalagring / A comparison in cost and performance for cloud-based data storage Burgess, Olivia, Oucif, Sara January 2024 (has links) I takt med att datakvantiteter växer och kraven på skalbarhet och tillgänglighet inom molntjänster växer, framhävs behovet av undersökningar kring dess prestanda och kostnadseffektivitet. Dessa analyser är avgörande för att optimera tjänster och bistå företag med värdefulla rekommendationer för att fatta välgrundade beslut om datalagring i molnet. Detta examensarbete undersöker kostnad samt prestanda hos relationella och icke-relationella datalagringslösningar implementerade på Microsoft Azure och Google Cloud Platform. Verktyget Hyperfine används för att mäta latens och tjänsternas kostnadseffektivitet beräknas baserat på detta resultat samt dess beräknade månadskostnader. Studiens resultat indikerar att för de utvärderade relationella databastjänsterna uppvisar Azure SQL Database initialt en låg latens som sedan ökar proportionellt med datamängden, medan Google Cloud SQL visar en något högre latens vid lägre datamängder men mer konstant latens vid högre datamängder. Azure SQL visar sig vara mer kostnadseffektiv i förhållande till Google Cloud SQL, vilket gör den till ett mer fördelaktigt alternativ för företag som eftersträvar hög prestanda till lägre kostnader. Vid jämförelse mellan de två icke-relationella databastjänsterna Azure Cosmos DB och Google Cloud Datastore uppvisar Azure Cosmos DB genomgående jämförelsevis lägre latens och överlägsen kostnadseffektivitet. Detta gör Azure Cosmos DB till en fördelaktig lösning för företag som prioriterar ekonomisk effektivitet i sin databashantering. / As data volumes grow and the demands for scalability and availability within cloud services increase, the need for studies on their performance and cost-effectiveness is emphasized. These analyses are crucial for optimizing services and providing businesses with valuable recommendations to make well-grounded decisions about cloud data storage. This thesis examines cost and performance for relational and non-relational data storage solutions implemented on Microsoft Azure and Google Cloud Platform. The tool Hyperfine is used to evaluate latency and the cloud services cost efficiency is calculated using this result as well as their monthly cost. The study's results regarding relational data storage indicate that Azure SQL Database initially exhibits low latency, which then increases proportionally with the data volume, while Google Cloud SQL shows slightly higher latency at smaller data volumes but more consistent latency with more data. Azure SQL Database is more cost-effective, making it a more favorable option than Google Cloud SQL for companies seeking high performance at lower costs. Regarding evaluated services for non-relational data storage Azure Cosmos DB consistently demonstrates lower latency and superior cost efficiency compared to Google Cloud Datastore, making it the preferred solution for companies prioritizing economic efficiency in their database management. Azure Cosmos DB Azure SQL Database Google Cloud Datastore Google Cloud SQL Cloud databases Cloud services Cost efficiency NoSQL Performance SQL Azure Cosmos DB Azure SQL Database Google Cloud Datastore Google Cloud SQL kostnadseffektivitet molndatabaser molntjänster NoSQL prestanda SQL Computer Sciences Datavetenskap (datalogi)

Search results

Balancing Money and Time for OLAP Queries on Cloud Databases

En jämförelse i kostnad och prestanda för molnbaserad datalagring / A comparison in cost and performance for cloud-based data storage