Global ETD Search

1	Workload Management for Data-Intensive Services Lim, Harold Vinson Chao January 2013 (has links) <p>Data-intensive web services are typically composed of three tiers: i) a display tier that interacts with users and serves rich content to them, ii) a storage tier that stores the user-generated or machine-generated data used to create this content, and iii) an analytics tier that runs data analysis tasks in order to create and optimize new content. Each tier has different workloads and requirements that result in a diverse set of systems being used in modern data-intensive web services.</p><p>Servers are provisioned dynamically in the display tier to ensure that interactive client requests are served as per the latency and throughput requirements. The challenge is not only deciding automatically how many servers to provision but also when to provision them, while ensuring stable system performance and high resource utilization. To address these challenges, we have developed a new control policy for provisioning resources dynamically in coarse-grained units (e.g., adding or removing servers or virtual machines in cloud platforms). Our new policy, called proportional thresholding, converts a user-specified performance target value into a target range in order to account for the relative effect of provisioning a server on the overall workload performance.</p><p>The storage tier is similar to the display tier in some respects, but poses the additional challenge of needing redistribution of stored data when new storage nodes are added or removed. Thus, there will be some delay before the effects of changing a resource allocation will appear. Moreover, redistributing data can cause some interference to the current workload because it uses resources that can otherwise be used for processing requests. We have developed a system, called Elastore, that addresses the new challenges found in the storage tier. Elastore not only coordinates resource allocation and data redistribution to preserve stability during dynamic resource provisioning, but it also finds the best tradeoff between workload interference and data redistribution time.</p><p>The workload in the analytics tier consists of data-parallel workflows that can either be run in a batch fashion or continuously as new data becomes available. Each workflow is composed of smaller units that have producer-consumer relationships based on data. These workflows are often generated from declarative specifications in languages like SQL, so there is a need for a cost-based optimizer that can generate an efficient execution plan for a given workflow. There are a number of challenges when building a cost-based optimizer for data-parallel workflows, which includes characterizing the large execution plan space, developing cost models to estimate the execution costs, and efficiently searching for the best execution plan. We have built two cost-based optimizers: Stubby for batch data-parallel workflows running on MapReduce systems, and Cyclops for continuous data-parallel workflows where the choice of execution system is made a part of the execution plan space.</p><p>We have conducted a comprehensive evaluation that shows the effectiveness of each tier's automated workload management solution.</p> / Dissertation Computer science Automated Control Data-Intensive Services Optimization Workload Management
2	Advancing information privacy concerns evaluation in personal data intensive services Rohunen, A. (Anna) 04 December 2019 (has links) Abstract When personal data are collected and utilised to produce personal data intensive services, users of these services are exposed to the possibility of privacy losses. Users’ information privacy concerns may lead to non-adoption of new services and technologies, affecting the quality and the completeness of the collected data. These issues make it challenging to fully reap the benefits brought by the services. The evaluation of information privacy concerns makes it possible to address these concerns in the design and the development of personal data intensive services. This research investigated how privacy concerns evaluations should be developed to make them valid in the evolving data collection contexts. The research was conducted in two phases: employing a mixed-method research design and using a literature review methodology. In Phase 1, two empirical studies were conducted, following a mixed-method exploratory sequential design. In both studies, the data subjects’ privacy behaviour and privacy concerns that were associated with mobility data collection were first explored qualitatively, and quantitative instruments were then developed based on the qualitative results to generalise the findings. Phase 2 was planned to provide an extensive view on privacy behaviour and some possibilities to develop privacy concerns evaluation in new data collection contexts. Phase 2 consisted of two review studies: a systematic literature review of privacy behaviour models and a review of the EU data privacy legislation changes. The results show that in evolving data collection contexts, privacy behaviour and concerns have characteristics that differ from earlier ones. Privacy concerns have aspects specific to these contexts, and their multifaceted nature appears emphasised. Because privacy concerns are related to other privacy behaviour antecedents, it may be reasonable to incorporate some of these antecedents into evaluations. The existing privacy concerns evaluation instruments serve as valid starting points for evaluations in evolving personal data collection contexts. However, these instruments need to be revised and adapted to the new contexts. The development of privacy concerns evaluation may be challenging due to the incoherence of the existing privacy behaviour research. More overarching research is called for to facilitate the application of the existing knowledge. / Tiivistelmä Kun henkilötietoja kerätään ja hyödynnetään dataintensiivisten palveluiden tuottamiseen, palveluiden käyttäjien tietosuoja saattaa heikentyä. Käyttäjien tietosuojahuolet voivat hidastaa uusien palveluiden ja teknologioiden käyttöönottoa sekä vaikuttaa kerättävän tiedon laatuun ja kattavuuteen. Tämä hankaloittaa palveluiden täysimittaista hyödyntämistä. Tietosuojahuolten arviointi mahdollistaa niiden huomioimisen henkilötietoperusteisten palveluiden suunnittelussa ja kehittämisessä. Tässä tutkimuksessa selvitettiin, kuinka tietosuojahuolten arviointia tulisi kehittää muuttuvissa tiedonkeruuympäristöissä. Kaksivaiheisessa tutkimuksessa toteutettiin aluksi empiirinen monimenetelmällinen tutkimus ja tämän jälkeen systemaattinen kirjallisuustutkimus. Ensimmäisessä vaiheessa tehtiin kaksi empiiristä tutkimusta monimenetelmällisen tutkimuksen tutkivan peräkkäisen asetelman mukaisesti. Näissä tutkimuksissa selvitettiin ensin laadullisin menetelmin tietosuojakäyttäytymistä ja tietosuojahuolia liikkumisen dataa kerättäessä. Laadullisten tulosten pohjalta kehitettiin kvantitatiiviset instrumentit tulosten yleistettävyyden tutkimiseksi. Tutkimuksen toisessa vaiheessa toteutettiin kaksi katsaustyyppistä tutkimusta, jotta saataisiin kattava käsitys tietosuojakäyttäytymisestä sekä mahdollisuuksista kehittää tietosuojahuolten arviointia uusissa tiedonkeruuympäristöissä. Nämä tutkimukset olivat systemaattinen kirjallisuuskatsaus tietosuojakäyttäytymisen malleista sekä katsaus EU:n tietosuojalainsäädännön muutoksista. Tutkimuksen tulokset osoittavat, että kehittyvissä tiedonkeruuympäristöissä tietosuojakäyttäytyminen ja tietosuojahuolet poikkeavat aikaisemmista ympäristöistä. Näissä ympäristöissä esiintyy niille ominaisia tietosuojahuolia ja huolten monitahoisuus korostuu. Koska tietosuojahuolet ovat kytköksissä muihin tietosuojakäyttäytymistä ennustaviin muuttujiin, arviointeihin voi olla aiheellista sisällyttää myös näitä muuttujia. Olemassa olevia tietosuojahuolten arviointi-instrumentteja on perusteltua käyttää arvioinnin lähtökohtana myös kehittyvissä tiedonkeruuympäristöissä, mutta niitä on mukautettava uusiin ympäristöihin soveltuviksi. Arvioinnin kehittäminen voi olla haasteellista, sillä aikaisempi tietosuojatutkimus on epäyhtenäistä. Jotta sitä voidaan soveltaa asianmukaisesti arviointien kehittämisessä, tutkimusta on vietävä kokonaisvaltaisempaan suuntaan. information privacy mixed-method research personal data personal data intensive services privacy behaviour privacy concerns privacy concerns evaluation dataintensiiviset palvelut henkilötiedot monimenetelmällinen tutkimus tietosuoja tietosuojahuolet tietosuojahuolten arviointi tietosuojakäyttäytyminen

Search results

Workload Management for Data-Intensive Services

Advancing information privacy concerns evaluation in personal data intensive services