Global ETD Search

11	Adaptyvūs duomenų modeliai projektavime / Adaptive data models in design Pliuskuvienė, Birutė 27 June 2008 (has links) Disertacijoje nagrinėjamos taikomųjų uždavinių, kurių duomenys išreikšti reliacinėmis aibėmis, sprendimus realizuojančių priemonių adaptyvumo problemos. Pagrindiniai tyrimo objektai yra adaptyvieji duomenų modeliai: duomenų išrinkimo modelis, duomenų agregavimo modelis ir duomenų apdorojimo projektavimo modelis. Darbo tikslas – sukurti adaptyviąją duomenų apdorojimo projektavimo technologiją, kuri leistų išrinkti, agreguoti ir apdoroti duomenis keičiant tik šią technologiją sudarančių adaptyviųjų duomenų modelių formalių išraiškų parametrus. Naudojant sukurtą technologiją skirtingiems uždaviniams spęsti taikomas vienas ir tas pats duomenų apdorojimo principas. Kitaip tariant, visą algoritmų ir juos realizuojančių programini�� modulių sistemą galime pritaikyti skirtingiems taikomojo pobūdžio uždaviniams spręsti. Tai leidžia sumažinti naujų programinių priemonių kūrimo apimtis ir sąnaudas. / The dissertation deals with the adaptivity difficulties of the solutions implemented to solve applied problems whose data is expressed as relational sets. The main objects of research are adaptive data models: a data selection model, a data aggregation model and a model for designing data processing. The aim of the work is to create an adaptive technology for designing data processing that would enable to perform data selection, aggregation and processing by changing only the parameters of formal expressions for the adaptive data models forming the technology. While using the technology created for solving different problems the same data processing principle is used. In other words, the whole system of algorithms and program modules implementing them can be adjusted for solving different problems of applied nature. This allows to decrease the volume and expenses of creating new software. Informatics Engineering Reliacinė aibė Adaptyvieji duomenų modeliai Duomenų transformaticija Algoritminė duomenų priklausomybė Relation set Adaptive data models Data transformation Algorithmic data dependency
12	Adaptive data models in design / Adaptyvūs duomenų modeliai projektavime Pliuskuvienė, Birutė 27 June 2008 (has links) In the dissertation the adaptation problem of the software whose instability is caused by the changes in primary data contents and structure as well as the algorithms for applied problems implementing solutions to problems of applied nature is examined. The solution to the problem is based on the methodology of adapting models for the data expressed as relational sets. / Disertacijoje nagrinėjama taikomųjų uždavinių sprendimus realizuojančių programinių priemonių, kurių nepastovumą lemia pirminių duomenų turinio, jų struktūrų ir sprendžiamų taikomojo pobūdžio uždavinių algoritmų pokyčiai, adaptavimo problema. Informatics Engineering Relation set Adaptive data models Data transformation Algorithmic data dependency Reliacinė aibė Adaptyvieji duomenų modeliai Duomenų transformaticija Algoritminė duomenų priklausomybė
13	EMPIRICALLY EXAMINING THE ROADBLOCKS TO THE AUTOMATIC PARALLELIZATION AND ANALYSIS OF OPEN SOURCE SOFTWARE SYSTEMS Alnaeli, Saleh M. 20 April 2015 (has links) No description available. Computer Science parallelization inhibitors data dependency function calls function pointers virtual functions empirical study static analysis inter-procedural analysis automatic parallelization
14	Covering or complete? : Discovering conditional inclusion dependencies Bauckmann, Jana, Abedjan, Ziawasch, Leser, Ulf, Müller, Heiko, Naumann, Felix January 2012 (has links) Data dependencies, or integrity constraints, are used to improve the quality of a database schema, to optimize queries, and to ensure consistency in a database. In the last years conditional dependencies have been introduced to analyze and improve data quality. In short, a conditional dependency is a dependency with a limited scope defined by conditions over one or more attributes. Only the matching part of the instance must adhere to the dependency. In this paper we focus on conditional inclusion dependencies (CINDs). We generalize the definition of CINDs, distinguishing covering and completeness conditions. We present a new use case for such CINDs showing their value for solving complex data quality tasks. Further, we define quality measures for conditions inspired by precision and recall. We propose efficient algorithms that identify covering and completeness conditions conforming to given quality thresholds. Our algorithms choose not only the condition values but also the condition attributes automatically. Finally, we show that our approach efficiently provides meaningful and helpful results for our use case. / Datenabhängigkeiten (wie zum Beispiel Integritätsbedingungen), werden verwendet, um die Qualität eines Datenbankschemas zu erhöhen, um Anfragen zu optimieren und um Konsistenz in einer Datenbank sicherzustellen. In den letzten Jahren wurden bedingte Abhängigkeiten (conditional dependencies) vorgestellt, die die Qualität von Daten analysieren und verbessern sollen. Eine bedingte Abhängigkeit ist eine Abhängigkeit mit begrenztem Gültigkeitsbereich, der über Bedingungen auf einem oder mehreren Attributen definiert wird. In diesem Bericht betrachten wir bedingte Inklusionsabhängigkeiten (conditional inclusion dependencies; CINDs). Wir generalisieren die Definition von CINDs anhand der Unterscheidung von überdeckenden (covering) und vollständigen (completeness) Bedingungen. Wir stellen einen Anwendungsfall für solche CINDs vor, der den Nutzen von CINDs bei der Lösung komplexer Datenqualitätsprobleme aufzeigt. Darüber hinaus definieren wir Qualitätsmaße für Bedingungen basierend auf Sensitivität und Genauigkeit. Wir stellen effiziente Algorithmen vor, die überdeckende und vollständige Bedingungen innerhalb vorgegebener Schwellwerte finden. Unsere Algorithmen wählen nicht nur die Werte der Bedingungen, sondern finden auch die Bedingungsattribute automatisch. Abschließend zeigen wir, dass unser Ansatz effizient sinnvolle und hilfreiche Ergebnisse für den vorgestellten Anwendungsfall liefert. Datenabhängigkeiten Bedingte Inklusionsabhängigkeiten Erkennen von Meta-Daten Linked Open Data Link-Entdeckung Assoziationsregeln Data Dependency Conditional Inclusion Dependency Metadata Discovery Linked Open Data Link Discovery Association Rule Mining Data processing Computer science
15	Efficient placement design and storage cost saving for big data workflow in cloud datacenters / Conception d'algorithmes de placement efficaces et économie des coûts de stockage pour les workflows du big data dans les centres de calcul de type cloud Ikken, Sonia 14 December 2017 (has links) Les workflows sont des systèmes typiques traitant le big data. Ces systèmes sont déployés sur des sites géo-distribués pour exploiter des infrastructures cloud existantes et réaliser des expériences à grande échelle. Les données générées par de telles expériences sont considérables et stockées à plusieurs endroits pour être réutilisées. En effet, les systèmes workflow sont composés de tâches collaboratives, présentant de nouveaux besoins en terme de dépendance et d'échange de données intermédiaires pour leur traitement. Cela entraîne de nouveaux problèmes lors de la sélection de données distribuées et de ressources de stockage, de sorte que l'exécution des tâches ou du job s'effectue à temps et que l'utilisation des ressources soit rentable. Par conséquent, cette thèse aborde le problème de gestion des données hébergées dans des centres de données cloud en considérant les exigences des systèmes workflow qui les génèrent. Pour ce faire, le premier problème abordé dans cette thèse traite le comportement d'accès aux données intermédiaires des tâches qui sont exécutées dans un cluster MapReduce-Hadoop. Cette approche développe et explore le modèle de Markov qui utilise la localisation spatiale des blocs et analyse la séquentialité des fichiers spill à travers un modèle de prédiction. Deuxièmement, cette thèse traite le problème de placement de données intermédiaire dans un stockage cloud fédéré en minimisant le coût de stockage. A travers les mécanismes de fédération, nous proposons un algorithme exacte ILP afin d’assister plusieurs centres de données cloud hébergeant les données de dépendances en considérant chaque paire de fichiers. Enfin, un problème plus générique est abordé impliquant deux variantes du problème de placement lié aux dépendances divisibles et entières. L'objectif principal est de minimiser le coût opérationnel en fonction des besoins de dépendances inter et intra-job / The typical cloud big data systems are the workflow-based including MapReduce which has emerged as the paradigm of choice for developing large scale data intensive applications. Data generated by such systems are huge, valuable and stored at multiple geographical locations for reuse. Indeed, workflow systems, composed of jobs using collaborative task-based models, present new dependency and intermediate data exchange needs. This gives rise to new issues when selecting distributed data and storage resources so that the execution of tasks or job is on time, and resource usage-cost-efficient. Furthermore, the performance of the tasks processing is governed by the efficiency of the intermediate data management. In this thesis we tackle the problem of intermediate data management in cloud multi-datacenters by considering the requirements of the workflow applications generating them. For this aim, we design and develop models and algorithms for big data placement problem in the underlying geo-distributed cloud infrastructure so that the data management cost of these applications is minimized. The first addressed problem is the study of the intermediate data access behavior of tasks running in MapReduce-Hadoop cluster. Our approach develops and explores Markov model that uses spatial locality of intermediate data blocks and analyzes spill file sequentiality through a prediction algorithm. Secondly, this thesis deals with storage cost minimization of intermediate data placement in federated cloud storage. Through a federation mechanism, we propose an exact ILP algorithm to assist multiple cloud datacenters hosting the generated intermediate data dependencies of pair of files. The proposed algorithm takes into account scientific user requirements, data dependency and data size. Finally, a more generic problem is addressed in this thesis that involve two variants of the placement problem: splittable and unsplittable intermediate data dependencies. The main goal is to minimize the operational data cost according to inter and intra-job dependencies Workflow du big data Accès et placement des données Minimisation des coûts de stockage Centres de données cloud Hadoop MapReduce Application dirigée par les données Données de dépendances Optimisation Big data workflow Data access and placement Storage cost minimization Cloud datacenters Hadoop MapReduce Data-driven application Data dependency Optimization

Page generated in 0.5323 seconds