• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • 1
  • Tagged with
  • 8
  • 8
  • 8
  • 8
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Karst Database Development in Minnesota: Design and Data Assembly

Gao, Y., Alexander, E. C., Tipping, R. G. 01 May 2005 (has links)
The Karst Feature Database (KFD) of Minnesota is a relational GIS-based Database Management System (DBMS). Previous karst feature datasets used inconsistent attributes to describe karst features in different areas of Minnesota. Existing metadata were modified and standardized to represent a comprehensive metadata for all the karst features in Minnesota. Microsoft Access 2000 and ArcView 3.2 were used to develop this working database. Existing county and sub-county karst feature datasets have been assembled into the KFD, which is capable of visualizing and analyzing the entire data set. By November 17 2002, 11,682 karst features were stored in the KFD of Minnesota. Data tables are stored in a Microsoft Access 2000 DBMS and linked to corresponding ArcView applications. The current KFD of Minnesota has been moved from a Windows NT server to a Windows 2000 Citrix server accessible to researchers and planners through networked interfaces.
2

Dynamic Energy-Aware Database Storage and Operations

Behzadnia, Peyman 29 March 2018 (has links)
Energy consumption has become a first-class optimization goal in design and implementation of data-intensive computing systems. This is particularly true in the design of database management systems (DBMS), which is one of the most important servers in software stack of modern data centers. Data storage system is one of the essential components of database and has been under many research efforts aiming at reducing its energy consumption. In previous work, dynamic power management (DPM) techniques that make real-time decisions to transition the disks to low-power modes are normally used to save energy in storage systems. In this research, we tackle the limitations of DPM proposals in previous contributions and design a dynamic energy-aware disk storage system in database servers. We introduce a DPM optimization model integrated with model predictive control (MPC) strategy to minimize power consumption of the disk-based storage system while satisfying given performance requirements. It dynamically determines the state of disks and plans for inter-disk data fragment migration to achieve desirable balance between power consumption and query response time. Furthermore, via analyzing our optimization model to identify structural properties of optimal solutions, a fast-solution heuristic DPM algorithm is proposed that can be integrated in large-scale disk storage systems, where finding the most optimal solution might be long, to achieve near-optimal power saving solution within short periods of computational time. The proposed ideas are evaluated through running simulations using extensive set of synthetic workloads. The results show that our solution achieves up to 1.65 times more energy saving while providing up to 1.67 times shorter response time compared to the best existing algorithm in literature. Stream join is a dynamic and expensive database operation that performs join operation in real-time fashion on continuous data streams. Stream joins, also known as window joins, impose high computational time and potentially higher energy consumption compared to other database operations, and thus we also tackle energy-efficiency of stream join processing in this research. Given that there is a strong linear correlation between energy-efficiency and performance of in-memory parallel join algorithms in database servers, we study parallelization of stream join algorithms on multicore processors to achieve energy efficiency and high performance. Equi-join is the most frequent type of join in query workloads and symmetric hash join (SHJ) algorithm is the most effective algorithm to evaluate equi-joins in data streams. To best of our knowledge, we are the first to propose a shared-memory parallel symmetric hash join algorithm on multi-core CPUs. Furthermore, we introduce a novel parallel hash-based stream join algorithm called chunk-based pairing hash join that aims at elevating data throughput and scalability. We also tackle parallel processing of multi-way stream joins where there are more than two input data streams involved in the join operation. To best of our knowledge, we are also the first to propose an in-memory parallel multi-way hash-based stream join on multicore processors. Experimental evaluation on our proposed parallel algorithms demonstrates high throughput, significant scalability, and low latency while reducing the energy consumption. Our parallel symmetric hash join and chunk-based pairing hash join achieve up to 11 times and 12.5 times more throughput, respectively, compared to that of state-of-the-art parallel stream join algorithm. Also, these two algorithms provide up to around 22 times and 24.5 times more throughput, respectively, compared to that of non-parallel (sequential) stream join computation where there is one processing thread.
3

Compression Selection for Columnar Data using Machine-Learning and Feature Engineering

Persson, Douglas, Juelsson Larsen, Ludvig January 2023 (has links)
There is a continuously growing demand for improved solutions that provide both efficient storage and efficient retrieval of big data for analytical purposes. This thesis researches the use of machine-learning together with feature engineering to recommend the most cost-effective compression algorithm and encoding combination for columns in a columnar database management system (DBMS). The framework consists of a cost function calculated using compression time, decompression time, and compression ratio. An XGBoost machine-learning model is trained on labels provided by the cost function to recommend the most cost-effective combination for columnar data within a column or vector-oriented DBMS. While the methods are applied on ClickHouse, one of the most popular open-source column-oriented DBMS on the market, the results are broadly applicable to column-oriented data which share data type and characteristics with IoT telemetry data. Using billions of available rows of numeric real business data obtained at Axis Communications in Lund, Sweden, a set of features are engineered to accurately describe the characteristics of a given column. The proposed framework allows for weighting the business interests (compression time, decompression time, and compression ratio) to determine the individually optimal cost-effective solution. The model reaches an accuracy of 99% on the test dataset and an accuracy of 90.1% on unseen data by leveraging data features that are predictive of compression algorithms and encodings performances. Following ClickHouse strategies and the most suitable practices in the field, combinations of general-purpose compression algorithms and data encodings are analysed that together yield the best results in efficiently compressing the data of certain columns. Applying the unweighted recommended combinations on all columns, the framework’s performance impact was measured to increase the average compression speed by 95.46%. Reducing the time to compress the columns from 31.17 seconds to compress the data to 13.17 seconds. Additionally, the decompression speed was increased by 59.87%, reducing the time to decompress the columns from 2.63 seconds to 2.02 seconds, at the cost of decreasing the compression ratio by 66.05%. Increasing the storage requirements by 94.9 MB. In column and vector databases, chunks of data belonging to a certain column are often stored together on a disk. Therefore, choosing the right compression algorithm can lower the storage requirements and boost database throughput.
4

Sistema de gerenciamento da informação: alterações neurológicas em chagásicos crônicos não-cardíacos / Information Management System: neurological disorders in non-cardiac chronics chagasic.

Carmo, Samuel Sullivan 27 April 2010 (has links)
O presente trabalho ocupa-se no desenvolvimento de um sistema computacional de gerenciamento da informação para auxiliar os estudos científicos sobre o sistema nervoso de chagásicos crônicos não-cardíacos. O objetivo é desenvolver o sistema requerido, pelo pressuposto de praticidade nas análises decorrentes da investigação. O método utilizado para desenvolver este sistema computacional, dedicado ao gerenciamento das informações da pesquisa sobre as alterações neurológicas de seus sujeitos, foi; compor o arquétipo de metas e a matriz de levantamento de requisitos das variantes do sistema; listar os atributos, domínios e qualificações das suas variáveis; elaborar o quadro de escolha de equipamentos e aplicativos necessários para sua implantação física e lógica e; implantá-lo mediante uma modelagem de base de dados, e uma programação lógica de algoritmos. Como resultado o sistema foi desenvolvido. A discussão de análise é; a saber, que a informatização pode tornar mais eficaz as operações de cadastro, consulta e validação de campo, além da formatação e exportação de tabelas pré-tratadas para análises estatísticas, atuando assim como uma ferramenta do método científico. Ora, a argumentação lógica é que a confiabilidade das informações computacionalmente registradas é aumentada porque o erro humano é diminuído na maioria dos processamentos. Como discussão de cerramento, estudos dotados de razoável volume de variáveis e sujeitos de pesquisa são mais bem geridos caso possuam um sistema dedicado ao gerenciamento de suas informações. / This is the development of a computer information management system to support scientific studies about the nervous system of non-cardiac chronic chagasic patients. The goal is to develop the required system, by assumption of the convenience in the analysis of research results. The method used to develop this computer system, dedicated to information management of research about the neurological disorders of their human subject research, were; compose the archetypal matrix of targets and requirements elicitation of the system variants; list the attributes, qualifications and domains of its variables; draw up the choice framework of equipment and required applications for its physical and logic implementation, and; deploying it through a data modeling, an adapted entity-relationship diagram and programmable logic algorithms. As a result the required system was developed. The analytical discussion is that the computerization makes the data processing faster and safer. The more practical information management processes are: the operations of registration, queries and fields\' validations, as well as the advanced and basic queries of records, in addition to table formatting and exporting of pre-treated for statistical analysis. The logical argument is that the reliability of the recorded computationally information is increased because is insured that bias of human error is absent from most of the steps, including several the data processing operations. As end discussion, scientific studies with reasonable amount of variables and research subjects are better managed if they have a dedicated system to managing their information.
5

Sistema de gerenciamento da informação: alterações neurológicas em chagásicos crônicos não-cardíacos / Information Management System: neurological disorders in non-cardiac chronics chagasic.

Samuel Sullivan Carmo 27 April 2010 (has links)
O presente trabalho ocupa-se no desenvolvimento de um sistema computacional de gerenciamento da informação para auxiliar os estudos científicos sobre o sistema nervoso de chagásicos crônicos não-cardíacos. O objetivo é desenvolver o sistema requerido, pelo pressuposto de praticidade nas análises decorrentes da investigação. O método utilizado para desenvolver este sistema computacional, dedicado ao gerenciamento das informações da pesquisa sobre as alterações neurológicas de seus sujeitos, foi; compor o arquétipo de metas e a matriz de levantamento de requisitos das variantes do sistema; listar os atributos, domínios e qualificações das suas variáveis; elaborar o quadro de escolha de equipamentos e aplicativos necessários para sua implantação física e lógica e; implantá-lo mediante uma modelagem de base de dados, e uma programação lógica de algoritmos. Como resultado o sistema foi desenvolvido. A discussão de análise é; a saber, que a informatização pode tornar mais eficaz as operações de cadastro, consulta e validação de campo, além da formatação e exportação de tabelas pré-tratadas para análises estatísticas, atuando assim como uma ferramenta do método científico. Ora, a argumentação lógica é que a confiabilidade das informações computacionalmente registradas é aumentada porque o erro humano é diminuído na maioria dos processamentos. Como discussão de cerramento, estudos dotados de razoável volume de variáveis e sujeitos de pesquisa são mais bem geridos caso possuam um sistema dedicado ao gerenciamento de suas informações. / This is the development of a computer information management system to support scientific studies about the nervous system of non-cardiac chronic chagasic patients. The goal is to develop the required system, by assumption of the convenience in the analysis of research results. The method used to develop this computer system, dedicated to information management of research about the neurological disorders of their human subject research, were; compose the archetypal matrix of targets and requirements elicitation of the system variants; list the attributes, qualifications and domains of its variables; draw up the choice framework of equipment and required applications for its physical and logic implementation, and; deploying it through a data modeling, an adapted entity-relationship diagram and programmable logic algorithms. As a result the required system was developed. The analytical discussion is that the computerization makes the data processing faster and safer. The more practical information management processes are: the operations of registration, queries and fields\' validations, as well as the advanced and basic queries of records, in addition to table formatting and exporting of pre-treated for statistical analysis. The logical argument is that the reliability of the recorded computationally information is increased because is insured that bias of human error is absent from most of the steps, including several the data processing operations. As end discussion, scientific studies with reasonable amount of variables and research subjects are better managed if they have a dedicated system to managing their information.
6

Role-based Data Management

Jäkel, Tobias 24 March 2017 (has links)
Database systems build an integral component of today’s software systems and as such they are the central point for storing and sharing a software system’s data while ensuring global data consistency at the same time. Introducing the primitives of roles and their accompanied metatype distinction in modeling and programming languages, results in a novel paradigm of designing, extending, and programming modern software systems. In detail, roles as modeling concept enable a separation of concerns within an entity. Along with its rigid core, an entity may acquire various roles in different contexts during its lifetime and thus, adapts its behavior and structure dynamically during runtime. Unfortunately, database systems, as important component and global consistency provider of such systems, do not keep pace with this trend. The absence of a metatype distinction, in terms of an entity’s separation of concerns, in the database system results in various problems for the software system in general, for the application developers, and finally for the database system itself. In case of relational database systems, these problems are concentrated under the term role-relational impedance mismatch. In particular, the whole software system is designed by using different semantics on various layers. In case of role-based software systems in combination with relational database systems this gap in semantics between applications and the database system increases dramatically. Consequently, the database system cannot directly represent the richer semantics of roles as well as the accompanied consistency constraints. These constraints have to be ensured by the applications and the database system loses its single point of truth characteristic in the software system. As the applications are in charge of guaranteeing global consistency, their development requires more effort in data management. Moreover, the software system’s data management is distributed over several layers, which results in an unstructured software system architecture. To overcome the role-relational impedance mismatch and bring the database system back in its rightful position as single point of truth in a software system, this thesis introduces the novel and tripartite RSQL approach. It combines a novel database model that represents the metatype distinction as first class citizen in a database system, an adapted query language on the database model’s basis, and finally a proper result representation. Precisely, RSQL’s logical database model introduces Dynamic Data Types, to directly represent the separation of concerns within an entity type on the schema level. On the instance level, the database model defines the notion of a Dynamic Tuple that combines an entity with the notion of roles and thus, allows for dynamic structure adaptations during runtime without changing an entity’s overall type. These definitions build the main data structures on which the database system operates. Moreover, formal operators connecting the query language statements with the database model data structures, complete the database model. The query language, as external database system interface, features an individual data definition, data manipulation, and data query language. Their statements directly represent the metatype distinction to address Dynamic Data Types and Dynamic Tuples, respectively. As a consequence of the novel data structures, the query processing of Dynamic Tuples is completely redesigned. As last piece for a complete database integration of a role-based notion and its accompanied metatype distinction, we specify the RSQL Result Net as result representation. It provides a novel result structure and features functionalities to navigate through query results. Finally, we evaluate all three RSQL components in comparison to a relational database system. This assessment clearly demonstrates the benefits of the roles concept’s full database integration.
7

DJ: Bridging Java and Deductive Databases

Hall, Andrew Brian 07 July 2008 (has links)
Modern society is intrinsically dependent on the ability to manage data effectively. While relational databases have been the industry standard for the past quarter century, recent growth in data volumes and complexity requires novel data management solutions. These trends revitalized the interest in deductive databases and highlighted the need for column-oriented data storage. However, programming technologies for enterprise computing were designed for the relational data management model (i.e., row-oriented data storage). Therefore, developers cannot easily incorporate emerging data management solutions into enterprise systems. To address the problem above, this thesis presents Deductive Java (DJ), a system that enables enterprise programmers to use a column oriented deductive database in their Java applications. DJ does so without requiring that the programmer become proficient in deductive databases and their non-standardized, vendor-specific APIs. The design of DJ incorporates three novel features: (1) tailoring orthogonal persistence technology to the needs of a deductive database with column-oriented storage; (2) using Java interfaces as a primary mapping construct, thereby simplifying method call interception; (3) providing facilities to deploy light-weight business rules. DJ was developed in partnership with LogicBlox Inc., an Atlanta based technology startup. / Master of Science
8

L’irrigation dans le bassin du Rhône : gestion de l’information géographique sur les ressources en eau et leurs usages / Irrigation in the Rhône basin : geographic information system about freshwater resources and water uses

Richard-Schott, Florence 06 December 2010 (has links)
L’irrigation a connu de grands changements dans le bassin du Rhône français durant les trente dernières années du vingtième siècle. La mise en œuvre d’un Système d’Information sur le bassin du Rhône (SIR) montre l’existence de quatre grands systèmes d’irrigation qui s’individualisent au sein de plusieurs « régions d’irrigation ». Ces dernières révèlent des dynamiques contrastées, mettant à mal l’idée que l’irrigation aurait connu une expansion continue et homogène, même si les superficies irriguées augmentent globalement. Ces dynamiques spatiales s’expliquent par les profondes transformations d’une pratique modernisée, utilisant des techniques toujours plus économes en eau. C’est d’ailleurs le deuxième enseignement de la recherche : l’accroissement général des superficies irriguées n’a pas entraîné une augmentation des demandes en eau. Celles-ci ont plutôt tendance à diminuer, de l’ordre de 30 % en trente ans. Sous l’impulsion des gestionnaires, les irrigants font un usage de plus en plus raisonné des ressources en eau et, à terme, il ne faut certainement pas considérer l’irrigation comme une menace généralisée pour les équilibres environnementaux... Le mémoire de thèse s’accompagne d’un système de gestion de l’information géographique et d’un atlas en version électronique. / Over the last thirty years of the twentieth century, irrigation in the French basin of the Rhône river has undergone substantial change. The implementation of a Geographic Information System on the Rhône basin (SIR) demonstrates the existence of four main irrigation systems individualized within several “irrigation regions.” These reveal in turn a series of contrasted dynamics, putting into question the idea that irrigation expansion had been both continuous and homogeneous, even though the total surface area irrigated actually increased. These spatial dynamics can be accounted for by the deep transformations due to a modernised practice that relies on techniques ever more sparing with water. This is in fact the second lesson one can draw from this study : the general increase in irrigated surface areas did not lead to an increase in water demand. On the contrary, water demand has tended to diminish, in the order of 30% over thirty years. Driven by management, the cultivators’ use of water resources is more and more reasoned, so that in the long run irrigation is surely no global threat to environmental balance. The thesis includes a system for managing geographic information as well as an electronic atlas.

Page generated in 0.1157 seconds