Global ETD Search

81	Análise e desenvolvimento de um novo algoritmo de junção espacial para SGBD geográficos / Analysis and design of a new algorithm to perform spatial join in geographic DBMS Fornari, Miguel Rodrigues January 2006 (has links) Um Sistema de Informação Geográfica armazena e mantém dados geográficos, combinando-os, para obter novas representações do espaço geográfico. A junção espacial combina duas relações de geometrias geo-referenciadas de acordo com algum predicado espacial, como intersecção e distância entre objetos. Trata-se de uma operação essencial, pois é constantemente utilizada e possui um alto custo de realização devido a realização de grande número de operações de Entrada/Saída e a complexidade do algoritmo. Este trabalho estuda o desempenho de algoritmos de junção espacial. Inicialmente, apresenta a análise dos algoritmos já publicados na literatura, obtendo expressões de custo para número de operações de disco e processamento. Após, descreve-se a implementação de alguns algoritmos em um ambiente de testes. Este ambiente permite ao usuário variar diversos parâmetros de entrada: cardinalidade dos conjuntos, memória disponível e predicado de junção, envolvendo dados reais e sintéticos. O ambiente de testes inclui os algoritmos de Laços Aninhados, Partition Based Spatial Join Method (PBSM), Synchronized Tree Transversal (STT) para árvores R* e Iterative Spatial Stripped Join (ISSJ). Os testes demonstraram que o STT é adequado para conjuntos pequenos de dados; o ISSJ se houver memória suficiente para ordenar os conjuntos internamente; e o PBSM se houver pouca memória disponível para buffer de dados. A partir da análise um novo algoritmo, chamado Histogram-based Hash Stripped Join (HHSJ) é apresentado. O HSSJ utiliza histogramas da distribuição dos objetos no espaço para definir o particionamento, armazena os objetos em arquivos organizados em hash e subdivide o espaço em faixas (strips) para reduzir o processamento. Os testes indicam que o HHSJ é mais rápido na maioria dos cenários, sendo ainda mais vantajoso quanto maior o número de objetos envolvidos na junção. Um módulo de otimização de consultas baseado em custos, capaz de escolher o melhor algoritmo para realizar a etapa de filtragem é descrito. O módulo utiliza informações estatísticas mantidas no dicionário de dados para estimar o tempo de resposta de cada algoritmo, e indicar o mais rápido para realizar uma operação específica. Este otimizador de consultas acertou a indicação em 88,9% dos casos, errando apenas na junção de conjuntos pequenos, quando o impacto é menor. / A Geographic Information System (GIS) stores geographic data, combining them to obtain new representations of the geographic space. The spatial join operation combines two sets of spatial features, A and B, based on a spatial predicate. It is a fundamental as well as one of the most expensive operations in GIS. Combining pairs of spatial, georreferenced data objects of two different, and probably large data sets implies the execution of a significant number of Input/Output (I/O) operations as well as a large number of CPU operations. This work presents a study about the performance of spatial join algorithms. Firstly, an analysis of the algorithms is realized. As a result, mathematical expressions are identified to predict the number of I/O operations and the algorithm complexity. After this, some of the algorithms (e.g.; Nested Loops, Partition Based Spatial Join Method (PBSM), Synchronized Tree Transversal (STT) to R-Trees and Iterative Spatial Stripped Join (ISSJ)) are implemented, allowing the execution of a series of tests in different spatial join scenarios. The tests were performed using both synthetic and real data sets. Based on the results, a new algorithm, called Histogram-based Hash Stripped Join (HHSJ), is proposed. The partitioning of the space is carried out according to the spatial distribution of the objects, maintained in histograms. In addition, a hash file is created for each input data set and used to enhance both the storage of and the access to the minimum bounding rectangles (MBR) of the respective set elements. Furthermore, the space is divided in strips, to reduce the processing time. The results showed that the new algorithm is faster in almost all scenarios, specially when bigger data sets are processed. Finally, a query optimizer based on costs, capable to choose the best algorithm to perform the filter step of a spatial join operation, is presented. The query optimizer uses statistical information stored in the data dictionary to estimate the response time for each algorithm and chooses the faster to realize the operation. This query optimizer choose the right one on 88.9% of cases, mistaken just in spatial join envolving small data sets, when the impact is small. Geoinformática Sistemas : Informacao geografica Banco : Dados geograficos Otimizacao : Consultas Geographic Database Management Systems Query processing Spatial join
82	Parallélisme et équilibrage de charges dans le traitement de la jointure sur des architectures distribuées / Parallelism and load balancing in the treatment of the join on distributed architectures Al Hajj Hassan, Mohamad 16 December 2009 (has links) L’émergence des applications de bases de données dans les domaines tels que le data warehousing,le data mining et l’aide à la décision qui font généralement appel à de très grands volumes de donnéesrend la parallélisation des algorithmes des jointures nécessaire pour avoir un temps de réponse acceptable.Une accélération linéaire est l’objectif principal des algorithmes parallèles, cependant dans les applicationsréelles, elle est difficilement atteignable : ceci est dû généralement d’une part aux coûts de communicationsinhérents aux systèmes multi-processeurs et d’autre part au déséquilibre des charges des différents processeurs.En plus, dans un environnement hétérogène multi-utilisateur, la charge des différents processeurspeut varier de manière dynamique et imprévisible.Dans le cadre de cette thèse, nous nous intéressons au traitement de la jointure et de la multi-jointure surles architectures distribuées hétérogènes, les grilles de calcul et les systèmes de fichiers distribués. Nousavons proposé une variété d’algorithmes, basés sur l’utilisation des histogrammes distribués, pour traiterde manière efficace le déséquilibre des données, tout en garantissant un équilibrage presque parfait dela charge des différents processeurs même dans un environnement hétérogène et multi-utilisateur. Cesalgorithmes sont basés sur une approche dynamique de redistribution des données permettant de réduire lescoûts de communication à un minimum tout en traitant de manière très efficace le problème de déséquilibredes valeurs de l’attribut de jointure.L’analyse de complexité de nos algorithmes et les résultats expérimentaux obtenus montrent que cesalgorithmes possèdent une accélération presque linéaire. / The appeal of parallel processing becomes very strong in applications which require ever higher performanceand particularly in applications such as : data-warehousing, decision support, On-Line Analytical Processing(OLAP) and more generally DBMS. A linear speed-up is the main objective of parallel algorithms. However,in real applications, it’s not obvious to reach this objective due to the high communication cost in parallel anddistributed systems and to the possible skew in the charge of different processors. In addition, on heterogeneousmulti-user architectures, the load of each processor may highly vary in a dynamic and unpredictableway.In this thesis, we are interested in treating the join and multi-join queries on distributed multi-user heteregeneoussystems, grid systems and distributed file systems. We have proposed several algorithms based onusing distributed histograms. These algorithms are based on a dynamic data distribution and task allocationwhich makes them insensitive to data skew and ensure perfect balancing properties during all stages of joincomputation even on heteregeneous multi-user environment. The complexity analysis of our algorithms andthe experimental results show that they have a near-linear speedup. Jointures parallèles Multi-jointure Déséquilibre des données Equilibrage dynamique de charges Parallel joins Multi-join Data skew Dynamic load balancing
83	Dynamic Energy-Aware Database Storage and Operations Behzadnia, Peyman 29 March 2018 (has links) Energy consumption has become a first-class optimization goal in design and implementation of data-intensive computing systems. This is particularly true in the design of database management systems (DBMS), which is one of the most important servers in software stack of modern data centers. Data storage system is one of the essential components of database and has been under many research efforts aiming at reducing its energy consumption. In previous work, dynamic power management (DPM) techniques that make real-time decisions to transition the disks to low-power modes are normally used to save energy in storage systems. In this research, we tackle the limitations of DPM proposals in previous contributions and design a dynamic energy-aware disk storage system in database servers. We introduce a DPM optimization model integrated with model predictive control (MPC) strategy to minimize power consumption of the disk-based storage system while satisfying given performance requirements. It dynamically determines the state of disks and plans for inter-disk data fragment migration to achieve desirable balance between power consumption and query response time. Furthermore, via analyzing our optimization model to identify structural properties of optimal solutions, a fast-solution heuristic DPM algorithm is proposed that can be integrated in large-scale disk storage systems, where finding the most optimal solution might be long, to achieve near-optimal power saving solution within short periods of computational time. The proposed ideas are evaluated through running simulations using extensive set of synthetic workloads. The results show that our solution achieves up to 1.65 times more energy saving while providing up to 1.67 times shorter response time compared to the best existing algorithm in literature. Stream join is a dynamic and expensive database operation that performs join operation in real-time fashion on continuous data streams. Stream joins, also known as window joins, impose high computational time and potentially higher energy consumption compared to other database operations, and thus we also tackle energy-efficiency of stream join processing in this research. Given that there is a strong linear correlation between energy-efficiency and performance of in-memory parallel join algorithms in database servers, we study parallelization of stream join algorithms on multicore processors to achieve energy efficiency and high performance. Equi-join is the most frequent type of join in query workloads and symmetric hash join (SHJ) algorithm is the most effective algorithm to evaluate equi-joins in data streams. To best of our knowledge, we are the first to propose a shared-memory parallel symmetric hash join algorithm on multi-core CPUs. Furthermore, we introduce a novel parallel hash-based stream join algorithm called chunk-based pairing hash join that aims at elevating data throughput and scalability. We also tackle parallel processing of multi-way stream joins where there are more than two input data streams involved in the join operation. To best of our knowledge, we are also the first to propose an in-memory parallel multi-way hash-based stream join on multicore processors. Experimental evaluation on our proposed parallel algorithms demonstrates high throughput, significant scalability, and low latency while reducing the energy consumption. Our parallel symmetric hash join and chunk-based pairing hash join achieve up to 11 times and 12.5 times more throughput, respectively, compared to that of state-of-the-art parallel stream join algorithm. Also, these two algorithms provide up to around 22 times and 24.5 times more throughput, respectively, compared to that of non-parallel (sequential) stream join computation where there is one processing thread. Database Management System (DBMS) Data Consolidation Data Stream Management System (DSMS) Dynamic Power Management (DPM) Stream (Window) Join Computer Sciences
84	Análise e desenvolvimento de um novo algoritmo de junção espacial para SGBD geográficos / Analysis and design of a new algorithm to perform spatial join in geographic DBMS Fornari, Miguel Rodrigues January 2006 (has links) Um Sistema de Informação Geográfica armazena e mantém dados geográficos, combinando-os, para obter novas representações do espaço geográfico. A junção espacial combina duas relações de geometrias geo-referenciadas de acordo com algum predicado espacial, como intersecção e distância entre objetos. Trata-se de uma operação essencial, pois é constantemente utilizada e possui um alto custo de realização devido a realização de grande número de operações de Entrada/Saída e a complexidade do algoritmo. Este trabalho estuda o desempenho de algoritmos de junção espacial. Inicialmente, apresenta a análise dos algoritmos já publicados na literatura, obtendo expressões de custo para número de operações de disco e processamento. Após, descreve-se a implementação de alguns algoritmos em um ambiente de testes. Este ambiente permite ao usuário variar diversos parâmetros de entrada: cardinalidade dos conjuntos, memória disponível e predicado de junção, envolvendo dados reais e sintéticos. O ambiente de testes inclui os algoritmos de Laços Aninhados, Partition Based Spatial Join Method (PBSM), Synchronized Tree Transversal (STT) para árvores R* e Iterative Spatial Stripped Join (ISSJ). Os testes demonstraram que o STT é adequado para conjuntos pequenos de dados; o ISSJ se houver memória suficiente para ordenar os conjuntos internamente; e o PBSM se houver pouca memória disponível para buffer de dados. A partir da análise um novo algoritmo, chamado Histogram-based Hash Stripped Join (HHSJ) é apresentado. O HSSJ utiliza histogramas da distribuição dos objetos no espaço para definir o particionamento, armazena os objetos em arquivos organizados em hash e subdivide o espaço em faixas (strips) para reduzir o processamento. Os testes indicam que o HHSJ é mais rápido na maioria dos cenários, sendo ainda mais vantajoso quanto maior o número de objetos envolvidos na junção. Um módulo de otimização de consultas baseado em custos, capaz de escolher o melhor algoritmo para realizar a etapa de filtragem é descrito. O módulo utiliza informações estatísticas mantidas no dicionário de dados para estimar o tempo de resposta de cada algoritmo, e indicar o mais rápido para realizar uma operação específica. Este otimizador de consultas acertou a indicação em 88,9% dos casos, errando apenas na junção de conjuntos pequenos, quando o impacto é menor. / A Geographic Information System (GIS) stores geographic data, combining them to obtain new representations of the geographic space. The spatial join operation combines two sets of spatial features, A and B, based on a spatial predicate. It is a fundamental as well as one of the most expensive operations in GIS. Combining pairs of spatial, georreferenced data objects of two different, and probably large data sets implies the execution of a significant number of Input/Output (I/O) operations as well as a large number of CPU operations. This work presents a study about the performance of spatial join algorithms. Firstly, an analysis of the algorithms is realized. As a result, mathematical expressions are identified to predict the number of I/O operations and the algorithm complexity. After this, some of the algorithms (e.g.; Nested Loops, Partition Based Spatial Join Method (PBSM), Synchronized Tree Transversal (STT) to R-Trees and Iterative Spatial Stripped Join (ISSJ)) are implemented, allowing the execution of a series of tests in different spatial join scenarios. The tests were performed using both synthetic and real data sets. Based on the results, a new algorithm, called Histogram-based Hash Stripped Join (HHSJ), is proposed. The partitioning of the space is carried out according to the spatial distribution of the objects, maintained in histograms. In addition, a hash file is created for each input data set and used to enhance both the storage of and the access to the minimum bounding rectangles (MBR) of the respective set elements. Furthermore, the space is divided in strips, to reduce the processing time. The results showed that the new algorithm is faster in almost all scenarios, specially when bigger data sets are processed. Finally, a query optimizer based on costs, capable to choose the best algorithm to perform the filter step of a spatial join operation, is presented. The query optimizer uses statistical information stored in the data dictionary to estimate the response time for each algorithm and chooses the faster to realize the operation. This query optimizer choose the right one on 88.9% of cases, mistaken just in spatial join envolving small data sets, when the impact is small. Geoinformática Sistemas : Informacao geografica Banco : Dados geograficos Otimizacao : Consultas Geographic Database Management Systems Query processing Spatial join
85	Trigraphes de Berge apprivoisés / Tame Berge trigraphes Trunck, Théophile 17 September 2014 (has links) L'objectif de cette thèse est de réussir à utiliser des décompositions de graphes afin de résoudre des problèmes algorithmiques sur les graphes. Notre objet d'étude principal est la classe des graphes de Berge apprivoisés. Les graphes de Berge sont les graphes ne possédant ni cycle de longueur impaire supérieur à 4 ni complémentaire de cycle de longueur impaire supérieure à 4. Dans les années 60, Claude Berge a conjecturé que les graphes de Berge étaient des graphes parfaits. C'est-à-dire que la taille de la plus grande clique est exactement le nombre minimum de couleurs nécessaire à une coloration propre et ce pour tout sous-graphe. En 2002, Chudnovsky, Robertson, Seymour et Thomas ont démontré cette conjecture en utilisant un théorème de structure: les graphes de Berge sont basiques ou admettent une décomposition. Ce résultat est très utile pour faire des preuves par induction. Cependant, une des décompositions du théorème, la skew-partition équilibrée, est très difficile à utiliser algorithmiquement. Nous nous focalisons donc sur les graphes de Berge apprivoisés, c'est-à-dire les graphes de Berge sans skew-partition équilibrée. Pour pouvoir faire des inductions, nous devons adapter le théorème destructure de Chudnovsky et al à notre classe. Nous prouvons un résultat plus fort: les graphes de Berge apprivoisés sont basiques ou admettent une décomposition telle qu'un côté de la décomposition soit toujours basique. Nous avons de plus un algorithme calculant cette décomposition. Nous utilisons ensuite notre théorème pour montrer que les graphes de Berge apprivoisés admettent la propriété du grand biparti, de la clique-stable séparation et qu'il existe un algorithme polynomial permettant de calculer le stable maximum. / The goal of this these is to use graph's decompositions to solve algorithmic problems on graphs. We will study the class of Berge tame graphs. A Berge graph is a graph without cycle of odd length at least 4 nor complement of cycle of odd length at least 4.In the 60's, Claude Berge conjectured that Berge graphs are perfect graphs. The size of the biggest clique is exactly the number of colors required to color the graph. In 2002, Chudnovsky, Robertson, Seymour et Thomas proved this conjecture using a theorem of decomposition: Berge graphs are either basic or have a decomposition. This is a useful result to do proof by induction. Unfortunately, one of the decomposition, the skew-partition, is really hard to use. We arefocusing here on Berge tame graphs, i.e~Berge graph without balanced skew-partition. To be able to do induction, we must first adapt the Chudnovsky et al's theorem of structure to our class. We prove a stronger result: Berge tame graphs are basic or have a decomposition such that one side is always basic. We also have an algorithm to compute this decomposition. We then use our theorem to prouve that Berge tame graphs have the big-bipartite property, the clique-stable set separation property and there exists a polytime algorithm to compute the maximum stable set. Graphes parfaits 2-joint Clique-stable séparation Décompositions extrêmes Perfect graphs 2-join Clique-stable separator Extreme decompositions
86	Processamento de consultas SOLAP drill-across e com junção espacial em data warehouses geográficos / Processing of drill-across and spatial join SOLAP queries over geographic data warehouses Jaqueline Joice Brito 28 November 2012 (has links) Um data warehouse geográco (DWG) é um banco de dados multidimensional, orientado a assunto, integrado, histórico, não-volátil e geralmente organizado em níveis de agregação. Além disso, também armazena dados espaciais em uma ou mais dimensões ou em pelo menos uma medida numérica. Visando oferecer suporte à tomada de decisão, é possível realizar em DWGs consultas SOLAP (spatial online analytical processing ), isto é, consultas analíticas multidimensionais (e.g., drill-down, roll-up, drill-across ) com predicados espaciais (e.g., intersecta, contém, está contido) denidos para range queries e junções espaciais. Um desafio no processamento dessas consultas é recuperar, de forma eficiente, dados espaciais e convencionais em DWGs muito volumosos. Na literatura, existem poucos índices voltados à indexação de DWGs, e ainda assim nenhum desses índices dedica-se a indexar consultas SOLAP drill-across e com junção espacial. Esta dissertação visa suprir essa limitação, por meio da proposta de estratégias para o processamento dessas consultas complexas. Para o processamento de consultas SOLAP drill-across foram propostas duas estratégias, Divide e Única, além da especicação de um conjunto de diretrizes que deve ser seguido para o projeto de um esquema de DWG que possibilite a execução dessas consultas e da especicação de classes de consultas. Para o processamento de consultas SOLAP com junção espacial foi proposta a estratégia SJB, além da identicação de quais características o esquema de DWG deve possuir para possibilitar a execução dessas consultas e da especicação do formato dessas consultas. A validação das estratégias propostas foi realizada por meio de testes de desempenho considerando diferentes congurações, sendo que os resultados obtidos foram contrastados com a execução de consultas do tipo junção estrela e o uso de visões materializadas. Os resultados mostraram que as estratégias propostas são muito eficientes. No processamento de consultas SOLAP drill-across, as estratégias Divide e Única mostraram uma redução no tempo de 82,7% a 98,6% com relação à junção estrela e ao uso de visões materializadas. No processamento de consultas SOLAP com junção espacial, a estratégia SJB garantiu uma melhora de desempenho na grande maioria das consultas executadas. Para essas consultas, o ganho de desempenho variou de 0,3% até 99,2% / A geographic data warehouse (GDW) is a special kind of multidimensional database. It is subject-oriented, integrated, historical, non-volatile and usually organized in levels of aggregation. Furthermore, a GDW also stores spatial data in one or more dimensions or at least in one numerical measure. Aiming at decision support, GDWs allow SOLAP (spatial online analytical processing) queries, i.e., multidimensional analytical queries (e.g., drill-down, roll-up, drill-across) extended with spatial predicates (e.g., intersects, contains, is contained) dened for range and spatial join queries. A challenging issue related to the processing of these complex queries is how to recover spatial and conventional data stored in huge GDWs eciently. In the literature, there are few access methods dedicated to index GDWs, and none of these methods focus on drill-across and spatial join SOLAP queries. In this master\'s thesis, we propose novel strategies for processing these complex queries. We introduce two strategies for processing SOLAP drill-across queries (namely, Divide and Unique), dene a set of guidelines for the design of a GDW schema that enables the execution of these queries, and determine a set of classes of these queries to be issued over a GDW schema that follows the proposed guidelines. As for the processing of spatial join SOLAP queries, we propose the SJB strategy, and also identify the characteristics of a DWG schema that enables the execution of these queries as well as dene the format of these queries. We validated the proposed strategies through performance tests that compared them with the star join computation and the use of materialized views. The obtained results showed that our strategies are very ecient. Regarding the SOLAP drill-across queries, the Divide and Unique strategies showed a time reduction that ranged from 82,7% to 98,6% with respect to star join computation and the use of materialized views. Regarding the SOLAP spatial join queries, the SJB strategy guaranteed best results for most of the analyzed queries. For these queries, the performance gain of the SJB strategy ranged from 0,3% to 99,2% over the star join computation and the use of materialized view Data warehouses geográficos Drill-across Junção espacial Processamento de consultas SOLAP Drill-across Geographic data warehouses Processing of SOLAP queries Spatial join
87	Análise e desenvolvimento de um novo algoritmo de junção espacial para SGBD geográficos / Analysis and design of a new algorithm to perform spatial join in geographic DBMS Fornari, Miguel Rodrigues January 2006 (has links) Um Sistema de Informação Geográfica armazena e mantém dados geográficos, combinando-os, para obter novas representações do espaço geográfico. A junção espacial combina duas relações de geometrias geo-referenciadas de acordo com algum predicado espacial, como intersecção e distância entre objetos. Trata-se de uma operação essencial, pois é constantemente utilizada e possui um alto custo de realização devido a realização de grande número de operações de Entrada/Saída e a complexidade do algoritmo. Este trabalho estuda o desempenho de algoritmos de junção espacial. Inicialmente, apresenta a análise dos algoritmos já publicados na literatura, obtendo expressões de custo para número de operações de disco e processamento. Após, descreve-se a implementação de alguns algoritmos em um ambiente de testes. Este ambiente permite ao usuário variar diversos parâmetros de entrada: cardinalidade dos conjuntos, memória disponível e predicado de junção, envolvendo dados reais e sintéticos. O ambiente de testes inclui os algoritmos de Laços Aninhados, Partition Based Spatial Join Method (PBSM), Synchronized Tree Transversal (STT) para árvores R* e Iterative Spatial Stripped Join (ISSJ). Os testes demonstraram que o STT é adequado para conjuntos pequenos de dados; o ISSJ se houver memória suficiente para ordenar os conjuntos internamente; e o PBSM se houver pouca memória disponível para buffer de dados. A partir da análise um novo algoritmo, chamado Histogram-based Hash Stripped Join (HHSJ) é apresentado. O HSSJ utiliza histogramas da distribuição dos objetos no espaço para definir o particionamento, armazena os objetos em arquivos organizados em hash e subdivide o espaço em faixas (strips) para reduzir o processamento. Os testes indicam que o HHSJ é mais rápido na maioria dos cenários, sendo ainda mais vantajoso quanto maior o número de objetos envolvidos na junção. Um módulo de otimização de consultas baseado em custos, capaz de escolher o melhor algoritmo para realizar a etapa de filtragem é descrito. O módulo utiliza informações estatísticas mantidas no dicionário de dados para estimar o tempo de resposta de cada algoritmo, e indicar o mais rápido para realizar uma operação específica. Este otimizador de consultas acertou a indicação em 88,9% dos casos, errando apenas na junção de conjuntos pequenos, quando o impacto é menor. / A Geographic Information System (GIS) stores geographic data, combining them to obtain new representations of the geographic space. The spatial join operation combines two sets of spatial features, A and B, based on a spatial predicate. It is a fundamental as well as one of the most expensive operations in GIS. Combining pairs of spatial, georreferenced data objects of two different, and probably large data sets implies the execution of a significant number of Input/Output (I/O) operations as well as a large number of CPU operations. This work presents a study about the performance of spatial join algorithms. Firstly, an analysis of the algorithms is realized. As a result, mathematical expressions are identified to predict the number of I/O operations and the algorithm complexity. After this, some of the algorithms (e.g.; Nested Loops, Partition Based Spatial Join Method (PBSM), Synchronized Tree Transversal (STT) to R-Trees and Iterative Spatial Stripped Join (ISSJ)) are implemented, allowing the execution of a series of tests in different spatial join scenarios. The tests were performed using both synthetic and real data sets. Based on the results, a new algorithm, called Histogram-based Hash Stripped Join (HHSJ), is proposed. The partitioning of the space is carried out according to the spatial distribution of the objects, maintained in histograms. In addition, a hash file is created for each input data set and used to enhance both the storage of and the access to the minimum bounding rectangles (MBR) of the respective set elements. Furthermore, the space is divided in strips, to reduce the processing time. The results showed that the new algorithm is faster in almost all scenarios, specially when bigger data sets are processed. Finally, a query optimizer based on costs, capable to choose the best algorithm to perform the filter step of a spatial join operation, is presented. The query optimizer uses statistical information stored in the data dictionary to estimate the response time for each algorithm and chooses the faster to realize the operation. This query optimizer choose the right one on 88.9% of cases, mistaken just in spatial join envolving small data sets, when the impact is small. Geoinformática Sistemas : Informacao geografica Banco : Dados geograficos Otimizacao : Consultas Geographic Database Management Systems Query processing Spatial join
88	Finanční analýza vybraného zdravotnického zařízení / Financial Analysis in a chosen Health-Care Facility Konopčíková, Petra January 2009 (has links) The aim of this work is to give a comprehensive overview of the financial situation of the health-care facility and analyze the facility available in the longer time series using the methods of financial analysis. Find the weakest position in the financing of the facility and try to assign a structural analysis problem of cost centers as a tool for managerial decision-making.
89	Data Warehouses na era do Big Data: processamento eficiente de Junções Estrela no Hadoop / Data Warehouses na era do Big Data: processamento eficiente de Junções Estrela no Hadoop Jaqueline Joice Brito 12 December 2017 (has links) The era of Big Data is here: the combination of unprecedented amounts of data collected every day with the promotion of open source solutions for massively parallel processing has shifted the industry in the direction of data-driven solutions. From recommendation systems that help you find your next significant one to the dawn of self-driving cars, Cloud Computing has enabled companies of all sizes and areas to achieve their full potential with minimal overhead. In particular, the use of these technologies for Data Warehousing applications has decreased costs greatly and provided remarkable scalability, empowering business-oriented applications such as Online Analytical Processing (OLAP). One of the most essential primitives in Data Warehouses are the Star Joins, i.e. joins of a central table with satellite dimensions. As the volume of the database scales, Star Joins become unpractical and may seriously limit applications. In this thesis, we proposed specialized solutions to optimize the processing of Star Joins. To achieve this, we used the Hadoop software family on a cluster of 21 nodes. We showed that the primary bottleneck in the computation of Star Joins on Hadoop lies in the excessive disk spill and overhead due to network communication. To mitigate these negative effects, we proposed two solutions based on a combination of the Spark framework with either Bloom filters or the Broadcast technique. This reduced the computation time by at least 38%. Furthermore, we showed that the use of full scan may significantly hinder the performance of queries with low selectivity. Thus, we proposed a distributed Bitmap Join Index that can be processed as a secondary index with loose-binding and can be used with random access in the Hadoop Distributed File System (HDFS). We also implemented three versions (one in MapReduce and two in Spark) of our processing algorithm that uses the distributed index, which reduced the total computation time up to 88% for Star Joins with low selectivity from the Star Schema Benchmark (SSB). Because, ideally, the system should be able to perform both random access and full scan, our solution was designed to rely on a two-layer architecture that is framework-agnostic and enables the use of a query optimizer to select which approaches should be used as a function of the query. Due to the ubiquity of joins as primitive queries, our solutions are likely to fit a broad range of applications. Our contributions not only leverage the strengths of massively parallel frameworks but also exploit more efficient access methods to provide scalable and robust solutions to Star Joins with a significant drop in total computation time. / A era do Big Data chegou: a combinação entre o volume dados coletados diarimente com o surgimento de soluções de código aberto para o processamento massivo de dados mudou para sempre a indústria. De sistemas de recomendação que assistem às pessoas a encontrarem seus pares românticos à criação de carros auto-dirigidos, a Computação em Nuvem permitiu que empresas de todos os tamanhos e áreas alcançassem o seu pleno potencial com custos reduzidos. Em particular, o uso dessas tecnologias em aplicações de Data Warehousing reduziu custos e proporcionou alta escalabilidade para aplicações orientadas a negócios, como em processamento on-line analítico (Online Analytical Processing- OLAP). Junções Estrelas são das primitivas mais essenciais em Data Warehouses, ou seja, consultas que realizam a junções de tabelas de fato com tabelas de dimensões. Conforme o volume de dados aumenta, Junções Estrela tornam-se custosas e podem limitar o desempenho das aplicações. Nesta tese são propostas soluções especializadas para otimizar o processamento de Junções Estrela. Para isso, utilizamos a família de software Hadoop em um cluster de 21 nós. Nós mostramos que o gargalo primário na computação de Junções Estrelas no Hadoop reside no excesso de operações escrita do disco (disk spill) e na sobrecarga da rede devido a comunicação excessiva entre os nós. Para reduzir estes efeitos negativos, são propostas duas soluções em Spark baseadas nas técnicas Bloom filters ou Broadcast, reduzindo o tempo total de computação em pelo menos 38%. Além disso, mostramos que a realização de uma leitura completa das tables (full table scan) pode prejudicar significativamente o desempenho de consultas com baixa seletividade. Assim, nós propomos um Índice Bitmap de Junção distribuído que é implementado como um índice secundário que pode ser combinado com acesso aleatório no Hadoop Distributed File System (HDFS). Nós implementamos três versões (uma em MapReduce e duas em Spark) do nosso algoritmo de processamento baseado nesse índice distribuído, os quais reduziram o tempo de computação em até 77% para Junções Estrelas de baixa seletividade do Star Schema Benchmark (SSB). Como idealmente o sistema deve ser capaz de executar tanto acesso aleatório quanto full scan, nós também propusemos uma arquitetura genérica que permite a inserção de um otimizador de consultas capaz de selecionar quais abordagens devem ser usadas dependendo da consulta. Devido ao fato de consultas de junção serem frequentes, nossas soluções são pertinentes a uma ampla gama de aplicações. A contribuições desta tese não só fortalecem o uso de frameworks de processamento de código aberto, como também exploram métodos mais eficientes de acesso aos dados para promover uma melhora significativa no desempenho Junções Estrela. Big Data Computação em Nuvem Data Warehouse Hadoop Junção Estrela Big Data Cloud Computing Data Warehouse Hadoop Star Join
90	Postup opravy beranu lisu / The repair procedure of the press frame Švarc, Lukáš January 2018 (has links) The diploma thesis is dealing with the repair procedure of the cracked ram of the press – machine parts made from gray cast iron by casting. The ram is a part of the repaired forming press LET 160. The presented work contains an overview and analysis of the technologies of repair of cast iron by welding. On the basis of the theoretical part is designed the procedure of repairing. The aim of the experimental tests is selecting the most suitable additional materiál and verifying the proposed welding process. The welding method was selected by a low preheating manual metal arc. All the pWPS were prepared for the welding of all test specimens. Experimental samples were subjected to visual, macroscopic, microscopic analysis and hardness measurement. Based on the results of the experiments was suggested a welding repair procedure with a pWPS.

Search results