• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 175
  • 24
  • 19
  • 13
  • 10
  • 9
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 2
  • Tagged with
  • 334
  • 84
  • 66
  • 55
  • 51
  • 47
  • 39
  • 32
  • 31
  • 30
  • 26
  • 25
  • 24
  • 23
  • 22
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
301

Generalizing association rules in n-ary relations : application to dynamic graph analysis / Généralisation des règles d'association dans des relations n-aires : application à l'analyse de graphes dynamiques

Nguyen, Thi Kim Ngan 23 October 2012 (has links)
Le calcul de motifs dans de grandes relations binaires a été très étudié. Un succès emblématique concerne la découverte d'ensembles fréquents et leurs post-traitements pour en dériver des règles d'association. Il s'agit de calculer des motifs dans des relations binaires qui enregistrent quelles sont les propriétés satisfaites par des objets. En fait, de nombreux jeux de données se présentent naturellement comme des relations n-aires (avec n > 2). Par exemple, avec l'ajout de dimensions spatiales et/ou temporelles (lieux et/ou temps où les propriétés sont enregistrées), la relation binaire Objets x Propriétés est étendue à une relation 4-aire Objets x Propriétés x Lieux x Temps. Nous avons généralisé le concept de règle d'association dans un tel contexte multi-dimensionnel. Contrairement aux règles usuelles qui n'impliquent que des sous-ensembles d'un seul domaine de la relation, les prémisses et les conclusions de nos règles peuvent impliquer des sous-ensembles arbitraires de certains domaines. Nous avons conçu des mesures de fréquence et de confiance pour définir la sémantique de telles règles et c'est une contribution significative de cette thèse. Le calcul exhaustif de toutes les règles qui ont des fréquences et confiances suffisantes et l'élimination des règles redondantes ont été étudiés. Nous proposons ensuite d'introduire des disjonctions dans les conclusions des règles, ce qui nécessite de retravailler les définitions des mesures d'intérêt et les questions de redondance. Pour ouvrir un champ d'application original, nous considérons la découverte de règles dans des graphes relationnels dynamiques qui peuvent être codés dans des relations n-aires (n ≥ 3). Une application à l'analyse des usages de bicyclettes dans le système Vélo'v (système de Vélos en libre-service du Grand Lyon) montre quelques usages possibles des règles que nous savons calculer avec nos prototypes logiciels. / Pattern discovery in large binary relations has been extensively studied. An emblematic success in this area concerns frequent itemset mining and its post-processing that derives association rules. In this case, we mine binary relations that encode whether some properties are satisfied or not by some objects. It is however clear that many datasets correspond to n-ary relations where n > 2. For example, adding spatial and/or temporal dimensions (location and/or time when the properties are satisfied by the objects) leads to the 4-ary relation Objects x Properties x Places x Times. Therefore, we study the generalization of association rule mining within arbitrary n-ary relations: the datasets are now Boolean tensors and not only Boolean matrices. Unlike standard rules that involve subsets of only one domain of the relation, in our setting, the head and the body of a rule can include arbitrary subsets of some selected domains. A significant contribution of this thesis concerns the design of interestingness measures for such generalized rules: besides a frequency measures, two different views on rule confidence are considered. The concept of non-redundant rules and the efficient extraction of the non-redundant rules satisfying the minimal frequency and minimal confidence constraints are also studied. To increase the subjective interestingness of rules, we then introduce disjunctions in their heads. It requires to redefine the interestingness measures again and to revisit the redundancy issues. Finally, we apply our new rule discovery techniques to dynamic relational graph analysis. Such graphs can be encoded into n-ary relations (n ≥ 3). Our use case concerns bicycle renting in the Vélo'v system (self-service bicycle renting in Lyon). It illustrates the added-value of some rules that can be computed thanks to our software prototypes.
302

Functional analysis of High-Throughput data for dynamic modeling in eukaryotic systems

Flöttmann, Max 20 June 2013 (has links)
Das Verhalten Biologischer Systeme wird durch eine Vielzahl regulatorischer Prozesse beeinflusst, die sich auf verschiedenen Ebenen abspielen. Die Forschung an diesen Regulationen hat stark von den großen Mengen von Hochdurchsatzdaten profitiert, die in den letzten Jahren verfügbar wurden. Um diese Daten zu interpretieren und neue Erkenntnisse aus ihnen zu gewinnen, hat sich die mathematische Modellierung als hilfreich erwiesen. Allerdings müssen die Daten vor der Integration in Modelle aggregiert und analysiert werden. Wir präsentieren vier Studien auf unterschiedlichen zellulären Ebenen und in verschiedenen Organismen. Zusätzlich beschreiben wir zwei Computerprogramme die den Vergleich zwischen Modell und Experimentellen Daten erleichtern. Wir wenden diese Programme in zwei Studien über die MAP Kinase (MAP, engl. mitogen-acticated-protein) Signalwege in Saccharomyces cerevisiae an, um Modellalternativen zu generieren und unsere Vorstellung des Systems an Daten anzupassen. In den zwei verbleibenden Studien nutzen wir bioinformatische Methoden, um Hochdurchsatz-Zeitreihendaten von Protein und mRNA Expression zu analysieren. Um die Daten interpretieren zu können kombinieren wir sie mit Netzwerken und nutzen Annotationen um Module identifizieren, die ihre Expression im Lauf der Zeit ändern. Im Fall der humanen somatischen Zell Reprogrammierung führte diese Analyse zu einem probabilistischen Boolschen Modell des Systems, welches wir nutzen konnten um neue Hypothesen über seine Funktionsweise aufzustellen. Bei der Infektion von Säugerzellen (Canis familiaris) mit dem Influenza A Virus konnten wir neue Verbindungen zwischen dem Virus und seinem Wirt herausfinden und unsere Zeitreihendaten in bestehende Netzwerke einbinden. Zusammenfassend zeigen viele unserer Ergebnisse die Wichtigkeit von Datenintegration in mathematische Modelle, sowie den hohen Grad der Verschaltung zwischen verschiedenen Regulationssystemen. / The behavior of all biological systems is governed by numerous regulatory mechanisms, acting on different levels of time and space. The study of these regulations has greatly benefited from the immense amount of data that has become available from high-throughput experiments in recent years. To interpret this mass of data and gain new knowledge about studied systems, mathematical modeling has proven to be an invaluable method. Nevertheless, before data can be integrated into a model it needs to be aggregated, analyzed, and the most important aspects need to be extracted. We present four Systems Biology studies on different cellular organizational levels and in different organisms. Additionally, we describe two software applications that enable easy comparison of data and model results. We use these in two of our studies on the mitogen-activated-protein (MAP) kinase signaling in Saccharomyces cerevisiae to generate model alternatives and adapt our representation of the system to biological data. In the two remaining studies we apply Bioinformatic methods to analyze two high-throughput time series on proteins and mRNA expression in mammalian cells. We combine the results with network data and use annotations to identify modules and pathways that change in expression over time to be able to interpret the datasets. In case of the human somatic cell reprogramming (SCR) system this analysis leads to the generation of a probabilistic Boolean model which we use to generate new hypotheses about the system. In the last system we examined, the infection of mammalian (Canis familiaris) cells by the influenza A virus, we find new interconnections between host and virus and are able to integrate our data with existing networks. In summary, many of our findings show the importance of data integration into mathematical models and the high degree of connectivity between different levels of regulation.
303

From birth to birth A cell cycle control network of S. cerevisiae

Münzner, Ulrike Tatjana Elisabeth 23 November 2017 (has links)
Der Zellzyklus organisiert die Zellteilung, und kontrolliert die Replikation der DNA sowie die Weitergabe des Genoms an die nächste Zellgeneration. Er unterliegt einer strengen Kontrolle auf molekularer Ebene. Diese molekularen Kontrollmechanismen sind für das Überleben eines Organismus essentiell, da Fehler Krankheiten begüngstigen können. Vor allem Krebs ist assoziiert mit Abweichungen im Ablauf des Zellzyklus. Die Aufklärung solcher Kontrollmechanismen auf molekularer Ebene ermöglicht einerseits das Verständnis deren grundlegender Funktionsweise, andererseits können solche Erkenntnisse dazu beitragen, Methoden zu entwickeln um den Zellzyklus steuern zu können. Um die molekularen Abläufe des Zellzyklus in ihrer Gesamtheit besser zu verstehen, eignen sich computergestützte Analysen. Beim Zellzyklus handelt es sich um einen Signaltransduktionsweg. Die Eigenschaften dieser Prozesse stellen Rekonstruktion und Übersetzung in digital lesbare Formate vor besondere Herausforderungen in Bezug auf Skalierbarkeit, Simulierbarkeit und Parameterschätzung. Diese Studie präsentiert eine großskalige Netzwerkrekonstruktion des Zellzyklus des Modellorganismus Saccharomyces cerevisiae. Hierfür wurde die reaction-contingency Sprache benutzt, die sowohl eine mechanistisch detaillierte Rekonstruktion auf molekularer Ebene zulässt, als auch deren Übersetzung in ein bipartites Boolesches Modell. Für das Boolesche Modell mit 2506 Knoten konnte ein zyklischer Attraktor bestimmt werden, der das Verhalten einer sich teilenden Hefezelle darstellt. Das Boolesche Modell reproduziert zudem das erwartete phänotypische Verhalten bei Aktivierung von vier Zellzyklusinhibitoren, und in 32 von 37 getesteten Mutanten. Die Rekonstruktion des Zellzyklus der Hefe kann in Folgestudien genutzt werden, um Signaltransduktionswege zu integrieren, die mit dem Zellzyklus interferieren, deren Schnittstellen aufzuzeigen, und dem Ziel, die molekularen Mechanismen einer ganzen Zelle abzubilden, näher zu kommen. Diese Studie zeigt zudem, dass eine auf reaction- contingency Sprache basierte Rekonstruktion geeignet ist, um ein biologisches Netzwerk konsistent mit empirischer Daten darzustellen, und gleichzeitig durch Simulation die Funktionalität des Netzwerkes zu überprüfen. / The survival of a species depends on the correct transmission of an intact genome from one generation to the next. The cell cycle regulates this process and its correct execution is vital for survival of a species. The cell cycle underlies a strict control mechanism ensuring accurate cell cycle progression, as aberrations in cell cycle progression are often linked to serious defects and diseases such as cancer. Understanding this regulatory machinery of the cell cycle offers insights into how life functions on a molecular level and also provides for a better understanding of diseases and possible approaches to control them. Cell cycle control is furthermore a complex mechanism and studying it holistically provides for understanding its collective properties. Computational approaches facilitate holistic cell cycle control studies. However, the properties of the cell cycle control network challenge large-scale in silico studies with respect to scalability, model execution and parameter estimation. This thesis presents a mechanistically detailed and executable large-scale reconstruction of the Saccharomyces cerevisiae cell cycle control network based on reaction- contingency language. The reconstruction accounts for 229 proteins and consists of three individual cycles corresponding to the macroscopic events of DNA replication, spindle pole body duplication, and bud emergence and growth. The reconstruction translated into a bipartite Boolean model has, using an initial state determined with a priori knowledge, a cyclic attractor which reproduces the cyclic behavior of a wildtype yeast cell. The bipartite Boolean model has 2506 nodes and correctly responds to four cell cycle arrest chemicals. Furthermore, the bipartite Boolean model was used in a mutational study where 37 mutants were tested and 32 mutants found to reproduce known phenotypes. The reconstruction of the cell cycle control network of S. cerevisiae demonstrates the power of the reaction-contingency based approach, and paves the way for network extension with regard to the cell cycle machinery itself, and several signal transduction pathways interfering with the cell cycle.
304

Vers une modélisation des écoulements dans les massifs très fissurés de type karst : étude morphologique, hydraulique et changement d'échelle / Flow modeling in highly fissured media such as karsts : morphological study, hydraulics and upscaling

Bailly, David 24 June 2009 (has links)
Les aquifères fissurés de type karst contiennent d'importantes ressources en eau. Ces aquifères sont complexes et hétérogènes sur une gamme d'échelles importantes. Leur gestion nécessite l'utilisation d'outils et de méthodologies adaptés. Dans le cadre de cette étude, différents outils et méthodologies numériques d'étude ont été développés pour la modélisation des aquifères karstiques, et plus généralement, des milieux poreux très fissurés 2D et 3D - en mettant l'accent sur la morphologie et sur le comportement hydrodynamique du milieu à travers la notion de changement d'échelle ("second changement d'échelle", reposant sur un modèle d'écoulement local de type Darcy et/ou Poiseuille avec quelques généralisations). Plusieurs axes sont explorés concernant la morphologie du milieu poreux fissuré (milieux aléatoires, milieux booléens avec réseaux statistiques de fissures, mais aussi, modèles morphogénétiques). L'étude du changement d'échelle hydrodynamique tourne autour du concept de macro perméabilité. Dans un premier temps, l'étude porte sur un modèle de perte de charge linéaire darcien. Les perméabilités effectives sont calculées numériquement en termes des fractions volumiques de fissures et du contraste de perméabilité matrice/fissures. Elles sont analysées et comparées à des modèles théoriques (analytiques). Une étude particulière des effets de quasi-percolation pour les grands contrastes aboutit à la définition de trois fractions critiques liées à des seuils de percolation. Pour tenir compte des effets inertiels dans les fissures, l'étude est étendue au cas d'une loi locale comprenant un terme quadratique en vitesse (Darcy/Ward-Forchheimer). Une perméabilité macroscopique équivalente non linéaire est définie et analysée à l'aide d'un modèle inertiel généralisé (linéaire/puissance). Enfin, l'anisotropie hydraulique à grande échelle du milieu fissuré est étudiée, en termes de perméabilités directionnelles, à l'aide d'une méthode numérique d'immersion. / Karstic aquifers contain large subsurface water resources. These aquifers are complex and heterogeneous on a large range of scales. Their management requires appropriate numerical tools and approaches. Various tools and numerical methodologies have been developed to characterize andmodel the geometry and hydraulic properties of karstic aquifers, more generally, of highly fissured 2D and 3D porous media. In this study, we emphasize morphological characterization, and we analyze hydrodynamic behavior through the concept of upscaling ("second upscaling"). Concerning the morphology of fissured porous media, several axes are explored : random media, composite random Boolean media with statistical properties, and morphogenetic models. Hydrodynamic upscaling is developed using the macro-permeability concept. This upscaling method is based on either Darcy's linear law, or on a linear/quadratic combination of Darcy's and Ward-Forchheimer's quadratic law (inertial effects). First, the study focuses on Darcy's linear head loss law, and Darcian effective permeabilities are calculated numerically in terms of volume fractions of fissures and "fissure/matrix" permeability contrasts. The results are analysed and compared with analytical results and bounds. A special study of percolation and quasi-percolation effects, for high contrasts, leads to defined three critical fractions. These critical fractions are "connected" to percolation thresholds. Secondly, in order to consider inertial effect in fissures, the study is extended to a local law with a quadratic velocity term (Darcy/Ward-Forchheimer). Then, an equivalent nonlinear macroscopic permeability is defined and analysed using a generalized inertial model (linear/power). Finally, the large scale hydraulic anisotropy of fissured medium is studied, in terms of directional permeabilities, using an "immersion" numerical method.
305

Automating Component-Based System Assembly

Subramanian, Gayatri 23 May 2006 (has links)
Owing to advancements in component re-use technology, component-based software development (CBSD) has come a long way in developing complex commercial software systems while reducing software development time and cost. However, assembling distributed resource-constrained and safety-critical systems using current assembly techniques is a challenge. Within complex systems when there are numerous ways to assemble the components unless the software architecture clearly defines how the components should be composed, determining the correct assembly that satisfies the system assembly constraints is difficult. Component technologies like CORBA and .NET do a very good job of integrating components, but they do not automate component assembly; it is the system developer's responsibility to ensure thatthe components are assembled correctly. In this thesis, we first define a component-based system assembly (CBSA) technique called "Constrained Component Assembly Technique" (CCAT), which is useful when the system has complex assembly constraints and the system architecture specifies component composition as assembly constraints. The technique poses the question: Does there exist a way of assembling the components that satisfies all the connection, performance, reliability, and safety constraints of the system, while optimizing the objective constraint? To implement CCAT, we present a powerful framework called "CoBaSA". The CoBaSA framework includes an expressive language for declaratively describing component functional and extra-functional properties, component interfaces, system-level and component-level connection, performance, reliability, safety, and optimization constraints. To perform CBSA, we first write a program (in the CoBaSA language) describing the CBSA specifications and constraints, and then an interpreter translates the CBSA program into a satisfiability and optimization problem. Solving the generated satisfiability and optimization problem is equivalent to answering the question posed by CCAT. If a satisfiable solution is found, we deduce that the system can be assembled without violating any constraints. Since CCAT and CoBaSA provide a mechanism for assembling systems that have complex assembly constraints, they can be utilized in several industries like the avionics industry. We demonstrate the merits of CoBaSA by assembling an actual avionic system that could be used on-board a Boeing aircraft. The empirical evaluation shows that our approach is promising and can scale to handle complex industrial problems.
306

A new algorithm for the quantified satisfiability problem, based on zero-suppressed binary decision diagrams and memoization

Ghasemzadeh, Mohammad January 2005 (has links)
Quantified Boolean formulas (QBFs) play an important role in theoretical computer science. QBF extends propositional logic in such a way that many advanced forms of reasoning can be easily formulated and evaluated. In this dissertation we present our ZQSAT, which is an algorithm for evaluating quantified Boolean formulas. ZQSAT is based on ZBDD: Zero-Suppressed Binary Decision Diagram / which is a variant of BDD, and an adopted version of the DPLL algorithm. It has been implemented in C using the CUDD: Colorado University Decision Diagram package. <br><br> The capability of ZBDDs in storing sets of subsets efficiently enabled us to store the clauses of a QBF very compactly and let us to embed the notion of memoization to the DPLL algorithm. These points led us to implement the search algorithm in such a way that we could store and reuse the results of all previously solved subformulas with a little overheads. ZQSAT can solve some sets of standard QBF benchmark problems (known to be hard for DPLL based algorithms) faster than the best existing solvers. In addition to prenex-CNF, ZQSAT accepts prenex-NNF formulas. We show and prove how this capability can be exponentially beneficial. <br><br> / In der Dissertation stellen wir einen neuen Algorithmus vor, welcher Formeln der quantifizierten Aussagenlogik (engl. Quantified Boolean formula, kurz QBF) löst. QBFs sind eine Erweiterung der klassischen Aussagenlogik um die Quantifizierung über aussagenlogische Variablen. Die quantifizierte Aussagenlogik ist dabei eine konservative Erweiterung der Aussagenlogik, d.h. es können nicht mehr Theoreme nachgewiesen werden als in der gewöhnlichen Aussagenlogik. Der Vorteil der Verwendung von QBFs ergibt sich durch die Möglichkeit, Sachverhalte kompakter zu repräsentieren. <br><br> SAT (die Frage nach der Erfüllbarkeit einer Formel der Aussagenlogik) und QSAT (die Frage nach der Erfüllbarkeit einer QBF) sind zentrale Probleme in der Informatik mit einer Fülle von Anwendungen, wie zum Beispiel in der Graphentheorie, bei Planungsproblemen, nichtmonotonen Logiken oder bei der Verifikation. Insbesondere die Verifikation von Hard- und Software ist ein sehr aktuelles und wichtiges Forschungsgebiet in der Informatik. <br><br> Unser Algorithmus zur Lösung von QBFs basiert auf sogenannten ZBDDs (engl. Zero-suppressed Binary decision Diagrams), welche eine Variante der BDDs (engl. Binary decision Diagrams) sind. BDDs sind eine kompakte Repräsentation von Formeln der Aussagenlogik. Der Algorithmus kombiniert nun bekannte Techniken zum Lösen von QBFs mit der ZBDD-Darstellung unter Verwendung geeigneter Heuristiken und Memoization. Memoization ermöglicht dabei das einfache Wiederverwenden bereits gelöster Teilprobleme. <br><br> Der Algorithmus wurde unter Verwendung des CUDD-Paketes (Colorado University Decision Diagram) implementiert und unter dem Namen ZQSAT veröffentlicht. In Tests konnten wir nachweisen, dass ZQSAT konkurrenzfähig zu existierenden QBF-Beweisern ist, in einigen Fällen sogar bessere Resultate liefern kann.
307

Modélisation procédurale par composants

Leblanc, Luc 08 1900 (has links)
Le réalisme des images en infographie exige de créer des objets (ou des scènes) de plus en plus complexes, ce qui entraîne des coûts considérables. La modélisation procédurale peut aider à automatiser le processus de création, à simplifier le processus de modification ou à générer de multiples variantes d'une instance d'objet. Cependant même si plusieurs méthodes procédurales existent, aucune méthode unique permet de créer tous les types d'objets complexes, dont en particulier un édifice complet. Les travaux réalisés dans le cadre de cette thèse proposent deux solutions au problème de la modélisation procédurale: une solution au niveau de la géométrie de base, et l’autre sous forme d'un système général adapté à la modélisation des objets complexes. Premièrement, nous présentons le bloc, une nouvelle primitive de modélisation simple et générale, basée sur une forme cubique généralisée. Les blocs sont disposés et connectés entre eux pour constituer la forme de base des objets, à partir de laquelle est extrait un maillage de contrôle pouvant produire des arêtes lisses et vives. La nature volumétrique des blocs permet une spécification simple de la topologie, ainsi que le support des opérations de CSG entre les blocs. La paramétrisation de la surface, héritée des faces des blocs, fournit un soutien pour les textures et les fonctions de déplacements afin d'appliquer des détails de surface. Une variété d'exemples illustrent la généralité des blocs dans des contextes de modélisation à la fois interactive et procédurale. Deuxièmement, nous présentons un nouveau système de modélisation procédurale qui unifie diverses techniques dans un cadre commun. Notre système repose sur le concept de composants pour définir spatialement et sémantiquement divers éléments. À travers une série de déclarations successives exécutées sur un sous-ensemble de composants obtenus à l'aide de requêtes, nous créons un arbre de composants définissant ultimement un objet dont la géométrie est générée à l'aide des blocs. Nous avons appliqué notre concept de modélisation par composants à la génération d'édifices complets, avec intérieurs et extérieurs cohérents. Ce nouveau système s'avère général et bien adapté pour le partionnement des espaces, l'insertion d'ouvertures (portes et fenêtres), l'intégration d'escaliers, la décoration de façades et de murs, l'agencement de meubles, et diverses autres opérations nécessaires lors de la construction d'un édifice complet. / The realism of computer graphics images requires the creation of objects (or scenes) of increasing complexity, which leads to considerable costs. Procedural modeling can help to automate the creation process, to simplify the modification process or to generate multiple variations of an object instance. However although several procedural methods exist, no single method allows the creation of all types of complex objects, including in particular a complete building. This thesis proposes two solutions to the problem of procedural modeling: one solution addressing the geometry level, and the other introducing a general system suitable for complex object modeling. First, we present a simple and general modeling primitive, called a block, based on a generalized cuboid shape. Blocks are laid out and connected together to constitute the base shape of complex objects, from which is extracted a control mesh that can contain both smooth and sharp edges. The volumetric nature of the blocks allows for easy topology specification, as well as CSG operations between blocks. The surface parameterization inherited from the block faces provides support for texturing and displacement functions to apply surface details. A variety of examples illustrate the generality of our blocks in both interactive and procedural modeling contexts. Second, we present a novel procedural modeling system which unifies some techniques into a common framework. Our system relies on the concept of components to spatially and semantically define various elements. Through a series of successive statements executed on a subset of queried components, we grow a tree of components ultimately defining an object whose geometry is made from blocks. We applied our concept and representation of components to the generation of complete buildings, with coherent interiors and exteriors. It proves general and well adapted to support partitioning of spaces, insertion of openings (doors and windows), embedding of staircases, decoration of façades and walls, layout of furniture, and various other operations required when constructing a complete building.
308

An Efficient, Extensible, Hardware-aware Indexing Kernel

Sadoghi Hamedani, Mohammad 20 June 2014 (has links)
Modern hardware has the potential to play a central role in scalable data management systems. A realization of this potential arises in the context of indexing queries, a recurring theme in real-time data analytics, targeted advertising, algorithmic trading, and data-centric workflows, and of indexing data, a challenge in multi-version analytical query processing. To enhance query and data indexing, in this thesis, we present an efficient, extensible, and hardware-aware indexing kernel. This indexing kernel rests upon novel data structures and (parallel) algorithms that utilize the capabilities offered by modern hardware, especially abundance of main memory, multi-core architectures, hardware accelerators, and solid state drives. This thesis focuses on presenting our query indexing techniques to cope with processing queries in data-intensive applications that are susceptible to ever increasing data volume and velocity. At the core of our query indexing kernel lies the BE-Tree family of memory-resident indexing structures that scales by overcoming the curse of dimensionality through a novel two-phase space-cutting technique, an effective Top-k processing, and adaptive parallel algorithms to operate directly on compressed data (that exploits the multi-core architecture). Furthermore, we achieve line-rate processing by harnessing the unprecedented degrees of parallelism and pipelining only available through low-level logic design using FPGAs. Finally, we present a comprehensive evaluation that establishes the superiority of BE-Tree in comparison with state-of-the-art algorithms. In this thesis, we further expand the scope of our indexing kernel and describe how to accelerate analytical queries on (multi-version) databases by enabling indexes on the most recent data. Our goal is to reduce the overhead of index maintenance, so that indexes can be used effectively for analytical queries without being a heavy burden on transaction throughput. To achieve this end, we re-design the data structures in the storage hierarchy to employ an extra level of indirection over solid state drives. This indirection layer dramatically reduces the amount of magnetic disk I/Os that is needed for updating indexes and localizes the index maintenance. As a result, by rethinking how data is indexed, we eliminate the dilemma between update vs. query performance and reduce index maintenance and query processing cost substantially.
309

An Efficient, Extensible, Hardware-aware Indexing Kernel

Sadoghi Hamedani, Mohammad 20 June 2014 (has links)
Modern hardware has the potential to play a central role in scalable data management systems. A realization of this potential arises in the context of indexing queries, a recurring theme in real-time data analytics, targeted advertising, algorithmic trading, and data-centric workflows, and of indexing data, a challenge in multi-version analytical query processing. To enhance query and data indexing, in this thesis, we present an efficient, extensible, and hardware-aware indexing kernel. This indexing kernel rests upon novel data structures and (parallel) algorithms that utilize the capabilities offered by modern hardware, especially abundance of main memory, multi-core architectures, hardware accelerators, and solid state drives. This thesis focuses on presenting our query indexing techniques to cope with processing queries in data-intensive applications that are susceptible to ever increasing data volume and velocity. At the core of our query indexing kernel lies the BE-Tree family of memory-resident indexing structures that scales by overcoming the curse of dimensionality through a novel two-phase space-cutting technique, an effective Top-k processing, and adaptive parallel algorithms to operate directly on compressed data (that exploits the multi-core architecture). Furthermore, we achieve line-rate processing by harnessing the unprecedented degrees of parallelism and pipelining only available through low-level logic design using FPGAs. Finally, we present a comprehensive evaluation that establishes the superiority of BE-Tree in comparison with state-of-the-art algorithms. In this thesis, we further expand the scope of our indexing kernel and describe how to accelerate analytical queries on (multi-version) databases by enabling indexes on the most recent data. Our goal is to reduce the overhead of index maintenance, so that indexes can be used effectively for analytical queries without being a heavy burden on transaction throughput. To achieve this end, we re-design the data structures in the storage hierarchy to employ an extra level of indirection over solid state drives. This indirection layer dramatically reduces the amount of magnetic disk I/Os that is needed for updating indexes and localizes the index maintenance. As a result, by rethinking how data is indexed, we eliminate the dilemma between update vs. query performance and reduce index maintenance and query processing cost substantially.
310

Towards Next Generation Sequential and Parallel SAT Solvers / Hin zur nächsten Generation Sequentieller und Paralleler SAT-Solver

Manthey, Norbert 08 January 2015 (has links) (PDF)
This thesis focuses on improving the SAT solving technology. The improvements focus on two major subjects: sequential SAT solving and parallel SAT solving. To better understand sequential SAT algorithms, the abstract reduction system Generic CDCL is introduced. With Generic CDCL, the soundness of solving techniques can be modeled. Next, the conflict driven clause learning algorithm is extended with the three techniques local look-ahead, local probing and all UIP learning that allow more global reasoning during search. These techniques improve the performance of the sequential SAT solver Riss. Then, the formula simplification techniques bounded variable addition, covered literal elimination and an advanced cardinality constraint extraction are introduced. By using these techniques, the reasoning of the overall SAT solving tool chain becomes stronger than plain resolution. When using these three techniques in the formula simplification tool Coprocessor before using Riss to solve a formula, the performance can be improved further. Due to the increasing number of cores in CPUs, the scalable parallel SAT solving approach iterative partitioning has been implemented in Pcasso for the multi-core architecture. Related work on parallel SAT solving has been studied to extract main ideas that can improve Pcasso. Besides parallel formula simplification with bounded variable elimination, the major extension is the extended clause sharing level based clause tagging, which builds the basis for conflict driven node killing. The latter allows to better identify unsatisfiable search space partitions. Another improvement is to combine scattering and look-ahead as a superior search space partitioning function. In combination with Coprocessor, the introduced extensions increase the performance of the parallel solver Pcasso. The implemented system turns out to be scalable for the multi-core architecture. Hence iterative partitioning is interesting for future parallel SAT solvers. The implemented solvers participated in international SAT competitions. In 2013 and 2014 Pcasso showed a good performance. Riss in combination with Copro- cessor won several first, second and third prices, including two Kurt-Gödel-Medals. Hence, the introduced algorithms improved modern SAT solving technology.

Page generated in 0.0297 seconds