Spelling suggestions: "subject:"algorithmen"" "subject:"basisalgorithmen""
111 |
Top-k Entity Augmentation using Consistent Set CoveringEberius, Julian, Thiele, Maik, Braunschweig, Katrin, Lehner, Wolfgang 19 September 2022 (has links)
Entity augmentation is a query type in which, given a set of entities and a large corpus of possible data sources, the values of a missing attribute are to be retrieved. State of the art methods return a single result that, to cover all queried entities, is fused from a potentially large set of data sources. We argue that queries on large corpora of heterogeneous sources using information retrieval and automatic schema matching methods can not easily return a single result that the user can trust, especially if the result is composed from a large number of sources that user has to verify manually. We therefore propose to process these queries in a Top-k fashion, in which the system produces multiple minimal consistent solutions from which the user can choose to resolve the uncertainty of the data sources and methods used. In this paper, we introduce and formalize the problem of consistent, multi-solution set covering, and present algorithms based on a greedy and a genetic optimization approach. We then apply these algorithms to Web table-based entity augmentation. The publication further includes a Web table corpus with 100M tables, and a Web table retrieval and matching system in which these algorithms are implemented. Our experiments show that the consistency and minimality of the augmentation results can be improved using our set covering approach, without loss of precision or coverage and while producing multiple alternative query results.
|
112 |
Complexity of Normal Forms on Structures of Bounded DegreeHeimberg, Lucas 04 June 2018 (has links)
Normalformen drücken semantische Eigenschaften einer Logik durch syntaktische Restriktionen aus. Sie ermöglichen es Algorithmen, Grenzen der Ausdrucksstärke einer Logik auszunutzen. Ein Beispiel ist die Lokalität der Logik erster Stufe (FO), die impliziert, dass Graph-Eigenschaften wie Erreichbarkeit oder Zusammenhang nicht FO-definierbar sind. Gaifman-Normalformen drücken die Bedeutung einer FO-Formel als Boolesche Kombination lokaler Eigenschaften aus. Sie haben eine wichtige Rolle in Model-Checking Algorithmen für Klassen dünn besetzter Graphen, deren Laufzeit durch die Größe der auszuwertenden Formel parametrisiert ist. Es ist jedoch bekannt, dass Gaifman-Normalformen im Allgemeinen nur mit nicht-elementarem Aufwand konstruiert werden können. Dies führt zu einer enormen Parameterabhängigkeit der genannten Algorithmen. Ähnliche nicht-elementare untere Schranken sind auch für Feferman-Vaught-Zerlegungen und für die Erhaltungssätze von Lyndon, Łoś und Tarski bekannt.
Diese Arbeit untersucht die Komplexität der genannten Normalformen auf Klassen von Strukturen beschränkten Grades, für welche die nicht-elementaren unteren Schranken nicht gelten. Für diese Einschränkung werden Algorithmen mit elementarer Laufzeit für die Konstruktion von Gaifman-Normalformen, Feferman-Vaught-Zerlegungen, und für die Erhaltungssätze von Lyndon, Łoś und Tarski entwickelt, die in den ersten beiden Fällen worst-case optimal sind.
Wichtig hierfür sind Hanf-Normalformen. Es wird gezeigt, dass eine Erweiterung von FO durch unäre Zählquantoren genau dann Hanf-Normalformen erlaubt, wenn alle Zählquantoren ultimativ periodisch sind, und wie Hanf-Normalformen in diesen Fällen in elementarer und worst-case optimaler Zeit konstruiert werden können.
Dies führt zu Model-Checking Algorithmen für solche Erweiterungen von FO sowie zu Verallgemeinerungen der Algorithmen für Feferman-Vaught-Zerlegungen und die Erhaltungssätze von Lyndon, Łoś und Tarski. / Normal forms express semantic properties of logics by means of syntactical restrictions. They allow algorithms to benefit from restrictions of the expressive power of a logic. An example is the locality of first-order logic (FO), which implies that properties like reachability or connectivity cannot be defined in FO. Gaifman's local normal form expresses the satisfaction conditions of an FO-formula by a Boolean combination of local statements. Gaifman normal form serves as a first step in fixed-parameter model-checking algorithms, parameterised by the size of the formula, on sparse graph classes. However, it is known that in general, there are non-elementary lower bounds for the costs involved in transforming a formula into Gaifman normal form. This leads to an enormous parameter-dependency of the aforementioned algorithms. Similar non-elementary lower bounds also hold for Feferman-Vaught decompositions and for the preservation theorems by Lyndon, Łoś, and Tarski.
This thesis investigates the complexity of these normal forms when restricting attention to classes of structures of bounded degree, for which the non-elementary lower bounds are known to fail. Under this restriction, the thesis provides
algorithms with elementary and even worst-case optimal running time for the construction of Gaifman normal form and Feferman-Vaught decompositions. For the preservation theorems, algorithmic versions with elementary running time and non-matching lower bounds are provided.
Crucial for these results is the notion of Hanf normal form. It is shown that an extension of FO by unary counting quantifiers allows Hanf normal forms if, and only if, all quantifiers are ultimately periodic, and furthermore, how Hanf normal form can be computed in elementary and worst-case optimal time in these cases. This leads to model-checking algorithms for such extensions of FO and also allows generalisations of the constructions for Feferman-Vaught decompositions and preservation theorems.
|
113 |
Scheduling algorithms for saving energy and balancing loadAntoniadis, Antonios 16 August 2012 (has links)
Diese Arbeit beschäftigt sich mit Scheduling von Tasks in Computersystemen. Wir untersuchen sowohl die in neueren Arbeiten betrachtete Zielfunktion zur Energieminimierung als auch die klassische Zielfunktion zur Lastbalancierung auf mehreren Prozessoren. Beim Speed-Scaling mit Sleep-State darf ein Prozessor, der zu jedem Zeitpunkt seine Geschwindigkeit anpassen kann, auch in einen Schlafmodus übergehen. Unser Ziel ist es, den Energieverbrauch zu minimieren. Wir zeigen die NP-Härte des Problems und klären somit den Komplexitätsstatus. Wir beweisen eine untere Schranke für die Approximationsgüte für eine spezielle natürliche Klasse von Schedules. Ferner entwickeln wir eine Familie von Algorithmen, die gute Approximationsfaktoren liefert, und zeigen, dass diese sogar Lösungen liefert, die optimal für die zuvor erwähnte Klasse von Schedules sind. Anschließend widmen wir unsere Aufmerksamkeit dem folgenden Termin-basierten Scheduling-Problem. Es seien mehrere Prozessoren gegeben, wobei jeder einzelne Prozessor zu jedem Zeitpunkt seine Geschwindigkeit anpassen kann. Ziel ist es wie zuvor, den Energieverbrauch des erzeugten Schedules zu minimieren. Für den Offline-Fall entwickeln wir einen optimalen Polynomialzeit-Algorithmus. Für das Online-Problem erweitern wir die zwei bekannten Ein-Prozessor-Algorithmen Optimal Available und Average Rate. Wir zeigen, dass diese den gleichen bzw. einen um die additive Konstante von eins vergrößerten kompetiven Faktor haben. Bei der Lastbalancierung auf mehreren Prozessoren betrachten wir Offline-Load-Balancing auf identischen Maschinen. Unser Ziel ist es, die Current-Load für temporäre Tasks mit identischem Gewicht zu minimieren. Wir zeigen, dass eine Lösung mit maximaler Imbalance von eins immer existiert und entwickeln einen effizienten Algorithmus, der solche Lösungen liefert. Zum Schluss beweisen wir die NP-Härte von zwei Verallgemeinerungen des Problems. / This thesis studies problems of scheduling tasks in computing environments. We consider both the modern objective function of minimizing energy consumption, and the classical objective of balancing load across machines. We first investigate offline deadline-based scheduling in the setting of a single variable-speed processor that is equipped with a sleep state. The objective is that of minimizing the total energy consumption. Apart from settling the complexity of the problem by showing its NP-hardness, we provide a lower bound of 2 for general convex power functions, and a particular natural class of schedules. We also present an algorithmic framework for designing good approximation algorithms. Furthermore, we give tight bounds for the aforementioned particular class of schedules. We then focus on the multiprocessor setting where each processor has the ability to vary its speed. We first study the offline problem and show that optimal schedules can be computed efficiently in polynomial time. Regarding the online problem and a natural class of power functions, we extend the two well-known single-processor algorithms Optimal Available and Average Rate. We prove that Optimal Available has the same competitive ratio as in the single-processor case. For Average Rate we show a competitive factor that increases by an additive constant of one compared to the single-processor result. With respect to load balancing, we consider offline load balancing on identical machines, with the objective of minimizing the current load, for temporary unit-weight jobs. The problem can be seen as coloring n intervals with k colors, such that for each point on the line, the maximal difference between the number of intervals of any two colors is minimal. We prove that a coloring with maximal difference at most one is always possible, and develop a fast polynomial-time algorithm for generating such a coloring. Lastly, we prove that two generalizations of the problem are NP-hard.
|
114 |
Parallele dynamische Adaption hybrider Netze für effizientes verteiltes Rechnen / Parallel dynamic adaptation of hybrid grids for efficient distributed computingAlrutz, Thomas 17 September 2008 (has links)
No description available.
|
115 |
Computation with finitely L-presented groups / Algorithmen für endlich L-präsentierte GruppenHartung, René 01 June 2012 (has links)
No description available.
|
116 |
Proximal Splitting Methods in Nonsmooth Convex OptimizationHendrich, Christopher 25 July 2014 (has links) (PDF)
This thesis is concerned with the development of novel numerical methods for solving nondifferentiable convex optimization problems in real Hilbert spaces and with the investigation of their asymptotic behavior. To this end, we are also making use of monotone operator theory as some of the provided algorithms are originally designed to solve monotone inclusion problems.
After introducing basic notations and preliminary results in convex analysis, we derive two numerical methods based on different smoothing strategies for solving nondifferentiable convex optimization problems. The first approach, known as the double smoothing technique, solves the optimization problem with some given a priori accuracy by applying two regularizations to its conjugate dual problem. A special fast gradient method then solves the regularized dual problem such that an approximate primal solution can be reconstructed from it. The second approach affects the primal optimization problem directly by applying a single regularization to it and is capable of using variable smoothing parameters which lead to a more accurate approximation of the original problem as the iteration counter increases. We then derive and investigate different primal-dual methods in real Hilbert spaces. In general, one considerable advantage of primal-dual algorithms is that they are providing a complete splitting philosophy in that the resolvents, which arise in the iterative process, are only taken separately from each maximally monotone operator occurring in the problem description. We firstly analyze the forward-backward-forward algorithm of Combettes and Pesquet in terms of its convergence rate for the objective of a nondifferentiable convex optimization problem. Additionally, we propose accelerations of this method under the additional assumption that certain monotone operators occurring in the problem formulation are strongly monotone. Subsequently, we derive two Douglas–Rachford type primal-dual methods for solving monotone inclusion problems involving finite sums of linearly composed parallel sum type monotone operators. To prove their asymptotic convergence, we use a common product Hilbert space strategy by reformulating the corresponding inclusion problem reasonably such that the Douglas–Rachford algorithm can be applied to it. Finally, we propose two primal-dual algorithms relying on forward-backward and forward-backward-forward approaches for solving monotone inclusion problems involving parallel sums of linearly composed monotone operators.
The last part of this thesis deals with different numerical experiments where we intend to compare our methods against algorithms from the literature. The problems which arise in this part are manifold and they reflect the importance of this field of research as convex optimization problems appear in lots of applications of interest.
|
117 |
Frequent itemset mining on multiprocessor systemsSchlegel, Benjamin 08 May 2014 (has links) (PDF)
Frequent itemset mining is an important building block in many data mining applications like market basket analysis, recommendation, web-mining, fraud detection, and gene expression analysis. In many of them, the datasets being mined can easily grow up to hundreds of gigabytes or even terabytes of data. Hence, efficient algorithms are required to process such large amounts of data. In recent years, there have been many frequent-itemset mining algorithms proposed, which however (1) often have high memory requirements and (2) do not exploit the large degrees of parallelism provided by modern multiprocessor systems. The high memory requirements arise mainly from inefficient data structures that have only been shown to be sufficient for small datasets. For large datasets, however, the use of these data structures force the algorithms to go out-of-core, i.e., they have to access secondary memory, which leads to serious performance degradations. Exploiting available parallelism is further required to mine large datasets because the serial performance of processors almost stopped increasing. Algorithms should therefore exploit the large number of available threads and also the other kinds of parallelism (e.g., vector instruction sets) besides thread-level parallelism.
In this work, we tackle the high memory requirements of frequent itemset mining twofold: we (1) compress the datasets being mined because they must be kept in main memory during several mining invocations and (2) improve existing mining algorithms with memory-efficient data structures. For compressing the datasets, we employ efficient encodings that show a good compression performance on a wide variety of realistic datasets, i.e., the size of the datasets is reduced by up to 6.4x. The encodings can further be applied directly while loading the dataset from disk or network. Since encoding and decoding is repeatedly required for loading and mining the datasets, we reduce its costs by providing parallel encodings that achieve high throughputs for both tasks. For a memory-efficient representation of the mining algorithms’ intermediate data, we propose compact data structures and even employ explicit compression. Both methods together reduce the intermediate data’s size by up to 25x. The smaller memory requirements avoid or delay expensive out-of-core computation when large datasets are mined.
For coping with the high parallelism provided by current multiprocessor systems, we identify the performance hot spots and scalability issues of existing frequent-itemset mining algorithms. The hot spots, which form basic building blocks of these algorithms, cover (1) counting the frequency of fixed-length strings, (2) building prefix trees, (3) compressing integer values, and (4) intersecting lists of sorted integer values or bitmaps. For all of them, we discuss how to exploit available parallelism and provide scalable solutions. Furthermore, almost all components of the mining algorithms must be parallelized to keep the sequential fraction of the algorithms as small as possible. We integrate the parallelized building blocks and components into three well-known mining algorithms and further analyze the impact of certain existing optimizations. Our algorithms are already single-threaded often up an order of magnitude faster than existing highly optimized algorithms and further scale almost linear on a large 32-core multiprocessor system. Although our optimizations are intended for frequent-itemset mining algorithms, they can be applied with only minor changes to algorithms that are used for mining of other types of itemsets.
|
118 |
Entwicklung eines Evolutionären Algorithmus zur Preisoptimierung für kleine und mittlere Handelsunternehmen / Development of an evolutionary algorithm for price optimization for small and medium sized enterprisesLüders, Sören Oliver 20 April 2018 (has links)
No description available.
|
119 |
Frequent itemset mining on multiprocessor systemsSchlegel, Benjamin 30 May 2013 (has links)
Frequent itemset mining is an important building block in many data mining applications like market basket analysis, recommendation, web-mining, fraud detection, and gene expression analysis. In many of them, the datasets being mined can easily grow up to hundreds of gigabytes or even terabytes of data. Hence, efficient algorithms are required to process such large amounts of data. In recent years, there have been many frequent-itemset mining algorithms proposed, which however (1) often have high memory requirements and (2) do not exploit the large degrees of parallelism provided by modern multiprocessor systems. The high memory requirements arise mainly from inefficient data structures that have only been shown to be sufficient for small datasets. For large datasets, however, the use of these data structures force the algorithms to go out-of-core, i.e., they have to access secondary memory, which leads to serious performance degradations. Exploiting available parallelism is further required to mine large datasets because the serial performance of processors almost stopped increasing. Algorithms should therefore exploit the large number of available threads and also the other kinds of parallelism (e.g., vector instruction sets) besides thread-level parallelism.
In this work, we tackle the high memory requirements of frequent itemset mining twofold: we (1) compress the datasets being mined because they must be kept in main memory during several mining invocations and (2) improve existing mining algorithms with memory-efficient data structures. For compressing the datasets, we employ efficient encodings that show a good compression performance on a wide variety of realistic datasets, i.e., the size of the datasets is reduced by up to 6.4x. The encodings can further be applied directly while loading the dataset from disk or network. Since encoding and decoding is repeatedly required for loading and mining the datasets, we reduce its costs by providing parallel encodings that achieve high throughputs for both tasks. For a memory-efficient representation of the mining algorithms’ intermediate data, we propose compact data structures and even employ explicit compression. Both methods together reduce the intermediate data’s size by up to 25x. The smaller memory requirements avoid or delay expensive out-of-core computation when large datasets are mined.
For coping with the high parallelism provided by current multiprocessor systems, we identify the performance hot spots and scalability issues of existing frequent-itemset mining algorithms. The hot spots, which form basic building blocks of these algorithms, cover (1) counting the frequency of fixed-length strings, (2) building prefix trees, (3) compressing integer values, and (4) intersecting lists of sorted integer values or bitmaps. For all of them, we discuss how to exploit available parallelism and provide scalable solutions. Furthermore, almost all components of the mining algorithms must be parallelized to keep the sequential fraction of the algorithms as small as possible. We integrate the parallelized building blocks and components into three well-known mining algorithms and further analyze the impact of certain existing optimizations. Our algorithms are already single-threaded often up an order of magnitude faster than existing highly optimized algorithms and further scale almost linear on a large 32-core multiprocessor system. Although our optimizations are intended for frequent-itemset mining algorithms, they can be applied with only minor changes to algorithms that are used for mining of other types of itemsets.
|
120 |
A 2TnC ferroelectric memory gain cell suitable for compute-in-memory and neuromorphic applicationSlesazeck, Stefan, Ravsher, Taras, Havel, Viktor, Breyer, Evelyn T., Mulaosmanovic, Halid, Mikolajick, Thomas 20 June 2022 (has links)
A 2TnC ferroelectric memory gain cell consisting of two transistors and two or more ferroelectric capacitors (FeCAP) is proposed. While a pre-charge transistor allows to access the single cell in an array, the read transistor amplifies the small read signals from small-scaled FeCAPs that can be operated either in FeRAM mode by sensing the polarization reversal current, or in ferroelectric tunnel junction (FTJ) mode by sensing the polarization dependent leakage current. The simultaneous read or write operation of multiple FeCAPs is used to realize compute-in-memory (CiM) algorithms that enable processing of data being represented by both, non-volatilely internally stored data and externally applied data. The internal gain of the cell mitigates the need for 3D integration of the FeCAPs, thus making the concept very attractive especially for embedded memories. Here we discuss design constraints of the 2TnC cell and present the proof-of-concept for realizing versatile (CiM) approaches by means of electrical characterization results.
|
Page generated in 0.0413 seconds