Global ETD Search

61	Column-specific Context Extraction for Web Tables Braunschweig, Katrin, Thiele, Maik, Eberius, Julian, Lehner, Wolfgang 14 June 2022 (has links) Relational Web tables have become an important resource for applications such as factual search and entity augmentation. A major challenge for an automatic identification of relevant tables on the Web is the fact that many of these tables have missing or non-informative column labels. Research has focused largely on recovering the meaning of columns by inferring class labels from the instances using external knowledge bases. The table context, which often contains additional information on the table's content, is frequently considered as an indicator for the general content of a table, but not as a source for column-specific details. In this paper, we propose a novel approach to identify and extract column-specific information from the context of Web tables. In our extraction framework, we consider different techniques to extract directly as well as indirectly related phrases. We perform a number of experiments on Web tables extracted from Wikipedia. The results show that column-specific information extracted using our simple heuristic significantly boost precision and recall for table and column search. info:eu-repo/classification/ddc/004 ddc:004
62	Sample synopses for approximate answering of group-by queries Lehner, Wolfgang, Rösch, Philipp 22 April 2022 (has links) With the amount of data in current data warehouse databases growing steadily, random sampling is continuously gaining in importance. In particular, interactive analyses of large datasets can greatly benefit from the significantly shorter response times of approximate query processing. Typically, those analytical queries partition the data into groups and aggregate the values within the groups. Further, with the commonly used roll-up and drill-down operations a broad range of group-by queries is posed to the system, which makes the construction of highly-specialized synopses difficult. In this paper, we propose a general-purpose sampling scheme that is biased in order to answer group-by queries with high accuracy. While existing techniques focus on the size of the group when computing its sample size, our technique is based on its standard deviation. The basic idea is that the more homogeneous a group is, the less representatives are required in order to give a good estimate. With an extensive set of experiments, we show that our approach reduces both the estimation error and the construction cost compared to existing techniques. info:eu-repo/classification/ddc/004 ddc:004
63	To and Fro Between Tableaus and Automata for Description Logics Hladik, Jan 14 November 2007 (has links) Beschreibungslogiken (Description logics, DLs) sind eine Klasse von Wissensrepraesentationsformalismen mit wohldefinierter, logik-basierter Semantik und entscheidbaren Schlussfolgerungsproblemen, wie z.B. dem Erfuellbarkeitsproblem. Zwei wichtige Entscheidungsverfahren fuer das Erfuellbarkeitsproblem von DL-Ausdruecken sind Tableau- und Automaten-basierte Algorithmen. Diese haben aufgrund ihrer unterschiedlichen Arbeitsweise komplementaere Eigenschaften: Tableau-Algorithmen eignen sich fuer Implementierungen und fuer den Nachweis von PSPACE- und NEXPTIME-Resultaten, waehrend Automaten sich besonders fuer EXPTIME-Resultate anbieten. Zudem ermoeglichen sie eine vom Standpunkt der Theorie aus elegantere Handhabung von unendlichen Strukturen, eignen sich aber wesentlich schlechter fuer eine Implementierung. Ziel der Dissertation ist es, die Gruende fuer diese Unterschiede zu analysieren und Moeglichkeiten aufzuzeigen, wie Eigenschaften von einem Ansatz auf den anderen uebertragen werden koennen, um so die positiven Eigenschaften von beiden Ansaetzen miteinander zu verbinden. Unter Anderem werden Methoden entwickelt, mit Hilfe von Automaten PSPACE-Resultate zu zeigen, und von einem Tableau-Algorithmus automatisch ein EXPTIME-Resultat abzuleiten. / Description Logics (DLs) are a family of knowledge representation languages with well-defined logic-based semantics and decidable inference problems, e.g. satisfiability. Two of the most widely used decision procedures for the satisfiability problem are tableau- and automata-based algorithms. Due to their different operation, these two classes have complementary properties: tableau algorithms are well-suited for implementation and for showing PSPACE and NEXPTIME complexity results, whereas automata algorithms are particularly useful for showing EXPTIME results. Additionally, they allow for an elegant handling of infinite structures, but they are not suited for implementation. The aim of this thesis is to analyse the reasons for these differences and to find ways of transferring properties between the two approaches in order to reconcile the positive properties of both. For this purpose, we develop methods that enable us to show PSPACE results with the help of automata and to automatically derive an EXPTIME result from a tableau algorithm. info:eu-repo/classification/ddc/004 ddc:004
64	A Genetic-Based Search for Adaptive Table Recognition in Spreadsheets Lehner, Wolfgang, Koci, Elvis, Thiele, Maik, Romero, Oscar 22 June 2023 (has links) Spreadsheets are very successful content generation tools, used in almost every enterprise to create a wealth of information. However, this information is often intermingled with various formatting, layout, and textual metadata, making it hard to identify and interpret the tabular payload. Previous works proposed to solve this problem by mainly using heuristics. Although fast to implement, these approaches fail to capture the high variability of user-generated spreadsheet tables. Therefore, in this paper, we propose a supervised approach that is able to adapt to arbitrary spreadsheet datasets. We use a graph model to represent the contents of a sheet, which carries layout and spatial features. Subsequently, we apply genetic-based approaches for graph partitioning, to recognize the parts of the graph corresponding to tables in the sheet. The search for tables is guided by an objective function, which is tuned to match the specific characteristics of a given dataset. We present the feasibility of this approach with an experimental evaluation, on a large, real-world spreadsheet corpus. info:eu-repo/classification/ddc/004 ddc:004
65	Konzeptentwicklung für das Qualitätsmanagement und der vorausschauenden Instandhaltung im Bereich der Innenhochdruck-Umformung (IHU): SFU 2023 Reuter, Thomas, Massalsky, Kristin, Burkhardt, Thomas 06 March 2024 (has links) Serienfertiger im Bereich der Innenhochdruck-Umformung stehen unter starkem Wettbewerbsdruck alternativer klassischer Fertigungen und deren Kostenkriterien. Wechselnde Produktionsanforderungen im globalisierten Marktumfeld erfordern flexibles Handeln bei höchster Qualität und niedrigen Kosten. Durch Reduzierung der Lager- und Umlaufbestände können Kosteneinsparungen erzielt werden. Störungsbedingte Ausfälle an IHU-Anlagen gilt es dabei auf ein Minimum zu reduzieren, um die vereinbarten Liefertermine fristgerecht zu erfüllen und Konventionalstrafen zu umgehen. Die erforderliche Produktivität und das angestrebte Qualitätsniveau lässt sich nur durch angepasste Instandhaltungsstrategien aufrechterhalten, weshalb ein Konzept für die vorausschauende Instandhaltung mit integriertem Qualitätsmanagement speziell für den Bereich der IHU erarbeitet wurde. Dynamische Prozess- und Instandhaltungsanpassungen sind zentraler Bestandteil der Entwicklungsarbeit.
66	Concept development for quality management and predictive maintenance in the area of hydroforming (IHU): SFU 2023 Reuter, Thomas, Massalsky, Kristin, Burkhardt, Thomas 06 March 2024 (has links) Series manufacturers in the field of hydroforming face intense competition from alternative conventional manufacturing methods and their cost criteria. Changing production requirements in the globalized market environment require flexible action with highest quality and low costs. Cost savings can be achieved through reductions in warehouse and circulating stocks. Malfunction-related downtimes in hydroforming systems must be reduced to a minimum in order to meet the agreed delivery dates on time and avoid conventional penalties. The required productivity and the desired quality level can only be maintained through adapted maintenance strategies, leading to the development of a concept for predictive maintenance integrated with quality management specifically for the IHU domain. Dynamic process and maintenance adaptations are a central component to this developmental effort.
67	A Modelling Study to Examine Threat Assessment Algorithms Performance in Predicting Cyclist Fall Risk in Safety Critical Bicycle-Automatic Vehicle lnteractions Reijne, Marco M., Dehkordi, Sepehr G., Glaser, Sebastien, Twisk, Divera, Schwab, A. L. 19 December 2022 (has links) Falls are responsible for a large proportion of serious injuries and deaths among cyclists [1-4]. A common fall scenario is loss of balance during an emergency braking maneuver to avoid another vehicle [5-7]. Automated Vehicles (AV) have the potential to prevent these critical scenarios between bicycle and cars. However, current Threat Assessment Algorithms (TAA) used by AVs only consider collision avoidance to decide upon safe gaps and decelerations when interacting wih cyclists and do not consider bicycle specific balance-related constraints. To date, no studies have addressed this risk of falls in safety critical scenarios. Yet, given the bicycle dynamics, we hypothesized that the existing TAA may be inaccurate in predicting the threat of cyclist falls and misclassify unsafe interactions. To test this hypothesis, this study developed a simple Newtonian mechanics-based model that calculates the performance of two existing TAAs in four critical scenarios with two road conditions. Tue four scenarios are: (1) a crossing scenario and a bicycle following lead car scenario in which the car either (2) suddenly braked, (3) halted or (4) accelerated from standstill. These scenarios have been identified by bicycle-car conflict studies as common scenarios where the car driver elicits an emergency braking response of the cyclist [8-11] and are illustrated in Figure 1. The two TAAs are Time-to-Collision (TTC) and Headway (H). These TAAs are commonly used by AVs in the four critical scenarios that will be modelled. The two road conditions are a flat dry road and also a downhill wet road, which serves as a worst-case condition for loss of balance during emergency braking [12].
68	Memory-Efficient Frequent-Itemset Mining Schlegel, Benjamin, Gemulla, Rainer, Lehner, Wolfgang 15 September 2022 (has links) Efficient discovery of frequent itemsets in large datasets is a key component of many data mining tasks. In-core algorithms---which operate entirely in main memory and avoid expensive disk accesses---and in particular the prefix tree-based algorithm FP-growth are generally among the most efficient of the available algorithms. Unfortunately, their excessive memory requirements render them inapplicable for large datasets with many distinct items and/or itemsets of high cardinality. To overcome this limitation, we propose two novel data structures---the CFP-tree and the CFP-array---, which reduce memory consumption by about an order of magnitude. This allows us to process significantly larger datasets in main memory than previously possible. Our data structures are based on structural modifications of the prefix tree that increase compressability, an optimized physical representation, lightweight compression techniques, and intelligent node ordering and indexing. Experiments with both real-world and synthetic datasets show the effectiveness of our approach. info:eu-repo/classification/ddc/004 ddc:004
69	Scalable frequent itemset mining on many-core processors Schlegel, Benjamin, Karnagel, Thomas, Kiefer, Tim, Lehner, Wolfgang 19 September 2022 (has links) Frequent-itemset mining is an essential part of the association rule mining process, which has many application areas. It is a computation and memory intensive task with many opportunities for optimization. Many efficient sequential and parallel algorithms were proposed in the recent years. Most of the parallel algorithms, however, cannot cope with the huge number of threads that are provided by large multiprocessor or many-core systems. In this paper, we provide a highly parallel version of the well-known Eclat algorithm. It runs on both, multiprocessor systems and many-core coprocessors, and scales well up to a very large number of threads---244 in our experiments. To evaluate mcEclat's performance, we conducted many experiments on realistic datasets. mcEclat achieves high speedups of up to 11.5x and 100x on a 12-core multiprocessor system and a 61-core Xeon Phi many-core coprocessor, respectively. Furthermore, mcEclat is competitive with highly optimized existing frequent-itemset mining implementations taken from the FIMI repository. info:eu-repo/classification/ddc/004 ddc:004
70	Cutting plane methods and dual problems Gladin, Egor 28 August 2024 (has links) Die vorliegende Arbeit befasst sich mit Schnittebenenverfahren, einer Gruppe von iterativen Algorithmen zur Minimierung einer (möglicherweise nicht glatten) konvexen Funktion über einer kompakten konvexen Menge. Wir betrachten zwei prominente Beispiele, nämlich die Ellipsoidmethode und die Methode der Vaidya, und zeigen, dass ihre Konvergenzrate auch bei Verwendung eines ungenauen Orakels erhalten bleibt. Darüber hinaus zeigen wir, dass es möglich ist, diese Methoden im Rahmen der stochastischen Optimierung effizient zu nutzen. Eine andere Richtung, in der Schnittebenenverfahren nützlich sein können, sind duale Probleme. In der Regel können die Zielfunktion und ihre Ableitungen bei solchen Problemen nur näherungsweise berechnet werden. Daher ist die Unempfindlichkeit der Methoden gegenüber Fehlern in den Subgradienten von großem Nutzen. Als Anwendungsbeispiel schlagen wir eine linear konvergierende duale Methode für einen Markow-Entscheidungsprozess mit Nebenbedienungen vor, die auf der Methode der Vaidya basiert. Wir demonstrieren die Leistungsfähigkeit der vorgeschlagenen Methode in einem einfachen RL Problem. Die Arbeit untersucht auch das Konzept der Genauigkeitszertifikate für konvexe Minimierungsprobleme. Zertifikate ermöglichen die Online-Überprüfung der Genauigkeit von Näherungslösungen. In dieser Arbeit verallgemeinern wir den Begriff der Genauigkeitszertifikate für die Situation eines ungenauen Orakels erster Ordnung. Darüber hinaus schlagen wir einen expliziten Weg zur Konstruktion von Genauigkeitszertifikaten für eine große Klasse von Schnittebenenverfahren vor. Als Nebenprodukt zeigen wir, dass die betrachteten Methoden effizient mit einem verrauschten Orakel verwendet werden können, obwohl sie ursprünglich für ein exaktes Orakel entwickelt wurden. Schließlich untersuchen wir die vorgeschlagenen Zertifikate in numerischen Experimenten und zeigen, dass sie eine enge obere Schranke für das objektive Residuum liefern. / The present thesis studies cutting plane methods, which are a group of iterative algorithms for minimizing a (possibly nonsmooth) convex function over a compact convex set. We consider two prominent examples, namely, the ellipsoid method and Vaidya's method, and show that their convergence rate is preserved even when an inexact oracle is used. Furthermore, we demonstrate that it is possible to use these methods in the context of stochastic optimization efficiently. Another direction where cutting plane methods can be useful is Lagrange dual problems. Commonly, the objective and its derivatives can only be computed approximately in such problems. Thus, the methods' insensitivity to error in subgradients comes in handy. As an application example, we propose a linearly converging dual method for a constrained Markov decision process (CMDP) based on Vaidya's algorithm. We demonstrate the performance of the proposed method in a simple RL environment. The work also investigates the concept of accuracy certificates for convex minimization problems. Certificates allow for online verification of the accuracy of approximate solutions. In this thesis, we generalize the notion of accuracy certificates for the setting of an inexact first-order oracle. Furthermore, we propose an explicit way to construct accuracy certificates for a large class of cutting plane methods. As a by-product, we show that the considered methods can be efficiently used with a noisy oracle even though they were originally designed to be equipped with an exact oracle. Finally, we illustrate the work of the proposed certificates in numerical experiments highlighting that they provide a tight upper bound on the objective residual. Schnittebenenverfahren ungenauer Subgradient Genauigkeitszertifikat duale Algorithmen konvexe Optimierung Cutting plane methods inexact subgradient accuracy certificate dual algorithms convex optimization 510 Mathematik SK 380 ddc:510

Search results