Global ETD Search

41	Exploring the Genomic Basis of Antibiotic Resistance in Wastewater E. coli: Positive Selection, GWAS, and AI Language Model Analyses Malekian Boroujeni, Negin 24 October 2023 (has links) Antibiotic resistance is critical to global health. This thesis examines the relationship between antibiotic resistance and genomic variations in E. coli from wastewater. E. coli is of interest as it causes urinary tract and other infections. Wastewater is a good source because it is a melting pot for E. coli from diverse origins. The research delves into two key aspects: including or excluding antibiotic resistance data and the level of granularity in representing genomic variations. The former is important because there is more genomic data than antibiotic resistance data. Consequently, relying solely on genomic data, this thesis studies positive selection in E. coli to identify mutations and genes favored by evolution. This study demonstrates the preferential selection of known antibiotic resistance genes and mutations, particularly mutations located on functionally important locations of outer membrane porins, and may hence have a direct effect on structure and function. Encouraged by these results, the study was expanded to include antibiotic resistance data and to examine genomic variations at three resolution levels: single mutations, unitigs (genome words) that may contain multiple mutations, and whole coding genome using machine learning classifier models that capture dependencies among multiple mutations and other genomic variations. Representation of single mutations detects well-known resistance mutations as well as potentially novel mechanisms related to biofilm formation and translation. By exploring larger genomic units such as genome words, the analysis confirms the findings from single mutations and additionally uncovers joint mutations in both known and novel genes. Finally, machine learning models, including AI language models, were trained to predict antibiotic resistance based on the whole coding genome. This achieved an accuracy of over 90% in predicting antibiotic resistance when sufficient data were available. Overall, this thesis unveils new antibiotic resistance mechanisms, conducts one of the largest studies of positive selection in E. coli, and stands out as one of the pioneering studies that utilizes AI language models for antibiotic resistance prediction. info:eu-repo/classification/ddc/006 ddc:006
42	Merging Queries in OLTP Workloads Rehrmann, Robin 30 May 2023 (has links) OLTP applications are usually executed by a high number of clients in parallel and are typically faced with high throughput demand as well as a constraint latency requirement for individual statements. In enterprise scenarios, they often face the challenge to deal with overload spikes resulting from events such as Cyber Monday or Black Friday. The traditional solution to prevent running out of resources and thus coping with such spikes is to use a significant over-provisioning of the underlying infrastructure. In this thesis, we analyze real enterprise OLTP workloads with respect to statement types, complexity, and hot-spot statements. Interestingly, our findings reveal that workloads are often read-heavy and comprise similar query patterns, which provides a potential to share work of statements belonging to different transactions. In the past, resource sharing has been extensively studied for OLAP workloads. Naturally, the question arises, why studies mainly focus on OLAP and not on OLTP workloads? At first sight, OLTP queries often consist of simple calculations, such as index look-ups with little sharing potential. In consequence, such queries – due to their short execution time – may not have enough potential for the additional overhead. In addition, OLTP workloads do not only execute read operations but also updates. Therefore, sharing work needs to obey transactional semantics, such as the given isolation level and read-your-own-writes. This thesis presents THE LEVIATHAN, a novel batching scheme for OLTP workloads, an approach for merging read statements within interactively submitted multi-statement transactions consisting of reads and updates. Our main idea is to merge the execution of statements by merging their plans, thus being able to merge the execution of not only complex, but also simple calculations, such as the aforementioned index look-up. We identify mergeable statements by pattern matching of prepared statement plans, which comes with low overhead. For obeying the isolation level properties and providing read-your-own-writes, we first define a formal framework for merging transactions running under a given isolation level and provide insights into a prototypical implementation of merging within a commercial database system. Our experimental evaluation shows that, depending on the isolation level, the load in the system, and the read-share of the workload, an improvement of the transaction throughput by up to a factor of 2.5x is possible without compromising the transactional semantics. Another interesting effect we show is that with our strategy, we can increase the throughput of a real enterprise workload by 20%.:1 INTRODUCTION 1.1 Summary of Contributions 1.2 Outline 2 WORKLOAD ANALYSIS 2.1 Analyzing OLTP Benchmarks 2.1.1 YCSB 2.1.2 TATP 2.1.3 TPC Benchmark Scenarios 2.1.4 Summary 2.2 Analyzing OLTP Workloads from Open Source Projects 2.2.1 Characteristics of Workloads 2.2.2 Summary 2.3 Analyzing Enterprise OLTP Workloads 2.3.1 Overview of Reports about OLTP Workload Characteristics 2.3.2 Analysis of SAP Hybris Workload 2.3.3 Summary 2.4 Conclusion 3 RELATED WORK ON QUERY MERGING 3.1 Merging the Execution of Operators 3.2 Merging the Execution of Subplans 3.3 Merging the Results of Subplans 3.4 Merging the Execution of Full Plans 3.5 Miscellaneous Works on Merging 3.6 Discussion 4 MERGING STATEMENTS IN MULTI STATEMENT TRANSACTIONS 4.1 Overview of Our Approach 4.1.1 Examples 4.1.2 Why Naïve Merging Fails 4.2 THE LEVIATHAN Approach 4.3 Formalizing THE LEVIATHAN Approach 4.3.1 Transaction Theory 4.3.2 Merging Under MVCC 4.4 Merging Reads Under Different Isolation Levels 4.4.1 Read Uncommitted 4.4.2 Read Committed 4.4.3 Repeatable Read 4.4.4 Snapshot Isolation 4.4.5 Serializable 4.4.6 Discussion 4.5 Merging Writes Under Different Isolation Levels 4.5.1 Read Uncommitted 4.5.2 Read Committed 4.5.3 Snapshot Isolation 4.5.4 Serializable 4.5.5 Handling Dependencies 4.5.6 Discussion 5 SYSTEM MODEL 5.1 Definition of the Term “Overload” 5.2 Basic Queuing Model 5.2.1 Option (1): Replacement with a Merger Thread 5.2.2 Option (2): Adding Merger Thread 5.2.3 Using Multiple Merger Threads 5.2.4 Evaluation 5.3 Extended Queue Model 5.3.1 Option (1): Replacement with a Merger Thread 5.3.2 Option (2): Adding Merger Thread 5.3.3 Evaluation 6 IMPLEMENTATION 6.1 Background: SAP HANA 6.2 System Design 6.2.1 Read Committed 6.2.2 Snapshot Isolation 6.3 Merger Component 6.3.1 Overview 6.3.2 Dequeuing 6.3.3 Merging 6.3.4 Sending 6.3.5 Updating MTx State 6.4 Challenges in the Implementation of Merging Writes 6.4.1 SQL String Implementation 6.4.2 Update Count 6.4.3 Error Propagation 6.4.4 Abort and Rollback 7 EVALUATION 7.1 Benchmark Settings 7.2 System Settings 7.2.1 Experiment I: End-to-end Response Time Within a SAP Hybris System 7.2.2 Experiment II: Dequeuing Strategy 7.2.3 Experiment III: Merging Improvement on Different Statement, Transaction and Workload Types 7.2.4 Experiment IV: End-to-End Latency in YCSB 7.2.5 Experiment V: Breakdown of Execution in YCSB 7.2.6 Discussion of System Settings 7.3 Merging in Interactive Transactions 7.3.1 Experiment VI: Merging TATP in Read Uncommitted 7.3.2 Experiment VII: Merging TATP in Read Committed 7.3.3 Experiment VIII: Merging TATP in Snapshot Isolation 7.4 Merging Queries in Stored Procedures Experiment IX: Merging TATP Stored Procedures in Read Committed 7.5 Merging SAP Hybris 7.5.1 Experiment X: CPU-time Breakdown on HANA Components 7.5.2 Experiment XI: Merging Media Query in SAP Hybris 7.5.3 Discussion of our Results in Comparison with Related Work 8 CONCLUSION 8.1 Summary 8.2 Future Research Directions REFERENCES A UML CLASS DIAGRAMS Database, OLTP, Merging, Isolation Level info:eu-repo/classification/ddc/006 ddc:006
43	Wissensintegration von generischem und fallbasiertem Wissen, uniforme Repräsentation, Verwendung relationaler Datenbanksysteme sowie Problemlösen mit Concept Based und Case Based Reasoning sowie Bayesschen Netzen in medizinischen wissensbasierten Systemen Zimmer, Sandra 27 June 2023 (has links) Ein wissensbasiertes System soll den Mediziner im Rahmen der Diagnosestellung unterstützen, indem relevante Informationen bereitgestellt werden. Aus komplexen Symptomkonstellationen soll eine zuverlässige Diagnose und damit verbundene medizinische Maßnahmen abgeleitet werden. Grundlage dafür bildet das im System adäquat repräsentierte Wissen, das effizient verarbeitet wird. Dieses Wissen ist in der medizinischen Domäne sehr heterogen und häufig nicht gut strukturiert. In der Arbeit wird eine Methodik entwickelt, die die begriffliche Erfassung und Strukturierung der Anwendungsdomäne über Begriffe, Begriffshierarchien, multiaxiale Komposition von Begriffen sowie Konzeptdeklarationen ermöglicht. Komplexe Begriffe können so vollständig, eindeutig und praxisrelevant abgebildet werden. Darüber hinaus werden mit der zugrunde liegenden Repräsentation Dialogsysteme, fallbasierte und generische Problemlösungsmethoden sowie ihr Zusammenspiel mit relationalen Datenbanken in einem System vorgestellt. Dies ist vor allem im medizinischen Diskursbereich von Bedeutung, da zur Problemlösung generisches Wissen (Lehrbuchwissen) und Erfahrungswissen (behandelte Fälle) notwendig ist. Die Wissensbestände können auf relationalen Datenbanken uniform gespeichert werden. Um das vorliegende Wissen effizient verarbeiten zu können, wird eine Methode zur semantischen Indizierung vorgestellt und deren Anwendung im Bereich der Wissensrepräsentation beschrieben. Ausgangspunkt der semantischen Indizierung ist das durch Konzepthierarchien repräsentierte Wissen. Ziel ist es, den Knoten (Konzepten) Schlüssel zuzuordnen, die hierarchisch geordnet und syntaktisch sowie semantisch korrekt sind. Mit dem Indizierungsalgorithmus werden die Schlüssel so berechnet, dass die Konzepte mit den spezifischeren Konzepten unifizierbar sind und nur semantisch korrekte Konzepte zur Wissensbasis hinzugefügt werden dürfen. Die Korrektheit und Vollständigkeit des Indizierungsalgorithmus wird bewiesen. Zur Wissensverarbeitung wird ein integrativer Ansatz der Problemlösungsmethoden des Concept Based und Case Based Reasoning vorgestellt. Concept Based Reasoning kann für die Diagnose-, Therapie- und Medikationsempfehlung und -evaluierung über generisches Wissen verwendet werden. Mit Hilfe von Case Based Reasoning kann Erfahrungswissen von Patientenfällen verarbeitet werden. Weiterhin werden zwei neue Ähnlichkeitsmaße (Kompromissmengen für Ähnlichkeitsmaße und multiaxiale Ähnlichkeit) für das Retrieval ähnlicher Patientenfälle entwickelt, die den semantischen Kontext adäquat berücksichtigen. Einem ausschließlichen deterministischen konzeptbasiertem Schließen sind im medizinischen Diskursbereich Grenzen gesetzt. Für die diagnostische Inferenz unter Unsicherheit, Unschärfe und Unvollständigkeit werden Bayessche Netze untersucht. Es können so die gültigen allgemeinen Konzepte nach deren Wahrscheinlichkeit ausgegeben werden. Dazu werden verschiedene Inferenzmechanismen vorgestellt und anschließend im Rahmen der Entwicklung eines Prototypen evaluiert. Mit Hilfe von Tests wird die Klassifizierung von Diagnosen durch das Netz bewertet.:1 Einleitung 2 Medizinische wissensbasierte Systeme 3 Medizinischer Behandlungsablauf und erweiterter wissensbasierter Agent 4 Methoden zur Wissensrepräsentation 5 Uniforme Repräsentation mit Begriffshierachien, Konzepten, generischem und fallbasierten Schließen 6 Semantische Indizierung 7 Medizinisches System als Beispielanwendung 8 Ähnlichkeitsmaße, Kompromissmengen, multiaxiale Ähnlichkeit 9 Inferenzen mittels Bayesscher Netze 10 Zusammenfassung und Ausblick A Ausgewählte medizinische wissensbasierte Systeme zur Entscheidungsunterstützung aus der Literatur B Realisierung mit Softwarewerkzeugen C Causal statistic modeling and calculation of distribution functions of classification features / A knowledge-based system is designed to support the medical professionals in the diagnostic process by providing relevant information. A reliable diagnosis and associated medical measures are to be derived from complex symptom constellations. It is based on the knowledge adequately represented in the system, which is processed efficiently. This knowledge is very heterogeneous in the medical domain and often not well structured. In this work, a methodology is developed that enables the conceptual capture and structuring of the application domain via concepts, conecpt hierarchies, multiaxial composition of concepts as well as concept declarations. Complex concepts can thus be mapped completely, clearly and with practical relevance. Furthermore, the underlying representation introduces dialogue systems, \acrlong{abk:CBR} and generic problem solving methods as well as their interaction with relational databases in one system. This is particularly important in the field of medical discourse, since generic knowledge (textbook knowledge) and experiential knowledge (treated cases) are necessary for problem solving. The knowledge can be stored uniformly on relational databases. In order to be able to process the available knowledge efficiently, a method for semantic indexing is presented and its application in the field of knowledge representation is described. The starting point of semantic indexing is the knowledge represented by concept hierarchies. The goal is to assign keys to the nodes (concepts) that are hierarchically ordered and syntactically and semantically correct. With the indexing algorithm, the keys are calculated in such a way that the concepts are unifiable with the more specific concepts and only semantically correct concepts may be added to the knowledge base. The correctness and completeness of the indexing algorithm is proven. An integrative approach of the problem-solving methods of Concept Based and \acrlong{abk:CBR} is presented for knowledge processing. Concept Based Reasoning can be used for diagnosis, therapy and medication recommendation and evaluation via generic knowledge. Case Based Reasoning can be used to process experiential knowledge of patient cases. Furthermore, two new similarity measures (compromise sets for similarity measures and multiaxial similarity) are developed for the retrieval of similar patient cases that adequately consider the semantic context. There are limits to an exclusively deterministic Concept Based Reasoning in the medical domain. For diagnostic inference under uncertainty, vagueness and incompleteness Bayesian networks are investigated. The method is based on an adequate uniform representation of the necessary knowledge. Thus, the valid general concepts can be issued according to their probability. To this end, various inference mechanisms are introduced and subsequently evaluated within the context of a developed prototype. Tests are employed to assess the classification of diagnoses by the network.:1 Einleitung 2 Medizinische wissensbasierte Systeme 3 Medizinischer Behandlungsablauf und erweiterter wissensbasierter Agent 4 Methoden zur Wissensrepräsentation 5 Uniforme Repräsentation mit Begriffshierachien, Konzepten, generischem und fallbasierten Schließen 6 Semantische Indizierung 7 Medizinisches System als Beispielanwendung 8 Ähnlichkeitsmaße, Kompromissmengen, multiaxiale Ähnlichkeit 9 Inferenzen mittels Bayesscher Netze 10 Zusammenfassung und Ausblick A Ausgewählte medizinische wissensbasierte Systeme zur Entscheidungsunterstützung aus der Literatur B Realisierung mit Softwarewerkzeugen C Causal statistic modeling and calculation of distribution functions of classification features info:eu-repo/classification/ddc/006 ddc:006
44	Vorhersage der Aktualisierungen auf Social Media Plattformen Keller, Max-Emanuel 28 June 2023 (has links) Social Media Plattformen wie Facebook, Twitter und YouTube sind nicht nur bei Endbenutzern, sondern auch bei Unternehmen seit Jahren sehr beliebt. Unternehmen nutzen diese Plattformen insbesondere für Marketingzwecke, womit herkömmliche Marketinginstrumente zunehmend in den Hintergrund rücken. Neben Unternehmen verwenden auch politische Parteien, Universitäten, Forschungseinrichtungen und viele weitere Organisationen die Möglichkeiten von Social Media für ihre Belange. Das große Interesse von Endbenutzern und Institutionen an Social Media macht es interessant für viele Anwendungen in Wirtschaft und Wissenschaft. Um Marktbeobachtung und Forschung zu Social Media zu betreiben, werden Daten benötigt, die meist über dedizierte Werkzeuge erhoben und ausgewertet werden, wobei die Einschränkungen vorhandener technischer Schnittstellen der Social Media Plattformen zu beachten sind. Für ausgewählte Forschungsfragen sind Aspekte wie Umfang und Aktualität der Daten von besonderer Bedeutung. Ein Abfragen von Aktualisierungen aus den Social Media Plattformen kann mit heute verfügbaren Mitteln nur über Polling-Verfahren durchgeführt werden. Zum Berechnen der Aktualisierungsintervalle nutzt man häufig statistische Modelle. Das Ziel der vorliegenden Arbeit ist es, geeignete Zeitpunkte zum Abruf vorgegebener Feeds auf Social Media Plattformen zu bestimmen, um neue Beiträge zeitnah abzurufen und zu verarbeiten. Die Berechnung geeigneter Aktualisierungszeitpunkte dient der Optimierung des Ressourceneinsatzes und einer Reduktion der Verzögerung der Verarbeitung. Viele Anwendungen können davon profitieren. Die vorliegende Arbeit leistet mehrere Beiträge im Hinblick auf die Zielsetzung. Zunächst wurden Arbeiten zu Social Media und angrenzenden Datenquellen im Umfeld des World Wide Web, welche die Bestimmung von Änderungsraten oder die Vorhersage von Aktualisierungen verfolgen, auf die eigene Problemstellung übertragen. Ferner wurde die Eignung der Algorithmen zur Vorhersage der Aktualisierungszeitpunkte aus bestehenden Ansätzen mithilfe quantitativer Messungen bestimmt. Die Ansätze wurden dazu auf reale Daten aus Facebook, Twitter und YouTube angewendet und mithilfe geeigneter Metriken evaluiert. Die gewonnenen Erkenntnisse zeigen, dass die Qualität der Vorhersagen wesentlich von der Wahl des Algorithmus abhängt. Hierbei konnte eine Forschungslücke im Hinblick auf die Auswahl geeigneter Algorithmen identifiziert werden, da diese nach bisherigen Erkenntnissen üblicherweise nur manuell oder nach statischen Regeln erfolgt. Ein eigener Ansatz zur Vorhersage bildet den Kern der Arbeit und bezieht die individuellen Aktualisierungsmuster bestehender Social Media Feeds ein, um für neue Feeds die geeigneten Algorithmen zur Vorhersage, mit passender Parametrisierung, auszuwählen. Entsprechend den Ergebnissen der Evaluation wird gegenüber dem Stand der Technik eine höhere Qualität der Vorhersagen bei gleichzeitiger Reduktion des Aufwands für die Auswahl erreicht. info:eu-repo/classification/ddc/006 ddc:006
45	In Situ Visualization of Performance Data in Parallel CFD Applications Falcao do Couto Alves, Rigel 19 January 2023 (has links) This thesis summarizes the work of the author on visualization of performance data in parallel Computational Fluid Dynamics (CFD) simulations. Current performance analysis tools are unable to show their data on top of complex simulation geometries (e.g. an aircraft engine). But in CFD simulations, performance is expected to be affected by the computations being carried out, which in turn are tightly related to the underlying computational grid. Therefore it is imperative that performance data is visualized on top of the same computational geometry which they originate from. However, performance tools have no native knowledge of the underlying mesh of the simulation. This scientific gap can be filled by merging the branches of HPC performance analysis and in situ visualization of CFD simulations data, which shall be done by integrating existing, well established state-of-the-art tools from each field. In this threshold, an extension for the open-source performance tool Score-P was designed and developed, which intercepts an arbitrary number of manually selected code regions (mostly functions) and send their respective measurements – amount of executions and cumulative time spent – to the visualization software ParaView – through its in situ library, Catalyst –, as if they were any other flow-related variable. Subsequently the tool was extended with the capacity to also show communication data (messages sent between MPI ranks) on top of the CFD mesh. Testing and evaluation are done with two industry-grade codes: Rolls-Royce’s CFD code, Hydra, and Onera, DLR and Airbus’ CFD code, CODA. On the other hand, it has been also noticed that the current performance tools have limited capacity of displaying their data on top of three-dimensional, framed (i.e. time-stepped) representations of the cluster’s topology. Parallel to that, in order for the approach not to be limited to codes which already have the in situ adapter, it was extended to take the performance data and display it – also in codes without in situ – on a three-dimensional, framed representation of the hardware resources being used by the simulation. Testing is done with the Multi-Grid and Block Tri-diagonal NAS Parallel Benchmarks (NPB), as well as with Hydra and CODA again. The benchmarks are used to explain how the new visualizations work, while real performance analyses are done with the industry-grade CFD codes. The proposed solution is able to provide concrete performance insights, which would not have been reached with the current performance tools and which motivated beneficial changes in the respective source code in real life. Finally, its overhead is discussed and proven to be suitable for usage with CFD codes. The dissertation provides a valuable addition to the state of the art of highly parallel CFD performance analysis and serves as basis for further suggested research directions. info:eu-repo/classification/ddc/006 ddc:006
46	Model-Driven Teaching zur automatischen Generierung von Kursmaterial Geisel, Oliver 28 February 2024 (has links) Diese Arbeit stellt einen Prozess vor, der das Ziel hat, automatisch Kursmaterialen aus Wissen zu generieren und zu einem Kurs zusammenzusetzen. Dazu werden vier Phasen durchlaufen. Im ersten Prozess wird das Wissen gesammelt und strukturiert. Das gewünschte Material für einen Kurs wird in der zweiten Phase aus dem vorhanden Wissen generiert und in der dritten Phase zu einem Kurs zusammengesetzt. Die letzte Phase kann über die Zeit neuen Wissen erlangen und die Kursmaterialien aktualisieren. Dieser Prozess wird als Model-Driven Teaching (kurz MDTea) eingeführt. Bisher ist dieser Begriff nicht definiert wurden. Diese Arbeit erklärt die einzelnen Phasen von MDTea, zeigt verwandte Arbeiten zu dem Thema 'Generirung von Kursmaterialen/Kursen' und zeigt einen ersten Prototypen, welcher die zweite und dritte Phasen teilweise als 'Proof of concept' umsetzt.:1. Einleitung 1.1. Motivation 1.2. Forschungsfragen 2. Grundlagen 2.1. E-Learning und Lernplattformen 2.2. Model-Driven Development 2.3. Wissensgenerierung 3. Model-Driven Teaching 3.1. Architektur 3.2. Definierte Begriffe 3.3. Aggregation 3.3.1. Benötigte Materialien 3.3.2. Ablauf 3.3.3. Erzeugte Materialien 3.4. Generation 3.4.1. Benötigte Materialien 3.4.2. Ablauf 3.4.3. Materialgruppen 3.4.4. Erzeugte Materialien 3.5. Finalization 3.5.1. Benötigte Materialien 3.5.2. Ablauf 3.5.3. Erzeugte Materialien 3.6. Synchronization 3.6.1. Benötigte Materialien 3.6.2. Ablauf 3.6.3. Erzeugte Materialien 4. Verwandte Arbeiten 4.1. Phasen von MDTea 4.1.1. Arbeiten zu Aggregation 4.1.2. Arbeiten zur Generation 4.1.3. Arbeiten zur Finalization 4.1.4. Arbeiten zur Synchronization 4.2. Art der Umsetzung 4.2.1. Modell-getrieben 4.2.2. Modell-basiert 4.2.3. Anderer Ansatz 4.3. Weitere Konzepte aus der Literatur 4.4. Zusammenfassung 5. Design 5.1. Wissensmodell 5.1.1. Struktur 5.1.2. Elemente und Relationen 5.1.3. Quellen 5.2. Der Kursplan 5.2.1. Allgemeine Struktur 5.2.2. Metadaten 5.2.3. Inhalt 5.2.4. Struktur 5.3. Generator 5.4. Vorlagen 5.5. Synchronisationsinformationen 6. Implementierung 6.1. Das Wissensmodell 6.1.1. Datenhaltung 6.1.2. Wissensmodellgenerator 6.2. Der Kursplan 6.2.1. Datenhaltung 6.2.2. Der Lehrplangenerator 6.3. Das Tool - MDTea-Gen 6.3.1. Generation mit MDTea-Gen 6.3.2. Finalization mit MDTea-Gen 7. Auswertung 7.1. Testumgebung 7.1.1. Befüllung des Wissensmodell 7.1.2. Erstellung des Testkurses 7.2. Möglichkeiten und Limitationen des Prototypen 7.3. Möglichkeiten und Limitationen des generierten Kurses 7.4. Auswertung einer Fallstudie 7.4.1. Versuchsaufbau 7.4.2. Auswertung von Probanden 7.5. Zusammenfassung 8. Ausblick und Zusammenfassung 8.1. Beantwortung der Forschungsfragen 8.2. Ausblick 8.3. Zusammenfassung Literatur A. Bilder B. Listen 69 B.1. Antworten zu Frage 9 B.2. Antworten zu Frage 10 C. Weiters 71 C.1. Weitere Kommentare von P2 C.2. Fragebogen info:eu-repo/classification/ddc/006 ddc:006
47	Automatic methods for distribution of data-parallel programs on multi-device heterogeneous platforms Moreń, Konrad 07 February 2024 (has links) This thesis deals with the problem of finding effective methods for programming and distributing data-parallel applications for heterogeneous multiprocessor systems. These systems are ubiquitous today. They range from embedded devices with low power consumption to high performance distributed systems. The demand for these systems is growing steadily. This is due to the growing number of data-intensive applications and the general growth of digital applications. Systems with multiple devices offer higher performance but unfortunately add complexity to the software development for such systems. Programming heterogeneous multiprocessor systems present several unique challenges compared to single device systems. The first challenge is the programmability of such systems. Despite constant innovations in programming languages and frameworks, they are still limited. They are either platform specific, like CUDA which supports only NVIDIA GPUs, or applied at a low level of abstraction, such as OpenCL. Application developers that design OpenCL programs must manually distribute data to the different devices and synchronize the distributed computations. These capabilities have an impact on the productivity of the developers. To reduce the programming complexity and the development time, this thesis introduces two approaches that automatically distribute and synchronize the data-parallel workloads. Another challenge is the multi-device hardware utilization. In contrast to single-device platforms, the application optimization process for a multi-device system is even more complicated. The application designers need to apply not only optimization strategies specific for a single-device architecture. They need also focus on the careful workload balancing between all the platform processors. For the balancing problem, this thesis proposes a method based on the platform model. The platform model is created with machine learning techniques. Using machine learning, this thesis builds automatically a reliable platform model, which is portable and adaptable to different platform setups, with a minimum manual involvement of the programmers. info:eu-repo/classification/ddc/006 ddc:006
48	Design and implementation of a workflow for quality improvement of the metadata of scientific publications Wolff, Stefan 07 November 2023 (has links) In this paper, a detailed workflow for analyzing and improving the quality of metadata of scientific publications is presented and tested. The workflow was developed based on approaches from the literature. Frequently occurring types of errors from the literature were compiled and mapped to the data-quality dimensions most relevant for publication data – completeness, correctness, and consistency – and made measurable. Based on the identified data errors, a process for improving data quality was developed. This process includes parsing hidden data, correcting incorrectly formatted attribute values, enriching with external data, carrying out deduplication, and filtering erroneous records. The effectiveness of the workflow was confirmed in an exemplary application to publication data from Open Researcher and Contributor ID (ORCID), with 56\% of the identified data errors corrected. The workflow will be applied to publication data from other source systems in the future to further increase its performance. info:eu-repo/classification/ddc/006 ddc:006
49	Analytical Exploration and Quantification of Nanowire-based Reconfigurable Digital Circuits Raitza, Michael 22 December 2022 (has links) Integrated circuit development is an industry-driven high-risk high-stakes environment. The time from the concept of a new transistor technology to the market-ready product is measured in decades rather than months or years. This increases the risk for any company endeavouring on the journey of driving a new concept. Additionally to the return on investment being in the far future, it is only to be expected at all in high volume production, increasing the upfront investment. What makes the undertaking worthwhile are the exceptional gains that are to be expected, when the production reaches the market and enables better products. For these reasons, the adoption of new transistor technologies is usually based on small increments with foreseeable impact on the production process. Emerging semiconductor device development must be able to prove its value to its customers, the chip-producing industry, the earlier the better. With this thesis, I provide a new approach for early evaluation of emerging reconfigurable transistors in reconfigurable digital circuits. Reconfigurable transistors are a type of MOSFET that features a controllable conduction polarity, i.e., they can be configured by other input signals to work as PMOS or NMOS devices. Early device and circuit characterisation poses some challenges that are currently largely neglected by the development community. Firstly, to drive transistor development into the right direction, early feedback is necessary, which requires a method that can provide quantitative and qualitative results over a variety of circuit designs and must run mostly automatic. It should also require as little expert knowledge as possible to enable early experimentation on the device and new circuit designs together. Secondly, to actually run early, its device model should need as little data as possible to provide meaningful results. The proposed approach of this thesis tackles both challenges and employs model checking, a formal method, to provide a framework for the automated quantitative and qualitative analysis. It pairs a simple transistor device model with a charge transport model of the electrical network. In this thesis, I establish the notion of transistor-level reconfiguration and show the kinds of reconfigurable standard cell designs the device facilitates. Early investigation resulted in the discovery of certain modes of reconfiguration that the transistor features and their application to design reconfigurable standard cells. Experiments with device parameters and the design of improved combinational circuits that integrate new reconfigurable standard cells further highlight the need for a thorough investigation and quantification of the new devices and newly available standard cells. As their performance improvements are inconclusive when compared to established CMOS technology, a design space exploration of the possible reconfigurable standard cell variants and a context-aware quantitative analysis turns out to be required. I show that a charge transport model of the analogue transistor circuit provides the necessary abstraction, precision and compatibility with an automated analysis. Formalised in a DSL, it enables designers to freely characterise and combine parametrised transistor models, circuit descriptions that are device independent, and re-usable experiment setups that enable the analysis of large families of circuit variants. The language is paired with a design space exploration algorithm that explores all implementation variants of a Boolean function that employs various degrees and modes of reconfiguration. The precision of the device models and circuit performance calculations is validated against state-of-the-art FEM and SPICE simulations of production transistors. Lastly, I show that the exploration and analysis can be done efficiently using two important Boolean functions. The analysis ranges from worst-case measures, like delay, power dissipation and energy consumption to the detection and quantification of output hazards and the verification of the functionality of a circuit implementation. It ends in presenting average performance results that depend on the statistical characterisation of application scenarios. This makes the approach particularly interesting for measures like energy consumption, where average results are more interesting, and for asynchronous circuit designs which highly depend on average delay performance. I perform the quantitative analysis under various input and output load conditions in over 900 fully automated experiments. It shows that the complexity of the results warrants an extension to electronic design automation flows to fully exploit the capabilities of reconfigurable standard cells. The high degree of automation enables a researcher to use as little as a Boolean function of interest, a transistor model and a set of experiment conditions and queries to perform a wide range quantitative analyses and acquire early results.:1 Introduction 1.1 Emerging Reconfigurable Transistor Technology 1.2 Testing and Standard Cell Characterisation 1.3 Research Questions 1.4 Design Space Exploration and Quantitative Analysis 1.5 Contribution 2 Fundamental Reconfigurable Circuits 2.1 Reconfiguration Redefined 2.1.1 Common Understanding of Reconfiguration 2.1.2 Reconfiguration is Computation 2.2 Reconfigurable Transistor 2.2.1 Device geometry 2.2.2 Electrical properties 2.3 Fundamental Circuits 3 Combinational Circuits and Higher-Order Functions 3.1 Programmable Logic Cells 3.1.1 Critical Path Delay Estimation using Logical Effort Method 3.1.2 Multi-Functional Circuits 3.2 Improved Conditional Carry Adder 4 Constructive DSE for Standard Cells Using MC 4.1 Principle Operation of Model Checking 4.1.1 Model Types 4.1.2 Query Types 4.2 Overview and Workflow 4.2.1 Experiment setup 4.2.2 Quantitative Analysis and Results 4.3 Transistor Circuit Model 4.3.1 Direct Logic Network Model 4.3.2 Charge Transport Network Model 4.3.3 Transistor Model 4.3.4 Queries for Quantitative Analysis 4.4 Circuit Variant Generation 4.4.1 Function Expansion 5 Quantitative Analysis of Standard Cells 5.1 Analysis of 3-Input Minority Logic Gate 5.1.1 Circuit Variants 5.1.2 Worst-Case Analysis 5.2 Analysis of 3-Input Exclusive OR Gate 5.2.1 Worst-Case Analysis 5.2.2 Functional Verification 5.2.3 Probabilistic Analysis 6 Conclusion and Future Work 6.1 Future Work A Notational conventions B prism-gen Programming Interfaces Bibliography Terms & Abbreviations info:eu-repo/classification/ddc/006 ddc:006
50	Quality-of-Service Aware Design and Management of Embedded Mixed-Criticality Systems Ranjbar, Behnaz 12 April 2024 (has links) Nowadays, implementing a complex system, which executes various applications with different levels of assurance, is a growing trend in modern embedded real-time systems to meet cost, timing, and power consumption requirements. Medical devices, automotive, and avionics industries are the most common safety-critical applications, exploiting these systems known as Mixed-Criticality (MC) systems. MC applications are real-time, and to ensure the correctness of these applications, it is essential to meet strict timing requirements as well as functional specifications. The correct design of such MC systems requires a thorough understanding of the system's functions and their importance to the system. A failure/deadline miss in functions with various criticality levels has a different impact on the system, from no effect to catastrophic consequences. Failure in the execution of tasks with higher criticality levels (HC tasks) may lead to system failure and cause irreparable damage to the system, while although Low-Criticality (LC) tasks assist the system in carrying out its mission successfully, their failure has less impact on the system's functionality and does not harm the system itself to fail. In order to guarantee the MC system safety, tasks are analyzed with different assumptions to obtain different Worst-Case Execution Times (WCETs) corresponding to the multiple criticality levels and the operation mode of the system. If the execution time of at least one HC task exceeds its low WCET, the system switches from low-criticality mode (LO mode) to high-criticality mode (HI mode). Then, all HC tasks continue executing by considering the high WCET to guarantee the system's safety. In this HI mode, all or some LC tasks are dropped/degraded in favor of HC tasks to ensure HC tasks' correct execution. Determining an appropriate low WCET for each HC task is crucial in designing efficient MC systems and ensuring QoS maximization. However, in the case where the low WCETs are set correctly, it is not recommended to drop/degrade the LC tasks in the HI mode due to its negative impact on the other functions or on the entire system in accomplishing its mission correctly. Therefore, how to analyze the task dropping in the HI mode is a significant challenge in designing efficient MC systems that must be considered to guarantee the successful execution of all HC tasks to prevent catastrophic damages while improving the QoS. Due to the continuous rise in computational demand for MC tasks in safety-critical applications, like controlling autonomous driving, the designers are motivated to deploy MC applications on multi-core platforms. Although the parallel execution feature of multi-core platforms helps to improve QoS and ensures the real-timeliness, high power consumption and temperature of cores may make the system more susceptible to failures and instability, which is not desirable in MC applications. Therefore, improving the QoS while managing the power consumption and guaranteeing real-time constraints is the critical issue in designing such MC systems in multi-core platforms. This thesis addresses the challenges associated with efficient MC system design. We first focus on application analysis by determining the appropriate WCET by proposing a novel approach to provide a reasonable trade-off between the number of scheduled LC tasks at design-time and the probability of mode switching at run-time to improve the system utilization and QoS. The approach presents an analytic-based scheme to obtain low WCETs based on the Chebyshev theorem at design-time. We also show the relationship between the low WCETs and mode switching probability, and formulate and solve the problem for improving resource utilization and reducing the mode switching probability. Further, we analyze the LC task dropping in the HI mode to improve QoS. We first propose a heuristic in which a new metric is defined that determines the number of allowable drops in the HI mode. Then, the task schedulability analysis is developed based on the new metric. Since the occurrence of the worst-case scenario at run-time is a rare event, a learning-based drop-aware task scheduling mechanism is then proposed, which carefully monitors the alterations in the behavior of MC systems at run-time to exploit the dynamic slacks for improving the QoS. Another critical design challenge is how to improve QoS using the parallel feature of multi-core platforms while managing the power consumption and temperature of these platforms. We develop a tree of possible task mapping and scheduling at design-time to cover all possible scenarios of task overrunning and reduce the LC task drop rate in the HI mode while managing the power and temperature in each scenario of task scheduling. Since the dynamic slack is generated due to the early execution of tasks at run-time, we propose an online approach to reduce the power consumption and maximum temperature by using low-power techniques like DVFS and task re-mapping, while preserving the QoS. Specifically, our approach examines multiple tasks ahead to determine the most appropriate task for the slack assignment that has the most significant effect on power consumption and temperature. However, changing the frequency and selecting a proper task for slack assignment and a suitable core for task re-mapping at run-time can be time-consuming and may cause deadline violation. Therefore, we analyze and optimize the run-time scheduler.:1. Introduction 1.1. Mixed-Criticality Application Design 1.2. Mixed-Criticality Hardware Design 1.3. Certain Challenges and Questions 1.4. Thesis Key Contributions 1.4.1. Application Analysis and Modeling 1.4.2. Multi-Core Mixed-Criticality System Design 1.5. Thesis Overview 2. Preliminaries and Literature Reviews 2.1. Preliminaries 2.1.1. Mixed-Criticality Systems 2.1.2. Fault-Tolerance, Fault Model and Safety Requirements 2.1.3. Hardware Architectural Modeling 2.1.4. Low-Power Techniques and Power Consumption Model 2.2. Related Works 2.2.1. Mixed-Criticality Task Scheduling Mechanisms 2.2.2. QoS Improvement Methods in Mixed-Criticality Systems 2.2.3. QoS-Aware Power and Thermal Management in Multi-Core Mixed-Criticality Systems 2.3. Conclusion 3. Bounding Time in Mixed-Criticality Systems 3.1. BOT-MICS: A Design-Time WCET Adjustment Approach 3.1.1. Motivational Example 3.1.2. BOT-MICS in Detail 3.1.3. Evaluation 3.2. A Run-Time WCET Adjustment Approach 3.2.1. Motivational Example 3.2.2. ADAPTIVE in Detail 3.2.3. Evaluation 3.3. Conclusion 4. Safety- and Task-Drop-Aware Mixed-Criticality Task Scheduling 4.1. Problem Objectives and Motivational Example 4.2. FANTOM in detail 4.2.1. Safety Quantification 4.2.2. MC Tasks Utilization Bounds Definition 4.2.3. Scheduling Analysis 4.2.4. System Upper Bound Utilization 4.2.5. A General Design Time Scheduling Algorithm 4.3. Evaluation 4.3.1. Evaluation with Real-Life Benchmarks 4.3.2. Evaluation with Synthetic Task Sets 4.4. Conclusion 5. Learning-Based Drop-Aware Mixed-Criticality Task Scheduling 5.1. Motivational Example and Problem Statement 5.2. Proposed Method in Detail 5.2.1. An Overview of the Design-Time Approach 5.2.2. Run-Time Approach: Employment of SOLID 5.2.3. LIQUID Approach 5.3. Evaluation 5.3.1. Evaluation with Real-Life Benchmarks 5.3.2. Evaluation with Synthetic Task Sets 5.3.3. Investigating the Timing and Memory Overheads of ML Technique 5.4. Conclusion 6. Fault-Tolerance and Power-Aware Multi-Core Mixed-Criticality System Design 6.1. Problem Objectives and Motivational Example 6.2. Design Methodology 6.3. Tree Generation and Fault-Tolerant Scheduling and Mapping 6.3.1. Making Scheduling Tree 6.3.2. Mapping and Scheduling 6.3.3. Time Complexity Analysis 6.3.4. Memory Space Analysis 6.4. Evaluation 6.4.1. Experimental Setup 6.4.2. Analyzing the Tree Construction Time 6.4.3. Analyzing the Run-Time Timing Overhead 6.4.4. Peak Power Management and Thermal Distribution for Real-Life and Synthetic Applications 6.4.5. Analyzing the QoS of LC Tasks 6.4.6. Analyzing the Peak Power Consumption and Maximum Temperature 6.4.7. Effect of Varying Different Parameters on Acceptance Ratio 6.4.8. Investigating Different Approaches at Run-Time 6.5. Conclusion 7. QoS- and Power-Aware Run-Time Scheduler for Multi-Core Mixed-Criticality Systems 7.1. Research Questions, Objectives and Motivational Example 7.2. Design-Time Approach 7.3. Run-Time Mixed-Criticality Scheduler 7.3.1. Selecting the Appropriate Task to Assign Slack 7.3.2. Re-Mapping Technique 7.3.3. Run-Time Management Algorithm 7.3.4. DVFS governor in Clustered Multi-Core Platforms 7.4. Run-Time Scheduler Algorithm Optimization 7.5. Evaluation 7.5.1. Experimental Setup 7.5.2. Analyzing the Relevance Between a Core Temperature and Energy Consumption 7.5.3. The Effect of Varying Parameters of Cost Functions 7.5.4. The Optimum Number of Tasks to Look-Ahead and the Effect of Task Re-mapping 7.5.5. The Analysis of Scheduler Timings Overhead on Different Real Platforms 7.5.6. The Latency of Changing Frequency in Real Platform 7.5.7. The Effect of Latency on System Schedulability 7.5.8. The Analysis of the Proposed Method on Peak Power, Energy and Maximum Temperature Improvement 7.5.9. The Analysis of the Proposed Method on Peak power, Energy and Maximum Temperature Improvement in a Multi-Core Platform Based on the ODROID-XU3 Architecture 7.5.10. Evaluation of Running Real MC Task Graph Model (Unmanned Air Vehicle) on Real Platform 7.6. Conclusion 8. Conclusion and Future Work 8.1. Conclusions 8.2. Future Work info:eu-repo/classification/ddc/006 ddc:006

Search results