Global ETD Search

1	Root Cause Localization for Unreproducible Builds Liu, Changlin 07 September 2020 (has links) No description available. Computer Science
2	Automatic Parallelization of Simulation Code from Equation Based Simulation Languages Aronsson, Peter January 2002 (has links) <p>Modern state-of-the-art equation based object oriented modeling languages such as Modelica have enabled easy modeling of large and complex physical systems. When such complex models are to be simulated, simulation tools typically perform a number of optimizations on the underlying set of equations in the modeled system, with the goal of gaining better simulation performance by decreasing the equation system size and complexity. The tools then typically generate efficient code to obtain fast execution of the simulations. However, with increasing complexity of modeled systems the number of equations and variables are increasing. Therefore, to be able to simulate these large complex systems in an efficient way parallel computing can be exploited.</p><p>This thesis presents the work of building an automatic parallelization tool that produces an efficient parallel version of the simulation code by building a data dependency graph (task graph) from the simulation code and applying efficient scheduling and clustering algorithms on the task graph. Various scheduling and clustering algorithms, adapted for the requirements from this type of simulation code, have been implemented and evaluated. The scheduling and clustering algorithms presented and evaluated can also be used for functional dataflow languages in general, since the algorithms work on a task graph with dataflow edges between nodes.</p><p>Results are given in form of speedup measurements and task graph statistics produced by the tool. The conclusion drawn is that some of the algorithms investigated and adapted in this work give reasonable measured speedup results for some specific Modelica models, e.g. a model of a thermofluid pipe gave a speedup of about 2.5 on 8 processors in a PC-cluster. However, future work lies in finding a good algorithm that works well in general.</p> / Report code: LiU-Tek-Lic-2002:06. state-of-the-art equation object oriented modeling automatic parallelization tool data dependency graph clustering algorithms Computer science Datavetenskap
3	Scheduling workflows to optimize for execution time Peters, Mathias January 2018 (has links) Many functions in today’s society are immensely dependent on data. Data drives everything from business decisions to self-driving cars to intelligent home assistants like Amazon Echo and Google Home. To make good decisions based on data, of which exabytes are generated every day, somehow that data has to be processed. Data processing can be complex and time-consuming. One way of reducing the complexity is to create workflows that consist of several steps that together produce the right result. Klarna is an example of a company that relies on workflows for transforming and analyzing data. As a company whose core business involves analyzing customer data, being able to do those analyses faster will lead to direct business value in the form of more well-informed decisions. The workflows Klarna use are currently all written in a sequential form. However, workflows, where independent tasks are executed in parallel, are more performant than workflows where only one task is executed at any point in time. Due to limitations in human attention span, parallelized workflows are harder for humans to write, compared to sequential workflows. In this work, a computer application was created that automates the parallelization of a workflow to let humans write sequential workflows while still getting the performance of parallelized workflows. The application does this by taking a simple sequential workflow, identifies dependencies in the workflow and then schedules it in a way that is as parallel as possible given the identified dependencies. Such a solution has not been created before. However, experimental evaluation shows that parallelization of a sequential workflow used in daily production at Klarna can reduce execution time by up to 80%, showing that the application can bring value to Klarna and other organizations that use workflows to analyze big data. Hadoop Hive big data workflows scheduling parallelization automatic dependency identification dependency graph SQL HiveQL Information Systems
4	Automatic Parallelization of Simulation Code from Equation Based Simulation Languages Aronsson, Peter January 2002 (has links) Modern state-of-the-art equation based object oriented modeling languages such as Modelica have enabled easy modeling of large and complex physical systems. When such complex models are to be simulated, simulation tools typically perform a number of optimizations on the underlying set of equations in the modeled system, with the goal of gaining better simulation performance by decreasing the equation system size and complexity. The tools then typically generate efficient code to obtain fast execution of the simulations. However, with increasing complexity of modeled systems the number of equations and variables are increasing. Therefore, to be able to simulate these large complex systems in an efficient way parallel computing can be exploited. This thesis presents the work of building an automatic parallelization tool that produces an efficient parallel version of the simulation code by building a data dependency graph (task graph) from the simulation code and applying efficient scheduling and clustering algorithms on the task graph. Various scheduling and clustering algorithms, adapted for the requirements from this type of simulation code, have been implemented and evaluated. The scheduling and clustering algorithms presented and evaluated can also be used for functional dataflow languages in general, since the algorithms work on a task graph with dataflow edges between nodes. Results are given in form of speedup measurements and task graph statistics produced by the tool. The conclusion drawn is that some of the algorithms investigated and adapted in this work give reasonable measured speedup results for some specific Modelica models, e.g. a model of a thermofluid pipe gave a speedup of about 2.5 on 8 processors in a PC-cluster. However, future work lies in finding a good algorithm that works well in general. / <p>Report code: LiU-Tek-Lic-2002:06.</p> state-of-the-art equation object oriented modeling automatic parallelization tool data dependency graph clustering algorithms Computer Sciences Datavetenskap (datalogi)
5	Synthèse automatique de circuits numériques à partir de spécifications temporelles / Automatic synthesis of digital circuits from temporal specifications Javaheri, Fatemeh Negin 01 October 2015 (has links) Les travaux présentés dans cette thèse visent à produire automatiquement des prototypes de circuits de communication et de contrôle à partir de spécifications temporelles déclaratives. Partant d'un ensemble de propriétés écrites en langage PSL, nous produisons un modèle RTL synthétisable automatiquement. La méthode proposée est modulaire, contrairement aux méthodes publiées antérieurement qui étaient fondées sur la théorie des automates. Pour chaque propriété, nous produisons un composant qui observe certains opérandes et génère des chronogrammes pour les autres opérandes : le module réactif. Tout d'abord, une bibliothèque des modules réactifs primitifs a été développée pour les opérateurs FL et SERE. Pour ce faire, une relation de dépendance a été définie pour chaque opérateur : fondée sur la sémantique de l'opérateur, elle exprime la dépendance entre ses opérandes. Ensuite, la relation de dépendance de chaque opérateur est interprétée comme un composant matériel qui met en œuvre l'opérateur : c'est le module réactif primitif de l'opérateur. À l'aide de cette formalisation, nous proposons une méthode pour déterminer automatiquement quels signaux d'une propriété sont observés et lesquels sont générés. Dans le cas où il n'est pas possible de déterminer le sens du signal, un solveur est ajouté pour identifier la valeur du signal. Le solveur sert aussi à déterminer la valeur d'un signal généré par plusieurs propriétés. Le circuit final est l'interconnexion des modules réactifs et des solveurs pour l'ensemble des propriétés. Un outil prototype, SyntHorus2, qui est une extension d'HORUS, a été mis développé. Il prend les propriétés PSL comme entrées et génère le code VHDL synthétisable du circuit. En outre, il génère des propriétés complémentaires pour vérifier si l'ensemble des spécifications est cohérent et complet. La méthode est efficace et synthétise des circuits de commande en quelques secondes. Les résultats que nous avons obtenus sur des jeux d'essais classiques montrent que notre technique compile les propriétés plus efficacement que les outils prototypes qui l'ont précédée. / The work presented in this thesis aims at automatically prototype communication and control designs from declarative temporal specifications. From a set of PSL properties, we produce a synthesizable RTL design automatically. The proposed method is modular, in contrast to previously published methods that were based on automata theory. From each property, we produce a component that observes some operands and generates waveforms for the other operands: the reactant. First, a library of primitive reactants has been provided for FL and SERE operators. To this goal, a dependency relation is defined for each operator that expresses the dependency among its operands using the operator's semantics. Then, the dependency relation of each operator is interpreted as a hardware component that implements the operator: the operator's primitive reactant. Using this formalization, a method is proposed to automatically decide which signals of a property are observed and which are generated. In the cases when specifying the signal direction is not possible, a solver is implemented to identify the signal value. In addition, the way of identifying the value of the signal that is generated in several properties is addressed. The final circuit is the interconnection of the properties' reactants and solvers. A prototype tool SyntHorus2, which is an extension to HORUS, has been developed. It takes PSL properties as its inputs, and generates the synthesizable VHDL code of the circuit. In addition, it generates some complementary properties to verify if the set of specification is coherent and complete. The method is efficient, and synthesizes control circuits in a few seconds. Results obtained on classical benchmarks show that our technique compiles properties more efficiently than previous prototype tools. PSL Conception basée sur les assertions Module réactif Synthèse automatique Graphe de dépendance Annotation Résolution Solveur PSL Assertion-based design Reactant Automatic synthesis Dependency graph Annotation Resolution Solver 620
6	Functional distributional semantics : learning linguistically informed representations from a precisely annotated corpus Emerson, Guy Edward Toh January 2018 (has links) The aim of distributional semantics is to design computational techniques that can automatically learn the meanings of words from a body of text. The twin challenges are: how do we represent meaning, and how do we learn these representations? The current state of the art is to represent meanings as vectors - but vectors do not correspond to any traditional notion of meaning. In particular, there is no way to talk about 'truth', a crucial concept in logic and formal semantics. In this thesis, I develop a framework for distributional semantics which answers this challenge. The meaning of a word is not represented as a vector, but as a 'function', mapping entities (objects in the world) to probabilities of truth (the probability that the word is true of the entity). Such a function can be interpreted both in the machine learning sense of a classifier, and in the formal semantic sense of a truth-conditional function. This simultaneously allows both the use of machine learning techniques to exploit large datasets, and also the use of formal semantic techniques to manipulate the learnt representations. I define a probabilistic graphical model, which incorporates a probabilistic generalisation of model theory (allowing a strong connection with formal semantics), and which generates semantic dependency graphs (allowing it to be trained on a corpus). This graphical model provides a natural way to model logical inference, semantic composition, and context-dependent meanings, where Bayesian inference plays a crucial role. I demonstrate the feasibility of this approach by training a model on WikiWoods, a parsed version of the English Wikipedia, and evaluating it on three tasks. The results indicate that the model can learn information not captured by vector space models.
7	Measurement and comparison of clustering algorithms Javar, Shima January 2007 (has links) <p>In this project, a number of different clustering algorithms are described and their workings explained. They are compared to each other by implementing them on number of graphs with a known architecture.</p><p>These clustering algorithm, in the order they are implemented, are as follows: Nearest neighbour hillclimbing, Nearest neighbour big step hillclimbing, Best neighbour hillclimbing, Best neighbour big step hillclimbing, Gem 3D, K-means simple, K-means Gem 3D, One cluster and One cluster per node.</p><p>The graphs are Unconnected, Directed KX, Directed Cycle KX and Directed Cycle.</p><p>The results of these clusterings are compared with each other according to three criteria: Time, Quality and Extremity of nodes distribution. This enables us to find out which algorithm is most suitable for which graph. These artificial graphs are then compared with the reference architecture graph to reach the conclusions.</p> Clustering algorithm Module dependency graph Quality Extremity Precision Recall artificial graph Reference graph Cluster Node Edge Implementation time Computer science Datavetenskap
8	Measurement and comparison of clustering algorithms Javar, Shima January 2007 (has links) In this project, a number of different clustering algorithms are described and their workings explained. They are compared to each other by implementing them on number of graphs with a known architecture. These clustering algorithm, in the order they are implemented, are as follows: Nearest neighbour hillclimbing, Nearest neighbour big step hillclimbing, Best neighbour hillclimbing, Best neighbour big step hillclimbing, Gem 3D, K-means simple, K-means Gem 3D, One cluster and One cluster per node. The graphs are Unconnected, Directed KX, Directed Cycle KX and Directed Cycle. The results of these clusterings are compared with each other according to three criteria: Time, Quality and Extremity of nodes distribution. This enables us to find out which algorithm is most suitable for which graph. These artificial graphs are then compared with the reference architecture graph to reach the conclusions. Clustering algorithm Module dependency graph Quality Extremity Precision Recall artificial graph Reference graph Cluster Node Edge Implementation time Computer Sciences Datavetenskap (datalogi)
9	Simplifying Software Testing in Microservice Architectures through Service Dependency Graphs / Förenkla mjukvarutestningen i mikrotjänstarkitekturer genom tjänsteberoendegrafer Alevärn, Marcus January 2023 (has links) A popular architecture for developing large-scale systems is the microservice architecture, which is currently in use by companies such as Amazon, LinkedIn, and Uber. The are many benefits of the microservice architecture with respect to maintainability, resilience, and scalability. However, despite these benefits, the microservice architecture presents its own unique set of challenges, particularly related to software testing. Software testing is exacerbated in the microservice architecture due to its complexity and distributed nature. To mitigate this problem, this project work investigated the use of a graph-based visualization system to simplify the software testing process of microservice systems. More specifically, the role of the visualization system was to provide an analysis platform for identifying the root cause of failing test cases. The developed visualization system was evaluated in a usability test with 22 participants. Each participant was asked to use the visualization system to solve five tasks. The average participant could on average solve 70.9% of the five tasks correctly with an average effort rating of 3.5, on a scale from one to ten. The perceived average satisfaction of the visualization system was 8.0, also on a scale from one to ten. The project work concludes that graph-based visualization systems can simplify the process of identifying the root cause of failing test cases for at least five different error types. The visualization system is an effective analysis tool that enables users to follow communication flows and pinpoint problematic areas. However, the results also show that the visualization system cannot automatically identify the root cause of failing test cases. Manual analysis and an adequate understanding of the microservice system are still necessary. / En populär arkitektur för att utveckla storskaliga system är mikrotjänstarkitekturen, som för närvarande används av företag som Amazon, LinkedIn och Uber. Det finns många fördelar med mikrotjänstarkitekturen med avseende på underhållbarhet, motståndskraft och skalbarhet. Men trots dessa fördelar presenterar mikrotjänstarkitekturen sin egen unika uppsättning utmaningar, särskilt med hänsyn till mjukvarutestningen. Mjukvarutestningen försvåras i mikrotjänstarkitekturen på grund av dess komplexitet och distribuerade natur. För att mildra detta problem undersökte detta projektarbete användningen av ett grafbaserat visualiseringssystem för att förenkla mjukvarutestprocessen för mikrotjänstsystem. Mer specifikt var visualiseringssystemets roll att tillhandahålla en analysplattform för att identifiera grundorsaken till misslyckade testfall. Det utvecklade visualiseringssystemet utvärderades i ett användbarhetstest med 22 deltagare. Varje deltagare ombads att använda visualiseringssystemet för att lösa fem uppgifter. Den genomsnittliga deltagaren kunde i genomsnitt lösa 70.9% av de fem uppgifterna korrekt med ett genomsnittligt ansträngningsbetyg på 3.5, på en skala från ett till tio. Den upplevda genomsnittliga nöjdheten med visualiseringssystemet var 8.0, också på en skala från ett till tio. Projektarbetet drar slutsatsen att grafbaserade visualiseringssystem kan förenkla processen att identifiera grundorsaken till misslyckade testfall för minst fem olika feltyper. Visualiseringssystemet är ett effektivt analysverktyg som gör det möjligt för användare att följa kommunikationsflöden och peka ut problemområden. Men resultaten visar också att visualiseringssystemet inte automatiskt kan identifiera grundorsaken till misslyckade testfall. Manuell analys och en grundlig förståelse av mikrotjänstsystemet är fortfarande nödvändigt. Microservice architecture Service Dependency Graph Software testing Mikrotjänstarkitektur Tjänsteberoendegraf Mjukvarutestning Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Computer and Information Sciences Data- och informationsvetenskap
10	Extracting and Aggregating Temporal Events from Texts Döhling, Lars 11 October 2017 (has links) Das Finden von zuverlässigen Informationen über gegebene Ereignisse aus großen und dynamischen Textsammlungen, wie dem Web, ist ein wichtiges Thema. Zum Beispiel sind Rettungsteams und Versicherungsunternehmen an prägnanten Fakten über Schäden nach Katastrophen interessiert, die heutzutage online in Web-Blogs, Zeitungsartikeln, Social Media etc. zu finden sind. Solche Fakten helfen, die erforderlichen Hilfsmaßnahmen zu bestimmen und unterstützen deren Koordination. Allerdings ist das Finden, Extrahieren und Aggregieren nützlicher Informationen ein hochkomplexes Unterfangen: Es erfordert die Ermittlung geeigneter Textquellen und deren zeitliche Einordung, die Extraktion relevanter Fakten in diesen Texten und deren Aggregation zu einer verdichteten Sicht auf die Ereignisse, trotz Inkonsistenzen, vagen Angaben und Veränderungen über die Zeit. In dieser Arbeit präsentieren und evaluieren wir Techniken und Lösungen für jedes dieser Probleme, eingebettet in ein vierstufiges Framework. Die angewandten Methoden beruhen auf Verfahren des Musterabgleichs, der Verarbeitung natürlicher Sprache und des maschinellen Lernens. Zusätzlich berichten wir über die Ergebnisse zweier Fallstudien, basierend auf dem Einsatz des gesamten Frameworks: Die Ermittlung von Daten über Erdbeben und Überschwemmungen aus Webdokumenten. Unsere Ergebnisse zeigen, dass es unter bestimmten Umständen möglich ist, automatisch zuverlässige und zeitgerechte Daten aus dem Internet zu erhalten. / Finding reliable information about given events from large and dynamic text collections, such as the web, is a topic of great interest. For instance, rescue teams and insurance companies are interested in concise facts about damages after disasters, which can be found today in web blogs, online newspaper articles, social media, etc. Knowing these facts helps to determine the required scale of relief operations and supports their coordination. However, finding, extracting, and condensing specific facts is a highly complex undertaking: It requires identifying appropriate textual sources and their temporal alignment, recognizing relevant facts within these texts, and aggregating extracted facts into a condensed answer despite inconsistencies, uncertainty, and changes over time. In this thesis, we present and evaluate techniques and solutions for each of these problems, embedded in a four-step framework. Applied methods are pattern matching, natural language processing, and machine learning. We also report the results for two case studies applying our entire framework: gathering data on earthquakes and floods from web documents. Our results show that it is, under certain circumstances, possible to automatically obtain reliable and timely data from the web. Dokumentenretrieval Query Expansion Temporal Alignment Informationsextraktion Named Entity Recognition Relationsextraktion CRF SVM Dependenzgraph Informationsfusion Funktionsanpassung Erdbeben Flut Document Retrieval Query Expansion Temporal Alignment Information Extraction Named Entity Recognition Relationship Extraction CRF SVM Dependency Graph Information Fusion Curve Fitting Earthquake Flood 004 Datenverarbeitung; Informatik ST 530 ddc:004

Search results