Spelling suggestions: "subject:"concurrence""
Mahmoud Mohamedin, Mohamed Ahmed
21 March 2012
As chip vendors are increasingly manufacturing a new generation of multi-processor chips called multicores, improving software performance requires exposing greater concurrency in software. Since code that must be run sequentially is often due to the need for synchronization, the synchronization abstraction has a significant effect on program performance. Lock-based synchronization — the most widely used synchronization method — suffers from programability, scalability, and composability challenges. Transactional memory (TM) is an emerging synchronization abstraction that promises to alleviate the difficulties with lock-based synchronization. With TM, code that read/write shared memory objects is organized as transactions, which speculatively execute. When two transactions conflict (e.g., read/write, write/write), one of them is aborted, while the other commits, yielding (the illusion of) atomicity. Aborted transactions are re-started, after rolling-back changes made to objects. In addition to a simple programming model, TM provides performance comparable to lock-based synchronization. Software transactional memory (STM) implements TM entirely in software, without any special hardware support, and is usually implemented as a library, or supported by a compiler or by a virtual machine. In this thesis, we present ByteSTM, a virtual machine-level Java STM implementation. ByteSTM implements two STM algorithms, TL2 and RingSTM, and transparently supports implicit transactions. Program bytecode is automatically modified to support transactions: memory load/store bytecode instructions automatically switch to transactional mode when a transaction starts, and switch back to normal mode when the transaction successfully commits. Being implemented at the VM-level, it accesses memory directly and uses absolute memory addresses to uniformly handle memory. Moreover, it avoids Java garbage collection (which has a negative impact on STM performance), by manually allocating and recycling memory for transactional metadata. ByteSTM uses field-based granularity, and uses the thread header to store transactional metadata, instead of the slower Java ThreadLocal abstraction. We conducted experimental studies comparing ByteSTM with other state-of-the-art Java STMs including Deuce, ObjectFabric, Multiverse, DSTM2, and JVSTM on a set of micro- benchmarks and macro-benchmarks. Our results reveal that, ByteSTM's transactional throughput improvement over competitors ranges from 20% to 75% on micro-benchmarks and from 36% to 100% on macro-benchmarks. / Master of Science
Factorisation des régions cubiques et application à la concurrence / Factorization of cubical area and application to concurrencyNinin, Nicolas 11 December 2017 (has links)
Cette thèse se propose d'étudier des problèmes de factorisations des régions cubiques. Dans le cadre de l'analyse de programme concurrent via des méthodes issues de la topologie algébrique, les régions cubiques sont un modèle géométrique simple mais expressif de la concurrence. Tout programme concurrent (sans boucle ni branchement) est ainsi représenté comme sous partie de R^n auquel on enlève des cubes interdits représentant les états du programme interdit par les contraintes de la concurrence (mutex par exemple) où n est le nombre de processus. La première partie de cette thèse s’intéresse à la question d'indépendance des processus. Cette question est cruciale dans l'analyse de programme non concurrent car elle permet de simplifier l'analyse en séparant le programme en groupe de processus indépendants. Dans le modèle géométrique d'un programme, l'indépendance se traduit comme une factorisation modulo permutation des processus. Ainsi le but de cette section est de donner un algorithme effectif de factorisation des régions cubiques et de le démontrer. L'algorithme donné est relativement simple et généralise l'algorithme très intuitif suivant (dit algorithme syntaxique). A partir du programme, on met dans un même groupe les processus qui partagent une ressource, puis l’on prend la clôture transitive de cette relation. Le nouvel algorithme s'effectue de la même manière, cependant il supprime certaines de ces relations. En effet par des jeux d'inclusion entre cubes interdits, il est possible d'avoir deux processus qui partagent une ressource mais qui sont toutefois indépendant. Ainsi la nouvelle relation est obtenue en regardant l'ensemble des cubes maximaux de la région interdite. Lorsque deux coordonnées sont différentes de R dans un cube maximal on dira qu’elles sont reliées. Il suffit alors de faire la clôture transitive de cette relation pour obtenir la factorisation optimale. La seconde partie de ce manuscrit s'intéresse à un invariant catégorique que l'on peut définir sur une région cubique. Celui-ci découpe la région cubique en cubes appelés "dés" auxquels on associe une catégorie appelée catégorie émincée de la région cubique. On peut voir cette catégorie comme un intermédiaire fini entre la catégorie des composantes et la catégorie fondamentale. On peut ainsi montrer que lorsque la région cubique factorise alors la catégorie émincée associée va elle-même se factoriser. Cependant la réciproque est plus compliquée et de nombreux contre exemples empêchent une réciproque totale. La troisième et dernière partie de cette thèse s'intéresse à la structure de produit tensoriel que l'on peut mettre sur les régions cubiques. En remarquant comment les opérations booléennes sur une région cubique peuvent être obtenues à partir des opérations sur les régions cubiques de dimension inférieure, on tente de voir ces régions cubiques comme un produit tensoriel des régions de dimension inférieure. La structure de produit tensoriel est hautement dépendante de la catégorie dans laquelle on la considère. Dans ce cas, si l'on considère le produit dans les algèbres de Boole, le résultat n'est pas celui souhaité. Au final il se trouve que le produit tensoriel dans la catégorie des demi-treillis avec zéro donne le résultat voulu. / This thesis studies some problems of the factorization of cubical areas. In the setting of analysis of programs through methods coming from algebraic topology, cubical areas are geometric models used to understand concurrency. Any concurrent programs (without loops nor branchings) can be seen as a subset of R^n where we remove some cubes which contains the states forbidden by the concurrency (think of a mutex) and where n is the number of process in the program. The first part of this thesis is interested in the question the independence of process. This question is particularly important to analyse a program, indeed being able to separate groups of process into independent part will greatly reduce the complexity of the analysis. In the geometric model, the independency is seen as a factorization up to permutation of processes. Hence the goal is to give a new effective algorithm which factorizes cubical areas, and proves that it does. The given algorithm is quite straightforward and is a generalization of the following algorithm (that we called syntactic algorithm). From the written program, groups together process that shares a resource, then take the transitive closure of this relation. This algorithm is not always optimal in that it can groups together process that actually could be separated. Thus we create a new (more relax) relationship between process. From the maximal cubes of the forbidden area of the program, if two coordinate are not equal to R, then groups them together. We can then take the transitive closure of this and get the optimal factorization. Each cube is an object of the category and between two adjacent cubes is an arrow. We can see that this category is in between the fundamental category and the components category of the cubical area. We can then show that if the cubical area factorize then so does the minced category. The reciprocal is harder to get. Indeed there's a few counter example on which we cant go back. The third and last part of this thesis is interested in seeing cubical areas as some kind of product over lower dimension cubical areas. By looking at how the booleans operations of a cubical area arise from the same operation on lower dimensional cubical areas we understand that it can be expressed as a tensor product. A tensor product is highly dependent on the category on which it is built upon. We show that to take the category of Boolean algebra is too restrictive and gives trivial result, while the category of semi-lattice with zeros works well. are not equal to R, then groups them together. We can then take the transitive closure of this and get the optimal factorization. The second part of this thesis looks at some categorical invariant that we define over cubical areas. These categories (called the minced category) slice the space into cubes.
Rungta, Neha Shyam
14 September 2009
The quality and reliability of software systems, in terms of their functional correctness, critically relies on the effectiveness of the testing tools and techniques to detect errors in the system before deployment. A lack of testing tools for concurrent programs that systematically control thread scheduling choices has not allowed concurrent software development to keep abreast with hardware trends of multi-core and multi-processor technologies. This motivates a need for the development of systematic testing techniques that detect errors in concurrent programs. The work in this dissertation presents a potentially scalable technique that can be used to detect concurrency errors in production code. The technique is a viable solution for software engineers and testers to detect errors in multi-threaded programs before deployment. We present a guided testing technique that combines static analysis techniques, systematic verification techniques, and heuristics to efficiently detect errors in concurrent programs. An abstraction-refinement technique lies at the heart of the guided test technique. The abstraction-refinement technique uses as input potential errors in the program generated by imprecise, but scalable, static analysis tools. The abstraction further leverages static analyses to generate a set of program locations relevant in verifying the reachability of the potential error. Program execution is guided along these points by ranking both thread and data non-determinism. The set of relevant locations is refined when program execution is unable to make progress. The dissertation also discusses various heuristics for effectively guiding program execution. We implemented the guided test technique to detect errors in Java programs. Guided test successfully detects errors caused by thread schedules and data input values in Java benchmarks and the JDK concurrent libraries for which other state of the art analysis and testing tools for concurrent programs are unable to find an error.
Concurrency-induced transitions in epidemic dynamics on temporal networks / テンポラルネットワーク上の感染症ダイナミクスにおけるコンカレンシーがもたらす転移Onaga, Tomokatsu 26 March 2018 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(理学) / 甲第20893号 / 理博第4345号 / 新制||理||1624(附属図書館) / 京都大学大学院理学研究科物理学・宇宙物理学専攻 / (主査)准教授 篠本 滋, 教授 佐々 真一, 教授 川上 則雄 / 学位規則第4条第1項該当 / Doctor of Science / Kyoto University / DFAM
Morse, Everett Allen
09 December 2010
Complex concurrent APIs are difficult to reason about annually due to the exponential growth in the number of feasible schedules. Testing against reference solutions of these APIs is equally difficult as reference solutions implement an unknown set of allowed behaviors, and programmers have no way to directly control schedules or API internals to expose or reproduce errors. The work in this paper mechanically generates a drop-in replacement for a concurrent API from a formal specification. The specification is a guarded command system with first-order logic that is compiled into a core calculus. The term rewriting system is connected to actual C programs written against the API through lightweight wrappers in a role-based relationship with the rewriting system. The drop-in replacement supports putative what-if queries over API scenarios for behavior exploration, reproducibility for test and debug, full exhaustive search and other advanced model checking analysis methods for C programs using the API. We provide a Racket instantiation of the rewriting system with a C/Racket implementation of the role-based architecture and validate the process with an API from the Multicore Association.
Khandrika, Ananth Viswa Sai Kalyan
04 September 2018
No description available.
13 February 2013
The distributed transactional memory (DTM) abstraction aims to simplify the development of distributed concurrent programs. It frees programmers from the complicated and error-prone task of explicit concurrency control based on locks (e.g., deadlocks, livelocks, non-scalability, non-composability), which are aggravated in a distributed environment due to the complexity of multi-node concurrency. At its core, DTM's atomic section-based synchronization abstraction enables the execution of a sequence of multi-node object operations with the classical serializability property, which significantly increases the programmability of distributed systems. In this thesis, we present the first ever DTM framework for distributed concurrency control in C++, called HyflowCPP. HyflowCPP provides distributed atomic sections, and pluggable support for concurrency control algorithms, directory lookup protocols, contention management policies, and network communication protocols. The framework uses the Transaction Forwarding Algorithm (or TFA) for concurrency control. While there exists implementations of TFA and other DTM concurrency control algorithms in Scala and Java, and concomitant DTM frameworks (e.g., HyflowJava, HyflowScala, D2STM, GenRSTM), HyflowCPP provides a uniquely distinguishing TFA/DTM implementation for C++. Also, HyflowCPP supports strong atomicity, transactional nesting models including closed and open nesting (supported using modifications to TFA), and checkpointing. We evaluated HyflowCPP through an experimental study that measured transactional throughput for a set of micro- and macro-benchmarks, and comparing with competitor DTM frameworks. Our results revealed that HyflowCPP achieves up to 600% performance improvement over competitor Java DTM frameworks including D2STM, GenRSTM, HyflowScala and HyflowJava, which can be attributed to the competitors' JVM overhead and rudimentary networking support. Additionally, our experimental studies revealed that checkpointing achieves up to 100% performance improvement over flat nesting and 50% over closed nesting. Open nesting model achieves up to 140% performance improvement over flat nesting and 90% over closed nesting. / Master of Science
Glantz, Isac, Hurtig, Hampus
As more and more companies use the internet to grow their businesses and sales, it is crucial to have a fast and responsive site that keeps customers on the site. Hence, comparing two web frameworks with respect to response time is vital, as it is a significant part of delivering the page. The comparison will help developers to choose between Express.js and Ktor. Our research shows how the two frameworks, Ktor and Express.js, compare in response times for static and dynamic pages for a set of concurrent users. The comparison will explain how the frameworks’ response times change when having a different number of concurrent users and delivering static vs. dynamic content. An experiment with Locust was conducted to obtain the data needed to show the differences in response time for the two frameworks. Additionally, a literature study was conducted to find the best way to structure the servers, design the tests, and find information on how the frameworks should perform. We found that Express.js has an overall better response time than Ktor. At the same time, it was found that the Object Relational Mapper used with Ktor affected result more than the Object Relational Mapper used with Express.js. Hence, we conclude that Express.js is the better choice, but since both frameworks had low response times, we would say that even Ktor is a valid choice.
01 January 2015
My research has been on the development of concurrent algorithms for shared memory systems that provide guarantees of progress. Research into such algorithms is important to developers implementing applications on mission critical and time sensitive systems. These guarantees of progress provide safety properties and freedom from many hazards, such as dead-lock, live-lock, and thread starvation. In addition to the safety concerns, the fine-grained synchronization used in implementing these algorithms promises to provide scalable performance in massively parallel systems. My research has resulted in the development of wait-free versions of the stack, hash map, ring buffer, vector, and a multi-word compare-and-swap algorithms. Through this experience, I have learned and developed new techniques and methodologies for implementing non-blocking and wait-free algorithms. I have worked with and refined existing techniques to improve their practicality and applicability. In the creation of the aforementioned algorithms, I have developed an association model for use with descriptor-based operations. This model, originally developed for the multi-word compare-and-swap algorithm, has been applied to the design of the vector and ring buffer algorithms. To unify these algorithms and techniques, I have released Tervel, a wait-free library of common algorithms and containers. This library includes a framework that simplifies and improves the design of non-blocking algorithms. I have reimplemented several algorithms using this framework and the resulting implementation exhibits less code duplication and fewer perceivable states. When reimplementing algorithms, I have adapted their Application Programming Interface (API) specification to remove ambiguity and non-deterministic behavior found when using a sequential API in a concurrent environment. To improve the performance of my algorithm implementations, I extended OVIS's Lightweight Distributed Metric Service (LDMS)'s data collection and transport system to support performance monitoring using perf_event and PAPI libraries. These libraries have provided me with deeper insights into the behavior of my algorithms, and I was able to use these insights to improve the design and performance of my algorithms.
17 June 2016
Standard operational semantics of the majority of concurrency models is defined in terms of either sequences or step sequences, while standard concurrent history semantics is usually defined in terms of partial orders, stratified order structures (or structures equivalent to them as net processes). It is commonly assumed (first argued by N. Wiener in 1914) that any system run (execution) that can be observed by a single observer must be an interval order of event occurrences. However, generating interval orders directly is problematic for most models of concurrency, as the only feasible sequence representation of interval order is by using Fishburn Theorem (1970) and appropriate sequences of beginnings and endings of events involved. It was shown by Janicki and Koutny in 1997 that concurrent histories involving interval orders can be represented by interval order structures, but how these interval order structures could be derived for particular concurrent systems was not clear. My original contribution to knowledge is defining an interval order semantics for Petri Nets with Inhibitor Arcs. We start with introducing operational interval order semantics, and then we generalize the concept of net process to represent the set of equivalent executions modelled by interval orders. Next we will show that our interval processes correspond to appropriate interval order structures. Finally, we will prove that our model is equivalent to that of Janicki and Yin (2015) where novel interval traces are used to represent equivalent executions. We will also demonstrate that our model covers simpler cases where sequences or step sequences were used to represent system runs. / Thesis / Doctor of Philosophy (PhD)
Page generated in 0.1064 seconds