Global ETD Search

91	Roko: Balancing Performance and Usability in Coarse-grain Parallelization Segulja, Cedomir 06 April 2010 (has links) We present Roko, a system that allows parallelization of sequential C codes with a modest user intervention. The user exposes parallelism at the function level by annotating the code with pragmas. Roko defines only two pragmas: the parallel pragma is used to denote function calls that will be executed asynchronously, and the exposed pragma is used to describe data usage of the marked function calls. Architecturally, Roko consists of three components: a compiler that analyzes pragmas, a software environment that spreads the execution over multiple processors, and a hardware support that implements a novel synchronization scheme, versioning. We have designed, implemented and evaluated an FPGA-based prototype of Roko. Our experimental evaluation shows: (i) that few simple pragmas are all that is needed to expose parallelism in benchmark applications and (ii) that Roko can deliver good performance in terms of application speedup. Programming Model Parallelization Synchronization Concurrency Control Multi-core Systems FPGA Applications 0984
92	Relaxing Concurrency Control in Transactional Memory Aydonat, Utku 05 January 2012 (has links) Transactional memory (TM) systems have gained considerable popularity in the last decade driven by the increased demand for tools that ease parallel programming. TM eliminates the need for user-locks that protect accesses to shared data. It offers performance close to that of fine-grain locking with the programming simplicity of coarse-grain locking. Today’s TM systems implement the two-phase-locking (2PL) algorithm which aborts transactions every time a conflict occurs. 2PL is a simple algorithm that provides fast transactional operations. However, it limits concurrency in applications with high contention because it increases the rate of aborts. We propose the use of a more relaxed concurrency control algorithm to provide better concurrency. This algorithm is based on the conflict-serializability (CS) model. Unlike 2PL, it allows some transactions to commit successfully even when they make conflicting accesses. We implement this algorithm both in a software TM system as well as in a simulator of a hardware TM system. Our evaluation using TM benchmarks shows that the algorithm improves the performance of applications with long transactions and high abort rates. Performance is improved by up to 299% in the software TM, and up to 66% in the hardware simulator. We argue that these improvements come with little additional complexity and require no changes to the transactional programming model. This makes our implementation feasible parallel programming multi threaded programs transactional memory synchronization serializability concurrency control 0544 0984
93	Relaxing Concurrency Control in Transactional Memory Aydonat, Utku 05 January 2012 (has links) Transactional memory (TM) systems have gained considerable popularity in the last decade driven by the increased demand for tools that ease parallel programming. TM eliminates the need for user-locks that protect accesses to shared data. It offers performance close to that of fine-grain locking with the programming simplicity of coarse-grain locking. Today’s TM systems implement the two-phase-locking (2PL) algorithm which aborts transactions every time a conflict occurs. 2PL is a simple algorithm that provides fast transactional operations. However, it limits concurrency in applications with high contention because it increases the rate of aborts. We propose the use of a more relaxed concurrency control algorithm to provide better concurrency. This algorithm is based on the conflict-serializability (CS) model. Unlike 2PL, it allows some transactions to commit successfully even when they make conflicting accesses. We implement this algorithm both in a software TM system as well as in a simulator of a hardware TM system. Our evaluation using TM benchmarks shows that the algorithm improves the performance of applications with long transactions and high abort rates. Performance is improved by up to 299% in the software TM, and up to 66% in the hardware simulator. We argue that these improvements come with little additional complexity and require no changes to the transactional programming model. This makes our implementation feasible parallel programming multi threaded programs transactional memory synchronization serializability concurrency control 0544 0984
94	Efecto de la distribución de trabajo en aplicaciones paralelas irregulares sobre clusters heterogéneos Chichizola, Franco 20 August 2013 (has links) El objetivo de este Trabajo Final es comparar el efecto de la distribución de trabajo estática y dinámica sobre arquitecturas de cluster heterogéneo, analizando al mismo tiempo el speedup paralelo teórico y el obtenido experimentalmente para un determinado tipo de problema. En particular, se ha elegido una aplicación clásica (Parallel N-Queens) con un algoritmo de solución paralela en la que predomina el procesamiento sobre el tamaño de los datos, de modo de profundizar en los aspectos del balance de carga (estático o dinámico) sin una distorsión de los resultados producida por aspectos relacionados al uso de la memoria y/o al tamaño de los mensajes a comunicar. Para la experimentación se ha utilizado una combinación de 4 clusters interconectados, donde las máquinas dentro de cada grupo poseen procesadores homogéneos, pero diferentes entre clusters. De este modo el conjunto puede verse como un cluster heterogéneo de 43 procesadores. El problema se ha resuelto utilizando el paradigma master/worker donde el procesamiento se descompone en tareas irregulares que atentan contra el balance de carga entre los procesadores. Por esta razón se han analizado tres estrategias de distribución de trabajo calculando en cada caso el desbalance de carga y el rendimiento obtenido, comparando los resultados para determinar la que tiene mejor comportamiento, y finalmente estudiar la escalabilidad para esa solución. La solución paralela pura (sin tener en cuenta la distribución del trabajo) para el tipo de problemas donde Tp>>Tc, en particular el de N-Reinas requiere mínima comunicación entre máquinas, lo que hace esencial la elección de la distribución de datos entre los procesadores, para alcanzar un speedup cercano al óptimo (es decir un buen rendimiento). distribución de trabajo arquitecturas heterogéneas balance de carga Clustering Parallelism and concurrency Ciencias Informáticas
95	A Concurrent IFDS Dataflow Analysis Algorithm Using Actors Rodriguez, Jonathan David January 2010 (has links) There has recently been a resurgence in interest in techniques for effective programming of multi-core computers. Most programmers find general-purpose concurrent programming to be extremely difficult. This difficulty severely limits the number of applications that currently benefit from multi-core computers. There already exist many concurrent solutions for the class of regular applications, which include various algorithms for linear algebra. For the class of irregular applications, which operate on dynamic and pointer- and graph-based structures, efficient concurrent solutions have so far remained elusive. Dataflow analysis applications, which are often found in compilers and program analysis tools, have received particularly little attention with regard to execution on multi-core machines. Operating on the theory that the Actor model, which structures computations as systems of asynchronously-communicating entities, is a more appropriate method for representing irregular algorithms than the shared-memory model, this work presents a concurrent Actor-based formulation of the IFDS, or Interprocedural Finite Distributive Subset, dataflow analysis algorithm. The implementation of this algorithm is done using the Scala language and its Actors library. This algorithm achieves significant speedup on multi-core machines without using any optimistic execution. This work contributes to Actor research by showing how the Actor model can be practically applied to a dataflow analysis problem. This work contributes to static analysis research by showing how a dataflow analysis algorithm can effectively make use of multi-core machines, allowing the possibility of faster and more precise analyses. concurrency static analysis actors irregular parallelism actor model dataflow analysis IFDS Computer Science
96	Arquitectura asimétrica multicore con procesador de Petri Micolini, Orlando January 2015 (has links) Se ha determinado, en una arquitectura multi-Core SMP, el lugar donde incorporar el PP o el HPP sin alterar el ISA del resto de los core. Se ha obtenido una familia de procesadores que ejecutan los algoritmos de Petri para dar solución a sistemas reactivos y concurrentes, con una sólida verificación formal que permite la programación directa de los procesadores. Para esto, se ha construido el hardware de un PP y un HPP, con un IP-Core en una FPGA, integrado a un sistema multi-Core SMP, que ejecuta distintos tipo de RdP. Esta familia de procesadores es configurable en distintos aspectos: - Tamaño del procesador (cantidad de plazas y transiciones). - Procesadores con tiempo y procesadores temporales. - Arquitectura heterogénea, que permite distribuir los recursos empleados para instanciar el procesador según se requiera, y obtener un ahorro sustancial. - La posibilidad de configurar el procesador en pos de obtener los requerimientos y minimizar los recursos. Muy valorado en la construcción de sistemas embebidos. En los sistemas con alta necesidad de concurrencia y sincronización, donde se ha evaluado este procesador, las prestaciones han mostrado una importante mejora en el desempeño. El procesador tiene la capacidad de resolver simultáneamente, por conjuntos múltiples disparos, lo que disminuye los tiempos de consulta y decisión, además los programas ejecutados cumplen con los formalismos de las RdP extendidas y sincronizadas, y los resultados de su ejecución son determinísticos. Los tiempos de respuesta para determinar una sincronización son de dos ciclos por consulta (entre la solicitud de un disparo y la respuesta). sistemas concurrente procesadores multi-core heterógeneo sistemas de tiempo real Concurrency Processors Petri nets Ciencias Informáticas
97	Automatic Extraction of Program Models for Formal Software Verification de Carvalho Gomes, Pedro January 2015 (has links) In this thesis we present a study of the generation of abstract program models from programs in real-world programming languages that are employed in the formal verification of software. The thesis is divided into three parts, which cover distinct types of software systems, programming languages, verification scenarios, program models and properties.The first part presents an algorithm for the extraction of control flow graphs from sequential Java bytecode programs. The graphs are tailored for a compositional technique for the verification of temporal control flow safety properties. We prove that the extracted models soundly over-approximate the program behaviour w.r.t. sequences of method invocations and exceptions. Therefore, the properties that are established with the compositional technique over the control flow graphs also hold for the programs. We implement the algorithm as ConFlEx, and evaluate the tool on a number of test cases.The second part presents a technique to generate program models from incomplete software systems, i.e., programs where the implementation of at least one of the components is not available. We first define a framework to represent incomplete Java bytecode programs, and extend the algorithm presented in the first part to handle missing code. Then, we introduce refinement rules, i.e., conditions for instantiating the missing code, and prove that the rules preserve properties established over control flow graphs extracted from incomplete programs. We have extended ConFlEx to support the new definitions, and re-evaluate the tool, now over test cases of incomplete programs.The third part addresses the verification of multithreaded programs. We present a technique to prove the following property of synchronization with condition variables: "If every thread synchronizing under the same condition variables eventually enters its synchronization block, then every thread will eventually exit the synchronization". To support the verification, we first propose SyncTask, a simple intermediate language for specifying synchronized parallel computations. Then, we propose an annotation language for Java programs to assist the automatic extraction of SyncTask programs, and show that, for correctly annotated programs, the above-mentioned property holds if and only if the corresponding SyncTask program terminates. We reduce the termination problem into a reachability problem on Coloured Petri Nets. We define an algorithm to extract nets from SyncTask programs, and show that a program terminates if and only if its corresponding net always reaches a particular set of dead configurations. The extraction of SyncTask programs and their translation into Petri nets is implemented as the STaVe tool. We evaluate the technique by feeding annotated Java programs to STaVe, then verifying the extracted nets with a standard Coloured Petri Net analysis tool / Den här avhandlingen studerar automatisk konstruktion av abstrakta modeller för formell verifikation av program skrivna i verkliga programmeringsspråk. Avhandlingen består av tre delar som involverar olika typer av program, programmeringsspråk, verifikationsscenarier, programmodeller och egenskaper.Del ett presenterar en algoritm för generation av flödesgrafer från sekventiella program i Java bytekod. Graferna är skräddarsydda för en kompositionell teknik för verifikationen av temporala kontrollflödens säkerhetsegenskaper. Vi visar att de extraherade modellerna sunt överapproximerar programbeteenden med avseende på sekvenser av metodanrop och -undantag. Således gäller egenskaperna som kan fastställas genom kompositionstekniken över kontrollflöden även för programmen. Vi implementerar dessutom algoritmen i form av verktyget ConFlEx och utvärderar verktyget på ett antal testfall.Del två presenterar en teknik för att generera modeller av ofullständiga program. Det vill säga, program där implementationen av åtminstone en komponent inte är tillgänglig. Vi definierar ett ramverk för att representera ofullständiga Java bytekodsprogram och utökar algoritmen från del ett till att hantera ofullständig kod. Därefter presenterar vi raffineringsregler - villkor för att instansiera den saknade koden - och bevisar att reglerna bevarar relevanta egenskaper av kontrollflödesgrafer. Vi har dessutom utökat ConFlEx till att stödja de nya definitionerna och har omvärderat verktyget på testfall av ofullständiga program.Del tre angriper verifikation av multitrådade program. Vi presenterar en teknik för att bevisa följande egenskap för synkronisering med vilkorsvariabler: "Om varje trådsynkronisering under samma villkor så småningom stiger in i sitt synkroniseringsblock så kommer varje tråd också till slut lämna synkroniseringen". För att stödja verifikationen så introducerar vi först SyncTask - ett enkelt mellanliggande språk för att specificera synkronisering av parallella beräkningar. Därefter presenterar vi ett annoteringsspråk för Java som tillåter automatisk extrahering av SyncTask-program och visar att egenskapen gäller om och endast om motsvarande SyncTask-program terminerar. Vi reducerar termineringsproblemet till ett nåbarhetsproblem på färgade Petrinät samt definierar en algoritm som skapar Petrinät från SyncTask-program där programmet terminerar om och endast om nätet alltid når en särskild mängd av döda konfigurationer. Extraktionen av SyncTask-program och deras motsvarande Petrinät är implementerade i form av verktyget STaVe. Slutligen utvärderar vi verktyget genom att mata annoterade. / <p>QC 20151101</p> Software Verification Static Analysis Program Models Petri Nets Compositional Verification Concurrency
98	THE EVALUATION OF TINYOS WITH WIRELESS SENSOR NODE OPERATING SYSTEMS Famoriyo, Olusola January 2007 (has links) Wireless Sensor nodes fall somewhere in between the single application devices that do not need an operating system, and the more capable, general purpose devices with the resources to run a traditional embedded operating system. Sensor node operating system such as TinyOS, Contiki, MantisOS and SOS which is discussed in this paper exhibit characteristics of both traditional embedded systems and general-purpose operating systems providing a limited number of common services for application developers linking software and hardware. These common services typically include platform support, hardware management of sensors, radios, and I/O buses and application construction etc. They also provide services needed by applications which include task coordination, power management, adaptation to resource constraints, and networking. The evaluation was concentrated on TinyOS including an analysis on version 1.x and 2.x resource management and flexibility and its operation with the other wireless sensor node operating systems.
99	A Concurrent IFDS Dataflow Analysis Algorithm Using Actors Rodriguez, Jonathan David January 2010 (has links) There has recently been a resurgence in interest in techniques for effective programming of multi-core computers. Most programmers find general-purpose concurrent programming to be extremely difficult. This difficulty severely limits the number of applications that currently benefit from multi-core computers. There already exist many concurrent solutions for the class of regular applications, which include various algorithms for linear algebra. For the class of irregular applications, which operate on dynamic and pointer- and graph-based structures, efficient concurrent solutions have so far remained elusive. Dataflow analysis applications, which are often found in compilers and program analysis tools, have received particularly little attention with regard to execution on multi-core machines. Operating on the theory that the Actor model, which structures computations as systems of asynchronously-communicating entities, is a more appropriate method for representing irregular algorithms than the shared-memory model, this work presents a concurrent Actor-based formulation of the IFDS, or Interprocedural Finite Distributive Subset, dataflow analysis algorithm. The implementation of this algorithm is done using the Scala language and its Actors library. This algorithm achieves significant speedup on multi-core machines without using any optimistic execution. This work contributes to Actor research by showing how the Actor model can be practically applied to a dataflow analysis problem. This work contributes to static analysis research by showing how a dataflow analysis algorithm can effectively make use of multi-core machines, allowing the possibility of faster and more precise analyses. concurrency static analysis actors irregular parallelism actor model dataflow analysis IFDS Computer Science
100	Duomenų atnaujinimo lygiagretumo konfliktų sprendimas prekybos ir klientų aptarnavimo sistemose / Data update concurrency conflict sollutions in commerce and customer service systems Kėsas, Marius 26 May 2004 (has links) There are many benefits to upgrading your data access layer to ADO.NET, most of which involve using the intrinsic DataSet object. The DataSet object is basically a disconnected, in-memory replica of a database. DataSets provide many benefits, but also present a few challenges. Specifically, you can run into problems related to data concurrency exceptions. I've created a simple Windows® Forms customer service application that illustrates the potential pitfalls of this particular problem. I'll walk you through my research and show you ways to overcome the data concurrency issues that arose. DataSets provide a number of benefits. For example, you gain the ability to enforce rules of integrity in memory rather than at the database level. The most important benefit of using DataSets, however, is improved performance. Since the DataSet is disconnected from the underlying database, your code will make fewer calls to the database, significantly boosting performance. As with most performance optimizations, this one comes with a price. Since the DataSet object is disconnected from the underlying database, there is always a chance that the data is out of date. Since a DataSet doesn't hold live data, but rather a snapshot of live data at the time the DataSet was filled, problems related to data concurrency can occur. Informatics Engineering Atnaujinimo lygiagretumas DataSet concurrency Duomenų lygiagretumas DataSet update conflicts

Search results