Spelling suggestions: "subject:"message passing"" "subject:"essage passing""
121 |
Verification of asynchronous concurrency and the shaped stack constraintKochems, Jonathan Antonius January 2014 (has links)
In this dissertation, we study the verification of concurrent programs written in the programming language Erlang using infinite-state model-checking. Erlang is a widely used, higher order, dynamically typed, call-by-value functional language with algebraic data types and pattern-matching. It is further augmented with support for actor concurrency, i.e. asynchronous message passing and dynamic process creation. With decidable model-checking in mind, we identify actor communicating systems (ACS) as a suitable target model for an abstract interpretation of Erlang. ACS model a dynamic network of finite-state processes that communicate over a fixed, finite number of unordered, unbounded channels. Thanks to being equivalent to Petri nets, ACS enjoy good algorithmic properties. We develop a verification procedure that extracts a sound abstract model, in the form of an ACS, from a given Erlang program; the resulting ACS simulates the operational semantics of the input. Using this abstract model, we can conservatively verify coverability properties of the input program, i.e. a weak form of safety properties, with a Petri net model-checker. We have implemented this procedure in our tool Soter, which is the first sound verification tool for Erlang programs using infinite-state model-checking. In our experiments, we find that Soter is accurate enough to verify a range of interesting and non-trivial benchmarks. Even though ACS coverability is Expspace-complete, Soter's analysis of these verification problems is surprisingly quick. In order to improve the precision of our verification procedure with respect to recursion, we investigate an extension of ACS that allows pushdown processes: asynchronously communicating pushdown systems (ACPS). ACPS that satisfy the empty-stack constraint (a pushdown process may receive only when its stack is empty) are a popular subclass of ACPS with good decision and complexity properties. In the context of Erlang, the empty stack constraint is unfortunately not realistic. We introduce a relaxation of the empty-stack constraint for ACPS called the shaped stack constraint. Stacks that fit the shape constraint may reach arbitrary heights. Further, a process may execute any communication action (be it process creation, message send or retrieval) whether or not its stack is empty. We prove that coverability for shaped ACPS, i.e. ACPS that satisfy the shaped constraint, reduces to the decidable coverability problem for well-structured transition systems (WSTS). Thus, shaped ACPS enable the modelling and verification of a larger class of message passing programs. We establish a close connection between shaped ACPS and a novel extension of Petri nets: nets with nested coloured tokens (NNCT). Tokens in NNCT are of two types: simple and complex. Complex tokens carry an arbitrary number of coloured tokens. The rules of a NNCT can synchronise complex and simple tokens, inject coloured tokens into a complex token, and eject all tokens of a specified set of active colours to predefined places. We show that the coverability problem for NNCT is Tower-complete, a new complexity class for non-elementary decision problems introduced by Schmitz. To prove Tower-membership, we devise a geometrically inspired version of the Rackoff technique, and we obtain Tower-hardness by adapting Stockmeyer's ruler construction to NNCT. To our knowledge, NNCT is the first extension of Petri nets (belonging to the class of nets with an infinite set of token types) that is proven to have primitive recursive coverability. This result implies Tower-completeness of coverability for ACPS that satisfy the shaped stack constraint.
|
122 |
Investigation of the scalar variance and scalar dissipation rate in URANS and LESYe, Isaac Keeheon January 2011 (has links)
Large-eddy simulation (LES) and unsteady Reynolds-averaged Navier-Stokes (URANS) calculations have been performed to investigate the effects of different mathematical models for scalar variance and its dissipation rate as applied to both a non-reacting bluff-body turbulent flow and an extension to a reacting case. In the conserved scalar formalism, the mean value of a thermo-chemical variable is obtained through the PDF-weighted integration of the local description over the conserved scalar, the mixture fraction. The scalar variance, one of the key parameters for the determination of a presumed β-function PDF, is obtained by solving its own transport equation with the unclosed scalar dissipation rate modelled using either an algebraic expression or a transport equation. The proposed approach is first applied to URANS and then extended to LES. Velocity, length and time scales associated with the URANS modelling are determined using the standard two-equation k-ε transport model. In contrast, all three scales required by the LES modelling are based on the Smagorinsky subgrid scale (SGS) algebraic model. The present study proposes a new algebraic and a new transport LES model for the scalar dissipation rate required by the transport equation for scalar variance, with a time scale consistent with the Smagorinsky SGS model.
|
123 |
Area and energy efficient VLSI architectures for low-density parity-check decoders using an on-the-fly computationGunnam, Kiran Kumar 15 May 2009 (has links)
The VLSI implementation complexity of a low density parity check (LDPC)
decoder is largely influenced by the interconnect and the storage requirements. This
dissertation presents the decoder architectures for regular and irregular LDPC codes that
provide substantial gains over existing academic and commercial implementations. Several
structured properties of LDPC codes and decoding algorithms are observed and are used to
construct hardware implementation with reduced processing complexity. The proposed
architectures utilize an on-the-fly computation paradigm which permits scheduling of the
computations in a way that the memory requirements and re-computations are reduced.
Using this paradigm, the run-time configurable and multi-rate VLSI architectures for the
rate compatible array LDPC codes and irregular block LDPC codes are designed. Rate
compatible array codes are considered for DSL applications. Irregular block LDPC codes
are proposed for IEEE 802.16e, IEEE 802.11n, and IEEE 802.20. When compared with a
recent implementation of an 802.11n LDPC decoder, the proposed decoder reduces the
logic complexity by 6.45x and memory complexity by 2x for a given data throughput.
When compared to the latest reported multi-rate decoders, this decoder design has an area efficiency of around 5.5x and energy efficiency of 2.6x for a given data throughput. The
numbers are normalized for a 180nm CMOS process.
Properly designed array codes have low error floors and meet the requirements of
magnetic channel and other applications which need several Gbps of data throughput. A
high throughput and fixed code architecture for array LDPC codes has been designed. No
modification to the code is performed as this can result in high error floors. This parallel
decoder architecture has no routing congestion and is scalable for longer block lengths.
When compared to the latest fixed code parallel decoders in the literature, this design has
an area efficiency of around 36x and an energy efficiency of 3x for a given data throughput.
Again, the numbers are normalized for a 180nm CMOS process. In summary, the design
and analysis details of the proposed architectures are described in this dissertation. The
results from the extensive simulation and VHDL verification on FPGA and ASIC design
platforms are also presented.
|
124 |
One To Mant And Many To Many Collective Communication Operations On GridsGupta, Rakhi 12 1900 (has links)
Collective Communication Operations are widely used in MPI applications and play an important role in their performance. Hence, various projects have focused on optimization of collective communications for various kinds of parallel computing environments including LAN settings, heterogeneous networks and most recently Grid systems. The distinguishing factor of Grids from all the other environments is heterogeneity of hosts and network, and dynamically changing resource characteristics including load and availability.
The first part of the thesis develops a solution for MPI broadcast (one-to-many) on Grids. Some current strategies take into consideration static information about network topology for determining an efficient broadcast tree for Grids. Some other strategies take into account only transient network characteristics. We combined both these strategies and cluster the network dynamically on the basis of link bandwidths. Given a set of network parameters we use Simulated Annealing (SA) to obtain the best schedule. Also, we can time tune individual. SAs, to adapt the solution finding process, on the basis of estimated available times before next broadcast invocations in the application. We also developed software architecture for updation of schedules. We compared our algorithm with the earlier approaches under loaded network conditions, and obtained average performance improvement of 20%.
The second part of the thesis extends the work for MPI all gather (many-to-many) operation. Current popular techniques consider strict hierarchical schemes for this operation, wherein from each cluster a representative (or coordinator) node is chosen, and inter cluster communication is done through these representative nodes. This is non optimal as inter cluster communication is usually on high capacity links that can sustain more than one transfer with the same through- put. We developed a cluster based and incremental heuristic algorithm for allgather on Grids.
We compared the time taken by allgather schedules determined by this algorithm with current popular implementations. We also compared our algorithm with a strategy where allgather is constructed from a set of broadcast trees. We obtained average performance improvement of 67% over existing strategies.
|
125 |
Investigation of the scalar variance and scalar dissipation rate in URANS and LESYe, Isaac Keeheon January 2011 (has links)
Large-eddy simulation (LES) and unsteady Reynolds-averaged Navier-Stokes (URANS) calculations have been performed to investigate the effects of different mathematical models for scalar variance and its dissipation rate as applied to both a non-reacting bluff-body turbulent flow and an extension to a reacting case. In the conserved scalar formalism, the mean value of a thermo-chemical variable is obtained through the PDF-weighted integration of the local description over the conserved scalar, the mixture fraction. The scalar variance, one of the key parameters for the determination of a presumed β-function PDF, is obtained by solving its own transport equation with the unclosed scalar dissipation rate modelled using either an algebraic expression or a transport equation. The proposed approach is first applied to URANS and then extended to LES. Velocity, length and time scales associated with the URANS modelling are determined using the standard two-equation k-ε transport model. In contrast, all three scales required by the LES modelling are based on the Smagorinsky subgrid scale (SGS) algebraic model. The present study proposes a new algebraic and a new transport LES model for the scalar dissipation rate required by the transport equation for scalar variance, with a time scale consistent with the Smagorinsky SGS model.
|
126 |
Athapascan-0 : exploitation de la multiprogrammation légère sur grappes de multiprocesseursCarissimi, Alexandre da Silva January 1999 (has links)
L'accroissement d'efficacite des réseaux d'interconnexion et la vulgarisation des machines multiprocesseurs permettent la réalisation de machines parallèles a mémoire distribuée de faible coût: les grappes de multiprocesseurs. Elles nécessitent l'exploitation à la fois du parallélismeà grain fin, interne à un multiprocesseur offert par la multiprogrammation légère, et du parallélisme à gros grain entre les différents multiprocesseurs. L'exploitation simultanée de ces deux types de parallélisme exige une méthode de communication entre les processus légers qui ne partagent pas le mêmme espace d'adressage. Le travail de cette thèse porte sur le problème de l'Intégration de la multiprogrammation légère et des communications sur grappes de multiprocesseurs symétriques (SMP). II porte plus précisément sur evaluation et le reglage du noyau exécutif ATHAPASCAN-0 sur ce type d'architecture. ATHAPASCAN-0 est un noyau exécutif, portable, développé au sein du projet APACHE (CNRS-INPG-INRIA-UJF), qui combine la multiprogrammation légère et la communication par échange de messages. La portabilité est assurée par une organisation en couches basée sur les standards POSIX threads et MPI largement répandus. ATHAPASCAN-0 étend le modèle de réseau statique de processus «lourds» communicants tel que MPI, PVM, etc,à celui d'un réseau dynamique de processus légers communicants. La technique de base est la multiprogrammation lègere des communications et des calculs. La progression des communications exige la scrutation de état du reseau et l'enchainement des opérations de transferts. L'efficacité repose sur la minimisation de ces opérations. De plus, l'emploi de multiprocesseurs ajoute des problèmes spécifiques dus à l'apparition d'un parallélisme réel entre calcul et communication. Ces problèmes sont présentés et des solutions sont proposées pour l'environnement ATHAPASCAN-0. Ces solutions sont évaluées sur des grappes de multiprocesseurs. / The continuous price reduction for commodity PC multiprocessors and the availability of fast network interfaces have made cluster of multiprocessors an attractive low-price alternative to build parallel systems. Multiprocessor clusters offer two levels of parallelism: a fine grain parallelism inside a single multiprocessor and a coarse grain among them. A mechanism must be provided to exploit both levels of parallelism simultaneously. This requires to provide communications between threads belonging to different addresses spaces. This dissertation addresses the problem of integrating threads and communications on ATHAPASCAN-0 run time system. ATHAPASCAN-0 is a portable run time for cluster of multiprocessors developed as part of the APACHE project (CNRS-INPG-INRIA-UJF). Portability is achieved by a layered organization based on standards like POSIX threads and MPI. The ATHAPASCAN-0 run time system extends the heavy-weight process communication model of message passing libraries such as MPI, PVM, etc, into a lighter dynamic network of communicating threads. Multiprogramming is the key concept used. Communication progress is based on a network polling basis to handle incoming messages and to deliver outgoing communications requests. Performance is strongly dependent on the way these operations are implemented. Additionally, multiprocessors introduce some programming problems like overhead of cache coherency mechanisms, method of managing concurrent accesses and efficient mutex locking to avoid unnecessary context switching. These problems are analyzed and solutions are implemented in the ATHAPASCAN-0 run time system. An evaluation of these solutions is performed on a cluster of multiprocessors.
|
127 |
Athapascan-0 : exploitation de la multiprogrammation légère sur grappes de multiprocesseursCarissimi, Alexandre da Silva January 1999 (has links)
L'accroissement d'efficacite des réseaux d'interconnexion et la vulgarisation des machines multiprocesseurs permettent la réalisation de machines parallèles a mémoire distribuée de faible coût: les grappes de multiprocesseurs. Elles nécessitent l'exploitation à la fois du parallélismeà grain fin, interne à un multiprocesseur offert par la multiprogrammation légère, et du parallélisme à gros grain entre les différents multiprocesseurs. L'exploitation simultanée de ces deux types de parallélisme exige une méthode de communication entre les processus légers qui ne partagent pas le mêmme espace d'adressage. Le travail de cette thèse porte sur le problème de l'Intégration de la multiprogrammation légère et des communications sur grappes de multiprocesseurs symétriques (SMP). II porte plus précisément sur evaluation et le reglage du noyau exécutif ATHAPASCAN-0 sur ce type d'architecture. ATHAPASCAN-0 est un noyau exécutif, portable, développé au sein du projet APACHE (CNRS-INPG-INRIA-UJF), qui combine la multiprogrammation légère et la communication par échange de messages. La portabilité est assurée par une organisation en couches basée sur les standards POSIX threads et MPI largement répandus. ATHAPASCAN-0 étend le modèle de réseau statique de processus «lourds» communicants tel que MPI, PVM, etc,à celui d'un réseau dynamique de processus légers communicants. La technique de base est la multiprogrammation lègere des communications et des calculs. La progression des communications exige la scrutation de état du reseau et l'enchainement des opérations de transferts. L'efficacité repose sur la minimisation de ces opérations. De plus, l'emploi de multiprocesseurs ajoute des problèmes spécifiques dus à l'apparition d'un parallélisme réel entre calcul et communication. Ces problèmes sont présentés et des solutions sont proposées pour l'environnement ATHAPASCAN-0. Ces solutions sont évaluées sur des grappes de multiprocesseurs. / The continuous price reduction for commodity PC multiprocessors and the availability of fast network interfaces have made cluster of multiprocessors an attractive low-price alternative to build parallel systems. Multiprocessor clusters offer two levels of parallelism: a fine grain parallelism inside a single multiprocessor and a coarse grain among them. A mechanism must be provided to exploit both levels of parallelism simultaneously. This requires to provide communications between threads belonging to different addresses spaces. This dissertation addresses the problem of integrating threads and communications on ATHAPASCAN-0 run time system. ATHAPASCAN-0 is a portable run time for cluster of multiprocessors developed as part of the APACHE project (CNRS-INPG-INRIA-UJF). Portability is achieved by a layered organization based on standards like POSIX threads and MPI. The ATHAPASCAN-0 run time system extends the heavy-weight process communication model of message passing libraries such as MPI, PVM, etc, into a lighter dynamic network of communicating threads. Multiprogramming is the key concept used. Communication progress is based on a network polling basis to handle incoming messages and to deliver outgoing communications requests. Performance is strongly dependent on the way these operations are implemented. Additionally, multiprocessors introduce some programming problems like overhead of cache coherency mechanisms, method of managing concurrent accesses and efficient mutex locking to avoid unnecessary context switching. These problems are analyzed and solutions are implemented in the ATHAPASCAN-0 run time system. An evaluation of these solutions is performed on a cluster of multiprocessors.
|
128 |
A Modified Sum-Product Algorithm over Graphs with Short CyclesRaveendran, Nithin January 2015 (has links) (PDF)
We investigate into the limitations of the sum-product algorithm for binary low density parity check (LDPC) codes having isolated short cycles. Independence assumption among messages passed, assumed reasonable in all configurations of graphs, fails the most
in graphical structures with short cycles. This research work is a step forward towards
understanding the effect of short cycles on error floors of the sum-product algorithm.
We propose a modified sum-product algorithm by considering the statistical dependency
of the messages passed in a cycle of length 4. We also formulate a modified algorithm in
the log domain which eliminates the numerical instability and precision issues associated
with the probability domain. Simulation results show a signal to noise ratio (SNR) improvement for the modified sum-product algorithm compared to the original algorithm.
This suggests that dependency among messages improves the decisions and successfully
mitigates the effects of length-4 cycles in the Tanner graph. The improvement is significant at high SNR region, suggesting a possible cause to the error floor effects on such graphs. Using density evolution techniques, we analysed the modified decoding algorithm. The threshold computed for the modified algorithm is higher than the threshold computed for the sum-product algorithm, validating the observed simulation results. We also prove that the conditional entropy of a codeword given the estimate obtained using the modified algorithm is lower compared to using the original sum-product algorithm.
|
129 |
Athapascan-0 : exploitation de la multiprogrammation légère sur grappes de multiprocesseursCarissimi, Alexandre da Silva January 1999 (has links)
L'accroissement d'efficacite des réseaux d'interconnexion et la vulgarisation des machines multiprocesseurs permettent la réalisation de machines parallèles a mémoire distribuée de faible coût: les grappes de multiprocesseurs. Elles nécessitent l'exploitation à la fois du parallélismeà grain fin, interne à un multiprocesseur offert par la multiprogrammation légère, et du parallélisme à gros grain entre les différents multiprocesseurs. L'exploitation simultanée de ces deux types de parallélisme exige une méthode de communication entre les processus légers qui ne partagent pas le mêmme espace d'adressage. Le travail de cette thèse porte sur le problème de l'Intégration de la multiprogrammation légère et des communications sur grappes de multiprocesseurs symétriques (SMP). II porte plus précisément sur evaluation et le reglage du noyau exécutif ATHAPASCAN-0 sur ce type d'architecture. ATHAPASCAN-0 est un noyau exécutif, portable, développé au sein du projet APACHE (CNRS-INPG-INRIA-UJF), qui combine la multiprogrammation légère et la communication par échange de messages. La portabilité est assurée par une organisation en couches basée sur les standards POSIX threads et MPI largement répandus. ATHAPASCAN-0 étend le modèle de réseau statique de processus «lourds» communicants tel que MPI, PVM, etc,à celui d'un réseau dynamique de processus légers communicants. La technique de base est la multiprogrammation lègere des communications et des calculs. La progression des communications exige la scrutation de état du reseau et l'enchainement des opérations de transferts. L'efficacité repose sur la minimisation de ces opérations. De plus, l'emploi de multiprocesseurs ajoute des problèmes spécifiques dus à l'apparition d'un parallélisme réel entre calcul et communication. Ces problèmes sont présentés et des solutions sont proposées pour l'environnement ATHAPASCAN-0. Ces solutions sont évaluées sur des grappes de multiprocesseurs. / The continuous price reduction for commodity PC multiprocessors and the availability of fast network interfaces have made cluster of multiprocessors an attractive low-price alternative to build parallel systems. Multiprocessor clusters offer two levels of parallelism: a fine grain parallelism inside a single multiprocessor and a coarse grain among them. A mechanism must be provided to exploit both levels of parallelism simultaneously. This requires to provide communications between threads belonging to different addresses spaces. This dissertation addresses the problem of integrating threads and communications on ATHAPASCAN-0 run time system. ATHAPASCAN-0 is a portable run time for cluster of multiprocessors developed as part of the APACHE project (CNRS-INPG-INRIA-UJF). Portability is achieved by a layered organization based on standards like POSIX threads and MPI. The ATHAPASCAN-0 run time system extends the heavy-weight process communication model of message passing libraries such as MPI, PVM, etc, into a lighter dynamic network of communicating threads. Multiprogramming is the key concept used. Communication progress is based on a network polling basis to handle incoming messages and to deliver outgoing communications requests. Performance is strongly dependent on the way these operations are implemented. Additionally, multiprocessors introduce some programming problems like overhead of cache coherency mechanisms, method of managing concurrent accesses and efficient mutex locking to avoid unnecessary context switching. These problems are analyzed and solutions are implemented in the ATHAPASCAN-0 run time system. An evaluation of these solutions is performed on a cluster of multiprocessors.
|
130 |
Efficient Numerical Methods For Chemotaxis And Plasma Modulation Instability StudiesNguyen, Truong B. 08 August 2019 (has links)
No description available.
|
Page generated in 0.0743 seconds