Global ETD Search

1	Synthetic neural networks : a situated systems approach Scutt, Tom William January 1996 (has links) No description available. 005 Parallel systems
2	An Analysis for Evaluating the Cost/Profit Effectiveness of Parallel Systems Teran, Maria 13 December 2002 (has links) A new domain of commercial applications demands the development of inexpensive parallel computing platforms to lower the cost of operations and increase the business profit. The calculation of returns on an IT investment is now important to justify the decision of upgrading or replacing parallel systems. This thesis presents a framework of the performance and economic factors that are considered when evaluating a parallel system. We introduce a metric called the cost/profit effective metric, which measures the effectiveness of a parallel system in terms of performance, cost and profit. This metric describes the profit obtained from the performance of three different domains for scaling: speed-up, throughput and/or scale-up. Cost is measured by the actual costs of a parallel system. We present two cases of study to demonstrate the application of this metric and analyze the results to support the evaluation of the parallel system on each case. Cost Profit Effectiveness Performance Parallel Systems
3	Static Execution Time Analysis of Parallel Systems Gustavsson, Andreas January 2016 (has links) The past trend of increasing processor throughput by increasing the clock frequency and the instruction level parallelism is no longer feasible due to extensive power consumption and heat dissipation. Therefore, the current trend in computer hardware design is to expose explicit parallelism to the software level. This is most often done using multiple, relatively slow and simple, processing cores situated on a single processor chip. The cores usually share some resources on the chip, such as some level of cache memory (which means that they also share the interconnect, e.g., a bus, to that memory and also all higher levels of memory). To fully exploit this type of parallel processor chip, programs running on it will have to be concurrent. Since multi-core processors are the new standard, even embedded real-time systems will (and some already do) incorporate this kind of processor and concurrent code. A real-time system is any system whose correctness is dependent both on its functional and temporal behavior. For some real-time systems, a failure to meet the temporal requirements can have catastrophic consequences. Therefore, it is crucial that methods to derive safe estimations on the timing properties of parallel computer systems are developed, if at all possible. This thesis presents a method to derive safe (lower and upper) bounds on the execution time of a given parallel system, thus showing that such methods must exist. The interface to the method is a small concurrent programming language, based on communicating and synchronizing threads, that is formally (syntactically and semantically) defined in the thesis. The method is based on abstract execution, which is itself based on abstract interpretation techniques that have been commonly used within the field of timing analysis of single-core computer systems, to derive safe timing bounds in an efficient (although, over-approximative) way. The thesis also proves the soundness of the presented method (i.e., that the estimated timing bounds are indeed safe) and evaluates a prototype implementation of it. / Den strategi som historiskt sett använts för att öka processorers prestanda (genom ökad klockfrekvens och ökad instruktionsnivåparallellism) är inte längre hållbar på grund av den ökade energikonsumtion som krävs. Därför är den nuvarande trenden inom processordesign att låta mjukvaran påverka det parallella exekveringsbeteendet. Detta görs vanligtvis genom att placera multipla processorkärnor på ett och samma processorchip. Kärnorna delar vanligtvis på några av processorchipets resurser, såsom cache-minne (och därmed också det nätverk, till exempel en buss, som ansluter kärnorna till detta minne, samt alla minnen på högre nivåer). För att utnyttja all den prestanda som denna typ av processorer erbjuder så måste mjukvaran som körs på dem kunna delas upp över de tillgängliga kärnorna. Eftersom flerkärniga processorer är standard idag så måste även realtidssystem baseras på dessa och den nämnda typen av kod. Ett realtidssystem är ett datorsystem som måste vara både funktionellt och tidsmässigt korrekt. För vissa typer av realtidssystem kan ett inkorrekt tidsmässigt beteende ha katastrofala följder. Därför är det ytterst viktigt att metoder för att analysera och beräkna säkra gränser för det tidsmässiga beteendet hos parallella datorsystem tas fram. Denna avhandling presenterar en metod för att beräkna säkra gränser för exekveringstiden hos ett givet parallellt system, och visar därmed att sådana metoder existerar. Gränssnittet till metoden är ett litet formellt definierat trådat programmeringsspråk där trådarna tillåts kommunicera och synkronisera med varandra. Metoden baseras på abstrakt exekvering för att effektivt beräkna de säkra (men ofta överskattade) gränserna för exekveringstiden. Abstrakt exekvering baseras i sin tur på abstrakta interpreteringstekniker som vida används inom tidsanalys av sekventiella datorsystem. Avhandlingen bevisar även korrektheten hos den presenterade metoden (det vill säga att de beräknade gränserna för det analyserade systemets exekveringstid är säkra) och utvärderar en prototypimplementation av den. / Worst-Case Execution Time Analysis of Parallel Systems / RALF3 - Software for Embedded High Performance Architectures WCET analysis parallel systems multi-core multicore threaded programming language
4	Real Time Data Reduction and Analysis Using Artificial Neural Networks Dionisi, Steven M. 10 1900 (has links) International Telemetering Conference Proceedings / October 25-28, 1993 / Riviera Hotel and Convention Center, Las Vegas, Nevada / An artificial neural network (ANN) for use in real time data reduction and analysis will be presented. The use and advantage of hardware and software implementations of neural networks will be considered. The ability of neural networks to learn and store associations between different sets of data can be used to create custom algorithms for some of the data analysis done during missions. Once trained, the ANN can distill the signals from several sensors into a single output, such as safe/unsafe. Used on a neural chip, the trained ANN can eliminate the need for A/D conversions and multiplexing for processing of combined parameters and the massively parallel nature of the network allows the processing time to remain independent of the number of parameters. As a software routine, the advantages of using an ANN over conventional algorithms include the ease of use for engineers, and the ability to handle nonlinear, noisy and imperfect data. This paper will apply the ANN to performance data from a T-38 aircraft. Neural Networks Real Time Data Analysis Massively Parallel Systems
5	Uma Abordagem, baseada em framework e na técnica de descrição formal Estelle, para o desenvolvimento de sistemas de arquivos paralelos distribuídos. / An approach, based on framework and the formal description technique Estelle, for the development of distributed parallel file systems. Mantovan, Ulisses 07 July 2006 (has links) O constante aumento da velocidade de processamento, devido principalmente à utilização de um número cada vez maior de processadores, tem propiciado grandes avanços no projeto e na construção de sistemas computacionais paralelos. Entretanto o desempenho de muitas aplicações é afetado pela latência das operações de Entrada e Saída de dados. Para solucionar esse problema, sistemas de arquivos paralelos, que oferecem acesso paralelo aos dados armazenados em diversos discos, vêm sendo desenvolvidos. O desenvolvimento desses sistemas complexos pode ser beneficiado pela adoção de Técnicas de Descrição Formal (TDFs), durante as fases de projeto e especificação dos mesmos, as quais podem ser aliadas a técnicas de implementação durante as demais fases. Neste sentido, este projeto propõe uma abordagem baseada em frameworks e na TDF Extended State Transition Language (Estelle), para a especificação formal, validação, implementação e teste de sistemas dessa categoria. Um framework conceitual que descreve um sistema funcional é apresentado, e dois estudos de caso são desenvolvidos dando origem a dois sistemas de arquivos derivados do framework. Uma metodologia para a validação, que usa ferramentas de simulação, é apresentada. Um dos estudos de caso é implementado semi-automaticamente, a partir de sua especificação formal Estelle, e comparações de desempenho com o mesmo sistema implementado manualmente são realizadas. / The constant increase of processing speed, mainly due to the use of a large number of processors, has allowed an improvement in the design and building of parallel computation systems. However, the performance of several types of applications is affected by the latency originated from Input/Output operations on data. In order to solve this problem parallel file systems, which allow parallel access to the data stored on a set of discs, have been developed. The design of such complex systems can benefit from the adoption of implementation techniques allied with Formal Description Techniques (FDTs). Aimed to introduce the use of FDTs in the development cycle of distributed parallel file systems, this work proposes an approach, based on framework and the FDT Extended State Transition Language (Estelle), for the formal specification, validation, implementation and testing of systems belonging to this domain. A conceptual framework that describes a basic functional system is presented, and two case studies are developed from it. A methodology for Estelle specification validation that makes use of simulation tools is also proposed in this work. One of the systems, developed as a case study, is semi-automatically implemented from its Estelle formal specification, and performance comparisons with a hand-coded implementation of the same system are done. Especificação formal Formal specification Parallel systems Sistemas paralelos Validação Validation
6	Uma Abordagem, baseada em framework e na técnica de descrição formal Estelle, para o desenvolvimento de sistemas de arquivos paralelos distribuídos. / An approach, based on framework and the formal description technique Estelle, for the development of distributed parallel file systems. Ulisses Mantovan 07 July 2006 (has links) O constante aumento da velocidade de processamento, devido principalmente à utilização de um número cada vez maior de processadores, tem propiciado grandes avanços no projeto e na construção de sistemas computacionais paralelos. Entretanto o desempenho de muitas aplicações é afetado pela latência das operações de Entrada e Saída de dados. Para solucionar esse problema, sistemas de arquivos paralelos, que oferecem acesso paralelo aos dados armazenados em diversos discos, vêm sendo desenvolvidos. O desenvolvimento desses sistemas complexos pode ser beneficiado pela adoção de Técnicas de Descrição Formal (TDFs), durante as fases de projeto e especificação dos mesmos, as quais podem ser aliadas a técnicas de implementação durante as demais fases. Neste sentido, este projeto propõe uma abordagem baseada em frameworks e na TDF Extended State Transition Language (Estelle), para a especificação formal, validação, implementação e teste de sistemas dessa categoria. Um framework conceitual que descreve um sistema funcional é apresentado, e dois estudos de caso são desenvolvidos dando origem a dois sistemas de arquivos derivados do framework. Uma metodologia para a validação, que usa ferramentas de simulação, é apresentada. Um dos estudos de caso é implementado semi-automaticamente, a partir de sua especificação formal Estelle, e comparações de desempenho com o mesmo sistema implementado manualmente são realizadas. / The constant increase of processing speed, mainly due to the use of a large number of processors, has allowed an improvement in the design and building of parallel computation systems. However, the performance of several types of applications is affected by the latency originated from Input/Output operations on data. In order to solve this problem parallel file systems, which allow parallel access to the data stored on a set of discs, have been developed. The design of such complex systems can benefit from the adoption of implementation techniques allied with Formal Description Techniques (FDTs). Aimed to introduce the use of FDTs in the development cycle of distributed parallel file systems, this work proposes an approach, based on framework and the FDT Extended State Transition Language (Estelle), for the formal specification, validation, implementation and testing of systems belonging to this domain. A conceptual framework that describes a basic functional system is presented, and two case studies are developed from it. A methodology for Estelle specification validation that makes use of simulation tools is also proposed in this work. One of the systems, developed as a case study, is semi-automatically implemented from its Estelle formal specification, and performance comparisons with a hand-coded implementation of the same system are done. Especificação formal Sistemas paralelos Validação Formal specification Parallel systems Validation
7	Performance Projections of HPC Applications on Chip Multiprocessor (CMP) Based Systems Shawky Sharkawi, Sameh Sh 2011 May 1900 (has links) Performance projections of High Performance Computing (HPC) applications onto various hardware platforms are important for hardware vendors and HPC users. The projections aid hardware vendors in the design of future systems and help HPC users with system procurement and application refinements. In this dissertation, we present an efficient method to project the performance of HPC applications onto Chip Multiprocessor (CMP) based systems using widely available standard benchmark data. The main advantage of this method is the use of published data about the target machine; the target machine need not be available. With the current trend in HPC platforms shifting towards cluster systems with chip multiprocessors (CMPs), efficient and accurate performance projection becomes a challenging task. Typically, CMP-based systems are configured hierarchically, which significantly impacts the performance of HPC applications. The goal of this research is to develop an efficient method to project the performance of HPC applications onto systems that utilize CMPs. To provide for efficiency, our projection methodology is automated (projections are done using a tool) and fast (with small overhead). Our method, called the surrogate-based workload application projection method, utilizes surrogate benchmarks to project an HPC application performance on target systems where computation component of an HPC application is projected separately from the communication component. Our methodology was validated on a variety of systems utilizing different processor and interconnect architectures with high accuracy and efficiency. The average projection error on three target systems was 11.22 percent with standard deviation of 1.18 percent for twelve HPC workloads. High Performance Computing Parallel Systems Parallel Applications Performance Projection Prediction
8	Static Timing Analysis of Parallel Systems Using Abstract Execution Gustavsson, Andreas January 2014 (has links) The Power Wall has stopped the past trend of increasing processor throughput by increasing the clock frequency and the instruction level parallelism.Therefore, the current trend in computer hardware design is to expose explicit parallelism to the software level.This is most often done using multiple processing cores situated on a single processor chip.The cores usually share some resources on the chip, such as some level of cache memory (which means that they also share the interconnect, e.g. a bus, to that memory and also all higher levels of memory), and to fully exploit this type of parallel processor chip, programs running on it will have to be concurrent.Since multi-core processors are the new standard, even embedded real-time systems will (and some already do) incorporate this kind of processor and concurrent code. A real-time system is any system whose correctness is dependent both on its functional and temporal output. For some real-time systems, a failure to meet the temporal requirements can have catastrophic consequences. Therefore, it is of utmost importance that methods to analyze and derive safe estimations on the timing properties of parallel computer systems are developed. This thesis presents an analysis that derives safe (lower and upper) bounds on the execution time of a given parallel system.The interface to the analysis is a small concurrent programming language, based on communicating and synchronizing threads, that is formally (syntactically and semantically) defined in the thesis.The analysis is based on abstract execution, which is itself based on abstract interpretation techniques that have been commonly used within the field of timing analysis of single-core computer systems, to derive safe timing bounds in an efficient (although, over-approximative) way.Basically, abstract execution simulates the execution of several real executions of the analyzed program in one go.The thesis also proves the soundness of the presented analysis (i.e. that the estimated timing bounds are indeed safe) and includes some examples, each showing different features or characteristics of the analysis. / Worst-Case Execution Time Analysis of Parallel Systems / RALF3 - Software for Embedded High Performance Architectures WCET analysis parallel systems multi-core multicore threaded programming language
9	Parallel Mining of Association Rules Using a Lattice Based Approach Thomas, Wessel Morant 01 January 2009 (has links) The discovery of interesting patterns from database transactions is one of the major problems in knowledge discovery in database. One such interesting pattern is the association rules extracted from these transactions. Parallel algorithms are required for the mining of association rules due to the very large databases used to store the transactions. In this paper we present a parallel algorithm for the mining of association rules. We implemented a parallel algorithm that used a lattice approach for mining association rules. The Dynamic Distributed Rule Mining (DDRM) is a lattice-based algorithm that partitions the lattice into sublattices to be assigned to processors for processing and identification of frequent itemsets. Experimental results show that DDRM utilizes the processors efficiently and performed better than the prefix-based and partition algorithms that use a static approach to assign classes to the processors. The DDRM algorithm scales well and shows good speedup. Apriori Association rules Data mining Distributed algorithms Lattice Parallel systems Computer Sciences
10	An efficient execution model for reactive stream programs Nguyen, Vu Thien Nga January 2015 (has links) Stream programming is a paradigm where a program is structured by a set of computational nodes connected by streams. Focusing on data moving between computational nodes via streams, this programming model fits well for applications that process long sequences of data. We call such applications reactive stream programs (RSPs) to distinguish them from stream programs with rather small and finite input data. In stream programming, concurrency is expressed implicitly via communication streams. This helps to reduce the complexity of parallel programming. For this reason, stream programming has gained popularity as a programming model for parallel platforms. However, it is also challenging to analyse and improve the performance without an understanding of the program's internal behaviour. This thesis targets an effi cient execution model for deploying RSPs on parallel platforms. This execution model includes a monitoring framework to understand the internal behaviour of RSPs, scheduling strategies for RSPs on uniform shared-memory platforms; and mapping techniques for deploying RSPs on heterogeneous distributed platforms. The foundation of the execution model is based on a study of the performance of RSPs in terms of throughput and latency. This study includes quantitative formulae for throughput and latency; and the identification of factors that influence these performance metrics. Based on the study of RSP performance, this thesis exploits characteristics of RSPs to derive effective scheduling strategies on uniform shared-memory platforms. Aiming to optimise both throughput and latency, these scheduling strategies are implemented in two heuristic-based schedulers. Both of them are designed to be centralised to provide load balancing for RSPs with dynamic behaviour as well as dynamic structures. The first one uses the notion of positive and negative data demands on each stream to determine the scheduling priorities. This scheduler is independent from the runtime system. The second one requires the runtime system to provide the position information for each computational node in the RSP; and uses that to decide the scheduling priorities. Our experiments show that both schedulers provides similar performance while being significantly better than a reference implementation without dynamic load balancing. Also based on the study of RSP performance, we present in this thesis two new heuristic partitioning algorithms which are used to map RSPs onto heterogeneous distributed platforms. These are Kernighan-Lin Adaptation (KLA) and Congestion Avoidance (CA), where the main objective is to optimise the throughput. This is a multi-parameter optimisation problem where existing graph partitioning algorithms are not applicable. Compared to the generic meta-heuristic Simulated Annealing algorithm, both proposed algorithms achieve equally good or better results. KLA is faster for small benchmarks while slower for large ones. In contrast, CA is always orders of magnitudes faster even for very large benchmarks. 005.1

Search results