Global ETD Search

1	Kommunikationstechnologien beim parallelen vorkonditionierten Schur-Komplement CG-Verfahren Meisel, M., Meyer, A. 30 October 1998 (has links) (PDF) Two alternative technologies of communication inside a parallelized Conjugate-Gradient algorithm are presented and compared to the well known hypercubecommunication. The amount of communication is diskussed in detail. A large range of numerical results corroborate the theoretical investigations. Conjugate-Gradient communication parallelized MSC 65Y05 ddc:510
2	Kommunikationstechnologien beim parallelen vorkonditionierten Schur-Komplement CG-Verfahren Meisel, M., Meyer, A. 30 October 1998 (has links) Two alternative technologies of communication inside a parallelized Conjugate-Gradient algorithm are presented and compared to the well known hypercubecommunication. The amount of communication is diskussed in detail. A large range of numerical results corroborate the theoretical investigations. info:eu-repo/classification/ddc/510 ddc:510
3	On-Board Memory Extension on Reconfigurable Integrated Circuits using External DDR3 Memory Lodaya, Bhaveen 08 February 2018 (has links) (PDF) User-programmable, integrated circuits (ICs) e.g. Field Programmable Gate Arrays (FPGAs) are increasingly popular for embedded, high-performance data exploitation. They combine the parallelization capability and processing power of application specific integrated circuits (ASICs) with the exibility, scalability and adaptability of software-based processing solutions. FPGAs provide powerful processing resources due to an optimal adaptation to the target application and a well-balanced ratio of performance, efficiency and parallelization. One drawback of FPGA-based data exploitation is the limited memory capacity of reconfigurable integrated circuits. Large-scale Digital Signal Processor (DSP) FPGAs provide approximately 4MB on-board random access memory (RAM) which is not sufficient to buffer the broadband sensor and result data. Hence, additional external memory is connected to the FPGA to increase on-board storage capacities. External memory devices like double data rate three synchronous dynamic random access memories (DDR3-SDRAM) provide very fast and wide bandwidth interfaces that represent a bottleneck when used in highly parallelized processing architectures. Independent processing modules are demanding concurrent read and write access. Within the master thesis, a concept for the integration of an external DDR3- SDRAM into an FPGA-based parallelized processing architecture is developed and implemented. The solution realizes time division multiple access (TDMA) to the external memory and virtual, low-latency memory extension to the on-board buffer capabilities. The integration of the external RAM does not change the way how on-board buffers are used (control, data-fow). FPGA DDR3-SRAM parallele Verarbeitung DDR3- SDRAM FPGA parallelized processing architecture ddc:004 Informatik Field programmable gate array TDMA
4	Reconstrução de imagens por tomografia por impedância elétrica utilizando recozimento simulado massivamente paralelizado. / Image reconstruction through electrical impedance tomography using massively parallelized simulated annealing. Tavares, Renato Seiji 06 May 2016 (has links) A tomografia por impedância elétrica é uma modalidade de imageamento médico recente, com diversas vantagens sobre as demais modalidades já consolidadas. O recozimento simulado é um algoritmo que apresentada qualidade de solução, mesmo com a utilização de uma regularização simples e sem informação a priori. Entretanto, existe a necessidade de reduzir o tempo de processamento. Este trabalho avança nessa direção, com a apresentação de um método de reconstrução que utiliza o recozimento simulado e paralelização massiva em GPU. A paralelização das operações matriciais em GPU é explicada, com uma estratégia de agendamento de threads que permite a paralelização efetiva de algoritmos, até então, considerados não paralelizáveis. Técnicas para sua aceleração são discutidas, como a heurística de fora para dentro. É proposta uma nova representação de matrizes esparsas voltada para as características da arquitetura CUDA, visando um melhor acesso à memória global do dispositivo e melhor utilização das threads. Esta nova representação de matriz mostrou-se vantajosa em relação aos formatos mais utilizados. Em seguida, a paralelização massiva do problema inverso da TIE, utilizando recozimento simulado, é estudada, com uma proposta de abordagem híbrida com paralelização tanto em CPU quanto GPU. Os resultados obtidos para a paralelização do problema inverso são superiores aos do problema direto. A GPU satura em aproximadamente 7.000 nós, a partir do qual o ganho em desempenho é de aproximadamente 5 vezes. A utilização de GPUs é viável para a reconstrução de imagens de tomografia por impedância elétrica. / Electrical impedance tomography is a new medical imaging modality with remarkable advatanges over other stablished modalities. Simulated annealing is an algorithm that renders quality solutions despite the use of simple regularization methods and the absence of a priori information. However, it remains the need to reduce its processing time. This work takes a step in this direction, presenting a method for the reconstruction of EIT images using simulated annealing and GPU parallelization. The parallelization of matrix operations in GPU is explained, with a thread scheduling strategy that allows the effective parallelization of not-yet effectively parallelized algorithms. There are strategies for improving its performance, such as the presented outside-in heuristic. It is proposed a new sparse matrix representation focused on the CUDA architecture characteristics, with improved global memory access patterns and thread efficiency. This new matrix representation showed several advantages over the most common formats. The massive parallelization of the TIE\'s inverse problem using simulated annealing is studied, with a proposed hybrid approach that uses parallelization in both CPU and GPU. Results showed that the performance gain for the inverse problem is higher than the one obtained for the forward problem. The GPU device saturates with meshes of size of approximately 7,000 nodes, with a performance gain around 5 times faster than serial implementations. GPU parallelization may be used for the reconstruction of electrical impedance tomography images. Algoritmos paralelos CUDA CUDA Electrical impedance tomography GPU GPU Otimização estocástica Parallelized algorithms Problemas inversos Processamento de imagens Recozimento simulado Simulated annealing Tomografia
5	Reconstrução de imagens por tomografia por impedância elétrica utilizando recozimento simulado massivamente paralelizado. / Image reconstruction through electrical impedance tomography using massively parallelized simulated annealing. Renato Seiji Tavares 06 May 2016 (has links) A tomografia por impedância elétrica é uma modalidade de imageamento médico recente, com diversas vantagens sobre as demais modalidades já consolidadas. O recozimento simulado é um algoritmo que apresentada qualidade de solução, mesmo com a utilização de uma regularização simples e sem informação a priori. Entretanto, existe a necessidade de reduzir o tempo de processamento. Este trabalho avança nessa direção, com a apresentação de um método de reconstrução que utiliza o recozimento simulado e paralelização massiva em GPU. A paralelização das operações matriciais em GPU é explicada, com uma estratégia de agendamento de threads que permite a paralelização efetiva de algoritmos, até então, considerados não paralelizáveis. Técnicas para sua aceleração são discutidas, como a heurística de fora para dentro. É proposta uma nova representação de matrizes esparsas voltada para as características da arquitetura CUDA, visando um melhor acesso à memória global do dispositivo e melhor utilização das threads. Esta nova representação de matriz mostrou-se vantajosa em relação aos formatos mais utilizados. Em seguida, a paralelização massiva do problema inverso da TIE, utilizando recozimento simulado, é estudada, com uma proposta de abordagem híbrida com paralelização tanto em CPU quanto GPU. Os resultados obtidos para a paralelização do problema inverso são superiores aos do problema direto. A GPU satura em aproximadamente 7.000 nós, a partir do qual o ganho em desempenho é de aproximadamente 5 vezes. A utilização de GPUs é viável para a reconstrução de imagens de tomografia por impedância elétrica. / Electrical impedance tomography is a new medical imaging modality with remarkable advatanges over other stablished modalities. Simulated annealing is an algorithm that renders quality solutions despite the use of simple regularization methods and the absence of a priori information. However, it remains the need to reduce its processing time. This work takes a step in this direction, presenting a method for the reconstruction of EIT images using simulated annealing and GPU parallelization. The parallelization of matrix operations in GPU is explained, with a thread scheduling strategy that allows the effective parallelization of not-yet effectively parallelized algorithms. There are strategies for improving its performance, such as the presented outside-in heuristic. It is proposed a new sparse matrix representation focused on the CUDA architecture characteristics, with improved global memory access patterns and thread efficiency. This new matrix representation showed several advantages over the most common formats. The massive parallelization of the TIE\'s inverse problem using simulated annealing is studied, with a proposed hybrid approach that uses parallelization in both CPU and GPU. Results showed that the performance gain for the inverse problem is higher than the one obtained for the forward problem. The GPU device saturates with meshes of size of approximately 7,000 nodes, with a performance gain around 5 times faster than serial implementations. GPU parallelization may be used for the reconstruction of electrical impedance tomography images. Algoritmos paralelos CUDA GPU Otimização estocástica Problemas inversos Processamento de imagens Recozimento simulado Tomografia CUDA Electrical impedance tomography GPU Parallelized algorithms Simulated annealing
6	On-Board Memory Extension on Reconfigurable Integrated Circuits using External DDR3 Memory: On-Board Memory Extension on Reconfigurable Integrated Circuits usingExternal DDR3 Memory Lodaya, Bhaveen 08 February 2018 (has links) User-programmable, integrated circuits (ICs) e.g. Field Programmable Gate Arrays (FPGAs) are increasingly popular for embedded, high-performance data exploitation. They combine the parallelization capability and processing power of application specific integrated circuits (ASICs) with the exibility, scalability and adaptability of software-based processing solutions. FPGAs provide powerful processing resources due to an optimal adaptation to the target application and a well-balanced ratio of performance, efficiency and parallelization. One drawback of FPGA-based data exploitation is the limited memory capacity of reconfigurable integrated circuits. Large-scale Digital Signal Processor (DSP) FPGAs provide approximately 4MB on-board random access memory (RAM) which is not sufficient to buffer the broadband sensor and result data. Hence, additional external memory is connected to the FPGA to increase on-board storage capacities. External memory devices like double data rate three synchronous dynamic random access memories (DDR3-SDRAM) provide very fast and wide bandwidth interfaces that represent a bottleneck when used in highly parallelized processing architectures. Independent processing modules are demanding concurrent read and write access. Within the master thesis, a concept for the integration of an external DDR3- SDRAM into an FPGA-based parallelized processing architecture is developed and implemented. The solution realizes time division multiple access (TDMA) to the external memory and virtual, low-latency memory extension to the on-board buffer capabilities. The integration of the external RAM does not change the way how on-board buffers are used (control, data-fow). info:eu-repo/classification/ddc/004 ddc:004 FPGA, DDR3-SRAM, parallele Verarbeitung
7	Metode i postupci ubrzavanja operacija i upita u velikim sistemima baza i skladišta podataka (Big Data sistemi) / The methods and procedures for accelerating operations and queries in large database systems and data warehouses ( Big Data Systems ) Ivković Jovan 29 September 2016 (has links) <p>Predmet istraživanja ove doktorske disertacije je mogućnost uspostavljanja modela Big Data sistema sa pripadajućom softversko –  hardverskom arhitekturom za podr&scaron;ku senzorskim mrežama i IoT uređajima. Razvijeni model počiva na energetsko efikasnim, heterogenim, masovno paralelizovaim SoC hardverskim platformama, uz podr&scaron;ku softverske aplikativne arhitekture (poput OpenCL) za unifikovan rad.<br />Pored aktuelnih hardverskih, softverskih i mrežnih računarskih tehnologija i arhitektura namenjenih za rad podkomponenata modelovanog sistema u radu je predstavljen istorijski osvrt na njihov razvoj. Time je nagla&scaron;ena tendencija cikličnog kretanja koncepcijskih paradigmi računarstva, kroz svojevrstne ere centralizacije – decentralizacije computinga. U radu su predstavljene tehnologije i metode za ubrzavanje operacija  u bazama i skladi&scaron;tima podataka. Istražene su mogućnosti za bolju pripremu Big Data informacionih sistema  koji treba da zadovolje potrebe  novo najavljene informatičke revolucije op&scaron;te primene računarstva tzv. Ubiquitous computing-a i Interneta stvari (IoT).</p> / <p>The research topic of this doctoral thesis is the possibility of establishing a model for Big Data System with corresponding software-hardware architecture to support sensor networks and IoT devices.  The developed model is based on energy efficient, heterogeneous, massively parallelized SoC hardware platforms, with the support of software application architecture. (Such as an open CL) for unified operation.  In addition to current hardware, software and network computing technologies, and architecture intended to operate subcomponents of the system modeled in this paper is presented as an historical overview of their development.  Which emphasizes the tendency of the cyclic movement of the conceptual paradigm of computing, through the unique era of centralization/decentralization of computing. The thesis presents the technology and methods to accelerate operations in databases and data warehouses. We also investigate the possibilities for a better preparation of Big Data information systems to meet the needs of the newly announced IT revolution in the announced general application of computing called Ubiquitous computing and the Internet of Things (IoT).</p>

1

Page generated in 0.0413 seconds