Global ETD Search

201	Scaling Up Large-scale Sparse Learning and Its Application to Medical Imaging January 2017 (has links) abstract: Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy is to scaling up the optimization problem in parallel. Parallel solvers run multiple cores on a shared memory system or a distributed environment to speed up the computation, while the practical usage is limited by the huge dimension in the feature space and synchronization problems. In this dissertation, I carry out the research along the direction with particular focuses on scaling up the optimization of sparse learning for supervised and unsupervised learning problems. For the supervised learning, I firstly propose an asynchronous parallel solver to optimize the large-scale sparse learning model in a multithreading environment. Moreover, I propose a distributed framework to conduct the learning process when the dataset is distributed stored among different machines. Then the proposed model is further extended to the studies of risk genetic factors for Alzheimer's Disease (AD) among different research institutions, integrating a group feature selection framework to rank the top risk SNPs for AD. For the unsupervised learning problem, I propose a highly efficient solver, termed Stochastic Coordinate Coding (SCC), scaling up the optimization of dictionary learning and sparse coding problems. The common issue for the medical imaging research is that the longitudinal features of patients among different time points are beneficial to study together. To further improve the dictionary learning model, I propose a multi-task dictionary learning method, learning the different task simultaneously and utilizing shared and individual dictionary to encode both consistent and changing imaging features. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2017 Computer science Dictionary Learning Distributed Computing Machine Learning Medical Imaging Parallel Computing Sparse Learning
202	Paralelização da ferramenta de alinhamento de sequências MUSCLE para um ambiente distribuído Marucci, Evandro Augusto [UNESP] 11 February 2009 (has links) (PDF) Made available in DSpace on 2014-06-11T19:24:01Z (GMT). No. of bitstreams: 0 Previous issue date: 2009-02-11Bitstream added on 2014-06-13T19:51:06Z : No. of bitstreams: 1 marucci_ea_me_sjrp.pdf: 2105093 bytes, checksum: 5b417abdc99cd4c7f9807768af1ab956 (MD5) / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / Devido a crescente quantidade de dados genômicos para comparação, a computação paralela está se tornando cada vez mais necessária para realizar uma das operaçoes mais importantes da bioinformática, o alinhamento múltiplo de sequências. Atualmente, muitas ferramentas computacionais são utilizadas para resolver alinhamentos e o uso da computação paralela está se tornando cada vez mais generalizado. Entretanto, embora diferentes algoritmos paralelos tenham sido desenvolvidos para suportar as pesquisas genômicas, muitos deles não consideram aspectos fundamentais da computação paralela. O MUSCLE [1] e uma ferramenta que realiza o alinhamento m ultiplo de sequências com um bom desempenho computacional e resultados biológicos signi cativamente precisos [2]. Embora os m etodos utilizados por ele apresentem diferentes versões paralelas propostas na literatura, apenas uma versão paralela do MUSCLE foi proposta [3]. Essa versão, entretanto, foi desenvolvida para sistemas de mem oria compartilhada. O desenvolvimento de uma versão paralela do MUSCLE para sistemas distribu dos e importante dado o grande uso desses sistemas em laboratórios de pesquisa genômica. Esta paralelização e o foco deste trabalho e ela foi realizada utilizando-se abordagens paralelas existentes e criando-se novas abordagens. Como resultado, diferentes estratégias paralelas foram propostas. Estas estratégias podem ser incorporadas a outras ferramentas de alinhamento que utilizam, em determinadas etapas, a mesma abordagem seq uencial. Em cada método paralelizado, considerou-se principalmente a e ciência, a escalabilidade e a capacidade de atender problemas reais da biologia. Os testes realizados mostram que, para cada etapa paralela, ao menos uma estratégia de nida atende bem todos esses crit erios. Al em deste trabalho realizar um paralelismo in edito, ao viabilizar a execução da ferramenta MUSCLE em... / Due to increasing amount of genetic data for comparison, parallel computing is becoming increasingly necessary to perform one of the most important operations in bioinformatics, the multiple sequence alignments. Nowadays, many software tools are used to solve sequence alignments and the use of parallel computing is becoming more and more widespread. However, although di erent parallel algorithms were developed to support genetic researches, many of them do not consider fundamental aspects of parallel computing. The MUSCLE [1] is a tool that performs multiple sequence alignments with good computational performance and biological results signi cantly precise [2]. Although the methods used by them have di erent parallel versions proposed in the literature, only one parallel version of the MUSCLE tool was proposed [3]. This version, however, was developed for shared memory systems. The development of a parallel MUSCLE tool for distributed systems is important given the wide use of such systems in laboratories of genomic researches. This parallelization is the aim of this work and it was done using existing parallel approaches and creating new approaches. Consequently, di erent parallel strategies have been proposed. These strategies can be incorporated into other alignment tools that use, in a given stage, the same sequential approach. In each parallel method, we considered mainly the e ciency, scalability and ability to meet real biological problems. The tests show that, for each parallel step, at least one de ned strategy meets all these criteria. In addition to the new MUSCLE parallelization, enabling it execute in a distributed systems, the results show that the de ned strategies have a better performance than the existing strategies. Processamento paralelo (Computadores) Computação paralela Sistemas distribuídos Análise de desempenho Alinhamento de sequências Parallel computing Distributed system
203	Algoritmos de alinhamento múltiplo e técnicas de otimização para esses algoritmos utilizando Ant Colony Zafalon, Geraldo Francisco Donega [UNESP] 30 April 2009 (has links) (PDF) Made available in DSpace on 2014-06-11T19:24:01Z (GMT). No. of bitstreams: 0 Previous issue date: 2009-04-30Bitstream added on 2014-06-13T19:10:03Z : No. of bitstreams: 1 zafalon_gfd_me_sjrp.pdf: 915240 bytes, checksum: 39a35a2fec9d70947eb907760544f707 (MD5) / A biologia, como uma ciência bastante desenvolvida, foi dividida em diversas areas, dentre elas, a genética. Esta area passou a crescer em importância nos ultimos cinquenta anos devido aos in umeros benefícios que ela pode trazer, principalmente, aos seres humanos. Como a gen etica passou a apresentar problemas com grande complexidade de resolução estratégias computacionais foram agregadas a ela, surgindo assim a bioinform atica. A bioinformática desenvolveu-se de forma bastante signi cativa nos ultimos anos e esse desenvolvimento vem se acentuando a cada dia, devido ao aumento da complexidade dos problemas genômicos propostos pelos biólogos. Assim, os cientistas da computação têm se empenhado no desenvolvimento de novas técnicas computacionais para os biólogos, principalmente no que diz respeito as estrat egias para alinhamentos m ultiplos de sequências. Quando as sequências estão alinhadas, os biólogos podem realizar mais inferências sobre elas, principalmente no reconhecimento de padrões que e uma outra area interessante da bioinformática. Atrav es do reconhecimento de padrãoes, os bi ologos podem identicar pontos de alta signi cância (hot spots) entre as sequências e, consequentemente, pesquisar curas para doençass, melhoramentos genéticos na agricultura, entre outras possibilidades. Este trabalho traz o desenvolvimento e a comparação entre duas técnicas computacionais para o alinhamento m ultiplo de sequências. Uma e baseada na técnica de alinhamento múltiplo de sequências progressivas pura e a outra, e uma técnica de alinhamento múltiplo de sequências otimizada a partir da heurística de colônia de formigas. Ambas as técnicas adotam em algumas de suas fases estratégias de paralelismo, focando na redu c~ao do tempo de execução dos algoritmos. Os testes de desempenho e qualidade dos alinhamentos que foram conduzidos com as duas estrat egias... / Biology as an enough developed science was divided in some areas, and genetics is one of them. This area has improved its relevance in last fty years due to the several bene ts that it can mainly bring to the humans. As genetics starts to show problems with hard resolution complexity, computational strategies were aggregated to it, leading to the start of the bioinformatics. The bioinformatics has been developed in a signi cant way in the last years and this development is accentuating everyday due to the increase of the complexity of the genomic problems proposed by biologists. Thus, the computer scientists have committed in the development of new computational techniques to the biologists, mainly related to the strategies to multiple sequence alignments. When the sequences are aligned, the biologists can do more inferences about them mainly in the pattern recognition that is another interesting area of the bioinformatics. Through the pattern recognition, the biologists can nd hot spots among the sequences and consequently contribute for the cure of diseases, genetics improvements in the agriculture and many other possibilities. This work brings the development and the comparison between two computational techniques for the multiple sequence alignments. One is based on the pure progressive multiple sequence alignment technique and the other one is an optimized multiple sequence alignment technique based on the ant colony heuristics. Both techniques take on some of its stages of parallel strategies, focusing on reducing the execution time of algorithms. Performance and quality tests of the alignments were conducted with both strategies and showed that the optimized approach presents better results when it is compared with the pure progressive approach. Biology as an enough developed science was divided in some areas, and genetics is one of them. This area has improved... (Complete abstract click electronic access below) Processamento paralelo (Computadores) Alinhamento de sequências Computação paralela Otimização de algoritmos Parallel computing
204	An?lise de desempenho da rede neural artificial do tipo multilayer perceptron na era multicore Souza, Francisco Ary Alves de 07 August 2012 (has links) Made available in DSpace on 2014-12-17T14:56:07Z (GMT). No. of bitstreams: 1 FranciscoAAS_DISSERT.pdf: 1526658 bytes, checksum: 7ba5b80f03a10eaf25a4f9e6a4c91372 (MD5) Previous issue date: 2012-08-07 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / Artificial neural networks are usually applied to solve complex problems. In problems with more complexity, by increasing the number of layers and neurons, it is possible to achieve greater functional efficiency. Nevertheless, this leads to a greater computational effort. The response time is an important factor in the decision to use neural networks in some systems. Many argue that the computational cost is higher in the training period. However, this phase is held only once. Once the network trained, it is necessary to use the existing computational resources efficiently. In the multicore era, the problem boils down to efficient use of all available processing cores. However, it is necessary to consider the overhead of parallel computing. In this sense, this paper proposes a modular structure that proved to be more suitable for parallel implementations. It is proposed to parallelize the feedforward process of an RNA-type MLP, implemented with OpenMP on a shared memory computer architecture. The research consistes on testing and analizing execution times. Speedup, efficiency and parallel scalability are analyzed. In the proposed approach, by reducing the number of connections between remote neurons, the response time of the network decreases and, consequently, so does the total execution time. The time required for communication and synchronization is directly linked to the number of remote neurons in the network, and so it is necessary to investigate which one is the best distribution of remote connections / As redes neurais artificiais geralmente s?o aplicadas ? solu??o de problemas comple- xos. Em problemas com maior complexidade, ao aumentar o n?mero de camadas e de neur?nios, ? poss?vel conseguir uma maior efici?ncia funcional, por?m, isto acarreta em um maior esfor?o computacional. O tempo de resposta ? um fator importante na decis?o de us?-las em determinados sistemas. Muitos defendem que o maior custo computacional est? na fase de treinamento. Por?m, esta fase ? realizada apenas uma ?nica vez. J? trei- nada, ? necess?rio usar os recursos computacionais existentes de forma eficiente. Diante da era multicore esse problema se resume ? utiliza??o eficiente de todos os n?cleos de processamento dispon?veis. No entanto, ? necess?rio considerar a sobrecarga existente na computa??o paralela. Neste sentido, este trabalho prop?e uma estrutura modular que ? mais adequada para as implementa??es paralelas. Prop?e-se paralelizar o processo feed- forward (passo para frente) de uma RNA do tipo MLP, implementada com o OpenMP em uma arquitetura computacional de mem?ria compartilhada. A investiga??o dar-se-? com a realiza??o de testes e an?lises dos tempos de execu??o. A acelera??o, a efici?ncia e a es- calabilidade s?o analisados. Na proposta apresentada ? poss?vel perceber que, ao diminuir o n?mero de conex?es entre os neur?nios remotos, o tempo de resposta da rede diminui e por consequ?ncia diminui tamb?m o tempo total de execu??o. O tempo necess?rio para comunica??o e sincronismo est? diretamente ligado ao n?mero de neur?nios remotos da rede, sendo ent?o, necess?rio observar sua melhor distribui??o CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
205	Study of load distribution measures for high-performance applications / Estudos de medidas de distribuição de carga para aplicação de alto desempenho Rodrigues, Flavio Alles January 2016 (has links) Balanceamento de carga é essencial para que aplicações paralelas tenham desempenho adequado. Conforme sistemas de computação paralelos crescem, o custo de uma má distribuição de carga também aumenta. Porém, o comportamento dinâmico que a carga computacional possui em certas aplicações pode induzir disparidades na carga atribuída a cada recurso. Portanto, o repetitivo processo de redistribuição de carga realizado durante a execução é crucial para que problemas de grande escala que possuam tais características possam ser resolvidos. Medidas que quantifiquem a distribuição de carga são um importante aspecto desse procedimento. Por estas razões, métricas frequentemente utilizadas como indicadores da distribuição de carga em aplicações paralelas são investigadas nesse estudo. Dado que balanceamento de carga é um processo dinâmico e recorrente, a investigação examina como tais métricas quantificam a distribuição de carga em intervalos regulares durante a execução da aplicação paralela. Seis métricas são avaliadas: percent imbalance, imbalance percentage, imbalance time, standard deviation, skewness e kurtosis. A análise revela virtudes e deficiências que estas medidas possuem, bem como as diferenças entres as mesmas como descritores da distribuição de carga em aplicações paralelas. Uma investigação como esta não tem precedentes na literatura especializada. / Load balance is essential for parallel applications to perform at their highest possible levels. As parallel systems grow, the cost of poor load distribution increases in tandem. However, the dynamic behavior the distribution of load possesses in certain applications can induce disparities in computational loads among resources. Therefore, the process of repeatedly redistributing load as execution progresses is critical to achieve the performance necessary to compute large scale problems with such characteristics. Metrics quantifying the load distribution are an important facet of this procedure. For these reasons, measures commonly used as load distribution indicators in HPC applications are investigated in this study. Considering the dynamic and recurrent aspect in load balancing, the investigation examines how these metrics quantify load distribution at regular intervals during a parallel application execution. Six metrics are evaluated: percent imbalance, imbalance percentage, imbalance time, standard deviation, skewness, and kurtosis. The analysis reveals the virtues and deficiencies each metric has, as well as the differences they register as descriptors of load distribution progress in parallel applications. As far as we know, an investigation as the one performed in this work is unprecedented. Processamento : Alto desempenho High-performance computing Parallel computing Performance analysis Load balance
206	Estimador subsemble espacial para dados massivos em geoestatística Barbian, Márcia Helena January 2016 (has links) Um problema que vem se tornando habitual em análise geoestatística é a quantidade crescente de observações. Em tais casos é comum que estimadores usualmente utilizados não possam ser empregados devido a dificuldades numéricas. Esta tese têm por objetivo propor um novo estimador para massivas observações em geoestatística: o estimador subsemble espacial. O estimador subsemble espacial seleciona várias subamostras, espacialmente estruturadas, do conjunto completo de dados. Cada subamostra estima com facilidade os parâmetros do modelo e as estimativas resultantes são ponderadas através de um subconjunto de validação. Em estudos simulados, compara-se a metodologia proposta com outros métodos e os resultados apresentam sua acurácia e rapidez. Além disso, uma aplicação em um banco de dados reais, com 11.000 observações, confirma essas características. / A problem that is becoming common in geostatistical analysis is the growing number of observations. In such cases, common estimators cannot be used due to numerical difficulties. This thesis proposes a new estimator for massive observations in geostatistics: the spatial subsemble estimator. The estimator selects small spatially structured subset of observations. The model parameters are estimated easily with each subsample, and the resulting estimates are weighted by a subset of validation. We compare the spatial subsemble with competing alternatives showing that it is faster and accurate. In addition, we present an application in a real database with 11000 observations. Geoestatística Programação paralela Large spatial data Subsampling U-statistics Parallel computing Geoestatistic
207	Uma plataforma para manuseio, análise e avaliação de dados de observação e controle de simulações Sidou, Pedro Niederhagebock January 2017 (has links) Atualmente, com a melhoria das técnicas de paralelização, cada vez mais vem sendo empregado o uso de clusteres de computadores para programas exigentes de alta performance. Porém, para um programa rodar adequadamente em paralelo, o mesmo deve ser escrito de maneira paralelizável. A proposta deste trabalho é o desenvolvimento de uma plataforma computacional que facilite a escrita de programas orientados à análise de dados climáticos e simulações de fenômenos atmosféricos escritos em paralelo. A plataforma foi pensada de maneira a ser o mais exível possível, permitindo que novas funcionalidades sejam adicionadas através de plugins. Em um futuro próximo, pretende-se disponibilizá-la para uso público com licença GNU General Public License (GNU GPL) e código aberto. Para demonstrar seu potencial de uso, foi realizado um estudo exploratório com dados provenientes do projeto de reanálise Twentieth Century Reanalysis version 2 (20CRv2). Tal estudo visa obter informações sobre que parâmetros e escalas de tempo são importantes para a descrição do fenômeno da zona de convergência do Atlântico sul (ZCAS). / Since parallel computing technologies are improving very fast, computer clusteres are getting more and more employed for highly demanding computational tasks. Nevertheless, for a software to run in a cluster, it needs to be written in a parallel way, which is not necessarily a simple task. Therefore, the aim of this work is to develop a computer platform capable of manipulating, analysing and evaluating climate observational data and controlling simulation in a parallel manner. Features can be added to the platform through plugins, making it very exible and extensible for a very large range of tasks. A simple application is purposed to demonstrate the platform's potential uses. The application consists of a exploratory study of data from the 20th century reanalysis project concerning the South American Convergence Zone (SACZ), a very important phenomena in the south hemisphere. Programas de computador Dispersão de poluentes Climatologia Modelos matemáticos Parallel computing Simulation Climate SACZ
208	Aurora : seamless optimization of openMP applications / Aurora: Otimização Transparente de Aplicações OpenMP Lorenzon, Arthur Francisco January 2018 (has links) A exploração eficiente do paralelismo no nível de threads tem sido um desafio para os desenvolvedores de softwares. Como muitas aplicações não escalam com o número de núcleos, aumentar cegamente o número de threads pode não produzir os melhores resultados em desempenho ou energia. No entanto, a tarefa de escolher corretamente o número ideal de threads não é simples: muitas variáveis estão envolvidas (por exemplo, saturação do barramento off-chip e sobrecarga de sincronização de dados), que mudam de acordo com diferentes aspectos do sistema (por exemplo, conjunto de entrada, micro-arquitetura) e mesmo durante a execução da aplicação. Para abordar esse complexo cenário, esta tese apresenta Aurora. Ela é capaz de encontrar automaticamente, em tempo de execução e com o mínimo de sobrecarga, o número ideal de threads para cada região paralela da aplicação e se readaptar nos casos em que o comportamento de uma região muda durante a execução. Aurora trabalha com o OpenMP e é completamente transparente tanto para o programador quanto para o usuário final: dado um binário de uma aplicação OpenMP, Aurora o otimiza sem nenhuma transformação ou recompilação de código. Através da execução de quinze benchmarks conhecidos em quatro processadores multi-core, mostramos que Aurora melhora o trade-off entre desempenho e energia em até: 98% sobre a execução padrão do OpenMP; 86% sobre o recurso interno do OpenMP que ajusta dinamicamente o número de threads; e 91% quando comparado a uma emulação do feedback-driven threading. / Efficiently exploiting thread-level parallelism has been challenging for software developers. As many parallel applications do not scale with the number of cores, blindly increasing the number of threads may not produce the best results in performance or energy. However, the task of rightly choosing the ideal amount of threads is not straightforward: many variables are involved (e.g. off-chip bus saturation and overhead of datasynchronization), which will change according to different aspects of the system at hand (e.g., input set, micro-architecture) and even during execution. To address this complex scenario, this thesis presents Aurora. It is capable of automatically finding, at run-time and with minimum overhead, the optimal number of threads for each parallel region of the application and re-adapt in cases the behavior of a region changes during execution. Aurora works with OpenMP and is completely transparent to both designer and end-user: given an OpenMP application binary, Aurora optimizes it without any code transformation or recompilation. By executing fifteen well-known benchmarks on four multi-core processors, Aurora improves the trade-off between performance and energy by up to: 98% over the standard OpenMP execution; 86% over the built-in feature of OpenMP that dynamically adjusts the number of threads; and 91% over a feedback-driven threading emulation. Processamento paralelo Open MP Parallel computing Energy and performance optimization Software tuning OpenMP
209	Asymptotic and Numerical Algorithms in Applied Electromagnetism January 2012 (has links) abstract: Asymptotic and Numerical methods are popular in applied electromagnetism. In this work, the two methods are applied for collimated antennas and calibration targets, respectively. As an asymptotic method, the diffracted Gaussian beam approach (DGBA) is developed for design and simulation of collimated multi-reflector antenna systems, based upon Huygens principle and independent Gaussian beam expansion, referred to as the frames. To simulate a reflector antenna in hundreds to thousands of wavelength, it requires 1E7 - 1E9 independent Gaussian beams. To this end, high performance parallel computing is implemented, based on Message Passing Interface (MPI). The second part of the dissertation includes the plane wave scattering from a target consisting of doubly periodic array of sharp conducting circular cones by the magnetic field integral equation (MFIE) via Coiflet based Galerkin's procedure in conjunction with the Floquet theorem. Owing to the orthogonally, compact support, continuity and smoothness of the Coiflets, well-conditioned impedance matrices are obtained. Majority of the matrix entries are obtained in the spectral domain by one-point quadrature with high precision. For the oscillatory entries, spatial domain computation is applied, bypassing the slow convergence of the spectral summation of the non-damping propagating modes. The simulation results are compared with the solutions from an RWG-MLFMA based commercial software, FEKO, and excellent agreement is observed. / Dissertation/Thesis / Ph.D. Electrical Engineering 2012 Electromagnetics Calibration target Coifman wavelets Gaussian beam method of moments parallel computing
210	Testing Independence of Parallel Pseudorandom Number Streams: Incorporating the Data's Multivariate Nature January 2013 (has links) abstract: Parallel Monte Carlo applications require the pseudorandom numbers used on each processor to be independent in a probabilistic sense. The TestU01 software package is the standard testing suite for detecting stream dependence and other properties that make certain pseudorandom generators ineffective in parallel (as well as serial) settings. TestU01 employs two basic schemes for testing parallel generated streams. The first applies serial tests to the individual streams and then tests the resulting P-values for uniformity. The second turns all the parallel generated streams into one long vector and then applies serial tests to the resulting concatenated stream. Various forms of stream dependence can be missed by each approach because neither one fully addresses the multivariate nature of the accumulated data when generators are run in parallel. This dissertation identifies these potential faults in the parallel testing methodologies of TestU01 and investigates two different methods to better detect inter-stream dependencies: correlation motivated multivariate tests and vector time series based tests. These methods have been implemented in an extension to TestU01 built in C++ and the unique aspects of this extension are discussed. A variety of different generation scenarios are then examined using the TestU01 suite in concert with the extension. This enhanced software package is found to better detect certain forms of inter-stream dependencies than the original TestU01 suites of tests. / Dissertation/Thesis / Ph.D. Statistics 2013 Statistics Computer science Mathematics parallel computing pseudorandom numbers testing of generators tests of vector independence

Search results