Global ETD Search

21	Escalabilidade Paralela de um Algoritmo de Migra??o Reversa no Tempo (RTM) Pr?-empilhamento / PARALLEL SCALABILITY OF A PRESTACK REVERSE TIME MIGRATION (RTM) ALGORITHM Ros?rio, Desnes Augusto Nunes do 21 December 2012 (has links) Made available in DSpace on 2014-12-17T14:56:09Z (GMT). No. of bitstreams: 1 DesnesANR_DISSERT.pdf: 3501359 bytes, checksum: 5155a508018af1e52dae20205b8f726b (MD5) Previous issue date: 2012-12-21 / The seismic method is of extreme importance in geophysics. Mainly associated with oil exploration, this line of research focuses most of all investment in this area. The acquisition, processing and interpretation of seismic data are the parts that instantiate a seismic study. Seismic processing in particular is focused on the imaging that represents the geological structures in subsurface. Seismic processing has evolved significantly in recent decades due to the demands of the oil industry, and also due to the technological advances of hardware that achieved higher storage and digital information processing capabilities, which enabled the development of more sophisticated processing algorithms such as the ones that use of parallel architectures. One of the most important steps in seismic processing is imaging. Migration of seismic data is one of the techniques used for imaging, with the goal of obtaining a seismic section image that represents the geological structures the most accurately and faithfully as possible. The result of migration is a 2D or 3D image which it is possible to identify faults and salt domes among other structures of interest, such as potential hydrocarbon reservoirs. However, a migration fulfilled with quality and accuracy may be a long time consuming process, due to the mathematical algorithm heuristics and the extensive amount of data inputs and outputs involved in this process, which may take days, weeks and even months of uninterrupted execution on the supercomputers, representing large computational and financial costs, that could derail the implementation of these methods. Aiming at performance improvement, this work conducted the core parallelization of a Reverse Time Migration (RTM) algorithm, using the parallel programming model Open Multi-Processing (OpenMP), due to the large computational effort required by this migration technique. Furthermore, analyzes such as speedup, efficiency were performed, and ultimately, the identification of the algorithmic scalability degree with respect to the technological advancement expected by future processors / A s?smica ? uma ?rea de extrema import?ncia na geof?sica. Associada principalmente ? explora??o de petr?leo, essa linha de pesquisa concentra boa parte de todo o investimento realizado nesta grande ?rea. A aquisi??o, o processamento e a interpreta??o dos dados s?smicos s?o as partes que comp?em um estudo s?smico. O processamento s?smico em especial tem como objetivo ? obten??o de uma imagem que represente as estruturas geol?gicas em subsuperf?cie. O processamento s?smico evoluiu significativamente nas ?ltimas d?cadas devido ?s demandas da ind?stria petrol?fera, e aos avan?os tecnol?gicos de hardware que proporcionaram maiores capacidades de armazenamento e processamento de informa??es digitais, que por sua vez possibilitaram o desenvolvimento de algoritmos de processamento mais sofisticados, tais como os que utilizam arquiteturas paralelas de processamento. Uma das etapas importantes contidas no processamento s?smico ? o imageamento. A migra??o ? uma das t?cnicas usadas para no imageamento com o objetivo de obter uma se??o s?smica que represente de forma mais precisa e fiel as estruturas geol?gicas. O resultado da migra??o ? uma imagem 2D ou 3D na qual ? poss?vel a identifica??o de falhas e domos salinos dentre outras estruturas de interesse, poss?veis reservat?rios de hidrocarbonetos. Entretanto, uma migra??o rica em qualidade e precis?o pode ser um processo demasiadamente longo, devido ?s heur?sticas matem?ticas do algoritmo e ? quantidade extensa de entradas e sa?das de dados envolvida neste processo, podendo levar dias, semanas e at? meses de execu??o ininterrupta em supercomputadores, o que representa grande custo computacional e financeiro, o que pode inviabilizar a aplica??o desses m?todos. Tendo como objetivo a melhoria de desempenho, este trabalho realizou a paraleliza??o do n?cleo de um algoritmo de Migra??o Reversa no Tempo (RTM - do ingl?s: Reverse Time Migration), utilizando o modelo de programa??o paralela OpenMP (do ingl?s: Open Multi-Processing), devido ao alto esfor?o computacional demandado por essa t?cnica de migra??o. Al?m disso, foram realizadas an?lises de desempenho tais como de speedup, efici?ncia, e, por fim, a identifica??o do grau de escalabilidade algor?tmica com rela??o ao avan?o tecnol?gico esperado para futuros processadores CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
22	Multifractal traffic generator modeled at the transaction level for integrates systems performance evaluation. / Gerador de tráfego multifractal modelado no nível de transações para a avaliação de desempenho de sistemas integrados. Bueno Filho, José Eduardo Chiarelli 10 February 2017 (has links) The present work aims to provide a contribution to improve the efficiency the design flow of integrated systems, focusing, specifically, on the performance evaluation of its communication structures. The use of Transaction Level Modeling (TLM) is proposed, in order to take advantage of the reduction of design effort and time. Within the performance evaluation approaches, the utilization of traffic generators instead of full system simulations started to be adopted due to its higher time efficiency. Initial works on on-chip traffic generation focused on Poisson processes and classic Markovian models, which are unable to capture Long Range Dependence (LRD). This fact led to the adoption of fractal/self-similar models. Later advancements have shown that the traffic produced in multiprocessed systems can show higher degrees of complexity, what can be attributed to the presence multifractal characteristics. In this work, a methodology to evaluate the on-chip traffic and to the development of a transaction level traffic generator is proposed. The main contributions of this work are a detailed analysis of traffic time series obtained by TLM simulations and the study of the effects of the traffic generator on these simulations, concerning, mainly, the speedup-accuracy trade-off. The proposed analysis follow the multifractal paradigm, allowing system developers to (1) understand the statistical nature of on-chip traffic, (2) to obtain accurate representations of this traffic and (3) to build traffic generators that mimic processing elements realistically. Another contribution of this work is a comparison of the performance, considering the accuracy of the obtained synthetic traffic time series, between monofractal and multifractal models. All of the mentioned contributions were grouped throughout the detailed methodology presented on the present document, for which experiments were carried out. / O presente trabalho visa oferecer uma contribuição para o aumentar a eficiência do fluxo de projeto de sistemas integrados, focando, especificamente, na avaliação do desempenho de suas estruturas de comunicação. É proposta a utilização de simulações com modelos no nível de transações (TLM), com o objetivo de se obter vantagens da redução de esforço e tempo de projeto oferecidos por esta abordagem. Dentro das propostas de análise de desempenho, a utilização de geradores de tráfego ao invés simulações de sistema completo tem sido adotada devido a sua maior eficiência no tempo. Trabalhos iniciais na geração de tráfego intrachip focaram-se em processos de Poisson e em modelos de Markov clássicos, os quais não capturam Dependência de Longa Duração (LRD). Este fato levou a adoção de modelos fractais/auto-similares. Avanços posteriores mostraram que o tráfego produzido pelos elementos de sistemas multiprocessados podem apresentar maior grau de complexidade, que pode ser atribuída à presença de características multifractais. Neste trabalho, é proposta uma metodologia para a avaliação de tráfego intrachip para o desenvolvimento de um gerador de tráfego TLM. As principais contribuições deste trabalho são uma análise detalhada das séries temporais de tráfego obtidas nas simulações TLM e o estudo dos efeitos que o gerador de tráfego exerce sobre estas simulações, se concentrando, principalmente, na relação entre precisão e aceleração da simulação. As análises propostas se baseiam no paradigma multifractal, o qual permite (1) um maior entendimento da natureza estatística do tráfego pelos desenvolvedores de sistemas, (2) a obtenção de uma representação precisa deste tráfego e (3) a construção de geradores de tráfego que substituam elementos processantes de maneira realista. Outra contribuição deste trabalho é a comparação do desempenho, no que concerne a precisão das séries de tráfego sintéticas obtidas, de modelos monofractais e multifractais. Todas as contribuições mencionadas foram agrupadas na metodologia detalhada, apresentada no presente documento, sobre a qual experimentos foram realizados. Fractais Modelos em séries temporais Multifractal model On-chip traffic Sistemas integrados em larga escala Speedup-accuracy trade-off Traffic generator Transaction level modeling
23	Multifractal traffic generator modeled at the transaction level for integrates systems performance evaluation. / Gerador de tráfego multifractal modelado no nível de transações para a avaliação de desempenho de sistemas integrados. José Eduardo Chiarelli Bueno Filho 10 February 2017 (has links) The present work aims to provide a contribution to improve the efficiency the design flow of integrated systems, focusing, specifically, on the performance evaluation of its communication structures. The use of Transaction Level Modeling (TLM) is proposed, in order to take advantage of the reduction of design effort and time. Within the performance evaluation approaches, the utilization of traffic generators instead of full system simulations started to be adopted due to its higher time efficiency. Initial works on on-chip traffic generation focused on Poisson processes and classic Markovian models, which are unable to capture Long Range Dependence (LRD). This fact led to the adoption of fractal/self-similar models. Later advancements have shown that the traffic produced in multiprocessed systems can show higher degrees of complexity, what can be attributed to the presence multifractal characteristics. In this work, a methodology to evaluate the on-chip traffic and to the development of a transaction level traffic generator is proposed. The main contributions of this work are a detailed analysis of traffic time series obtained by TLM simulations and the study of the effects of the traffic generator on these simulations, concerning, mainly, the speedup-accuracy trade-off. The proposed analysis follow the multifractal paradigm, allowing system developers to (1) understand the statistical nature of on-chip traffic, (2) to obtain accurate representations of this traffic and (3) to build traffic generators that mimic processing elements realistically. Another contribution of this work is a comparison of the performance, considering the accuracy of the obtained synthetic traffic time series, between monofractal and multifractal models. All of the mentioned contributions were grouped throughout the detailed methodology presented on the present document, for which experiments were carried out. / O presente trabalho visa oferecer uma contribuição para o aumentar a eficiência do fluxo de projeto de sistemas integrados, focando, especificamente, na avaliação do desempenho de suas estruturas de comunicação. É proposta a utilização de simulações com modelos no nível de transações (TLM), com o objetivo de se obter vantagens da redução de esforço e tempo de projeto oferecidos por esta abordagem. Dentro das propostas de análise de desempenho, a utilização de geradores de tráfego ao invés simulações de sistema completo tem sido adotada devido a sua maior eficiência no tempo. Trabalhos iniciais na geração de tráfego intrachip focaram-se em processos de Poisson e em modelos de Markov clássicos, os quais não capturam Dependência de Longa Duração (LRD). Este fato levou a adoção de modelos fractais/auto-similares. Avanços posteriores mostraram que o tráfego produzido pelos elementos de sistemas multiprocessados podem apresentar maior grau de complexidade, que pode ser atribuída à presença de características multifractais. Neste trabalho, é proposta uma metodologia para a avaliação de tráfego intrachip para o desenvolvimento de um gerador de tráfego TLM. As principais contribuições deste trabalho são uma análise detalhada das séries temporais de tráfego obtidas nas simulações TLM e o estudo dos efeitos que o gerador de tráfego exerce sobre estas simulações, se concentrando, principalmente, na relação entre precisão e aceleração da simulação. As análises propostas se baseiam no paradigma multifractal, o qual permite (1) um maior entendimento da natureza estatística do tráfego pelos desenvolvedores de sistemas, (2) a obtenção de uma representação precisa deste tráfego e (3) a construção de geradores de tráfego que substituam elementos processantes de maneira realista. Outra contribuição deste trabalho é a comparação do desempenho, no que concerne a precisão das séries de tráfego sintéticas obtidas, de modelos monofractais e multifractais. Todas as contribuições mencionadas foram agrupadas na metodologia detalhada, apresentada no presente documento, sobre a qual experimentos foram realizados. Fractais Modelos em séries temporais Sistemas integrados em larga escala Multifractal model On-chip traffic Speedup-accuracy trade-off Traffic generator Transaction level modeling
24	Um algoritmo paralelo eficiente de migra??o reversa no tempo (rtm) 3d com granularidade fina Assis, ?talo Augusto Souza de 30 January 2015 (has links) Submitted by Automa??o e Estat?stica (sst@bczm.ufrn.br) on 2016-02-22T21:52:17Z No. of bitstreams: 1 ItaloAugustoSouzaDeAssis_DISSERT.pdf: 2067503 bytes, checksum: 774040a098f0200527ecd35e1ac92443 (MD5) / Approved for entry into archive by Arlan Eloi Leite Silva (eloihistoriador@yahoo.com.br) on 2016-02-24T00:08:09Z (GMT) No. of bitstreams: 1 ItaloAugustoSouzaDeAssis_DISSERT.pdf: 2067503 bytes, checksum: 774040a098f0200527ecd35e1ac92443 (MD5) / Made available in DSpace on 2016-02-24T00:08:09Z (GMT). No. of bitstreams: 1 ItaloAugustoSouzaDeAssis_DISSERT.pdf: 2067503 bytes, checksum: 774040a098f0200527ecd35e1ac92443 (MD5) Previous issue date: 2015-01-30 / Conselho Nacional de Desenvolvimento Cient?fico e Tecnol?gico - CNPq / O algoritmo de migra??o reversa no tempo (RTM) tem sido amplamente utilizado na ind?stria s?smica para gerar imagens do subsolo e, assim, reduzir os riscos de explora??o de petr?leo e g?s. Seu uso em larga escala ? devido a sua alta qualidade no imageamento do subsolo. O RTM ? tamb?m conhecido pelo seu alto custo computacional. Por essa raz?o, t?cnicas de computa??o paralela t?m sido utilizadas em suas implementa??es. Em geral, as abordagens paralelas para o RTM utilizam uma granularidade grossa, dividindo o processamento de um subconjunto de tiros s?smicos entre n?s de sistemas distribu?- dos. A abordagem paralela com granularidade grossa para o RTM tem se mostrado bastante eficiente uma vez que o processamento de cada tiro s?smico pode ser realizado de forma independente. Todavia, os n?s dos sistemas distribu?dos atuais s?o, em geral, equipamentos com diversos elementos de processamento sob uma arquitetura com mem?ria compartilhada. Assim, o desempenho do algoritmo de RTM pode ser consideravelmente melhorado com a utiliza??o de uma abordagem paralela com granularidade fina para o processamento designado a cada n?. Por essa raz?o, este trabalho apresenta um algoritmo paralelo eficiente de migra??o reversa no tempo em 3D com granularidade fina utilizando o padr?o OpenMP como modelo de programa??o. O algoritmo de propaga??o da onda ac?stica 3D comp?e grande parte do RTM. Foram analisados diferentes balanceamentos de carga a fim de minimizar poss?veis perdas de desempenho paralelo nesta fase. Os resultados encontrados serviram como base para a implementa??o das outras fases do RTM: a retropropaga??o e a condi??o de imagem. O algoritmo proposto foi testado com dados sint?ticos representando algumas das poss?veis estruturas do subsolo. M?tricas como speedup e efici?ncia foram utilizadas para analisar seu desempenho paralelo. As se??es migradas mostram que o algoritmo obteve um desempenho satisfat?rio na identifica??o das estruturas da subsuperf?cie. J? as an?lises de desempenho paralelo explicitam a escalabilidade dos algoritmos alcan?ando um speedup de 22,46 para a propaga??o da onda e 16,95 para o RTM, ambos com 24 threads. / The reverse time migration algorithm (RTM) has been widely used in the seismic industry to generate images of the underground and thus reduce the risk of oil and gas exploration. Its widespread use is due to its high quality in underground imaging. The RTM is also known for its high computational cost. Therefore, parallel computing techniques have been used in their implementations. In general, parallel approaches for RTM use a coarse granularity by distributing the processing of a subset of seismic shots among nodes of distributed systems. Parallel approaches with coarse granularity for RTM have been shown to be very efficient since the processing of each seismic shot can be performed independently. For this reason, RTM algorithm performance can be considerably improved by using a parallel approach with finer granularity for the processing assigned to each node. This work presents an efficient parallel algorithm for 3D reverse time migration with fine granularity using OpenMP. The propagation algorithm of 3D acoustic wave makes up much of the RTM. Different load balancing were analyzed in order to minimize possible losses parallel performance at this stage. The results served as a basis for the implementation of other phases RTM: backpropagation and imaging condition. The proposed algorithm was tested with synthetic data representing some of the possible underground structures. Metrics such as speedup and efficiency were used to analyze its parallel performance. The migrated sections show that the algorithm obtained satisfactory performance in identifying subsurface structures. As for the parallel performance, the analysis clearly demonstrate the scalability of the algorithm achieving a speedup of 22.46 for the propagation of the wave and 16.95 for the RTM, both with 24 threads. CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA Sistemas paralelos e distribu?dos Migra??o reversa no tempo 3D (RTM) Granularidade fina Speedup Efici?ncia Escalabilidade
25	Comparison of Shared memory based parallel programming models Ravela, Srikar Chowdary January 2010 (has links) Parallel programming models are quite challenging and emerging topic in the parallel computing era. These models allow a developer to port a sequential application on to a platform with more number of processors so that the problem or application can be solved easily. Adapting the applications in this manner using the Parallel programming models is often influenced by the type of the application, the type of the platform and many others. There are several parallel programming models developed and two main variants of parallel programming models classified are shared and distributed memory based parallel programming models. The recognition of the computing applications that entail immense computing requirements lead to the confrontation of the obstacle regarding the development of the efficient programming models that bridges the gap between the hardware ability to perform the computations and the software ability to support that performance for those applications [25][9]. And so a better programming model is needed that facilitates easy development and on the other hand porting high performance. To answer this challenge this thesis confines and compares four different shared memory based parallel programming models with respect to the development time of the application under a shared memory based parallel programming model to the performance enacted by that application in the same parallel programming model. The programming models are evaluated in this thesis by considering the data parallel applications and to verify their ability to support data parallelism with respect to the development time of those applications. The data parallel applications are borrowed from the Dense Matrix dwarfs and the dwarfs used are Matrix-Matrix multiplication, Jacobi Iteration and Laplace Heat Distribution. The experimental method consists of the selection of three data parallel bench marks and developed under the four shared memory based parallel programming models considered for the evaluation. Also the performance of those applications under each programming model is noted and at last the results are used to analytically compare the parallel programming models. Results for the study show that by sacrificing the development time a better performance is achieved for the chosen data parallel applications developed in Pthreads. On the other hand sacrificing a little performance data parallel applications are extremely easy to develop in task based parallel programming models. The directive models are moderate from both the perspectives and are rated in between the tasking models and threading models. / From this study it is clear that threading model Pthreads model is identified as a dominant programming model by supporting high speedups for two of the three different dwarfs but on the other hand the tasking models are dominant in the development time and reducing the number of errors by supporting high growth in speedup for the applications without any communication and less growth in self-relative speedup for the applications involving communications. The degrade of the performance by the tasking models for the problems based on communications is because task based models are designed and bounded to execute the tasks in parallel without out any interruptions or preemptions during their computations. Introducing the communications violates the purpose and there by resulting in less performance. The directive model OpenMP is moderate in both aspects and stands in between these models. In general the directive models and tasking models offer better speedup than any other models for the task based problems which are based on the divide and conquer strategy. But for the data parallelism the speedup growth however achieved is low (i.e. they are less scalable for data parallel applications) are equally compatible in execution times with threading models. Also the development times are considerably low for data parallel applications this is because of the ease of development supported by those models by introducing less number of functional routines required to parallelize the applications. This thesis is concerned about the comparison of the shared memory based parallel programming models in terms of the speedup. This type of work acts as a hand in guide that the programmers can consider during the development of the applications under the shared memory based parallel programming models. We suggest that this work can be extended in two different ways: one is from the developer‘s perspective and the other is a cross-referential study about the parallel programming models. The former can be done by using a similar study like this by a different programmer and comparing this study with the new study. The latter can be done by including multiple data points in the same programming model or by using a different set of parallel programming models for the study. / C/O K. Manoj Kumar; LGH 555; Lindbloms Vägan 97; 37233; Ronneby. Phone no: 0738743400 Home country phone no: +91 9948671552 Parallel Programming models Distributed memory Shared memory Dwarfs Development time Speedup Data parallelism Dense Matrix dwarfs threading models Tasking models Directive models. Computer Sciences Datavetenskap (datalogi)
26	Evaluating Speedup in Parallel Compilers Komathukattil, Deepa V 01 January 2012 (has links) Parallel programming is prevalent in every field mainly to speed up computation. Advancements in multiprocessor technology fuel this trend toward parallel programming. However, modern compilers are still largely single threaded and do not take advantage of the machine resources available to them. There has been a lot of work done on compilers that add parallel constructs to the programs they are compiling, enabling programs to exploit parallelism at run time. Auto parallelization of loops by a compiler is one such example. Researchers have done very little work towards parallelizing the compilation process itself. The research done here focuses on parallel compilers that target computation speedup by parallelizing the process of program compilation during the lexical analysis and semantic analysis phase. Parallelization brings along with it issues like synchronization, concurrency and communication overhead. In the semantic analysis phase, these issues are of particular relevance during the construction of the symbol table. Research done on a concurrent compiler developed at the University of Toronto in 1991 proposed three techniques to address the generation of the symbol table [Seshadri91]. The goal here is to implement a parallel compiler using concepts from those techniques as references. The research done here will augment the work done formerly and measure the performance speedup obtained. Thesis University of North Florida UNF Speedup Parallel Compilers Top Down Parsing Computer and Systems Architecture
27	De-quantizing quantum machine learning algorithms Sköldhed, Stefanie January 2022 (has links) Today, a modern and interesting research area is machine learning. Another new and exciting research area is quantum computation, which is the study of the information processing tasks accomplished by practising quantum mechanical systems. This master thesis will combine both areas, and investigate quantum machine learning. Kerenidis’ and Prakash’s quantum algorithm for recommendation systems, that offered exponential speedup over the best known classical algorithms at the time, will be examined together with Tang’s classical algorithm regarding recommendation systems, which operates in time only polynomial slower than the previously mentioned algorithm. The speedup in the quantum algorithm was achieved by assuming that the algorithm had quantum access to the data structure and that the mapping to the quantum state was performed in polylog(mn). The speedup in the classical algorithm was attained by assuming that the sampling could be performed in O(logn) and O(logmn) for vectors and matrices, respectively. Quantum Machine Learning Algorithm Classical Machine Learning Algorithm Speedup Recommendation Systems Annan elektroteknik och elektronik
28	Parallel Processing of Reactive Transport Models Using OpenMP McLaughlin, Jared D. 20 March 2008 (has links) (PDF) Transport codes are beginning to be parallelized in order to allow more complex add-ons, such as geochemical packages, to utilize finer, more accurate grids, and to reduce solution times making stochastic and Monte Carlo simulations more feasible. Most codes parallelized via MPI (message passing interface) offer good results, but require the development of a new parallel code. OpenMP, the shared-memory standard, offers incremental parallelization, allowing sequential codes to remain relatively intact with minimal changes or additions. OpenMP allows speedup to be seen on personal computers with dual processors or greater, unlike some other parallelization approaches that require a supercomputer. An operator-split strategy creates an environment for easy parallelization by decoupling the transport and reactions of species. The transport, when decoupled from the reactions, is dependent on surrounding nodes and not on species. Therefore, each species transport can be solved on a different processor. The reactions, when decoupled from the transport, are dependant on the other species concentrations and not on the surrounding nodes, allowing the concentrations for all species to be solve for at a given node as if in a batch reactor. This allows a parallelization of the nodes. Two codes are parallelized in this work. The first is a 100-species 1D theoretical problem. The second is RT3D, a modular computer code for simulating reactive multi-species transport in 3-dimensional groundwater systems written and developed by Dr. T. Prabhakar Clement. RT3D is a sub-component of a parent code, MT3DMS, which utilizes RT3D to solve reaction terms. A speedup factor of 3.91 is seen on four processors, accomplishing a processor efficiency of approximately 98% while spent in RT3D itself. reactive transport OpenMP RT3D parallel speedup shared-memory multiprocessing modeling TVD flux limiters operator-split advection dispersion Amdahl Civil and Environmental Engineering
29	Detecting quantum speedup for random walks with artificial neural networks / Att upptäcka kvantacceleration för slumpvandringar med artificiella neuronnät Linn, Hanna January 2020 (has links) Random walks on graphs are an essential base for crucial algorithms for solving problems, like the boolean satisfiability problem. A speedup of random walks could improve these algorithms. The quantum version of the random walk, quantum walk, is faster than random walks in specific cases, e.g., on some linear graphs. An analysis of when the quantum walk is faster than the random walk can be accomplished analytically or by simulating both the walks on the graph. The problem arises when the graphs grow in size and connectivity. There are no known general rules for what an arbitrary graph not having explicit symmetries should exhibit to promote the quantum walk. Simulations will only answer the question for one single case, and will not provide any general rules for properties the graph should have. Using artificial neural networks (ANNs) as an aid for detecting when the quantum walk is faster on average than random walk on graphs, going from an initial node to a target node, has been done before. The quantum speedup may not be more than polynomial if the initial state of the quantum walk is purely in the initial node of the graph. We investigate starting the quantum walk in various superposition states, with an additional auxiliary node, to maybe achieve a larger quantum speedup. We suggest different ways to add the auxiliary node and select one of these schemes for use in this thesis. The superposition states examined are two stabiliser states and two magic states, inspired by the Gottesman-Knill theorem. According to this theorem, starting a quantum algorithm in a magic state may give an exponential speedup, but starting in a stabilizer state cannot give an exponential speedup, given that only gates from the Clifford group are used in the algorithm, as well as measurements are performed in the Pauli basis. We show that it is possible to train an ANN to classify graphs into what quantum walk was the fastest for various initial states of the quantum walk. The ANN classifies linear graphs and random graphs better than a random guess. We also show that a convolutional neural network (CNN) with a deeper architecture than earlier proposed for the task, is better at classifying the graphs than before. Our findings pave the way for automated research in novel quantum walk-based algorithms. / Slumpvandringar på grafer är essensiella i viktiga algoritmer för att lösa olika problem, till exempel SAT, booleska uppfyllningsproblem (the satisfiability problem). Genom att göra slumpvandringar snabbare går det att förbättra dessa algoritmer. Kvantversionen av slumpvandringar, kvantvandringar, har visats vara snabbare än klassiska slumpvandringar i specifika fall, till exempel på vissa linjära grafer. Det går att analysera, analytiskt eller genom att simulera vandringarna på grafer, när kvantvandringen är snabbare än slumpvandingen. Problem uppstår dock när graferna blir större, har fler noder samt fler kanter. Det finns inga kända generella regler för vad en godtycklig graf, som inte har några explicita symmetrier, borde uppfylla för att främja kvantvandringen. Simuleringar kommer bara besvara frågan för ett enda fall. De kommer inte att ge några generella regler för vilka egenskaper grafer borde ha. Artificiella neuronnät (ANN) har tidigare används som hjälpmedel för att upptäcka när kvantvandringen är snabbare än slumpvandingen på grafer. Då jämförs tiden det tar i genomsnitt att ta sig från startnoden till slutnoden. Dock är det inte säkert att få kvantacceleration för vandringen om initialtillståndet för kvantvandringen är helt i startnoden. I det här projektet undersöker vi om det går att få en större kvantacceleration hos kvantvandringen genom att starta den i superposition med en extra nod. Vi föreslår olika sätt att lägga till den extra noden till grafen och sen väljer vi en för att använda i resen av projektet. De superpositionstillstånd som undersöks är två av stabilisatortillstånden och två magiska tillstång. Valen av dessa tillstånd är inspirerat av Gottesmann- Knill satsen. Enligt satsen så kan en algoritm som startar i ett magiskt tillstånd ha en exponetiell uppsnabbning, men att starta i någon stabilisatortillstånden inte kan ha det. Detta givet att grindarna som används i algoritmen är från Cliffordgruppen samt att alla mätningar är i Paulibasen. I projektet visar vi att det är möjligt att träna en ANN så att den kan klassificera grafer utifrån vilken kvantvandring, med olika initialtillstånd, som var snabbast. Artificiella neuronnätet kan klassificera linjära grafer och slumpmässiga grafer bättre än slumpen. Vi visar också att faltningsnätverk med en djupare arkitektur än tidigare föreslaget för uppgiften är bättre på att klassificera grafer än innan. Våra resultat banar vägen för en automatiserad forskning i nya kvantvandringsbaserade algoritmer. Quantum machine learning convolutional neural networks magic state random walk quantum walk quantum speedup. Kvantmaskininlärning faltningsnätverk magiska kvanttillstånd slumpvandringar kvantvandringar kvantacceleration. Computer and Information Sciences Data- och informationsvetenskap
30	Netlist Security Algorithm Acceleration Using OpenCL on FPGAs Pelini, Nicholas Michael 28 August 2017 (has links) No description available. Computer Engineering Electrical Engineering OpenCL netlist FPGA DFF verification security Python gate ctypes fan in fan out flatten hash integrated circuit acceleration speedup

Search results