• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 53
  • 17
  • 14
  • 4
  • 4
  • 4
  • 3
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 116
  • 21
  • 14
  • 13
  • 13
  • 12
  • 12
  • 12
  • 11
  • 11
  • 11
  • 10
  • 10
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Higher order QCD corrections to diboson production at hadron colliders

Rontsch, Raoul Horst January 2012 (has links)
Hadronic collider experiments have played a major role in particle physics phenomenology over the last few decades. Data recorded at the Tevatron at Fermilab is still of interest, and its successor, the Large Hadron Collider (LHC) at CERN, has recently announced the discovery of a particle consistent with the Standard Model Higgs boson. Hadronic colliders look set to guide the field for the next fifteen years or more, with the discovery of more particles anticipated. The discovery and detailed study of new particles relies crucially on the availability of high-precision theoretical predictions for both the signal and background processes. This requires observables to be calculated to next-to-leading order (NLO) in perturbative quantum chromodynamics (QCD). Many hadroproduction processes of interest contain multiple particles in the final state. Until recently, this caused a bottleneck in NLO QCD calculations, due to the difficulty in calculating one-loop corrections to processes involving three or more final state particles. Spectacular developments in on-shell methods over the last six years have made these calculations feasible, allowing highly accurate predictions for final state observables at the Tevatron and LHC. A particular realisation of on-shell methods, generalised unitarity, is used to compute the NLO QCD cross-sections and distributions for two processes: the hadroproduction of W<sup>+</sup> W<sup>-</sup>jj, and the hadroproduction of W<sup>+</sup> W<sup>-</sup>jj. The NLO corrections to both processes serve to reduce the scale dependence of the results significantly, while having a moderate effect on the central scale choice cross-sections, and leaving the shapes of the kinematic distributions mostly unchanged. Additionally, the gluon fusion contribution to the next-to-next-to-leading order (NNLO) QCD corrections to W<sup>+</sup> W<sup>-</sup>j productions are studied. These contributions are found to be highly depen- dent on the kinematic cuts used. For cuts used in Higgs searches, the gluon fusion effect can be as large as the NLO scale uncertainty, and should not be neglected. All of the higher-order QCD corrections increase the accuracy and reliability of the theoretical predictions at hadronic colliders.
22

Une étude formelle de la théorie des calculs locaux à l'aide de l'assistant de preuve Coq

Filou, Vincent 21 December 2012 (has links)
L'objectif de cette thèse est de produire un environnement permettant de raisonner formellement sur la correction de systèmes de calculs locaux, ainsi que sur l'expressivité de ce modèle de calcul. Pour ce faire, nous utilisons l'assistant de preuve Coq. Notre première contribution est la formalisation en Coq de la sémantique des systèmes de réétiquetage localement engendrés, ou calculs locaux. Un système de calculs locaux est un système de réétiquetage de graphe dont la portée est limitée. Nous proposons donc tout d'abord une implantation succincte de la théorie des graphes en Coq, et utilisons cette dernière pour définir les systèmes de réétiquetage de graphes localement engendrés. Nous avons relevé, dans la définition usuelle des calculs locaux, certaines ambiguïtés. Nous proposons donc une nouvelle définition, et montrons formellement que celle-ci capture toutes les sous-classes d'algorithmes étudiées. Nous esquissons enfin une méthodologie de preuve des systèmes de calculs locaux en Coq.Notre seconde contribution consiste en l'étude formelle de l'expressivité des systèmes de calculs locaux. Nous formalisons un résultat de D. Angluin (repris par la suite par Y. Métivier et J. Chalopin): l'inexistence d'un algorithme d'élection universelle. Nous proposons ensuite deux lemmes originaux concernant les calculs locaux sur les arêtes (ou systèmes LC0), et utilisons ceux-ci pour produire des preuves formelles d'impossibilité pour plusieurs problèmes: calcul du degré de chaque sommet, calcul d'arbre recouvrant, etélection. Nous proposons informellement une nouvelles classes de graphe pour laquelle l'élection est irréalisable par des calculs locaux sur les arêtes.Nous étudions ensuite les transformations de systèmes de calculs locaux et de leur preuves. Nous adaptons le concept de Forward Simulation de N. Lynch aux systèmes de calculs locaux et utilisons ce dernier pour démontrer formellement l'inclusion de deux modes de détection de terminaison dans le cas des systèmes LC0. La preuve de cette inclusion estsimplifiée par l'utilisation de transformations "standards" de systèmes, pour lesquels des résultats génériques ont été démontrés. Finalement, nous réutilisons ces transformations standards pour étudier, en collaboration avec M. Tounsi, deux techniques de composition des systèmes de réétiquetage LC0. Une bibliothèque Coq d'environ 50000 lignes, contenant les preuves formelles des théorèmes présentés dans le mémoire de thèse à été produite en collaboration avec Pierre Castéran (dont environ 40%produit en propre par V. Filou) au cours de cette thèse. / The goal of this work is to build a framework allowing the study, in aformal setting, of the correctness of local computations systems aswell as the expressivity of this model. A local computation system isa set of graph relabelling rules with limited scope, corresponding to a class of distributed algorithms.Our first contribution is the formalisation, in the Coq proofassistant, of a relationnal semantic for local computation systems.This work is based on an original formal graph theory for Coq.Ambiguities inherent to a "pen and paper" definition of local computations are corrected, and we prove that our definition captures all sub-classes of relabelling relations studied in the remainder. We propose a draft of a proof methodology for local computation systems in Coq. Our second contribution is the study of the expressivity of classes of local computations inside our framework. We provide,for instance, a formal proof of D. Angluin results on election and graph coverings. We propose original "meta-theorems" concerningthe LC0 class of local computation, and use these theorem to produce formal impossibility proofs.Finally we study possible transformations of local computation systemsand of their proofs. To this end, we adapt the notion of ForwardSimulation, originally formulated by N. Lynch, to localcomputations. We use this notion to define certified transformationsof LC0 systems. We show how those certified transformation can be useto study the expressivity of certain class of algorithm in ourframework. We define, as certified transformation, two notions ofcomposition for LC0 systems.A Coq library of ~ 50000 lines of code, containing the formal proofs of the theorems presented in the thesis has been produced in collaboration with Pierre Castéran.
23

Performance optimization of geophysics stencils on HPC architectures / Optimização de desempenho de estênceis geofísicos sobre arquiteturas HPC

Abaunza, Víctor Eduardo Martínez January 2018 (has links)
A simulação de propagação de onda é uma ferramenta crucial na pesquisa de geofísica (para análise eficiente dos terremotos, mitigação de riscos e a exploração de petróleo e gáz). Devido à sua simplicidade e sua eficiência numérica, o método de diferenças finitas é uma das técnicas implementadas para resolver as equações da propagação das ondas. Estas aplicações são conhecidas como estênceis porque consistem num padrão que replica a mesma computação num domínio multidimensional de dados. A Computação de Alto Desempenho é requerida para solucionar este tipo de problemas, como consequência do grande número de pontos envolvidos nas simulações tridimensionais do subsolo. A optimização do desempenho dos estênceis é um desafio e depende do arquitetura usada. Neste contexto, focamos nosso trabalho em duas partes. Primeiro, desenvolvemos nossa pesquisa nas arquiteturas multicore; analisamos a implementação padrão em OpenMP dos modelos numéricos da transferência de calor (um estêncil Jacobi de 7 pontos), e o aplicativo Ondes3D (um simulador sísmico desenvolvido pela Bureau de Recherches Géologiques et Minières); usamos dois algoritmos conhecidos (nativo, e bloqueio espacial) para encontrar correlações entre os parâmetros da configuração de entrada, na execução, e o desempenho computacional; depois, propusemos um modelo baseado no Aprendizado de Máquina para avaliar, predizer e melhorar o desempenho dos modelos estênceis na arquitetura usada; também usamos um modelo de propagação da onda acústica fornecido pela empresa Petrobras; e predizemos o desempenho com uma alta precisão (até 99%) nas arquiteturas multicore. Segundo, orientamos nossa pesquisa nas arquiteturas heterogêneas, analisamos uma implementação padrão do modelo de propagação de ondas em CUDA, para encontrar os fatores que afetam o desempenho quando o número de aceleradores é aumentado; então, propusemos uma implementação baseada em tarefas para amelhorar o desempenho, de acordo com um conjunto de configuração no tempo de execução (algoritmo de escalonamento, tamanho e número de tarefas), e comparamos o desempenho obtido com as versões de só CPU ou só GPU e o impacto no desempenho das arquiteturas heterogêneas; nossos resultados demostram um speedup significativo (até 25) em comparação com a melhor implementação disponível para arquiteturas multicore. / Wave modeling is a crucial tool in geophysics, for efficient strong motion analysis, risk mitigation and oil & gas exploration. Due to its simplicity and numerical efficiency, the finite-difference method is one of the standard techniques implemented to solve the wave propagation equations. This kind of applications is known as stencils because they consist in a pattern that replicates the same computation on a multi-dimensional domain. High Performance Computing is required to solve this class of problems, as a consequence of a large number of grid points involved in three-dimensional simulations of the underground. The performance optimization of stencil computations is a challenge and strongly depends on the underlying architecture. In this context, this work was directed toward a twofold aim. Firstly, we have led our research on multicore architectures and we have analyzed the standard OpenMP implementation of numerical kernels from the 3D heat transfer model (a 7-point Jacobi stencil) and the Ondes3D code (a full-fledged application developed by the French Geological Survey). We have considered two well-known implementations (naïve, and space blocking) to find correlations between parameters from the input configuration at runtime and the computing performance; thus, we have proposed a Machine Learning-based approach to evaluate, to predict, and to improve the performance of these stencil models on the underlying architecture. We have also used an acoustic wave propagation model provided by the Petrobras company and we have predicted the performance with high accuracy on multicore architectures. Secondly, we have oriented our research on heterogeneous architectures, we have analyzed the standard implementation for seismic wave propagation model in CUDA, to find which factors affect the performance; then, we have proposed a task-based implementation to improve the performance, according to the runtime configuration set (scheduling algorithm, size, and number of tasks), and we have compared the performance obtained with the classical CPU or GPU only versions with the results obtained on heterogeneous architectures.
24

Molecular Computations for the Stabilization of Therapeutic Proteins

Trout, Bernhardt L. 01 1900 (has links)
Molecular computations based on quantum mechanics and statistical mechanics have been applied to the understanding and quantification of processes leading to the degradation of therapeutic proteins. In particular, we focus on oxidation and aggregation. Specifically, two reactions, hydrogen transfer of hydrogen peroxide to form water oxide and the oxidation of dimethyl sulfide (DMS) by hydrogen peroxide to form dimethyl sulfoxide, were studied as models of these processes in general. Reaction barriers of the hydrogen transfer of H₂O₂ are in average of 10 kcal/mol or higher than the oxidation of DMS. Therefore, a two step oxidation mechanism in which the transfer of hydrogen atom occurs first to form water oxide and the transfer of oxygen to substrate occurs as the second step, is unlikely to be correct. Our proposed oxidation mechanism does not suggest a pH dependence of oxidation rate within a moderate range around neutral pH (i.e. under conditions in which hydronium and hydroxide ions do not participate directly in the reaction), and it agrees with experimental observations over moderate pH values. In the field of aggregation, we have developed a relatively simple approach for computing the change in chemical potential of a protein upon addition of an excipient (cosolute) to the protein solution. We have also developed a general approach to the design of excipients to prevent aggregation and are currently testing it experimentally. / Singapore-MIT Alliance (SMA)
25

Scientific High Performance Computing (HPC) Applications On The Azure Cloud Platform

Agarwal, Dinesh 10 May 2013 (has links)
Cloud computing is emerging as a promising platform for compute and data intensive scientific applications. Thanks to the on-demand elastic provisioning capabilities, cloud computing has instigated curiosity among researchers from a wide range of disciplines. However, even though many vendors have rolled out their commercial cloud infrastructures, the service offerings are usually only best-effort based without any performance guarantees. Utilization of these resources will be questionable if it can not meet the performance expectations of deployed applications. Additionally, the lack of the familiar development tools hamper the productivity of eScience developers to write robust scientific high performance computing (HPC) applications. There are no standard frameworks that are currently supported by any large set of vendors offering cloud computing services. Consequently, the application portability among different cloud platforms for scientific applications is hard. Among all clouds, the emerging Azure cloud from Microsoft in particular remains a challenge for HPC program development both due to lack of its support for traditional parallel programming support such as Message Passing Interface (MPI) and map-reduce and due to its evolving application programming interfaces (APIs). We have designed newer frameworks and runtime environments to help HPC application developers by providing them with easy to use tools similar to those known from traditional parallel and distributed computing environment set- ting, such as MPI, for scientific application development on the Azure cloud platform. It is challenging to create an efficient framework for any cloud platform, including the Windows Azure platform, as they are mostly offered to users as a black-box with a set of application programming interfaces (APIs) to access various service components. The primary contributions of this Ph.D. thesis are (i) creating a generic framework for bag-of-tasks HPC applications to serve as the basic building block for application development on the Azure cloud platform, (ii) creating a set of APIs for HPC application development over the Azure cloud platform, which is similar to message passing interface (MPI) from traditional parallel and distributed setting, and (iii) implementing Crayons using the proposed APIs as the first end-to-end parallel scientific application to parallelize the fundamental GIS operations.
26

Improved Spectral Calculations for Discrete Schroedinger Operators

Puelz, Charles 16 September 2013 (has links)
This work details an O(n^2) algorithm for computing the spectra of discrete Schroedinger operators with periodic potentials. Spectra of these objects enhance our understanding of fundamental aperiodic physical systems and contain rich theoretical structure of interest to the mathematical community. Previous work on the Harper model led to an O(n^2) algorithm relying on properties not satisfied by other aperiodic operators. Physicists working with the Fibonacci Hamiltonian, a popular quasicrystal model, have instead used a problematic dynamical map approach or a sluggish O(n^3) procedure for their calculations. The algorithm presented in this work, a blend of well-established eigenvalue/vector algorithms, provides researchers with a more robust computational tool of general utility. Application to the Fibonacci Hamiltonian in the sparsely studied intermediate coupling regime reveals structure in canonical coverings of the spectrum that will prove useful in motivating conjectures regarding band combinatorics and fractal dimensions.
27

Numerical Simulations Of Axisymmetric Near Wakes At High Reynolds Numbers

Devi, Ravindra G 08 1900 (has links)
The flow past the needle of a Pelton turbine injector is an axisymmetric wake embedded in a round jet. The wake does not fully relax to yield a uniform velocity jet due to the short distance between injector and the Pelton wheel buckets and this non-uniformity affects the turbine efficiency. To minimize the non-uniformity, it is essential to predict the near wake accurately. While far-field wakes are well described by analytical expressions and also well predicted by CFD codes, the quality of the prediction of axisymmetric near wakes is not known. It is of practical interest to establish the applicability bounds of the Reynolds Averaged Navier-Stokes (RANS) models, which are commonly used in industry, for axisymmetric near wakes, for this specific problem, as well as, in general. Understanding of the near wake is crucial considering various aerospace applications. For example the details of the aerodynamics of the near wake are crucial for stabilization of a flame. The size of recirculation zone affects the rate of production of hot burnt products, and the mixing between the products and reactants is governed by the turbulence in the free shear layers. Wakes from two-dimensional bodies such as a wedge, circular and square cylinder have been extensively studied at different Reynolds number (Re); however, this is not the case with three-dimensional axisymmetric bodies such as spheres, ellipsoids, disks etc. Most common axisymmetric body investigated is a sphere. The flow past sphere is typically characterized in three regions: sub critical, critical and supercritical. In sub critical region, Re<3x105 the boundary layer separation is laminar. Critical region, Re≈3x105, is where the boundary layer transitions to turbulent and then separates resulting in sudden drag reduction. The critical Re may vary depending on flow conditions such as turbulent intensities, sphere surface variations etc. In the supercritical region, Re > 3x105, the boundary layer is turbulent before separation and the drag starts increasing beyond critical drag. Though the geometry and the flow conditions are simple the flow features involved are complex especially laminar to turbulent boundary layer transition and high speed transient vortex shedding. Experimentally it has been observed that the vortex shedding location changes randomly and perhaps rotates. All these features pose a significant challenge for experimental measurements and as well as numerical modeling. Thus most experimental measurements have been done below Re=103. Also the data is measured over the sphere surface, for eg: skin friction, pressure, but almost no data is available in the near wake. Similarly numerical investigations are primarily in subcritical region. DNS has been used for low Re, up to 800. RANS has been used in the subcritical region at Re=104. For higher Re, LES and DES have been used however they are computationally intensive. No numerical work has been reported for an ellipsoid at zero angle of attack. Chevray (1968) has done measurements in the near wake of ellipsoid at Re=2.75x105. Most experimental and numerical investigations of an ellipsoid are at an angle of attack. Given the extensive usage of RANS in the industry due to its economy, the focus of this work is to investigate the applicability of these models for flow prediction in the near wake in the supercritical region. Simulations are performed using commercial code CFX. The code is validated against well-established results for laminar and turbulent boundary layer flow over flat plate. Sufficient agreement has been obtained for laminar flow past sphere, against measured quantities such as separation location, separation bubble length and drag coefficient. The changes in wake structure, as a function of Re, are validated against experimental observations. The wake is steady and axisymmetric up to Re=200, from Re=200 to 270 it remains steady, loses axisymmetry but retains planar symmetry. Beyond Re=290 the wake becomes unsteady due to unstable recirculation bubble which leads to vortex shedding, while still retaining planar symmetry. The formation of typical horseshoe vortices is observed. Before the simulations in the supercritical region the low-Re k- model is validated in the subcritical region at Re=104 against measurements of skin friction, pressure coefficient and average drag coefficient. Very distinct wake fluctuations are observed and low-mode Strouhal number (St) agrees with the past measurements. Vortex sheet fluctuations are observed but the high-mode St calculation is based on crude measurement of the fluctuations. At Re=7.8x104 the trends in the drag, skin friction coefficient and pressure coefficient are in logical direction when compared with data at Re=104. However the near wake velocity data does not match with measurements qualitatively as well as quantitatively. The velocities in the present work are qualitatively justified based on the flow directions in the recirculation bubble. Various RANS models such as k-, k- and Reynolds stress model are used to predict flow past a sphere and an ellipsoid in the supercritical region. The results for sphere are compared against the measurements from Achenbach at Re=1.14x106 and that for ellipsoid are compared against the measurements from Chevray at Re=2.75x106. Four different turbulence models namely: high-Re k-, high-Re k-, low-Re k- and low-Re RSM. All the models over predict skin friction, which is due to simplistic treatment of boundary layer. The boundary layer is treated as fully turbulent as against the experiments where it transitions from laminar to turbulent. The k- model, being high-Re model, did not capture near wall flow and hence predicts an almost steady wake. It over predicts the drag, skin friction and results in delayed separation. However it did show the vortex sheet roll-up and release mechanism prominently which agreed with the experiments by Taneda. In all other models this mechanism is seen but intermittently and the wake is unsteady. Due to highly random wake orientation the low-mode St number is not calculated. RSM model shows certain consistency and St based on that is 0.24. All models show vortex sheet fluctuations with almost equal magnitude and frequency. The high-mode St is about 20 based on this. There is a need to have better understanding both experimentally and numerically about validity of this number. High frequency fluctuations are displayed in the time history of streamwise drag force for all the four models. The St based on this frequency is 4.32. Origin of these fluctuations needs investigation. The RSM model predicts the most accurate skin friction coefficient, pressure coefficient and the drag. For an ellipsoid, two cases are computed, one without blockage (referred to as base case) and another with 25% blockage (referred to as blockage case) to represent the typical blockage due to Pelton injector needle. Same models that were used for sphere are evaluated. Similar to the results for the sphere the maximum drag is predicted by k- model and the least by RSM model. Similarly the skin friction is high and the separation is delayed hence k-w model always predicts a smallest recirculation bubble. The differences in the form drag predictions are a direct result of the differences in upstream stagnation pressures, as there is no significant difference in the pressure curves obtained from different models including the rear stagnation pressure. The form drag is highest in k- model and lowest in RSM and so are the upstream stagnation pressures. The velocities in the near wake are predicted well by all the models. Pressure is predicted accurately before separation at x/D=-0.25. However it is significantly over-predicted after separation. To validate the pressure prediction independent simulation is done for an ellipsoid at an angle of attack of 100. The pressures on the windward and leeward side are in agreement with the measurements by Chesnakes et al. Similar to pressure prediction the turbulent intensity was predicted correctly before separation. After separation the trends agree but the intensities are higher than the measurements by about 10%. The results are not sensitivity to the inlet intensity levels except in the far field. The dissipation of the intensities is under predicted in simulations. The results from blockage case show similar trends as the base case. In the near wake the generation of turbulent kinetic energy is higher and the decay is slower in k- and RSM model compared to k-. This in turn results in higher eddy viscosity and higher velocities in the near wake for these models. Considering overall prediction accuracies RSM model predicts the drag, St and the separation location most accurately. It is important to predict the separation accurately for valid downstream results. For the cases with mild separation such as ellipsoid there is no significant difference in the velocities, however the pressure and drag prediction from RSM are closer to the experiments. The RSM model is more suitable both for sphere and ellipsoid at high Re. Validation of mean velocities and intensities in the near wake are needed to further support the choice of model. (for symbols pl see the original document)
28

On characteristics of stable boundary layer flow fields and their influence on wind turbine loads

Park, Jinkyoo 30 September 2011 (has links)
Fourier-based stochastic simulation of wind fields commonly used in wind turbine loads computations is unable to account for contrasting states of atmospheric stability. Flow fields in the stable boundary layer (SBL), for instance, have characteristics such as enhanced wind shear and veering wind direction profiles; the influence of such characteristics on utility-scale wind turbine loads has not been studied. To investigate these influences, we use large-eddy simulation (LES) to generate inflow wind fields and to estimate load statistics for a 5-MW wind turbine model. In the first part of this thesis, we describe a procedure employing LES to generate SBL wind fields for wind turbine load computations. In addition, we study how large-scale atmospheric conditions affect the characteristics of wind fields and turbine loads. Next, in the second part, we study the contrasting characteristics of LES-SBL and stochastic NBL flow fields and their influences on wind turbine load statistics by isolating effects of the mean wind (shear) profile and of variation in wind direction and turbulence levels over the rotor sept area. Among large-scale atmospheric conditions, the geostrophic wind speed and surface cooling rate have the greatest influence on flow field characteristics and, thus, on wind turbine loads. Higher geostrophic winds lead to increased mean and standard deviation values of the longitudinal wind speed at hub height. Increased surface cooling rates lead to steeper shear profiles and appear to also increase fatigue damage associated with out-of-plane blade root moments. In summary, our studies suggest that LES may be effectively used to model wind fields in the SBL, to study characteristics of turbine-scale wind fields, and to assess turbine loads for conditions that are not typically examined in design standards. / text
29

Deterministic and stochastic methods for molecular simulation

Minoukadeh, Kimiya 24 November 2010 (has links) (PDF)
Molecular simulation is an essential tool in understanding complex chemical and biochemical processes as real-life experiments prove increasingly costly or infeasible in practice . This thesis is devoted to methodological aspects of molecular simulation, with a particular focus on computing transition paths and their associated free energy profiles. The first part is dedicated to computational methods for reaction path and transition state searches on a potential energy surface. In Chapter 3 we propose an improvement to a widely-used transition state search method, the Activation Relaxation Technique (ART). We also present a local convergence study of a prototypical algorithm. The second part is dedicated to free energy computations. We focus in particular on an adaptive importance sampling technique, the Adaptive Biasing Force (ABF) method. The first contribution to this field, presented in Chapter 5, consists in showing the applicability to a large molecular system of a new parallel implementation, named multiple-walker ABF (MW-ABF). Numerical experiments demonstrated the robustness of MW-ABF against artefacts arising due to poorly chosen or oversimplified reaction coordinates. These numerical findings inspired a new study of the longtime convergence of the ABF method, as presented in Chapter 6. By studying a slightly modified model, we back our numerical results by showing a faster theoretical rate of convergence of ABF than was previously shown
30

Fast and flexible hardware support for elliptic curve cryptography over multiple standard prime finite fields

Alrimeih, Hamad 29 March 2012 (has links)
Exchange of private information over a public medium must incorporate a method for data protection against unauthorized access. Elliptic curve cryptography (ECC) has become widely accepted as an efficient mechanism to secure private data using public-key protocols. Scalar multiplication (which translates into a sequence of point operations each involving several modular arithmetic operations) is the main ECC computation, where the scalar value is secret and must be secured. In this dissertation, we consider ECC over five standard prime finite fields recommended by the National Institute of Standard and Technology (NIST), with the corresponding prime sizes of 192, 224, 256, 384, and 521 bits. This dissertation presents our general hardware-software approach and technical details of our novel hardware processor design, aimed at accelerating scalar multiplications with flexible security-performance tradeoffs. To enhance performance, our processor exploits parallelism by pipelining modular arithmetic computations and associated input/output data transfers. To enhance security, modular arithmetic computations and associated data transfers are grouped into atomically executed computational blocks, in order to make curve point operations indistinguishable and thus mask the scalar value. The flexibility of our processor is achieved through the software-controlled hardware programmability, which allows for different scenarios of computing atomic block sequences. Each scenario is characterized by a certain trade-off between the processor’s security and performance. As the best trade-off scenario is specific to the user and/or application requirements, our approach allows for such a scenario to be chosen dynamically by the system software, thus facilitating system adaptation to dynamically changing requirements. Since modular multiplications are the most critical low-level operation in ECC computations, we also propose a novel modular multiplier specifically optimized to take full advantage of the fast reduction algorithms associated with the five NIST primes. The proposed architecture has been prototyped on a Xilinx Virtex-6 FPGA and takes between 0.30 ms and 3.91 ms to perform a typical scalar multiplication. Such performance figures demonstrate both flexibility and efficiency of our proposed design and compares favourably against other systems reported in the literature. / Graduate

Page generated in 0.0751 seconds