Global ETD Search

1	High-Performance Matrix Multiplication: Hierarchical Data Structures, Optimized Kernel Routines, and Qualitative Performance Modeling Wu, Wenhao 02 August 2003 (has links) The optimal implementation of matrix multiplication on modern computer architectures is of great importance for scientific and engineering applications. However, achieving the optimal performance for matrix multiplication has been continuously challenged both by the ever-widening performance gap between the processor and memory hierarchy and the introduction of new architectural features in modern architectures. The conventional way of dealing with these challenges benefits significantly from the blocking algorithm, which improves the data locality in the cache memory, and from the highly tuned inner kernel routines, which in turn exploit the architectural aspects on the specific processor to deliver near peak performance. A state-of-art improvement of the blocking algorithm is the self-tuning approach that utilizes "heroic" combinatorial optimization of parameters spaces. Other recent research approaches include the approach that explicitly blocks for the TLB (Translation Lookaside Buffer) and the hierarchical formulation that employs memoryriendly Morton Ordering (a spaceilling curve methodology). This thesis compares and contrasts the TLB-blocking-based and Morton-Order-based methods for dense matrix multiplication, and offers a qualitative model to explain the performance behavior. Comparisons to the performance of self-tuning library and the "vendor" library are also offered for the Alpha architecture. The practical benchmark experiments demonstrate that neither conventional blocking-based implementations nor the self-tuning libraries are optimal to achieve consistent high performance in dense matrix multiplication of relatively large square matrix size. Instead, architectural constraints and issues evidently restrict the critical path and options available for optimal performance, so that the relatively simple strategy and framework presented in this study offers higher and flatter overall performance. Interestingly, maximal inner kernel efficiency is not a guarantee of global minimal multiplication time. Also, efficient and flat performance is possible at all problem sizes that fit in main memory, rather than "jagged" performance curves often observed in blocking and self-tuned blocking libraries. performance tuning matrix multiplication hierarchical matrix storage cache model
2	Application of advanced diagonalization methods to quantum spin systems. Wang, Jieyu 13 May 2014 (has links) Quantum spin models play an important role in theoretical condensed matter physics and quantum information theory. One numerical technique that is frequently used in studies of quantum spin systems is exact diagonalization. In this approach, numerical methods are used to find the lowest eigenvalues and associated eigenvectors of the Hamilton matrix of the quantum system. The computational problem is thus to determine the lowest eigenpairs of an extremely large, sparse matrix. Although many sophisticated iterative techniques for the determination of a small number of lowest eigenpairs can be found in the literature, most exact diagonalization studies of quantum spin systems have employed the Lanczos algorithm. In contrast to this, other methods have been applied very successfully to the similar problem of electronic structure calculations. The well known VASP code for example uses a Block Davidson method as well as the residual-minimization - direct inversion of the iterative subspace algorithm (RMM-DIIS). The Davidson algorithm is closely related to the Lanczos method but usually needs less iterations. The RMM-DIIS method was originally proposed by Pulay and later modified by Wood and Zunger. The RMM-DIIS method is particularly interesting if more than one eigenpair is sought since it does not require orthogonalization of the trial vectors at each step. In this work I study the efficiency of the Lanczos, Block Davidson and RMM-DIIS method when applied to basic quantum spin models like the spin-1/2 Heisenberg chain, ladder and dimerized ladder. I have implemented all three methods and are currently applying the methods to the different models. In our presentation I will compare the three algorithms based on the number of iterations to achieve convergence, the required computational time. An Intel's Many-Integrated Core architecture with Intel Xeon Phi coprocessor 5110P integrates 60 cores with 4 hardware threads per core was used for RMM-DIIS method, the achieved parallel speedups were compared with those obtained on a conventional multi-core system. quantum spin systems matrix storage Davidson and block Davidson method Lanczos method RMM-DIIS method diagonalization
3	Control of E. coli in biosolids Fane, Sarah Elizabeth January 2016 (has links) Achieving microbial compliance levels in biosolids storage is complicated by the unpredictable increase of Escherichia coli (E. coli), which serves as an important indicator for pathogen presence risk. Meeting required microbial specifications validates sludge treatment processes and ensures that a safe product is applied to agricultural land. Controlled indicator monitoring provides confidence for farmers, retailers and the food industry, safeguarding the sludge-to-land application route. Following mechanical dewatering biosolids products are stored before microbial compliance testing permits agricultural application. During storage, concentrations of E. coli bacteria can become elevated and prevent the product from meeting the conventional or enhanced levels of treatment outlined in The Safe Sludge Matrix guidelines. Literature research identified innate characteristics of sludge and ambient environmental parameters of storage which are factors likely to influence E. coli behaviour in stored biosolids. The research hypothesis tested whether E. coli growth and death in dewatered sewage sludge can be controlled by the modification of physical-chemical factors in the cake storage environment. Parameters including nutrient availability, temperature, moisture content and atmospheric influences were investigated through a series of laboratory-scale experiments. Controlled dewatering and the assessment of modified storage environments using traditional microbial plating and novel flow cytometry analysis have been performed. At an operational scale, pilot trials and up-scaled monitoring of the sludge storage environment have been conducted enabling verification of laboratory results. Understanding the dynamics of cell health within the sludge matrix in relation to nutrient availability has provided a valuable understanding of the mechanisms that may be affecting bacterial growth post-dewatering. The importance of elevated storage temperatures on E. coli death rates and results showing the benefits of a controlled atmosphere storage environment provide important considerations for utilities. 628.3
4	Delninukų energijos suvartojimo apdorojant išretintas matricas saugomas stulpeliais modeliavimas / Pocket PC energy consumption using sparse matrix storage by columns modeling Dičpinigaitis, Petras 28 January 2008 (has links) Kiekvienas mobilus įrenginys turi bateriją, o tai reiškia, kad jų darbo laikas ribotas, kadangi nėra išrasta ilgaamžė baterija. Todėl šiuo metu egzistuojanti problema - kaip pasiekti, kuo ilgesnį mobiliojo įrenginio darbo laiką, be papildomo pakrovimo. Darbo metu naudojamas mobilus įrenginys - delninukas. Iš visų delninuko baterijos energiją suvartojančių komponentų visas dėmesys skiriamas procesoriui ir atminčiai. Tyrimo metu buvo apkrautas procesorius ir atmintis ir stebimi atitinkami baterijos parametrai. Apkrovimui naudojama paprastų ir išretintų matricų saugomų stulpelių metodu daugyba. Išretintų matricų daugybos metu užimama mažiau atminties, o procesorius atlieka daugiau komandų lyginant su paprastu metodu, kuris užima daugiau atminties. Iš gautų rezultatų pamatėme, kad išretintų matricų saugomų stulpelių metodu daugyba yra daug efektyvesnė negu paprastų matricų daugyba. Todėl kuriant programas, kur reikia naudoti matricas geriau naudoti išretintų matricų stulpelių saugojimo metodo daugybą, kadangi galima sutrumpinti operacijos vykdymo laika, sunaudoti mažiau baterijos resursų ir sutaupyti atminties. / Nowadays major problem is energy consumtion in portable devices which has a battery. In this job we have evaluated energy consumption for Pocket PC. We wanted to see memory and processor influence in battery energy consumption. We have created a program which can do matrix multiplication and sparse matrix „storage by columns“ multiplication. During multiplication program takes battery information and saves it into the file. After that I have investigated the result and saw, that sparse matrix storage by columns multiplication is much more effectived than normal matrix multiplication. Sparce matrix storage by columns multiplication take less memory and more processor commands then normal matrix multiplication. We suggest to use sparse matrix storage by columns model instead simple model, because you can save much more operation time, battery resources and memory. Informatics Engineering Išretintos matricos Delninukų energijos suvartojimas Sparse matrix Sparse matrix storage by columns model Pocket pc energy consumption

1

Page generated in 0.0593 seconds