Global ETD Search

351	Suitability of Java for Solving Large Sparse Positive Definite Systems of Equations Using Direct Methods Armstrong, Shea January 2004 (has links) The purpose of the thesis is to determine whether Java, a programming language that evolved out of a research project by Sun Microsystems in 1990, is suitable for solving large sparse linear systems using direct methods. That is, can performance comparable to the language traditionally used for sparse matrix computation, Fortran, be achieved by a Java implementation. Performance evaluation criteria include execution speed and memory requirements. A secondary criterion is ease of development. Many attractive features, unique to the Java programming language, make it desirable for use in sparse matrix computation and provide the motivation for the thesis. The 'write once, run anywhere' proposition, coupled with nearly-ubiquitous Java support, alleviates the need to re-write programs in the event of hardware change. Features such as garbage collection (automatic recycling of memory) and array-index bounds checking make Java programs more robust than those written in Fortran. Java has garnered a poor reputation as a high-performance computing platform, largely attributable to poor performance relative to Fortran in its early years. It is now a consensus among researchers that the Java language itself is not the problem, but rather its implementation. As such, improving compiler technology for numerical codes is critical to achieving high performance in numerical Java applications. Preliminary work involved converting SPARSPAK, a collection of Fortran 90 subroutines for solving large sparse systems of linear equations and least squares problems developed by Dr. Alan George, into Java (J-SPARSPAK). It is well known that the majority of the solution process is spent in the numeric factorization phase. Initial benchmarks showed Java performing, on average, 3. 6 times slower than Fortran for this critical phase. We detail how we improved Java performance to within a factor of two of Fortran. Computer Science Java High-performance Java Numerical Java Computing Java compiler Numerical Java Libraries
352	Research and development of accounting system in grid environment Chen, Xiaoyn January 2010 (has links) The Grid has been recognised as the next-generation distributed computing paradigm by seamlessly integrating heterogeneous resources across administrative domains as a single virtual system. There are an increasing number of scientific and business projects that employ Grid computing technologies for large-scale resource sharing and collaborations. Early adoptions of Grid computing technologies have custom middleware implemented to bridge gaps between heterogeneous computing backbones. These custom solutions form the basis to the emerging Open Grid Service Architecture (OGSA), which aims at addressing common concerns of Grid systems by defining a set of interoperable and reusable Grid services. One of common concerns as defined in OGSA is the Grid accounting service. The main objective of the Grid accounting service is to ensure resources to be shared within a Grid environment in an accountable manner by metering and logging accurate resource usage information. This thesis discusses the origins and fundamentals of Grid computing and accounting service in the context of OGSA profile. A prototype was developed and evaluated based on OGSA accounting-related standards enabling sharing accounting data in a multi-Grid environment, the World-wide Large Hadron Collider Grid (WLCG). Based on this prototype and lessons learned, a generic middleware solution was also implemented as a toolkit that eases migration of existing accounting system to be standard compatible. 658.05
353	High performance bioinformatics and computational biology on general-purpose graphics processing units Ling, Cheng January 2012 (has links) Bioinformatics and Computational Biology (BCB) is a relatively new multidisciplinary field which brings together many aspects of the fields of biology, computer science, statistics, and engineering. Bioinformatics extracts useful information from biological data and makes these more intuitive and understandable by applying principles of information sciences, while computational biology harnesses computational approaches and technologies to answer biological questions conveniently. Recent years have seen an explosion of the size of biological data at a rate which outpaces the rate of increases in the computational power of mainstream computer technologies, namely general purpose processors (GPPs). The aim of this thesis is to explore the use of off-the-shelf Graphics Processing Unit (GPU) technology in the high performance and efficient implementation of BCB applications in order to meet the demands of biological data increases at affordable cost. The thesis presents detailed design and implementations of GPU solutions for a number of BCB algorithms in two widely used BCB applications, namely biological sequence alignment and phylogenetic analysis. Biological sequence alignment can be used to determine the potential information about a newly discovered biological sequence from other well-known sequences through similarity comparison. On the other hand, phylogenetic analysis is concerned with the investigation of the evolution and relationships among organisms, and has many uses in the fields of system biology and comparative genomics. In molecular-based phylogenetic analysis, the relationship between species is estimated by inferring the common history of their genes and then phylogenetic trees are constructed to illustrate evolutionary relationships among genes and organisms. However, both biological sequence alignment and phylogenetic analysis are computationally expensive applications as their computing and memory requirements grow polynomially or even worse with the size of sequence databases. The thesis firstly presents a multi-threaded parallel design of the Smith- Waterman (SW) algorithm alongside an implementation on NVIDIA GPUs. A novel technique is put forward to solve the restriction on the length of the query sequence in previous GPU-based implementations of the SW algorithm. Based on this implementation, the difference between two main task parallelization approaches (Inter-task and Intra-task parallelization) is presented. The resulting GPU implementation matches the speed of existing GPU implementations while providing more flexibility, i.e. flexible length of sequences in real world applications. It also outperforms an equivalent GPPbased implementation by 15x-20x. After this, the thesis presents the first reported multi-threaded design and GPU implementation of the Gapped BLAST with Two-Hit method algorithm, which is widely used for aligning biological sequences heuristically. This achieved up to 3x speed-up improvements compared to the most optimised GPP implementations. The thesis then presents a multi-threaded design and GPU implementation of a Neighbor-Joining (NJ)-based method for phylogenetic tree construction and multiple sequence alignment (MSA). This achieves 8x-20x speed up compared to an equivalent GPP implementation based on the widely used ClustalW software. The NJ method however only gives one possible tree which strongly depends on the evolutionary model used. A more advanced method uses maximum likelihood (ML) for scoring phylogenies with Markov Chain Monte Carlo (MCMC)-based Bayesian inference. The latter was the subject of another multi-threaded design and GPU implementation presented in this thesis, which achieved 4x-8x speed up compared to an equivalent GPP implementation based on the widely used MrBayes software. Finally, the thesis presents a general evaluation of the designs and implementations achieved in this work as a step towards the evaluation of GPU technology in BCB computing, in the context of other computer technologies including GPPs and Field Programmable Gate Arrays (FPGA) technology. 572.8
354	Dissecting genetic interactions in complex traits Hemani, Gibran January 2012 (has links) Of central importance in the dissection of the components that govern complex traits is understanding the architecture of natural genetic variation. Genetic interaction, or epistasis, constitutes one aspect of this, but epistatic analysis has been largely avoided in genome wide association studies because of statistical and computational difficulties. This thesis explores both issues in the context of two-locus interactions. Initially, through simulation and deterministic calculations it was demonstrated that not only can epistasis maintain deleterious mutations at intermediate frequencies when under selection, but that it may also have a role in the maintenance of additive variance. Based on the epistatic patterns that are evolutionarily persistent, and the frequencies at which they are maintained, it was shown that exhaustive two dimensional search strategies are the most powerful approaches for uncovering both additive variance and the other genetic variance components that are co-precipitated. However, while these simulations demonstrate encouraging statistical benefits, two dimensional searches are often computationally prohibitive, particularly with the marker densities and sample sizes that are typical of genome wide association studies. To address this issue different software implementations were developed to parallelise the two dimensional triangular search grid across various types of high performance computing hardware. Of these, particularly effective was using the massively-multi-core architecture of consumer level graphics cards. While the performance will continue to improve as hardware improves, at the time of testing the speed was 2-3 orders of magnitude faster than CPU based software solutions that are in current use. Not only does this software enable epistatic scans to be performed routinely at minimal cost, but it is now feasible to empirically explore the false discovery rates introduced by the high dimensionality of multiple testing. Through permutation analysis it was shown that the significance threshold for epistatic searches is a function of both marker density and population sample size, and that because of the correlation structure that exists between tests the threshold estimates currently used are overly stringent. Although the relaxed threshold estimates constitute an improvement in the power of two dimensional searches, detection is still most likely limited to relatively large genetic effects. Through direct calculation it was shown that, in contrast to the additive case where the decay of estimated genetic variance was proportional to falling linkage disequilibrium between causal variants and observed markers, for epistasis this decay was exponential. One way to rescue poorly captured causal variants is to parameterise association tests using haplotypes rather than single markers. A novel statistical method that uses a regularised parameter selection procedure on two locus haplotypes was developed, and through extensive simulations it can be shown that it delivers a substantial gain in power over single marker based tests. Ultimately, this thesis seeks to demonstrate that many of the obstacles in epistatic analysis can be ameliorated, and with the current abundance of genomic data gathered by the scientific community direct search may be a viable method to qualify the importance of epistasis. 572.8
355	Towards Design and Analysis For High-Performance and Reliable SSDs Xia, Qianbin 01 January 2017 (has links) NAND Flash-based Solid State Disks have many attractive technical merits, such as low power consumption, light weight, shock resistance, sustainability of hotter operation regimes, and extraordinarily high performance for random read access, which makes SSDs immensely popular and be widely employed in different types of environments including portable devices, personal computers, large data centers, and distributed data systems. However, current SSDs still suffer from several critical inherent limitations, such as the inability of in-place-update, asymmetric read and write performance, slow garbage collection processes, limited endurance, and degraded write performance with the adoption of MLC and TLC techniques. To alleviate these limitations, we propose optimizations from both specific outside applications layer and SSDs' internal layer. Since SSDs are good compromise between the performance and price, so SSDs are widely deployed as second layer caches sitting between DRAMs and hard disks to boost the system performance. Due to the special properties of SSDs such as the internal garbage collection processes and limited lifetime, traditional cache devices like DRAM and SRAM based optimizations might not work consistently for SSD-based cache. Therefore, for the outside applications layer, our work focus on integrating the special properties of SSDs into the optimizations of SSD caches. Moreover, our work also involves the alleviation of the increased Flash write latency and ECC complexity due to the adoption of MLC and TLC technologies by analyzing the real work workloads. SSD Flash Cahe High-performance Reliability Computer and Systems Architecture Data Storage Systems
356	HPC scheduling in a brave new world Gonzalo P., Rodrigo January 2017 (has links) Many breakthroughs in scientific and industrial research are supported by simulations and calculations performed on high performance computing (HPC) systems. These systems typically consist of uniform, largely parallel compute resources and high bandwidth concurrent file systems interconnected by low latency synchronous networks. HPC systems are managed by batch schedulers that order the execution of application jobs to maximize utilization while steering turnaround time. In the past, demands for greater capacity were met by building more powerful systems with more compute nodes, greater transistor densities, and higher processor operating frequencies. Unfortunately, the scope for further increases in processor frequency is restricted by the limitations of semiconductor technology. Instead, parallelism within processors and in numbers of compute nodes is increasing, while the capacity of single processing units remains unchanged. In addition, HPC systems’ memory and I/O hierarchies are becoming deeper and more complex to keep up with the systems’ processing power. HPC applications are also changing: the need to analyze large data sets and simulation results is increasing the importance of data processing and data-intensive applications. Moreover, composition of applications through workflows within HPC centers is becoming increasingly important. This thesis addresses the HPC scheduling challenges created by such new systems and applications. It begins with a detailed analysis of the evolution of the workloads of three reference HPC systems at the National Energy Research Supercomputing Center (NERSC), with a focus on job heterogeneity and scheduler performance. This is followed by an analysis and improvement of a fairshare prioritization mechanism for HPC schedulers. The thesis then surveys the current state of the art and expected near-future developments in HPC hardware and applications, and identifies unaddressed scheduling challenges that they will introduce. These challenges include application diversity and issues with workflow scheduling or the scheduling of I/O resources to support applications. Next, a cloud-inspired HPC scheduling model is presented that can accommodate application diversity, takes advantage of malleable applications, and enables short wait times for applications. Finally, to support ongoing scheduling research, an open source scheduling simulation framework is proposed that allows new scheduling algorithms to be implemented and evaluated in a production scheduler using workloads modeled on those of a real system. The thesis concludes with the presentation of a workflow scheduling algorithm to minimize workflows’ turnaround time without over-allocating resources. / <p>Work also supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR) and we used resources at the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility, supported by the Officece of Science of the U.S. Department of Energy, both under Contract No. DE-AC02-05CH11231.</p> High Performance Computing HPC supercomputing scheduling workflows workloads exascale Computer Science Datavetenskap (datalogi)
357	SFE Fractionation and RP-HPLC Characterization of Aquatic Fulvic Acid Shao, Peimin 05 1900 (has links) The Supercritical Fluid Extraction (SFE) technique was used to fractionate Suwannee River reference fulvic acid (FA). The fractions were characterized by gas chromatography (GC) and reversed-phase high performance liquid chromatography (RP-HPLC). A SFE fractionation method was developed using stepwise gradient of supercritical CO₂ and methanol. Three FA fractions were separated. The average mass recovery was 102% with the coefficient of variation of 2.8%. The fractionation dynamics and the difference in the ratios of UV absorption to fluorescence emission indicate the real fractionation of the FA. The HPLC chromatographic peak patterns and the spectra of the corresponding peaks were almost indistinguishable. The overall results of this research support the argument that FA exhibits polymer-like molecular structure. Supercritical Fluid Extraction technique fulvic acid Fulvic acids.
358	High Performance Portability with RAJA and Agency Obermiller, Dan 01 January 2017 (has links) High performance and scientific computing take advantage of high-end and high-spec computer architectures. As these architectures evolve, and new architectures are created, applications may be able to run at greater and greater speeds. These changes persent challenges to implementors who wish to take advantage of the newest features and machines. Portability layers such as RAJA and Agency seek to abstract away machine-specific details and allow scientists to take advantage of new features as they become available. We enhance RAJA with a lower-level framework, Agency, to determine if these layered abstractions provide performance or maintainability benefits. high-performance portability c++ gpu threading Programming Languages and Compilers
359	Reactive transport modeling at hillslope scale with high performance computing methods He, Wenkui 07 December 2016 (has links) (PDF) Reactive transport modeling is an important approach to understand water dynamics, mass transport and biogeochemical processes from the hillslope to the catchment scale. It has a wide range of applications in the fields of e.g. water resource management, contaminanted site remediation and geotechnical engineering. To simulate reactive transport processes at a hillslope or larger scales is a challenging task, which involves interactions of complex physical and biogeochemical processes, huge computational expenses as well as difficulties in numerical precision and stability. The primary goal of the work is to develop a practical, accurate and efficient tool to facilitate the simulation techniques for reactive transport problems towards hillslope or larger scales. The first part of the work deals with the simulation of water flow in saturated and unsaturated porous media. The capability and accuracy of different numerical approaches were analyzed and compared by using benchmark tests. The second part of the work introduces the coupling of the scientific software packages OpenGeoSys and IPhreeqc by using a character-string-based interface. The accuracy and computational efficiency of the coupled tool were discussed based on three benchmarks. It shows that OGS#IPhreeqc provides sufficient numerical accuracy to simulate reactive transport problems for both equilibrium and kinetic reactions in variably saturated porous media. The third part of the work describes the algorithm of a parallelization scheme using MPI (Message Passing Interface) grouping concept, which enables a flexible allocation of computational resources for calculating geochemical reaction and the physical processes such as groundwater flow and transport. The parallel performance of the approach was tested by three examples. It shows that the new approach has more advantages than the conventional ones for the calculation of geochemically-dominated problems, especially when only limited benefit can be obtained through parallelization for solving flow or solute transport. The comparison between the character-string-based and the file-based coupling shows, that the former approach produces less computational overhead in a distributed-memory system such as a computing cluster. The last part of the work shows the application of OGS#IPhreeqc for the simulation of the water dynamic and denitrification process in the groundwater aquifer of a study site in Northern Germany. It demonstrates that OGS#IPhreeqc is able to simulate heterogeneous reactive transport problems at a hillslope scale within an acceptable time span. The model results shows the importance of functional zones for natural attenuation process. / Modellierung des reaktiven Stofftranports ist ein wichtiger Ansatz um die Wasserströmung, den Stofftransport und die biogeochemischen Prozesse von der Hang- bis zur Einzugsgebietsskala zu verstehen. Es gibt umfangreiche Anwendungsgebiete, z.B. in der Wasserwirtschaft, Umweltsanierung und Geotechnik. Die Simulation der reaktiven Stofftransportprozesse auf der Hangskala oder auf größeren Maßstäbe ist eine anspruchsvolle Aufgabe, da es sich um die Wechselwirkungen komplexer physikalischer und biogeochemischen Prozesse handelt, die riesigen Berechnungsaufwand sowie numerischen Schwierigkeiten bezogen auf die Genauigkeit und die Stabilität nach sich ziehen. Das Hauptziel dieser Arbeit besteht darin, ein praktisches, genaues und effizientes Werkzeug zu entwickeln, um die Simulationstechnik für reaktiven Stofftransport auf der Hangskala und auf größeren Skalen zu verbessern. Der erste Teil der Arbeit behandelt die Simulation der Wasserströmung in gesättigten und ungesättigten porösen Medien. Das Anwendungspotential und die Genauigkeit verschiedener numerischer Ansätze wurden mittels einiger Benchmarks analysiert und miteinander verglichen. Der zweite Teil der Arbeit stellt die Kopplung der wissenschaftlichen Softwarepakete OpenGeoSys und IPhreeqc mit einer stringbasierten Schnittstelle dar. Die Genauigkeit und die Recheneffizienz des gekoppelten Tools OGS#IPhreeqc wurden basierend auf drei Benchmark-Tests diskutiert. Das Ergebnis zeigt, dass OGS#IPhreeqc die ausreichende numerische Genauigkeit für die Simulation reaktiven Stofftransports liefert, welcher sich sowohl auf die Gleichgewichtsreaktion als auch auf die kinetische Reaktion in variabel gesättigten porösen Medien beziehen. Der dritte Teil der Arbeit beschreibt zuerst den Algorithmus der Parallelisierung des OGS#IPhreeqc basierend auf dem MPI (Message Passing Interface) Gruppierungskonzept, welcher eine flexible Verteilung der Rechenressourcen für die Berechnung der geochemischen Reaktion und der physikalischen Prozesse wie z.B. Wasserströmung oder Stofftransport ermöglicht. Danach wurde die Leistungsfähigkeit des Algorithmus anhand von drei Beispielen getestet. Es zeigt sich, dass der neue Ansatz Vorteile gegenüber die konventionellen Ansätzen für die Berechnung von geochemisch dominierten Problemen bringt. Dies ist vor allem dann der Fall, wenn nur eingeschränkter Nutzen aus der Parallelisierung für die Berechnung der Wasserströmung oder des Stofftransportes gezogen werden kann. Der Vergleich zwischen der string- und der dateibasierten Kopplung zeigt, dass die erstere weniger Rechenoverhead in einem verteilten Rechnersystem, wie z.B. Cluster erzeugt. Der letzte Teil der Arbeit zeigt die Anwendung von OGS#IPhreeqc für die Simulation der Wasserdynamik und der Denitrifikation im Grundwasserleiter eines Untersuchungsgebietes in NordDeutschland. Es beweist, dass OGS#IPhreeqc in der Lage ist, reaktiven Stofftransport auf der Hangskala innerhalb akzeptabler Zeitspanne zu simulieren. Die Simulationsergebnisse zeigen die Bedeutung der funktionalen Zonen für die natürlichen Selbstreinigungsprozesse. Hochleistungsrechnen reactive transport modeling high performance computing ddc:550 rvk:ZI 6716
360	Radial Compression High Performance Liquid Chromatography as a Tool for The Measurement of Endogenous Nucleotides in Bacteria Dutta, Probir Kumar 08 1900 (has links) High performance liquid chromatography was used to measure ribonucleoside triphosphates in microbial samples. Anion exchange columns in a radial compression module were used to separate and quantify purine and pyrimidine ribonucleotides. Endogenous ribonucleoside triphosphates were extracted from Escherichia coli and pseudomonas aeruginosa using three different solvents, namely trifluorocetic acid (TFA; 0.5M), trichloroacetic acid (TCA; 6 per cent w/v) and formic acid (1.0M) Extracts were assayed for uridine 5'-triphosphate (ATP), and guanosine 5'-triphosphate (GTP) by using anion exchange radial compression high performance (pressure) liquid chromatography. The three extraction produres were compared for yield of triphosphates. E. coli, the TFA extraction procedure was more sensitive and reliable than TCA and formic acid extraction procedures, but , in P. aeruginosa, the best yields of ATP and GTP were obrained following extraction with TFA. Yields of UTP and CTP increased when extraction was performed in TCA. These data illustrate that different extraction produres produce different measures for different triphosphates, a point often overlooked. bacteria growth endogenous nucleotides liquid chromatography Bacterial growth -- Measurement. Nucleotides. High performance liquid chromatography.

Search results