• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 16
  • 4
  • 4
  • 1
  • 1
  • 1
  • Tagged with
  • 36
  • 36
  • 8
  • 8
  • 8
  • 7
  • 7
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Analysis and Prediction of Community Structure Using Unsupervised Learning

Biradar, Rakesh 26 January 2016 (has links)
In this thesis, we perform analysis and prediction for community structures in graphs using unsupervised learning. The methods we use require the data matrices to be of low rank, and such matrices appear quite often in real world problems across a broad range of domains. Such a modelling assumption is widely considered by classical algorithms such as principal component analysis (PCA), and the same assumption is often used to achieve dimensionality reduction. Dimension reduction, which is a classic method in unsupervised learning, can be leveraged in a wide array of problems, including prediction of strength of connection between communities from unlabeled or partially labeled data. Accordingly, a low rank assumption addresses many real world problems, and a low rank assumption has been used in this thesis to predict the strength of connection between communities in Amazon product data. In particular, we have analyzed real world data across retail and cyber domains, with the focus being on the retail domain. Herein, our focus is on analyzing the strength of connection between the communities in Amazon product data, where each community represents a group of products, and we are given the strength of connection between the individual products but not between the product communities. We call the strength of connection between individual products first order data and the strength of connection between communities second order data. This usage is inspired by [1] where first order time series are used to compute second order covariance matrices where such covariance matrices encode the strength of connection between the time series. In order to find the strength of connection between the communities, we define various metrics to measure this strength, and one of the goals of this thesis is to choose a good metric, which supports effective predictions. However, the main objective is to predict the strength of connection between most of the communities, given measurements of the strength of connection between only a few communities. To address this challenge, we use modern extensions of PCA such as eRPCA that can provide better predictions and can be computationally efficient for large problems. However, the current theory of eRPCA algorithms is not designed to treat problems where the initial data (such as the second order matrix of communities strength) is both low rank and sparse. Therefore, we analyze the performance of eRPCA algorithm on such data and modify our approaches for the particular structure of Amazon product communities to perform the necessary predictions.
2

Sparse Matrix Belief Propagation

Bixler, Reid Morris 11 May 2018 (has links)
We propose sparse-matrix belief propagation, which executes loopy belief propagation in Markov random fields by replacing indexing over graph neighborhoods with sparse-matrix operations. This abstraction allows for seamless integration with optimized sparse linear algebra libraries, including those that perform matrix and tensor operations on modern hardware such as graphical processing units (GPUs). The sparse-matrix abstraction allows the implementation of belief propagation in a high-level language (e.g., Python) that is also able to leverage the power of GPU parallelization. We demonstrate sparse-matrix belief propagation by implementing it in a modern deep learning framework (PyTorch), measuring the resulting massive improvement in running time, and facilitating future integration into deep learning models. / Master of Science
3

Optimizing Sparse Matrix-Matrix Multiplication on a Heterogeneous CPU-GPU Platform

Wu, Xiaolong 16 December 2015 (has links)
Sparse Matrix-Matrix multiplication (SpMM) is a fundamental operation over irregular data, which is widely used in graph algorithms, such as finding minimum spanning trees and shortest paths. In this work, we present a hybrid CPU and GPU-based parallel SpMM algorithm to improve the performance of SpMM. First, we improve data locality by element-wise multiplication. Second, we utilize the ordered property of row indices for partial sorting instead of full sorting of all triples according to row and column indices. Finally, through a hybrid CPU-GPU approach using two level pipelining technique, our algorithm is able to better exploit a heterogeneous system. Compared with the state-of-the-art SpMM methods in cuSPARSE and CUSP libraries, our approach achieves an average of 1.6x and 2.9x speedup separately on the nine representative matrices from University of Florida sparse matrix collection.
4

Study and Design of an Intelligent Preconditioner Recommendation System

Xu, Shuting 01 January 2005 (has links)
There are many scientific applications in which there is a need to solve very large linear systems. The preconditioned Krylove subspace methods are considered the preferred methods in this field. The preconditioners employed in the preconditioned iterative solvers usually determine the overall convergence rate. However, choosing a good preconditioner for a specific sparse linear system arising from a particular application is the combination of art and science, and presents a formidable challenge for many design engineers and application scientists who do not have much knowledge of preconditioned iterative methods. We tackled the problem of choosing suitable preconditioners for particular applications from a nontraditional point of view. We used the techniques and ideas in knowledge discovery and data mining to extract useful information and special features from unstructured sparse matrices and analyze the relationship between these features and the solving status of the spearse linear systems generated from these sparse matrices. We have designed an Intelligent Preconditioner Recommendation System, which can provide advice on choosing a high performance preconditioner as well as suitable parameters for a given sparse linear system. This work opened a new research direction for a very important topic in large scale high performance scientific computing. The performance of the various data mining algorithms applied in the recommendation system is directly related to the set of matrix features used in the system. We have extracted more than 60 features to represent a sparse matrix. We have proposed to use data mining techniques to predict some expensive matrix features like the condition number. We have also proposed to use the combination of the clustering and classification methods to predict the solving status of a sparse linear system. For the preconditioners with multiple parameters, we may predict the possible combinations of the values of the parameters with which a given sparse linear system may be successfully solved. Furthermore, we have proposed an algorithm to find out which preconditioners work best for a certain sparse linear system with what parameters.
5

Parallel Sparse Matrix-Matrix Multiplication: A Scalable Solution With 1D Algorithm

Hoque, Mohammad Asadul, Raju, Md Rezaul Karim, Tymczak, Christopher John, Vrinceanu, Daniel, Chilakamarri, Kiran 01 January 2015 (has links)
This paper presents a novel implementation of parallel sparse matrix-matrix multiplication using distributed memory systems on heterogeneous hardware architecture. The proposed algorithm is expected to be linearly scalable up to several thousands of processors for matrices with dimensions over 106 (million). Our approach of parallelism is based on 1D decomposition and can work for both structured and unstructured sparse matrices. The storage mechanism is based on distributed hash lists, which reduces the latency for accessing and modifying an element of the product matrix, while reducing the overall merging time of the partial results computed by the processors. Theoretically, the time and space complexity of our algorithm is linearly proportional to the total number of non-zero elements in the product matrix C. The results of the performance evaluation show that the algorithm scales much better for sparse matrices with bigger dimensions. The speedup achieved using our algorithm is much better than other existing 1D algorithms. We have been able to achieve about 500 times speedup with only 672 processors. We also identified the impact of hardware architecture on scalability.
6

Efficient Evaluation of Makespan for a Manufacturing System Using Max-Plus Algebra

Patlola, Phanindher R. 26 July 2011 (has links)
No description available.
7

Delninukų energijos suvartojimo apdorojant išretintas matricas saugomas stulpeliais modeliavimas / Pocket PC energy consumption using sparse matrix storage by columns modeling

Dičpinigaitis, Petras 28 January 2008 (has links)
Kiekvienas mobilus įrenginys turi bateriją, o tai reiškia, kad jų darbo laikas ribotas, kadangi nėra išrasta ilgaamžė baterija. Todėl šiuo metu egzistuojanti problema - kaip pasiekti, kuo ilgesnį mobiliojo įrenginio darbo laiką, be papildomo pakrovimo. Darbo metu naudojamas mobilus įrenginys - delninukas. Iš visų delninuko baterijos energiją suvartojančių komponentų visas dėmesys skiriamas procesoriui ir atminčiai. Tyrimo metu buvo apkrautas procesorius ir atmintis ir stebimi atitinkami baterijos parametrai. Apkrovimui naudojama paprastų ir išretintų matricų saugomų stulpelių metodu daugyba. Išretintų matricų daugybos metu užimama mažiau atminties, o procesorius atlieka daugiau komandų lyginant su paprastu metodu, kuris užima daugiau atminties. Iš gautų rezultatų pamatėme, kad išretintų matricų saugomų stulpelių metodu daugyba yra daug efektyvesnė negu paprastų matricų daugyba. Todėl kuriant programas, kur reikia naudoti matricas geriau naudoti išretintų matricų stulpelių saugojimo metodo daugybą, kadangi galima sutrumpinti operacijos vykdymo laika, sunaudoti mažiau baterijos resursų ir sutaupyti atminties. / Nowadays major problem is energy consumtion in portable devices which has a battery. In this job we have evaluated energy consumption for Pocket PC. We wanted to see memory and processor influence in battery energy consumption. We have created a program which can do matrix multiplication and sparse matrix „storage by columns“ multiplication. During multiplication program takes battery information and saves it into the file. After that I have investigated the result and saw, that sparse matrix storage by columns multiplication is much more effectived than normal matrix multiplication. Sparce matrix storage by columns multiplication take less memory and more processor commands then normal matrix multiplication. We suggest to use sparse matrix storage by columns model instead simple model, because you can save much more operation time, battery resources and memory.
8

Magneto-hydrodynamics simulation study of high density thermal plasmas in plasma acceleration devices

Sitaraman, Hariswaran 17 October 2013 (has links)
The development of a Magneto-hydrodynamics (MHD) numerical tool to study high density thermal plasmas in plasma acceleration devices is presented. The MHD governing equations represent eight conservation equations for the evolution of density, momentum, energy and induced magnetic fields in a plasma. A matrix-free implicit method is developed to solve these conservation equations within the framework of an unstructured grid finite volume formulation. The analytic form of the convective flux Jacobian is derived for general unstructured grids. A Lower Upper Symmetric Gauss Seidel (LU-SGS) technique is developed as part of the implicit scheme. A coloring based algorithm for parallelization of this technique is also presented and its computational efficiency is compared with a global matrix solve technique that uses the GMRES (Generalized Minimum Residual) algorithm available in the PETSc (Portable Extensible Toolkit for Scientific computation) libraries. The verification cases used for this study are the MHD shock tube problem in one, two and three dimensions, the oblique shock and the Hartmann flow problem. It is seen that the matrix free method is comparatively faster and shows excellent scaling on multiple cores compared to the global matrix solve technique. The numerical model was thus verified against the above mentioned standard test cases and two application problems were studied. These include the simulation of plasma deflagration phenomenon in a coaxial plasma accelerator and a novel high speed flow control device called the Rail Plasma Actuator (RailPAc). Experimental studies on coaxial plasma accelerators have revealed two different modes of operation based on the delay between gas loading and discharge ignition. Longer delays lead to the detonation or the snowplow mode while shorter delays lead to the relatively efficient stationary or deflagration mode. One of the theories that explain the two different modes is based on plasma resistivity. A numerical modeling study is presented here in the context of a coaxial plasma accelerator and the effect of plasma resistivity is dealt with in detail. The simulated results pertaining to axial distribution of radial currents are compared with experimental measurements which show good agreement with each other. The simulations show that magnetic field diffusion is dominant at lower conductivities which tend to form a stationary region of high current density close to the inlet end of the device. Higher conductivities led to the formation of propagating current sheet like features due to greater convection of magnetic field. This study also validates the theory behind the two modes of operation based on plasma resistivity. The RailPAc (Rail Plasma Actuator) is a novel flow control device that uses the magnetic Lorentz forces for fluid flow actuation at atmospheric pressures. Experimental studies reveal actuation ~ 10-100 m/s can be achieved with this device which is much larger than conventional electro-hydrodynamic (EHD) force based plasma actuators. A magneto-hydrodynamics simulation study of this device is presented. The model is further developed to incorporate applied electric and magnetic fields seen in this device. The snowplow model which is typically used for studying pulsed plasma thrusters is used to predict the arc velocities which agrees well with experimental measurements. Two dimensional simulations were performed to study the effect of Lorentz forcing and heating effects on fluid flow actuation. Actuation on the order of 100 m/s is attained at the head of the current sheet due to the effect of Lorentz forcing alone. The inclusion of heating effects led to isotropic blast wave like actuation which is detrimental to the performance of RailPAc. This study also revealed the deficiencies of a single fluid model and a more accurate multi-fluid approach is proposed for future work. / text
9

Nonnegative matrix factorization for clustering

Kuang, Da 27 August 2014 (has links)
This dissertation shows that nonnegative matrix factorization (NMF) can be extended to a general and efficient clustering method. Clustering is one of the fundamental tasks in machine learning. It is useful for unsupervised knowledge discovery in a variety of applications such as text mining and genomic analysis. NMF is a dimension reduction method that approximates a nonnegative matrix by the product of two lower rank nonnegative matrices, and has shown great promise as a clustering method when a data set is represented as a nonnegative data matrix. However, challenges in the widespread use of NMF as a clustering method lie in its correctness and efficiency: First, we need to know why and when NMF could detect the true clusters and guarantee to deliver good clustering quality; second, existing algorithms for computing NMF are expensive and often take longer time than other clustering methods. We show that the original NMF can be improved from both aspects in the context of clustering. Our new NMF-based clustering methods can achieve better clustering quality and run orders of magnitude faster than the original NMF and other clustering methods. Like other clustering methods, NMF places an implicit assumption on the cluster structure. Thus, the success of NMF as a clustering method depends on whether the representation of data in a vector space satisfies that assumption. Our approach to extending the original NMF to a general clustering method is to switch from the vector space representation of data points to a graph representation. The new formulation, called Symmetric NMF, takes a pairwise similarity matrix as an input and can be viewed as a graph clustering method. We evaluate this method on document clustering and image segmentation problems and find that it achieves better clustering accuracy. In addition, for the original NMF, it is difficult but important to choose the right number of clusters. We show that the widely-used consensus NMF in genomic analysis for choosing the number of clusters have critical flaws and can produce misleading results. We propose a variation of the prediction strength measure arising from statistical inference to evaluate the stability of clusters and select the right number of clusters. Our measure shows promising performances in artificial simulation experiments. Large-scale applications bring substantial efficiency challenges to existing algorithms for computing NMF. An important example is topic modeling where users want to uncover the major themes in a large text collection. Our strategy of accelerating NMF-based clustering is to design algorithms that better suit the computer architecture as well as exploit the computing power of parallel platforms such as the graphic processing units (GPUs). A key observation is that applying rank-2 NMF that partitions a data set into two clusters in a recursive manner is much faster than applying the original NMF to obtain a flat clustering. We take advantage of a special property of rank-2 NMF and design an algorithm that runs faster than existing algorithms due to continuous memory access. Combined with a criterion to stop the recursion, our hierarchical clustering algorithm runs significantly faster and achieves even better clustering quality than existing methods. Another bottleneck of NMF algorithms, which is also a common bottleneck in many other machine learning applications, is to multiply a large sparse data matrix with a tall-and-skinny dense matrix. We use the GPUs to accelerate this routine for sparse matrices with an irregular sparsity structure. Overall, our algorithm shows significant improvement over popular topic modeling methods such as latent Dirichlet allocation, and runs more than 100 times faster on data sets with millions of documents.
10

Numerical linear algebra problems in structural analysis

Kannan, Ramaseshan January 2014 (has links)
A range of numerical linear algebra problems that arise in finite element-based structural analysis are considered. These problems were encountered when implementing the finite element method in the software package Oasys GSA. We present novel solutions to these problems in the form of a new method for error detection, algorithms with superior numerical effeciency and algorithms with scalable performance on parallel computers. The solutions and their corresponding software implementations have been integrated into GSA's program code and we present results that demonstrate the use of these implementations by engineers to solve real-world structural analysis problems.

Page generated in 0.0474 seconds