11 |
Phylogenetic analysis of multiple genes based on spectral methodsAbeysundera, Melanie 28 October 2011 (has links)
Multiple gene phylogenetic analysis is of interest since single gene analysis often
results in poorly resolved trees. Here the use of spectral techniques for analyzing
multi-gene data sets is explored. The protein sequences are treated as categorical
time series and a measure of similarity between a pair of sequences, the spectral
covariance, is used to build trees. Unlike other methods, the spectral covariance
method focuses on the relationship between the sites of genetic sequences.
We consider two methods with which to combine the dissimilarity or distance
matrices of multiple genes. The first method involves properly scaling the dissimilarity
measures derived from different genes between a pair of species and using the
mean of these scaled dissimilarity measures as a summary statistic to measure the
taxonomic distances across multiple genes. We introduced two criteria for computing
scale coefficients which can then be used to combine information across genes, namely
the minimum variance (MinVar) criterion and the minimum coefficient of variation
squared (MinCV) criterion. The scale coefficients obtained with the MinVar and
MinCV criteria can then be used to derive a combined-gene tree from the weighted
average of the distance or dissimilarity matrices of multiple genes.
The second method is based on the singular value decomposition of a matrix made
up of the p-vectors of pairwise distances for k genes. By decomposing such a
matrix, we extract the common signal present in multiple genes to obtain a single tree
representation of the relationship between a given set of taxa. Influence functions for
the components of the singular value decomposition are derived to determine which
genes are most influential in determining the combined-gene tree.
|
12 |
IMPROVED DOCUMENT SUMMARIZATION AND TAG CLOUDS VIA SINGULAR VALUE DECOMPOSITIONProvost, JAMES 25 September 2008 (has links)
Automated summarization is a difficult task. World-class summarizers can provide only "best guesses" of which sentences encapsulate the important content from within a set of documents. As automated systems continue to improve, users are still not given the means to observe complex relationships between seemingly independent concepts. In this research we used singular value decompositions to organize concepts and determine the best candidate sentences for an automated summary. The results from this straightforward attempt were comparable to world-class summarizers. We then included a clustered tag cloud, using a singular value decomposition to measure term "interestingness" with respect to the set of documents. The combination of best candidate sentences and tag clouds provided a more inclusive summary than a traditionally-developed summarizer alone. / Thesis (Master, Computing) -- Queen's University, 2008-09-24 16:31:25.261
|
13 |
Extension of Kendall's tau Using Rank-Adapted SVD to Identify Correlation and Factions Among Rankers and Equivalence Classes Among Ranked ElementsCampbell, Kathlleen January 2014 (has links)
The practice of ranking objects, events, and people to determine relevance, importance, or competitive edge is ancient. Recently, the use of rankings has permeated into daily usage, especially in the fields of business and education. When determining the association among those creating the ranks (herein called sources), the traditional assumption is that all sources compare a list of the same items (herein called elements). In the twenty-first century, it is rare that any two sources choose identical elements to rank. Adding to this difficulty, the number of credible sources creating and releasing rankings is increasing. In statistical literature, there is no current methodology that adequately assesses the association among multiple sources. We introduce rank-adapted singular value decomposition (R-A SVD), a new method that uses Kendall's tau as the underlying correlation method. We begin with (P), a matrix of data ranks. The first step is to factor the covariance matrix (K) as follows: K = cov(P) = V D^2 V Here, (V) is an orthonormal basis for the rows that is useful in identifying when sources agree as to the rank order and specifically which sources. D is a diagonal of eigenvalues. By analogy with singular value decomposition (SVD), we define U^* as U^* = PVD^(-1) The diagonal matrix, D, provides the factored eigenvalues in decreasing order. The largest eigenvalue is used to assess the overall association among the sources and is a conservative unbiased method comparable to Kendall's W. Anderson's test determines whether this association is significant and also identifies other significant eigenvalues produced by the covariance matrix.. Using Anderson's test (1963) we identify the a significantly large eigenvalues from D. When one or more eigenvalues is significant, there is evidence that the association among the sources is significant. Focusing on the a corresponding vectors of V specifically identifies which sources agree. In cases where more than one eigenvalue is significant, the $a$ significant vectors of V provide insight into factions. When more than one set of sources is in agreement, each group of agreeing sources is considered a faction. In many cases, more than one set of sources will be in agreement with one another but not necessarily with another set of sources; each group that is in agreement would be considered a faction. Using the a significant vectors of U^* provides different but equally important results. In many cases, the elements that are being ranked can be subdivided into equivalence classes. An equivalence class is defined as subpopulations of ranked elements that are similar to one another but dissimilar from other classes. When these classes exist, U^* provides insight as to how many classes and which elements belong in each class. In summary, the R-A SVD method gives the user the ability to assess whether there is any underlying association among multiple rank sources. It then identifies when sources agree and allows for more useful and careful interpretation when analyzing rank data. / Statistics
|
14 |
Improving the Performance of a Hybrid Classification Method Using a Parallel Algorithm and a Novel Data Reduction TechniquePhillips, Rhonda D. 21 August 2007 (has links)
This thesis presents both a shared memory parallel version of the hybrid classification algorithm IGSCR (iterative guided spectral class rejection) and a novel data reduction technique that can be used in conjuction with pIGSCR (parallel IGSCR). The parallel algorithm is motivated by a demonstrated need for more computing power driven by the increasing size of remote sensing datasets due to higher resolution sensors, larger study regions, and the like. Even with a fast algorithm such as pIGSCR, the reduction of dimension in a dataset is desirable in order to decrease the processing time further and possibly improve overall classification accuracy.
pIGSCR was developed to produce fast and portable code using Fortran 95, OpenMP, and the Hierarchical Data Format version 5 (HDF5) and accompanying data access library. The applicability of the faster pIGSCR algorithm is demonstrated by classifying Landsat data covering most of Virginia, USA into forest and non-forest classes with approximately 90 percent accuracy. Parallel results are given using the SGI Altix 3300 shared memory computer and the SGI Altix 3700 with as many as 64 processors reaching speedups of almost 77. This fast algorithm allows an analyst to perform and assess multiple classifications to refine parameters. As an example, pIGSCR was used for a factorial analysis consisting of 42 classifications of a 1.2 gigabyte image to select the number of initial classes (70) and class purity (70%) used for the remaining two images.
A feature selection or reduction method may be appropriate for a specific lassification method depending on the properties and training required for the classification method, or an alternative band selection method may be derived based on the classification method itself. This thesis introduces a feature reduction method based on the singular value decomposition (SVD). This feature reduction technique was applied to training data from two multitemporal datasets of Landsat TM/ETM+ imagery acquired over a forested area in Virginia, USA and Rondonia, Brazil. Subsequent parallel iterative guided spectral class rejection (pIGSCR) forest/non-forest classifications were performed to determine the quality of the feature reduction. The classifications of the Virginia data were five times faster using SVD based feature reduction without affecting the classification accuracy. Feature reduction using the SVD was also compared to feature reduction using principal components analysis (PCA). The highest average accuracies for the Virginia dataset (88.34%) and for the Amazon dataset (93.31%) were achieved using the SVD. The results presented here indicate that SVD based feature reduction can produce statistically significantly better classifications than PCA. / Master of Science
|
15 |
On the Use of Uncalibrated Digital Phased Arrays for Blind Signal Separation for Interference Removal in Congested Spectral BandsLusk, Lauren O. 05 May 2023 (has links)
With usable spectrum becoming increasingly more congested, the need for robust, adaptive communications to take advantage of spatially-separated signal sources is apparent. Traditional phased array beamforming techniques used for interference removal rely on perfect calibration between elements and precise knowledge of the array configuration; however, if the exact array configuration is not known (unknown or imperfect assumption of element locations, unknown mutual coupling between elements, etc.), these traditional beamforming techniques are not viable, so a blind beamforming approach is required. A novel blind beamforming approach is proposed to address complex narrow-band interference environments where the precise array configuration is unknown. The received signal is decomposed into orthogonal narrow-band partitions using a polyphase filter-bank channelizer, and a rank-reduced version of the received matrix on each sub-channel is computed through reconstruction by retaining a subset of its singular values. The wideband spectrum is synthesized through a near-perfect polyphase reconstruction filter, and a composite wideband spectrum is obtained from the maximum eigenvector of the resulting covariance matrix.The resulting process is shown to suppress numerous interference sources (in special cases even with more than the degrees of freedom of the array), all without any knowledge of the primary signal of interest. Results are validated with both simulation and wireless laboratory over-the-air experimentation. / M.S. / As the number of devices using wireless communications increase, the amount of usable radio frequency spectrum becomes increasingly congested. As a result, the need for robust, adaptive communications to improve spectral efficiency and ensure reliable communication in the presence of interference is apparent. One solution is using beamforming techniques on digital phased array receivers to maximize the energy in a desired direction and steer nulls to remove interference. However, traditional phased array beamforming techniques used for interference removal rely on perfect calibration between antenna elements and precise knowledge of the array configuration. Consequently, if the exact array configuration is not known (unknown or imperfect assumption of element locations, unknown mutual coupling between elements, etc.), these traditional beamforming techniques are not viable, so a beamforming approach with relaxed requirements (blind beamforming) is required. This thesis proposes a novel blind beamforming approach to address complex narrow-band interference in spectrally congested environments where the precise array configuration is unknown. The resulting process is shown to suppress numerous interference sources, all without any knowledge of the primary signal of interest. Results are validated with both simulation and wireless laboratory experimentation conducted with a two-element array, verifying that proposed beamforming approach achieves a similar performance to the theoretical performance bound of receiving packets in AWGN with no interference present.
|
16 |
Sparse and orthogonal singular value decompositionKhatavkar, Rohan January 1900 (has links)
Master of Science / Department of Statistics / Kun Chen / The singular value decomposition (SVD) is a commonly used matrix factorization technique
in statistics, and it is very e ective in revealing many low-dimensional structures in
a noisy data matrix or a coe cient matrix of a statistical model. In particular, it is often
desirable to obtain a sparse SVD, i.e., only a few singular values are nonzero and their
corresponding left and right singular vectors are also sparse. However, in several existing
methods for sparse SVD estimation, the exact orthogonality among the singular vectors are
often sacri ced due to the di culty in incorporating the non-convex orthogonality constraint
in sparse estimation. Imposing orthogonality in addition to sparsity, albeit di cult, can be
critical in restricting and guiding the search of the sparsity pattern and facilitating model
interpretation. Combining the ideas of penalized regression and Bregman iterative methods,
we propose two methods that strive to achieve the dual goal of sparse and orthogonal SVD
estimation, in the general framework of high dimensional multivariate regression. We set
up simulation studies to demonstrate the e cacy of the proposed methods.
|
17 |
Essays on Computational Problems in InsuranceHa, Hongjun 31 July 2016 (has links)
This dissertation consists of two chapters. The first chapter establishes an algorithm for calculating capital requirements. The calculation of capital requirements for financial institutions usually entails a reevaluation of the company's assets and liabilities at some future point in time for a (large) number of stochastic forecasts of economic and firm-specific variables. The complexity of this nested valuation problem leads many companies to struggle with the implementation. The current chapter proposes and analyzes a novel approach to this computational problem based on least-squares regression and Monte Carlo simulations. Our approach is motivated by a well-known method for pricing non-European derivatives. We study convergence of the algorithm and analyze the resulting estimate for practically important risk measures. Moreover, we address the problem of how to choose the regressors, and show that an optimal choice is given by the left singular functions of the corresponding valuation operator. Our numerical examples demonstrate that the algorithm can produce accurate results at relatively low computational costs, particularly when relying on the optimal basis functions. The second chapter discusses another application of regression-based methods, in the context of pricing variable annuities. Advanced life insurance products with exercise-dependent financial guarantees present challenging problems in view of pricing and risk management. In particular, due to the complexity of the guarantees and since practical valuation frameworks include a variety of stochastic risk factors, conventional methods that are based on the discretization of the underlying (Markov) state space may not be feasible. As a practical alternative, this chapter explores the applicability of Least-Squares Monte Carlo (LSM) methods familiar from American option pricing in this context. Unlike previous literature we consider optionality beyond surrendering the contract, where we focus on popular withdrawal benefits - so-called GMWBs - within Variable Annuities. We introduce different LSM variants, particularly the regression-now and regression-later approaches, and explore their viability and potential pitfalls. We commence our numerical analysis in a basic Black-Scholes framework, where we compare the LSM results to those from a discretization approach. We then extend the model to include various relevant risk factors and compare the results to those from the basic framework.
|
18 |
Computational Tools and Methods for Objective Assessment of Image Quality in X-Ray CT and SPECTPalit, Robin January 2012 (has links)
Computational tools of use in the objective assessment of image quality for tomography systems were developed for computer processing units (CPU) and graphics processing units (GPU) in the image quality lab at the University of Arizona. Fast analytic x-ray projection code called IQCT was created to compute the mean projection image for cone beam multi-slice helical computed tomography (CT) scanners. IQCT was optimized to take advantage of the massively parallel architecture of GPUs. CPU code for computing single photon emission computed tomography (SPECT) projection images was written calling upon previous research in the image quality lab. IQCT and the SPECT modeling code were used to simulate data for multimodality SPECT/CT observer studies. The purpose of these observer studies was to assess the benefit in image quality of using attenuation information from a CT measurement in myocardial SPECT imaging. The observer chosen for these studies was the scanning linear observer. The tasks for the observer were localization of a signal and estimation of the signal radius. For the localization study, area under the localization receiver operating characteristic curve (A(LROC)) was computed as A(LROC)^Meas = 0.89332 ± 0.00474 and A(LROC)^No = 0.89408 ± 0.00475, where "Meas" implies the use of attenuation information from the CT measurement, and "No" indicates the absence of attenuation information. For the estimation study, area under the estimation receiver operating characteristic curve (A(EROC)) was quantified as A(EROC)^Meas = 0.55926 ± 0.00731 and A(EROC)^No = 0.56167 ± 0.00731. Based on these results, it was concluded that the use of CT information did not improve the scanning linear observer's ability to perform the stated myocardial SPECT tasks. The risk to the patient of the CT measurement was quantified in terms of excess effective dose as 2.37 mSv for males and 3.38 mSv for females.Another image quality tool generated within this body of work was a singular value decomposition (SVD) algorithm to reduce the dimension of the eigenvalue problem for tomography systems with rotational symmetry. Agreement in the results of this reduced dimension SVD algorithm and those of a standard SVD algorithm are shown for a toy problem. The use of SVD toward image quality metrics such as the measurement and null space are also presented.
|
19 |
A Comparison of Data Transformations in Image DenoisingMichael, Simon January 2018 (has links)
The study of signal processing has wide applications, such as in hi-fi audio, television, voice recognition and many other areas. Signals are rarely observed without noise, which obstruct our analysis of signals. Hence, it is of great interest to study the detection, approximation and removal of noise. In this thesis we compare two methods for image denoising. The methods are each based on a data transformation. Specifically, Fourier Transform and Singular Value Decomposition are utilized in respective methods and compared on grayscale images. The comparison is based on the visual quality of the resulting image, the maximum peak signal-to-noise ratios attainable for the respective methods and their computational time. We find that the methods are fairly equal in visual quality. However, the method based on the Fourier transform scores higher in peak signal-to-noise ratio and demands considerably less computational time.
|
20 |
Time Series Decomposition Using Singular Spectrum AnalysisDeng, Cheng 01 May 2014 (has links)
Singular Spectrum Analysis (SSA) is a method for decomposing and forecasting time series that recently has had major developments but it is not yet routinely included in introductory time series courses. An international conference on the topic was held in Beijing in 2012. The basic SSA method decomposes a time series into trend, seasonal component and noise. However there are other more advanced extensions and applications of the method such as change-point detection or the treatment of multivariate time series. The purpose of this work is to understand the basic SSA method through its application to the monthly average sea temperature in a point of the coast of South America, near where “EI Ni˜no” phenomenon originates, and to artificial time series simulated using harmonic functions. The output of the basic SSA method is then compared with that of other decomposition methods such as classic seasonal decomposition, X-11 decomposition using moving averages and seasonal decomposition by Loess (STL) that are included in some time series courses.
|
Page generated in 0.5016 seconds