Spelling suggestions: "subject:"component 2analysis"" "subject:"component 3analysis""
171 |
New tools for unsupervised learningXiao, Ying 12 January 2015 (has links)
In an unsupervised learning problem, one is given an unlabelled dataset and hopes to find some hidden structure; the prototypical example is clustering similar data. Such problems often arise in machine learning and statistics, but also in signal processing, theoretical computer science, and any number of quantitative scientific fields. The distinguishing feature of unsupervised learning is that there are no privileged variables or labels which are particularly informative, and thus the greatest challenge is often to differentiate between what is relevant or irrelevant in any particular dataset or problem.
In the course of this thesis, we study a number of problems which span the breadth of unsupervised learning. We make progress in Gaussian mixtures, independent component analysis (where we solve the open problem of underdetermined ICA), and we formulate and solve a feature selection/dimension reduction model. Throughout, our goal is to give finite sample complexity bounds for our algorithms -- these are essentially the strongest type of quantitative bound that one can prove for such algorithms. Some of our algorithmic techniques turn out to be very efficient in practice as well.
Our major technical tool is tensor spectral decomposition: tensors are generalisations of matrices, and often allow access to the "fine structure" of data. Thus, they are often the right tools for unravelling the hidden structure in an unsupervised learning setting. However, naive generalisations of matrix algorithms to tensors run into NP-hardness results almost immediately, and thus to solve our problems, we are obliged to develop two new tensor decompositions (with robust analyses) from scratch. Both of these decompositions are polynomial time, and can be viewed as efficient generalisations of PCA extended to tensors.
|
172 |
Species of Science StudiesArmstrong, Paul 02 August 2013 (has links)
Following Merton (1942) science studies has moved from the philosophy of science to a more sociologically minded analysis of scientific activity. This largely involves a shift away from questions that bear on the context of justification – a question of rationality and philosophy, to those that deal with the context of discovery. This thesis investigates changes in science studies in three papers: sociocultural evolutionary theories of scientific change; general trends in science studies - especially concerning the sociology of science; and a principle component analysis (PCA) that details the development and interaction between research programmes in science studies. This thesis describes the proliferation of research programmes in science studies and uses evolutionary theory to make sense of the pattern of change.
|
173 |
Applications of Principal Component Analysis of Fluorescence Excitation-emission Matrices for Characterization of Natural Organic Matter in Water TreatmentPeleato, Nicolas Miguel 16 July 2013 (has links)
Quantification of natural organic matter (NOM) in water is limited by the complex and varied nature of compounds found in natural waters. Current characterization techniques, which identify and quantify fractions of NOM, are often expensive and time consuming suggesting the need for rapid and accurate characterization methods. In this work, principal component analysis of fluorescence excitation-emission matrices (FEEM-PCA) was investigated as a NOM characterization technique. Through the use of jar tests and disinfection by-product formation tests, FEEM-PCA was shown to be a good surrogate for disinfection by-product precursors. FEEM-PCA was also applied in order to characterize differences in humic-like, protein-like, and Rayleigh scattering between multiple source waters and due to differing treatment processes. A decrease in Rayleigh scattering influence was observed for a deep lake intake, and multiple processes were found to significantly affect humic-like substances, protein-like, and Rayleigh scattering fractions.
|
174 |
Species of Science StudiesArmstrong, Paul 02 August 2013 (has links)
Following Merton (1942) science studies has moved from the philosophy of science to a more sociologically minded analysis of scientific activity. This largely involves a shift away from questions that bear on the context of justification – a question of rationality and philosophy, to those that deal with the context of discovery. This thesis investigates changes in science studies in three papers: sociocultural evolutionary theories of scientific change; general trends in science studies - especially concerning the sociology of science; and a principle component analysis (PCA) that details the development and interaction between research programmes in science studies. This thesis describes the proliferation of research programmes in science studies and uses evolutionary theory to make sense of the pattern of change.
|
175 |
Applications of Principal Component Analysis of Fluorescence Excitation-emission Matrices for Characterization of Natural Organic Matter in Water TreatmentPeleato, Nicolas Miguel 16 July 2013 (has links)
Quantification of natural organic matter (NOM) in water is limited by the complex and varied nature of compounds found in natural waters. Current characterization techniques, which identify and quantify fractions of NOM, are often expensive and time consuming suggesting the need for rapid and accurate characterization methods. In this work, principal component analysis of fluorescence excitation-emission matrices (FEEM-PCA) was investigated as a NOM characterization technique. Through the use of jar tests and disinfection by-product formation tests, FEEM-PCA was shown to be a good surrogate for disinfection by-product precursors. FEEM-PCA was also applied in order to characterize differences in humic-like, protein-like, and Rayleigh scattering between multiple source waters and due to differing treatment processes. A decrease in Rayleigh scattering influence was observed for a deep lake intake, and multiple processes were found to significantly affect humic-like substances, protein-like, and Rayleigh scattering fractions.
|
176 |
Production and fractionation of antioxidant peptides from soy protein isolate using sequential membrane ultrafiltration and nanofiltrationRanamukhaarachchi, Sahan January 2012 (has links)
Antioxidants are molecules capable of stabilizing and preventing oxidation. Certain peptides, protein hydrolysates, have shown antioxidant capacities, which are obtained once liberated from the native protein structure. Soy protein isolates (SPI) were enzymatically hydrolyzed by pepsin and pancreatin mixtures. The soy protein hydrolysates (SPH) were fractionated with sequential ultrafiltration (UF) and nanofiltration (NF) membrane steps. Heat pre-treatment of SPI at 95 degrees celsius (C) for 5 min prior to enzymatic hydrolysis was investigated for its effect on peptide distribution and antioxidant capacity. SPH were subjected to UF with a 10 kDa molecular weight cut off (MWCO) polysulfone membrane. UF permeate fractions (lower molecular weight than 10 kDa) were fractionated by NF with a thin film composite membrane (2.5 kDa MWCO) at pH 4 and 8. Similar peptide content and antioxidant capacity (α=0.05) were obtained in control and pre-heated SPH when comparing the respective UF and NF permeate and retentate fractions produced. FCR antioxidant capacities of the SPH fractions were significantly lower than their ORAC antioxidant capacities, and the distribution among the UF and NF fractions was generally different. Most UF and NF fractions displayed higher antioxidant capacities when compared to the crude SPI hydrolysates, showing the importance of molecular weight on antioxidant capacity of peptides. The permeate fractions produced by NF at pH 8 displayed the highest antioxidant capacity, expressed in terms of Trolox equivalents (TE) per total solids (TS): 5562 μmol TE/g TS for control SPH, and 5187 μmol TE/g TS for pre-heated SPH. Due to the improvement in antioxidant capacity of peptides by NF at pH 8, the potential for NF as a viable industrial fractionation process was demonstrated.
Principal component analysis (PCA) of fluorescence excitation-emission matrix (EEM) data for UF and NF peptide fractions, followed by multi-linear regression analysis, was assessed for its potential to monitor and identify the contributions to ORAC and FCR, two in vitro antioxidant capacity assays, of SPH during membrane fractionation. Two statistically significant principal components (PCs) were obtained for UF and NF peptide fractions. Multi-linear regression models (MLRM) were developed to estimate their fluorescence and PCA-captured ORAC (ORAC-FPCA) and FCR (FCR-FPCA) antioxidant capacities. The ORAC-FPCA and FCR-FPCA antioxidant capacities for NF samples displayed strong, linear relationships at different pH conditions (R-squared>0.99). Such relationships are believed to reflect the individual and relative combined contributions of tryptophan and tyrosine residues present in the SPH fractions to ORAC and FCR antioxidant capacities. Therefore, the proposed method provides a tool for the assessment of fundamental parameters of antioxidant capacities captured by ORAC and FCR assays.
|
177 |
Towards Finding Optimal Mixture Of Subspaces For Data ClassificationMusa, Mohamed Elhafiz Mustafa 01 October 2003 (has links) (PDF)
In pattern recognition, when data has different structures in different parts of the
input space, fitting one global model can be slow and inaccurate. Learning methods
can quickly learn the structure of the data in local regions, consequently, offering faster
and more accurate model fitting. Breaking training data set into smaller subsets may
lead to curse of dimensionality problem, as a training sample subset may not be enough
for estimating the required set of parameters for the submodels. Increasing the size of
training data may not be at hand in many situations. Interestingly, the data in local
regions becomes more correlated. Therefore, by decorrelation methods we can reduce
data dimensions and hence the number of parameters. In other words, we can find
uncorrelated low dimensional subspaces that capture most of the data variability. The
current subspace modelling methods have proved better performance than the global
modelling methods for the given type of training data structure. Nevertheless these
methods still need more research work as they are suffering from two limitations
2 There is no standard method to specify the optimal number of subspaces.
² / There is no standard method to specify the optimal dimensionality for each
subspace.
In the current models these two parameters are determined beforehand. In this dissertation
we propose and test algorithms that try to find a suboptimal number of
principal subspaces and a suboptimal dimensionality for each principal subspaces automatically.
|
178 |
カテゴリカル・データの非計量的主成分分析の応用村上, 隆, Murakami, Takashi 26 December 1997 (has links)
国立情報学研究所で電子化したコンテンツを使用している。
|
179 |
二次元噴流と平行に置かれた平板との衝突により形成される渦構造のスケールと乱れの分布河合, 勇太, KAWAI, Yuta, 辻, 義之, TSUJI, Yoshiyuki, 久木田, 豊, KUKITA, Yutaka 04 1900 (has links)
No description available.
|
180 |
The Diversity of Variations in the Spectra of Type Ia SupernovaeWagers, Andrew James 2012 August 1900 (has links)
Type Ia supernovae (SNe Ia) are currently the best probe of the expansion history of the universe. Their usefulness is due chiefly to their uniformity between supernovae (SNe). However, there are some slight variations amongst SNe that have yet to be understood and accounted for. The goal of this work is to uncover relationships between the spectral features and the light curve decline rate, [delta]m₁₅. Wavelet decomposition has been used to develop a new spectral index to measure spectral line strengths independent of the continuum and easily corrected for noise. This new method yields consistent results without the arbitrary uncertainties introduced by current methods and is particularly useful for spectra which do not have a clearly defined continuum. These techniques are applied to SN Ia spectra and correlations are found between the spectral features and light curve decline rate. The wavelet spectral indexes are used to measure the evolution of spectral features which are characterized by 3 or 4 parameters for the most complicated evolution. The three absorption features studied here are associated with sulfur and silicon and all show a transition in strength between 1 to 2 weeks after B-band maximum. Pearson correlation coefficients between spectral features and [delta]m₁₅ are found to be significant within a week of maximum brightness and 3 to 4 weeks post-maximum. These correlations are used to determine the principal components at each epoch among the set of SN spectra in this work. The variation contained in the first principal component (PC1) is found to be greater than 60% to 70% for most epochs and reaching as high as 80% to 90% for epochs with the highest correlations. The same first principal component can be used to relate spectral feature strengths to the decline rate. These relations were used to estimate a SN light curve decline rate from a set of spectra taken over the course of the explosion, from a single spectrum, or from even a single spectral feature. These relationships could be used for future surveys to estimate spectral characteristics from light curve data, such as photometric redshift.
|
Page generated in 0.0532 seconds