Spelling suggestions: "subject:"anda principal component analysis"" "subject:"ando principal component analysis""
101 |
A Contribution To Modern Data Reduction Techniques And Their Applications By Applied Mathematics And Statistical LearningSakarya, Hatice 01 January 2010 (has links) (PDF)
High-dimensional data take place from digital image processing, gene expression micro arrays, neuronal population activities to financial time series. Dimensionality Reduction - extracting low dimensional structure from high dimension - is a key problem in many areas like information processing, machine learning, data mining, information retrieval and pattern recognition, where we find some data reduction techniques. In this thesis we will give a survey about modern data
reduction techniques, representing the state-of-the-art of theory, methods and application, by introducing the language of mathematics there. This needs a special care concerning the questions of, e.g., how to understand discrete structures as manifolds, to identify their structure, preparing the dimension reduction, and to face complexity in the algorithmically methods. A special emphasis will be paid to Principal Component Analysis, Locally Linear Embedding and Isomap Algorithms. These algorithms are studied by a research group from Vilnius, Lithuania and Zeev Volkovich, from Software Engineering Department, ORT Braude College of Engineering, Karmiel, and others. The main purpose of this study is to compare the results of the three
of the algorithms. While the comparison is beeing made we will focus the results and duration.
|
102 |
Functional data analysis: classification and regressionLee, Ho-Jin 01 November 2005 (has links)
Functional data refer to data which consist of observed functions or curves evaluated
at a finite subset of some interval. In this dissertation, we discuss statistical
analysis, especially classification and regression when data are available in function
forms. Due to the nature of functional data, one considers function spaces in presenting
such type of data, and each functional observation is viewed as a realization
generated by a random mechanism in the spaces. The classification procedure in
this dissertation is based on dimension reduction techniques of the spaces. One commonly
used method is Functional Principal Component Analysis (Functional PCA) in
which eigen decomposition of the covariance function is employed to find the highest
variability along which the data have in the function space. The reduced space of
functions spanned by a few eigenfunctions are thought of as a space where most of the
features of the functional data are contained. We also propose a functional regression
model for scalar responses. Infinite dimensionality of the spaces for a predictor causes
many problems, and one such problem is that there are infinitely many solutions. The
space of the parameter function is restricted to Sobolev-Hilbert spaces and the loss
function, so called, e-insensitive loss function is utilized. As a robust technique of
function estimation, we present a way to find a function that has at most e deviation
from the observed values and at the same time is as smooth as possible.
|
103 |
Computational and experimental investigation of the enzymatic hydrolysis of celluloseBansal, Prabuddha 25 August 2011 (has links)
The enzymatic hydrolysis of cellulose to glucose by cellulases is one of the major steps in the conversion of lignocellulosic biomass to biofuel. This hydrolysis by cellulases, a heterogeneous reaction, currently suffers from some major limitations, most importantly a dramatic rate slowdown at high degrees of conversion in the case of crystalline cellulose. Various rate-limiting factors were investigated employing experimental as well as computational studies. Cellulose accessibility and the hydrolysable fraction of accessible substrate (a previously undefined and unreported quantity) were shown to decrease steadily with conversion, while cellulose reactivity, defined in terms of hydrolytic activity per amount of actively adsorbed cellulase, remained constant. Faster restart rates were observed on partially converted cellulose as compared to uninterrupted hydrolysis rates, supporting the presence of an enzyme clogging phenomenon.
Cellulose crystallinity is a major substrate property affecting the rates, but its quantification has suffered from lack of consistency and accuracy. Using multivariate statistical analysis of X-ray data from cellulose, a new method to determine the degree of crystallinity was developed. Cel7A CBD is a promising target for protein engineering as cellulose pretreated with Cel7A CBDs exhibits enhanced hydrolysis rates resulting from a reduction in crystallinity. However, for Cel7A CBD, a high throughput assay is unlikely to be developed. In the absence of a high throughput assay (required for directed evolution) and extensive knowledge of the role of specific protein residues (required for rational protein design), the mutations need to be picked wisely, to avoid the generation of inactive variants. To tackle this issue, a method utilizing the underlying patterns in the sequences of a protein family has been developed.
|
104 |
Användarverifiering från webbkameraAlajarva, Sami January 2007 (has links)
<p>Arbetet som presenteras i den här rapporten handlar om ansiktsigenkänning från webbkameror med hjälp av principal component analysis samt artificiella neurala nätverk av typen feedforward. Arbetet förbättrar tekniken med hjälp av filterbaserade metoder som bland annat används inom ansiktsdetektering. Dessa filter bygger på att skicka med redundant data av delregioner av ansiktet.</p>
|
105 |
High-dimensional classification for brain decodingCroteau, Nicole Samantha 26 August 2015 (has links)
Brain decoding involves the determination of a subject’s cognitive state or an associated stimulus from functional neuroimaging data measuring brain activity. In this setting the cognitive state is typically characterized by an element of a finite set, and the neuroimaging data comprise voluminous amounts of spatiotemporal data measuring some aspect of the neural signal. The associated statistical problem is one of classification from high-dimensional data. We explore the use of functional principal component analysis, mutual information networks, and persistent homology for examining the data through exploratory analysis and for constructing features characterizing the neural signal for brain decoding. We review each approach from this perspective, and we incorporate the features into a classifier based on symmetric multinomial logistic regression with elastic net regularization. The approaches are illustrated in an application where the task is to infer from brain activity measured with magnetoencephalography (MEG) the type of video stimulus shown to a subject. / Graduate
|
106 |
Optimization of an array of peptidic indicator displacement assays for the discrimination of cabernet sauvignon winesChong, Sally 06 January 2011 (has links)
The research project, Optimization of an array of Peptidic Indicator Displacement Assays for the Discrimination of Cabernet Sauvignon Wines, describes the multiple step lab trials conducted to optimize an array of ensembles composed of synthesized peptides and PCV:Cu+2 complexes for the differentiation of seven Cabernet Sauvignon wines with different tannin levels. This report also includes the methods and analysis used. The analysis interpreted by principal component analysis. / text
|
107 |
Computerized model to forecast low-cost housing demand in urban area in Malaysia using Artificial Neural Networks (ANN)Zainun, Noor Y. B. January 2011 (has links)
The forecasted proportions of urban population to total population in Malaysia are steadily increasing from 26% in 1965 to 70% in 2020. Therefore, there is a need to fully appreciate the legacy of the urbanization of Malaysia by providing affordable housing. The main aim of this study is to focus on developing a model to forecast the demand of low cost housing in urban areas. The study is focused on eight states in Peninsular Malaysia, as most of these states are among the areas predicted to have achieved the highest urbanization level in the country. The states are Kedah, Penang, Perlis, Kelantan, Terengganu, Perak, Pahang and Johor. Monthly time-series data for six to eight years of nine indicators including: population growth; birth rate; child mortality rate; unemployment rate; household income rate; inflation rate; GDP; poverty rate and housing stocks have been used to forecast the demand on low cost housing using Artificial Neural Network (ANN) approach. The data is collected from the Department of Malaysian Statistics, the Ministry of Housing and the Housing Department of the State Secretary. The Principal Component Analysis (PCA) method has been adopted to analyze the data using SPSS 18.0 package. The performance of the Neural Network is evaluated using R squared (R2) and the accuracy of the model is measured using the Mean Absolute Percentage Error (MAPE). Lastly, a user friendly interface is developed using Visual Basic. From the results, it was found that the best Neural Network to forecast the demand on low cost housing in Kedah is 2-16-1, Pahang 2-15-1, Kelantan 2-25-1, Terengganu 2-30-1, Perlis 3-5-1, Pulau Pinang 3-7-1, Johor 3-38-1 and Perak 3-24-1. In conclusion, the evaluation performance of the model through the MAPE value shows that the NN model can forecast the low-cost housing demand very good in Pulau Pinang, Johor, Pahang and Kelantan, where else good in Kedah and Terengganu while in Perlis and Perak it is not accurate due to the lack of data. The study has successfully developed a user friendly interface to retrieve and view all the data easily.
|
108 |
Do Self-Sustainable MFI:s help alleviate relative poverty?Stenbäcken, Rasmus January 2006 (has links)
The subject of this paper is microfinance and the question: Do self-sustainable MFI:s alleviate poverty?. A MFI is a micro financial institution, a regular bank or a NGO that has transformed into a licensed financial institutions, focused on microenterprises. To answer the question data has been gathered in Ecuador, South America. South America have a large amount of self sustainable MFI:s. Ecuador was selected as the country to be studied as it has an intermediate level of market penetration in the micro financial sector. To determine relative poverty before and after the access to microcredit, interviews were used. The data retrieved in the interviews was used to determine the impact of micro credit on different aspects of relative poverty using the Difference in Difference method. Significant differences are found between old and new clients as well as for the change over time. But no significant results are found for the difference in change over time for clients compared to the non-clients. The author argues that the insignificant result can either be a result of a too small sample size, disturbances in the sample selection or that this specific kind of institution have little or no affect on the current clients economical development.
|
109 |
Dimensions of Women’s Empowerment and Their Influence on the Utilization of Maternal Health Services in an Egyptian Village: A Multivariate AnalysisAOYAMA, ATSUKO, SANEYA RIZK EL BANNA, NAGAH MAHMOUD ABDOU, CHIANG, CHIFA, KAWAGUCHI, LEO, INASS HELMY HASSAN ELSHAIR, NAWAL ABDEL MONEIM FOUAD 02 1900 (has links)
No description available.
|
110 |
Logistic Regression Analysis to Determine the Significant Factors Associated with Substance Abuse in School-Aged ChildrenMaxwell, Kori Lloyd Hugh 17 April 2009 (has links)
Substance abuse is the overindulgence in and dependence on a drug or chemical leading to detrimental effects on the individual’s health and the welfare of those surrounding him or her. Logistic regression analysis is an important tool used in the analysis of the relationship between various explanatory variables and nominal response variables. The objective of this study is to use this statistical method to determine the factors which are considered to be significant contributors to the use or abuse of substances in school-aged children and also determine what measures can be implemented to minimize their effect. The logistic regression model was used to build models for the three main types of substances used in this study; Tobacco, Alcohol and Drugs and this facilitated the identification of the significant factors which seem to influence their use in children.
|
Page generated in 0.1597 seconds