Spelling suggestions: "subject:"canonical correlation (estatistics)"" "subject:"canonical correlation (cstatistics)""
11 |
Bias and Precision of the Squared Canonical Correlation Coefficient under Nonnormal Data ConditionsLeach, Lesley Ann Freeny 08 1900 (has links)
This dissertation: (a) investigated the degree to which the squared canonical correlation coefficient is biased in multivariate nonnormal distributions and (b) identified formulae that adjust the squared canonical correlation coefficient (Rc2) such that it most closely approximates the true population effect under normal and nonnormal data conditions. Five conditions were manipulated in a fully-crossed design to determine the degree of bias associated with Rc2:
distribution shape, variable sets, sample size to variable ratios, and within- and between-set correlations.
Very few of the condition combinations produced acceptable amounts of bias in Rc2, but those that did were all found with first function results. The sample size to variable ratio (n:v)was determined to have the greatest impact on the bias associated with the Rc2 for the first, second, and third functions. The variable set condition also affected the accuracy of Rc2, but for the second and third functions only. The kurtosis levels of the marginal distributions (b2), and the
between- and within-set correlations demonstrated little or no impact on the bias associated with Rc2. Therefore, it is recommended that researchers use n:v ratios of at least 10:1 in canonical analyses, although greater n:v ratios have the potential to produce even less bias. Furthermore,because it was determined that b2 did not impact the accuracy of Rc2, one can be somewhat confident that, with marginal distributions possessing homogenous kurtosis levels ranging
anywhere from -1 to 8, Rc2 will likely be as accurate as that resulting from a normal distribution.
Because the majority of Rc2 estimates were extremely biased, it is recommended that all Rc2 effects, regardless of which function from which they result, be adjusted using an appropriate adjustment formula. If no rationale exists for the use of another formula, the Rozeboom-2 would likely be a safe choice given that it produced the greatest number of unbiased Rc2 estimates for the greatest number of condition combinations in this study.
|
12 |
Canonical correlation analysis of aggravated robbery and poverty in Limpopo ProvinceRwizi, Tandanai 05 1900 (has links)
The study was aimed at exploring the relationship between poverty and aggravated
robbery in Limpopo Province. Sampled secondary data of aggravated robbery of-
fenders, obtained from the South African Police (SAPS), Polokwane, was used in the
analysis. From empirical researches on poverty and crime, there are some deductions
that vulnerability to crime is increased by poverty. Poverty set was categorised by
gender, employment status, marital status, race, age and educational attainment.
Variables for aggravated robbery were house robbery, bank robbery, street/common
robbery, carjacking, truck hijacking, cash-in-transit and business robbery. Canonical
correlation analysis was used to make some inferences about the relationship of these
two sets. The results revealed a signi cant positive correlation of 0.219(p-value =
0.025) between poverty and aggravated robbery at ve per cent signi cance level. Of
the thirteen variables entered into the poverty-aggravated model, ve emerged as sta-
tistically signi cant. These were gender, marital status, employment status, common robbery and business robbery. / Mathematical Sciences / M. Sc. (Statistics)
|
13 |
Towards on-line domain-independent big data learning : novel theories and applicationsMalik, Zeeshan January 2015 (has links)
Feature extraction is an extremely important pre-processing step to pattern recognition, and machine learning problems. This thesis highlights how one can best extract features from the data in an exhaustively online and purely adaptive manner. The solution to this problem is given for both labeled and unlabeled datasets, by presenting a number of novel on-line learning approaches. Specifically, the differential equation method for solving the generalized eigenvalue problem is used to derive a number of novel machine learning and feature extraction algorithms. The incremental eigen-solution method is used to derive a novel incremental extension of linear discriminant analysis (LDA). Further the proposed incremental version is combined with extreme learning machine (ELM) in which the ELM is used as a preprocessor before learning. In this first key contribution, the dynamic random expansion characteristic of ELM is combined with the proposed incremental LDA technique, and shown to offer a significant improvement in maximizing the discrimination between points in two different classes, while minimizing the distance within each class, in comparison with other standard state-of-the-art incremental and batch techniques. In the second contribution, the differential equation method for solving the generalized eigenvalue problem is used to derive a novel state-of-the-art purely incremental version of slow feature analysis (SLA) algorithm, termed the generalized eigenvalue based slow feature analysis (GENEIGSFA) technique. Further the time series expansion of echo state network (ESN) and radial basis functions (EBF) are used as a pre-processor before learning. In addition, the higher order derivatives are used as a smoothing constraint in the output signal. Finally, an online extension of the generalized eigenvalue problem, derived from James Stone’s criterion, is tested, evaluated and compared with the standard batch version of the slow feature analysis technique, to demonstrate its comparative effectiveness. In the third contribution, light-weight extensions of the statistical technique known as canonical correlation analysis (CCA) for both twinned and multiple data streams, are derived by using the same existing method of solving the generalized eigenvalue problem. Further the proposed method is enhanced by maximizing the covariance between data streams while simultaneously maximizing the rate of change of variances within each data stream. A recurrent set of connections used by ESN are used as a pre-processor between the inputs and the canonical projections in order to capture shared temporal information in two or more data streams. A solution to the problem of identifying a low dimensional manifold on a high dimensional dataspace is then presented in an incremental and adaptive manner. Finally, an online locally optimized extension of Laplacian Eigenmaps is derived termed the generalized incremental laplacian eigenmaps technique (GENILE). Apart from exploiting the benefit of the incremental nature of the proposed manifold based dimensionality reduction technique, most of the time the projections produced by this method are shown to produce a better classification accuracy in comparison with standard batch versions of these techniques - on both artificial and real datasets.
|
14 |
A framework for conducting mechanistic based reliability assessments of components operating in complex systemsWallace, Jon Michael 02 December 2003 (has links)
Reliability prediction of components operating in complex systems has historically been conducted in a statistically isolated manner. Current physics-based, i.e. mechanistic, component reliability approaches focus more on component-specific attributes and mathematical algorithms and not enough on the influence of the system. The result is that significant error can be introduced into the component reliability assessment process.
The objective of this study is the development of a framework that infuses the influence of the system into the process of conducting mechanistic-based component reliability assessments. The formulated framework consists of six primary steps. The first three steps, identification, decomposition, and synthesis, are qualitative in nature and employ system reliability and safety engineering principles for an appropriate starting point for the component reliability assessment.
The most unique steps of the framework are the steps used to quantify the system-driven local parameter space and a subsequent step using this information to guide the reduction of the component parameter space. The local statistical space quantification step is accomplished using two newly developed multivariate probability tools: Multi-Response First Order Second Moment and Taylor-Based Inverse Transformation. Where existing joint probability models require preliminary statistical information of the responses, these models combine statistical information of the input parameters with an efficient sampling of the response analyses to produce the multi-response joint probability distribution.
Parameter space reduction is accomplished using Approximate Canonical Correlation Analysis (ACCA) employed as a multi-response screening technique. The novelty of this approach is that each individual local parameter and even subsets of parameters representing entire contributing analyses can now be rank ordered with respect to their contribution to not just one response, but the entire vector of component responses simultaneously.
The final step of the framework is the actual probabilistic assessment of the component. Variations of this final step are given to allow for the utilization of existing probabilistic methods such as response surface Monte Carlo and Fast Probability Integration.
The framework developed in this study is implemented to conduct the finite-element based reliability prediction of a gas turbine airfoil involving several failure responses. The framework, as implemented resulted in a considerable improvement to the accuracy of the part reliability assessment and an increased statistical understanding of the component failure behavior.
|
15 |
Canonical correlation analysis of aggravated robbery and poverty in Limpopo ProvinceRwizi, Tandanai 05 1900 (has links)
The study was aimed at exploring the relationship between poverty and aggravated
robbery in Limpopo Province. Sampled secondary data of aggravated robbery of-
fenders, obtained from the South African Police (SAPS), Polokwane, was used in the
analysis. From empirical researches on poverty and crime, there are some deductions
that vulnerability to crime is increased by poverty. Poverty set was categorised by
gender, employment status, marital status, race, age and educational attainment.
Variables for aggravated robbery were house robbery, bank robbery, street/common
robbery, carjacking, truck hijacking, cash-in-transit and business robbery. Canonical
correlation analysis was used to make some inferences about the relationship of these
two sets. The results revealed a signi cant positive correlation of 0.219(p-value =
0.025) between poverty and aggravated robbery at ve per cent signi cance level. Of
the thirteen variables entered into the poverty-aggravated model, ve emerged as sta-
tistically signi cant. These were gender, marital status, employment status, common robbery and business robbery. / Mathematical Sciences / M. Sc. (Statistics)
|
Page generated in 0.1445 seconds