Spelling suggestions: "subject:" multivariate analysis"" "subject:" ultivariate analysis""
91 |
Extensions of biplot methodology to discriminant analysis with applications of non-parametric principal componentsGardner, Sugnet January 2001 (has links)
Dissertation (PhD)--Stellenbosch University, 2001. / ENGLISH ABSTRACT: Gower and Hand offer a new perspective on the traditional biplot. This perspective
provides a unified approach to principal component analysis (PCA) biplots based on
Pythagorean distance; canonical variate analysis (CVA) biplots based on Mahalanobis
distance; non-linear biplots based on Euclidean embeddable distances as well as
generalised biplots for use with both continuous and categorical variables.
The biplot methodology of Gower and Hand is extended and applied in statistical
discrimination and classification. This leads to discriminant analysis by means of PCA
biplots, CVA biplots, non-linear biplots as well as generalised biplots. Properties of these
techniques are derived in detail. Classification regions defined for linear discriminant
analysis (LDA) are applied in the CVA biplot leading to discriminant analysis using biplot
methodology. Situations where the assumptions of LDA are not met are considered and
various existing alternative discriminant analysis procedures are formulated in terms of
biplots and apart from PCA biplots, QDA, FDA and DSM biplots are defined, constructed
and their usage illustrated.
It is demonstrated that biplot methodology naturally provides for managing categorical and
continuous variables simultaneously. It is shown through a simulation study that the
techniques based on biplot methodology can be applied successfully to the reversal
problem with categorical variables in discriminant analysis.
Situations occurring in practice where existing discriminant analysis procedures based on
distances from means fail are considered. After discussing self-consistency and principal
curves (a form of non-parametric principal components), discriminant analysis based on
distances from principal curves (a form of a conditional mean) are proposed. This biplot
classification procedure based upon principal curves, yields much better results.
Bootstrapping is considered as a means of describing variability in biplots. Variability in
samples as well as of axes in biplot displays receives attention. Bootstrap a-regions are defined and the ability of these regions to describe biplot variability and to detect outliers
is demonstrated. Robust PCA and CVA biplots restricting the role of influential
observations on biplot displays are also considered.
An extensive library of S-PLUS computer programmes is provided for implementing the
various discriminant analysis techniques that were developed using biplot methodology.
The application of the above theoretical developments and computer software is illustrated
by analysing real-life data sets. Biplots are used to investigate the degree of capital
intensity of companies and to serve as an aid in risk management of a financial institution.
A particular application of the PCA biplot is the TQI biplot used in industry to determine
the degree to which manufactured items comply with multidimensional specifications. A
further interesting application is to determine whether an Old-Cape furniture item is
manufactured of stinkwood or embuia. A data set provided by the Western Cape Nature
Conservation Board consisting of measurements of tortoises from the species Homopus
areolatus is analysed by means of biplot methodology to determine if morphological
differences exist among tortoises from different geographical regions. Allometric
considerations need to be taken into account and the resulting small sample sizes in some
subgroups severely limit the use of conventional statistical procedures.
Biplot methodology is also applied to classification in a diabetes data set illustrating the
combined advantage of using classification with principal curves in a robust biplot or
biplot classification where covariance matrices are unequal. A discriminant analysis
problem where foraging behaviour of deer might eventually result in a change in the
dominant plant species is used to illustrate biplot classification of data sets containing both
continuous and categorical variables. As an example of the use of biplots with large data
sets a data set consisting of 16828 lemons is analysed using biplot methodology to
investigate differences in fruit from various areas of production, cultivars and rootstocks.
The proposed a-bags also provide a measure of quantifying the graphical overlap among
classes. This method is successfully applied in a multidimensional socio-economical data
set to quantify the degree of overlap among different race groups. The application of the proposed biplot methodology in practice has an important byproduct:
It provides the impetus for many a new idea, e.g. applying a peA biplot in
industry led to the development of quality regions; a-bags were constructed to represent
thousands of observations in the lemons data set, in tum leading to means for quantifying
the degree of overlap. This illustrates the enormous flexibility of biplots - biplot
methodology provides an infrastructure for many novelties when applied in practice. / AFRIKAANSE OPSOMMING: Gower en Hand bied 'n nuwe perspektief op die tradisionele bistipping. Hierdie
perspektief verskaf 'n uniforme benadering tot hoofkomponent analise (HKA) bistippings
gebaseer op Pythagoras-afstand; kanoniese veranderlike analise (KVA) bistippings
gebaseer op Mahalanobis-afstand; nie-lineere bistippings gebaseer op Euclidies inbedbare
afstande sowel as veralgemeende bistippings vir gebruik wanneer beide kontinue en
kategoriese veranderlikes voorkom.
Die bistippingsmetodologie van Gower en Hand word uitgebrei en toegepas in statistiese
diskriminasie en klassifikasie. Dit lei tot diskriminantanalise met behulp van HKA
bistippings, KVA bistippings, nie-lineere bistippings sowel as veralgemeende bistippings.
Die eienskappe van hierdie tegnieke word in besonderhede afgelei. Die toepassing van
die konsep van 'n klassifikasiegebied in die KVA bistipping baan die weg vir lineere
diskriminantanalise (LDA) met behulp van bistippingsmetodologie. Situasies waar daar
nie aan die aannames van LDA voldoen word nie kry aandag en verskeie bestaande
altematiewe diskriminantanalise prosedures word in terme van bistippings geformuleer en
naas HKA bistippings, word QDA, FDA en DSM bistippings gedefinieer, gekonstrueer en
hul gebruike gedemonstreer.
Dit word aangetoon dat bistippingsmetodologie op 'n natuurlik wyse voorsiening maak om
kategoriese veranderlikes en kontinue veranderlikes gelyktydig te hanteer. Daar word met
behulp van 'n simulasie-studie aangetoon dat tegnieke gebaseer op die
bistippingsmetodologie wat ontwikkel IS, suksesvol by die sogenaamde
ornkeringsprobleem by diskriminantanalise met kategoriese veranderlikes gebruik kan
word.
Verder word aangevoer dat daar baie praktiese situasies voorkom waar bestaande
prosedures van diskriminantanalise faal omdat dit op afstande vanaf gemiddeldes gebaseer
IS. Na 'n bespreking van self-konsekwentheid en hoofkrommes ('n vorm van nieparametriese
hoofkomponente) word voorgestel om diskriminantanalise op afstand vanaf hoofkrommes ('n vonn van 'n voorwaardelike gemiddelde) te baseer. Sodoende is 'n
bistippingklassifikasie prosedure wat op afstand vanaf hoofkrommes gebaseer is en wat
baie beter resultate lewer, ontwikkel.
Die variasie in die posisies van datapunte in die bistipping sowel as van die bistippingsasse
word bestudeer met behulp van skoenlusmetodes. 'n Skoenlus a-gebied word gedefinieer
en dit word gedemonstreer hoe so 'n a-gebied aangewend kan word om variasie in
bistippings te beskryf en wegleers te identifiseer. Robuuste HKA en KV A bistippings wat
die rol van invloedryke waamemings op die bistipping beperk, word bespreek.
'n Omvangryke biblioteek van S-PLUS rekenaarprogramme is geskryf VIr die
implementering van die verskillende diskriminantanalise tegnieke wat met behulp van
bistippingsmetodologie ontwikkel is. Die toepassing van die voorafgaande teoretiese
ontwikkelinge en rekenaarprogramme word geillustreer aan die hand van werklike
datastelle vanuit die praktyk. So word bistippings gebruik om die mate van
kapitaalintensiteit van ondememings te ondersoek en om as hulpmiddel by risikobestuur
van 'n finansiele instelling te dien. 'n Besondere toepassing van die HKA bistipping is die
TQI bistipping wat in die industriele omgewing gebruik word ten einde te bepaal tot watter
mate vervaardigde artikels aan neergelegde meerdimensionele spesifikasies voldoen. 'n
Verdere interessante toepassing is om te bepaal of 'n Ou-Kaapse meubelstuk van stinkhout
of embuia gemaak is. 'n Datastel verskaf deur Wes-Kaap Natuurbewaring in verband met
die bekende padloper skilpad, Homopus areolatus, is met behulp van bistippings
geanaliseer om te bepaal of daar morfometriese verskille tussen die padlopers afkomstig
van bepaalde geografiese gebiede is. Allometriese beginsels moes ook in ag gene em word
en die min waamemings in sommige van die subgroepe het tot gevolg dat konvensionele
statistiese tegnieke nie sonder meer gebruik kan word nie.
Die bistippingsmetodologie is ook toegepas op klassifikasie by 'n diabetes datastel om die
gekombineerde gebruik van. hoofkrommes in 'n robuuste bistipping te illustreer en
bistippingklassifikasie waar daar sprake van ongelyke kovariansiematrikse is. 'n
Diskriminantanalise probleem waar die weidingsvoorkeure van wildsbokke 'n verandering
in die dominante plantegroei tot gevolg kan he, word gebruik om bistippingklassifikasie met data waar kontinue sowel as kategoriese veranderlikes verskaf word, te illustreer. As
voorbeeld van die gebruik van bistippings by 'n groot datastel is 'n datastel bestaande uit
waamemings van 16828 suurlemoene met behulp van bistippingsmetodologie geanaliseer
ten einde verskille in vrugte afkomstig van verskillende produsente-streke, kultivars en
onderstamme te ondersoek. Die a-sakkies wat hier ontwikkel is, lei tot kwantifisering van
die grafiese oorvleueling van groepe. Hierdie beginsel word suksesvol toegepas in 'n
meerdimensionele sosio-ekonomiese datastel om die mate van oorvleueling van
verskillende bevolkingsgroepe te kwantifiseer.
Die toepassing van die voorgestelde bistippingsmetodologie in die praktyk lei tot 'n
belangrike newe-produk: Dit verskaf die stimulus tot die ontstaan van nuwe idees,
byvoorbeeld, die toepassing van 'n HKA bistipping in 'n industriele omgewing het tot die
ontwikkeling van die konsep van 'n kwaliteitsgebied aanleiding gegee; a-sakkies is
gekonstrueer om duisende waamemings in die suurlemoendatastel te verteenwoordig wat
weer gelei het tot 'n metode om die graad van oorvleueling te kwantifiseer. Hierdeur is die
geweldige veelsydigheid van bistippings geillustreer - bistippingsmetodologie verskaf die
infrastruktuur vir baie vindingryke toepassings in die praktyk.
|
92 |
Scales of macroinvertebrate-habitat relationships in fluvial systems, a case study of the River FromeCannan, Caroline Elizabeth January 1998 (has links)
No description available.
|
93 |
General multivariate approximation techniques applied to the finite element methodHassoulas, Vasilios 26 January 2015 (has links)
No description available.
|
94 |
Approche métabolomique par résonance magnétique nucléaire du proton dans l'évaluation des hépatopathies stéatosiques non alcooliques et dans le suivi d'un traitement curatif du carcinome hépatocellulaire / 1H NMR-Metabolomics approaches in the assessment of the non-alcoholic fatty liver diseases and in the follow-up of the hepatocellular cacinoma curative treatmentGoossens, Corentine 10 December 2015 (has links)
Les atteintes hépatiques, asymptomatiques pour la plupart d’entre elles et pouvant évoluer vers des complications sévères telles que le carcinome hépatocellulaire (CHC) sont responsables de plus de 15 000 décès par an en France. Le manque de marqueurs cliniques et biologiques fiables pour déterminer le degré de sévérité de l’hépatopathie ainsi que pour reconnaître les stades précoces du CHC constitue actuellement un obstacle majeur à une prise en charge optimale de la maladie. Grâce aux approches de type métabolomique et aux techniques analytiques telles que la résonance magnétique nucléaire, il est désormais possible d’obtenir une véritable cartographie des métabolites d’un individu. L’objectif de ce travail a été d’explorer, par une approche RMN métabolomique, les changements métaboliques dans le foie et dans le sérum causés par différentes pathologies hépatiques afin de proposer de nouvelles pistes dans l’amélioration du diagnostic et de la prise en charge de ces maladies. Une attention particulière a également été donnée à l’étude de la validité des paramètres de qualité des modèles de discrimination réalisés lors des analyses statistiques des données multivariées. / Most liver diseases nowadays remain symptomless and tend to lead to hepatocellular carcinoma responsible for more than 15.000 patient deaths per year in France. Liver diseases are therefore a major concern for public health.Clinicians lack of non-invasive biomarkers allowing them to enhance identification of liver diseases stages in order to efficiently target the first HCC signs and accordingly improve clinical prognosis.Identification of new biomarkers set new challenges in translational research in order torefine the prognosis and adapt therapeutic procedures.Proton nuclear magnetic resonance spectroscopy-based metabolomics enable to identifyand quantify such metabolites by defining individual metabolic fingerprints.First part of this work was to explore the metabolic modifications of liver tissue to further establish diseases stages profiles.Second part was focused on the assessment of metabolic variations in HCC patients, by analyzing sequential serums taking, before and after a radiofrequency ablation curative treatment.Third and last part was centered on the validation of the quality parameters of the discriminant models used in multivariate statistical analysis.
|
95 |
Testing of homogeneity in distributions with ordered categories.January 1995 (has links)
by Lee Chi-ming. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 103-104). / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Three Underlying Distributions --- p.6 / Chapter 2.1 --- Exponential distribution --- p.8 / Chapter 2.1.1 --- Estimation of Thresholds --- p.8 / Chapter 2.1.2 --- Estimation of Parameter --- p.11 / Chapter 2.1.3 --- Simulation Study --- p.13 / Chapter 2.2 --- Normal distribution --- p.24 / Chapter 2.2.1 --- Estimation of Thresholds --- p.24 / Chapter 2.2.2 --- Estimation of Parameter --- p.26 / Chapter 2.2.3 --- Simulation Study --- p.29 / Chapter 2.3 --- Weibull distribution --- p.42 / Chapter 2.3.1 --- Estimation of Thresholds --- p.42 / Chapter 2.3.2 --- Estimation of Parameter --- p.44 / Chapter 2.3.3 --- Simulation Study --- p.46 / Chapter 3 --- Goodness-of-fit Test --- p.60 / Chapter 3.1 --- Test for the Exponential distribution --- p.63 / Chapter 3.2 --- Test for the Normal distribution --- p.66 / Chapter 3.3 --- Test for the Weibull distribution --- p.68 / Chapter 3.4 --- Implication --- p.70 / Chapter 3.5 --- An Artificial Example --- p.72 / Chapter 3.5.1 --- Case 1 (s= 3) --- p.72 / Chapter 3.5.2 --- Case 2 (s= 4) --- p.80 / Chapter 4 --- Real Data Illustration --- p.87 / Chapter 4.1 --- Test for the Exponential distribution --- p.90 / Chapter 4.2 --- Test for the Normal distribution --- p.91 / Chapter 4.3 --- Test for the Weibull distribution --- p.92 / Chapter 4.4 --- Inferences from the Exponential distribution --- p.94 / Chapter 4.5 --- Inferences from the Normal distribution --- p.95 / Chapter 4.6 --- Inferences from the Weibull distribution --- p.98 / Chapter 5 --- Conclusion --- p.101 / Bibliography --- p.103
|
96 |
Short-time independent component analysis for blind separation of speech sources. / CUHK electronic theses & dissertations collectionJanuary 2007 (has links)
Among all the three LOD types, the Dominant LOD manifests to be with comparatively higher efficiency in yielding accurate separation performance. The production mechanism of the Dominant LOD indicates that higher energy ratio of sources helps to build this type of LOD. Considering the sparse energy distribution of speech signals in the time-frequency domain, the Dominant LOD may arise in some short time subbands even though it appears to be Non-dominant LOD in its fullband. Therefore the proposed LOD-based ICA is extended to the frequency subbands for more opportunities to attain such Dominant LOD type. / Based on the insight into the effect on the aforementioned problems by the input sources as well as the mixing channel, three basic short time Local Optima Distribution (LOD) types are investigated. Information is derived from the characteristics of these LOD types for: (1) choosing simultaneous or sequential ICA algorithm; (2) shrinking feasible search region; and (3) producing possible initial points in search of the de-mixing matrix. As a result, the technique of LOD-based ICA is developed in this thesis to assign different procedures according to the LOD type of the observed short time mixtures. The analytical and simulation results demonstrated that more accurate de-mixing matrix estimation could be obtained; thereby producing improved separation performance. / Independent Component Analysis (ICA) has long been regarded as a powerful technique for speech source separation. In practice, however, speaker moving or reverberant environments may necessitate ICA to be implemented in short time intervals, which makes the fundamental assumption of sources' independence collapse in ICA. This leads to two important but often overlooked problems, namely: (1) excursion of global optimum from the desired solution and (2) diffusion of local optima in search of the de-mixing matrix. These two problems occur in most practical situations and greatly degrade the performance of the existing ICA algorithms. / The effectiveness of the proposed short time LOD-based ICA is validated by applying it to a speaker-moving model and a mixing system with abrupt changes, which approaches the practical applications better since the mixing system is not always constant as in standard ICA model. We have also explored the separation task with noise-contaminated speech signals. This suggests us that: other than the long time analysis, the short time analysis may provide an alternative means with extra information for separation when the independence information is impaired and subsequently fails to yield the desirable separation performance. / Zhang, Jing. / "July 2007." / Adviser: Ching Pak Chung. / Source: Dissertation Abstracts International, Volume: 69-01, Section: B, page: 0579. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references. / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
|
97 |
Empirical investigation of the performance of Mplus for analyzing structural equation model with mixed continuous and ordered categorical variables.January 2003 (has links)
Lam Ho-Suen Joffee. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaf 40). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Review of Mplus --- p.3 / Chapter 3 --- Design of the Simulation Study --- p.6 / Chapter 3.1 --- Simulation Design --- p.6 / Chapter 3.2 --- Covariance Structure Analysis and Mplus Restriction --- p.10 / Chapter 3.3 --- Implementation --- p.10 / Chapter 4 --- Method of Evalution --- p.12 / Chapter 4.1 --- Accuracy of Parameter Estimates --- p.12 / Chapter 4.2 --- Distribution of the Goodness-of-fit Statistic --- p.13 / Chapter 4.3 --- Precision of Standard Errors --- p.14 / Chapter 4.4 --- Number of Replications --- p.15 / Chapter 5 --- Results of the Simulation Study --- p.17 / Chapter 5.1 --- Accuracy of the Parameter Estimates --- p.17 / Chapter 5.2 --- Distribution of the Goodness-of-fit Statistic --- p.18 / Chapter 5.3 --- Precision of the Standard Error --- p.19 / Chapter 5.4 --- Results when the Sample Size is Extremely Large --- p.20 / Chapter 5.5 --- Conclusion --- p.21 / Chapter 6 --- Additional Simulation Study --- p.27 / Chapter 6.1 --- Precision of Standard Error when the Model Consists of Only Con- tinuous and Only Ordinal Variables --- p.28 / Chapter 6.2 --- Comparison of the Simulation Results of Mplus and LISREL --- p.29 / Chapter 6.3 --- Conclusion --- p.31 / Chapter 7 --- Conclusion and Discussion --- p.33 / Chapter A --- Mplus Sample Program (Condition C1 S2 N=500) --- p.36 / Chapter B --- PRELIS Sample Program (Condition C1 S1 N=500) --- p.37
|
98 |
Scale parameter modelling of the t-distributionTaylor, Julian January 2005 (has links)
This thesis considers location and scale parameter modelling of the heteroscedastic t-distribution. This new distribution is an extension of the heteroscedastic Gaussian and provides robust analysis in the presence of outliers as well accommodates possible heteroscedasticity by flexibly modelling the scale parameter using covariates existing in the data. To motivate components of work in this thesis the Gaussian linear mixed model is reviewed. The mixed model equations are derived for the location fixed and random effects and this model is then used to introduce Restricted Maximum Likelihood ( REML ). From this an algorithmic scheme to estimate the scale parameters is developed. A review of location and scale parameter modelling of the heteroscedastic Gaussian distribution is presented. In this thesis, the scale parameters are a restricted to be a function of covariates existing in the data. Maximum Likelihood ( ML ) and REML estimation of the location and scale parameters is derived as well as an efficient computational algorithm and software are presented. The Gaussian model is then extended by considering the heteroscedastic t distribution. Initially, the heteroscedastic t is restricted to known degrees of freedom. Scoring equations for the location and scale parameters are derived and their intimate connection to the prediction of the random scale effects is discussed. Tools for detecting and testing heteroscedasticity are also derived and a computational algorithm is presented. A mini software package " hett " using this algorithm is also discussed. To derive a REML equivalent for the heteroscedastic t asymptotic likelihood theory is discussed. In this thesis an integral approximation, the Laplace approximation, is presented and two examples, with the inclusion of ML for the heteroscedastic t, are discussed. A new approximate integral technique called Partial Laplace is also discussed and is exemplified with linear mixed models. Approximate marginal likelihood techniques using Modified Profile Likelihood ( MPL ), Conditional Profile Likelihood ( CPL ) and Stably Adjusted Profile Likelihood ( SAPL ) are also presented and offer an alternative to the approximate integration techniques. The asymptotic techniques are then applied to the heteroscedastic t when the degrees of freedom is known to form two distinct REMLs for the scale parameters. The first approximation uses the Partial Laplace approximation to form a REML for the scale parameters, whereas, the second uses the approximate marginal likelihood technique MPL. For each, the estimation of the location and scale parameters is discussed and computational algorithms are presented. For comparison, the heteroscedastic t for known degrees of freedom using ML and the two new REML equivalents are illustrated with an example and a comparative simulation study. The model is then extended to incorporate the estimation of the degrees of freedom parameter. The estimating equations for the location and scale parameters under ML are preserved and the estimation of the degrees of freedom parameter is integrated into the algorithm. The approximate REML techniques are also extended. For the Partial Laplace approximation the estimation of the degrees of freedom parameter is simultaneously estimated with the scale parameters and therefore the algorithm differs only slightly. The second approximation uses SAPL to estimate the parameters and produces approximate marginal likelihoods for the location, scale and degrees of freedom parameters. Computational algorithms for each of the techniques are also presented. Several extensive examples, as well as a comparative simulation study, are used to illustrate ML and the two REML equivalents for the heteroscedastic t with unknown degrees of freedom. The thesis is concluded with a discussion of the new techniques derived for the heteroscedastic t distribution along with their advantages and disadvantages. Topics of further research are also discussed. / Thesis (Ph.D.)--School of Agriculture and Wine, 2005.
|
99 |
A polytomous nonlinear mixed model for item analysisShin, Seon-hi. January 2003 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2003. / Vita. Includes bibliographical references. Available also from UMI Company.
|
100 |
Cure models for univariate and multivariate survival dataZhou, Feifei., 周飞飞. January 2011 (has links)
published_or_final_version / Statistics and Actuarial Science / Doctoral / Doctor of Philosophy
|
Page generated in 0.0883 seconds