Spelling suggestions: "subject:"multivariate data"" "subject:"multivariate mata""
1 |
Survey of Glyph-based Visualization Techniques for Spatial Multivariate Medical DataRopinski, Timo, Oeltze, Steffen, Preim, Bernhard January 2011 (has links)
In this survey article, we review glyph-based visualization techniques, which have been exploited when visualizing spatial multivariate medical data. To classify these techniques, we derive a taxonomy of glyph properties, which is based on classification concepts established in information visualization. By considering both the glyph visualization as well as the interaction techniques that are employed to generate or explore the glyph visualization, we are able to classify glyph techniques into two main groups: those supporting pre-attentive and those supporting attentive processing. With respect to this classification, we review glyph-based techniques described in the medical visualization literature. Based on the outcome of the literature review, we propose design guidelines for glyph visualizations in the medical domain.
|
2 |
Consumer interests as market segmentation variablesTempleton, William James January 1986 (has links)
No description available.
|
3 |
SMVCIR Dimensionality TestLindsey, Charles D. 2010 May 1900 (has links)
The original SMVCIR algorithm was developed by Simon J. Sheather, Joseph
W. McKean, and Kimberly Crimin. The dissertation first presents a new version
of this algorithm that uses the scaling standardization rather than the Mahalanobis
standardization. This algorithm takes grouped multivariate data as input and then
outputs a new coordinate space that contrasts the groups in location, scale, and
covariance. The central goal of research is to develop a method to determine the
dimension of this space with statistical confidence. A dimensionality test is developed
that can be used to make this determination. The new SMVCIR algorithm is
compared with two other inverse regression algorithms, SAVE and SIR in the process
of developing the dimensionality test and testing it.
The dimensionality test is based on the singular values of the kernel of the spanning
set of the vector space. The asymptotic distribution of the spanning set is found
by using the central limit theorem, delta method, and finally Slutsky's Theorem with
a permutation matrix. This yields a mean adjusted asymptotic distribution of the
spanning set. Theory by Eaton, Tyler, and others is then used to show an equivalence
between the singular values of the mean adjusted spanning set statistic and the
singular values of the spanning set statistic. The test statistic is a sample size scaled
sum of squared singular values of the spanning set. This statistic is asymptotically
equivalent in distribution to that of a linear combination of independent 21
random variables.
Simulations are performed to corroborate these theoretic findings. Additionally,
based on work by Bentler and Xie, an approximation to the test statistic reference
distribution is proposed and tested. This is also corroborated with simulations. Examples
are performed that demonstrate how SMVCIR is used and how the developed
tests for dimensionality are performed. Finally, further directions of research are
hinted at for SMVCIR and the dimensionality test. One of the more interesting
directions is explored by briefly examining how SMVCIR can be used to identify potentially
complex functions that link predictors and a continuous response variable.
|
4 |
Mining for Lung Cancer Biomarkers in Plasma Metabolomics Data / Sökande efter Biomarkörer för Lungcancer genom Analys av MetabolitdataJohnsson, Anna January 2010 (has links)
<p>Lung cancer is the cancer form that has the highest mortality worldwide and inaddition the survival of lung cancer is very low. Only 15% of the patients are alivefive years from set diagnosis. More research is needed to understand the biologyof lung cancer and thus make it possible to discover the disease at an early stage.Early diagnosis leads to an increased chance of survival. In this thesis 179 lungcancer- and 116 control samples of blood serum were analyzed for identificationof metabolomic biomarkers. The control samples were derived from patients withbenign lung diseases.Data was gained from GC/TOF-MS analysis and analyzed with the help ofthe multivariate analysis methods PCA and OPLS/OPLS-DA. In this thesis it isinvestigated how to pre-treat and analyze the data in the best way in order todiscover biomarkers. One part of the aim was to give directions for how to selectsamples from a biobank for further biological validation of suspected biomarkers.Models for different stages of lung cancer versus control samples were computedand validated. The most influencing metabolites in the models were selected andconfoundings with other clinical characteristics like gender and hemoglobin levelswere studied. 13 lung cancer biomakers were identified and validated by raw dataand new OPLS models based solely upon the biomarkers.In summary the identified biomarkers are able to separate fairly good betweencontrol samples and late lung cancer, but are poor for separation of early lungcancer from control samples. The recommendation is to select controls and latelung cancer samples from the biobank for further confirmation of the biomarkers.NyckelordLung cancer is the cancer form that has the highest mortality worldwide and inaddition the survival of lung cancer is very low. Only 15% of the patients are alivefive years from set diagnosis. More research is needed to understand the biologyof lung cancer and thus make it possible to discover the disease at an early stage.Early diagnosis leads to an increased chance of survival. In this thesis 179 lungcancer- and 116 control samples of blood serum were analyzed for identificationof metabolomic biomarkers. The control samples were derived from patients withbenign lung diseases.Data was gained from GC/TOF-MS analysis and analyzed with the help ofthe multivariate analysis methods PCA and OPLS/OPLS-DA. In this thesis it isinvestigated how to pre-treat and analyze the data in the best way in order todiscover biomarkers. One part of the aim was to give directions for how to selectsamples from a biobank for further biological validation of suspected biomarkers.Models for different stages of lung cancer versus control samples were computedand validated. The most influencing metabolites in the models were selected andconfoundings with other clinical characteristics like gender and hemoglobin levelswere studied. 13 lung cancer biomakers were identified and validated by raw dataand new OPLS models based solely upon the biomarkers.In summary the identified biomarkers are able to separate fairly good betweencontrol samples and late lung cancer, but are poor for separation of early lungcancer from control samples. The recommendation is to select controls and latelung cancer samples from the biobank for further confirmation of the biomarkers.Nyckelord</p>
|
5 |
Mining for Lung Cancer Biomarkers in Plasma Metabolomics Data / Sökande efter Biomarkörer för Lungcancer genom Analys av MetabolitdataJohnsson, Anna January 2010 (has links)
Lung cancer is the cancer form that has the highest mortality worldwide and inaddition the survival of lung cancer is very low. Only 15% of the patients are alivefive years from set diagnosis. More research is needed to understand the biologyof lung cancer and thus make it possible to discover the disease at an early stage.Early diagnosis leads to an increased chance of survival. In this thesis 179 lungcancer- and 116 control samples of blood serum were analyzed for identificationof metabolomic biomarkers. The control samples were derived from patients withbenign lung diseases.Data was gained from GC/TOF-MS analysis and analyzed with the help ofthe multivariate analysis methods PCA and OPLS/OPLS-DA. In this thesis it isinvestigated how to pre-treat and analyze the data in the best way in order todiscover biomarkers. One part of the aim was to give directions for how to selectsamples from a biobank for further biological validation of suspected biomarkers.Models for different stages of lung cancer versus control samples were computedand validated. The most influencing metabolites in the models were selected andconfoundings with other clinical characteristics like gender and hemoglobin levelswere studied. 13 lung cancer biomakers were identified and validated by raw dataand new OPLS models based solely upon the biomarkers.In summary the identified biomarkers are able to separate fairly good betweencontrol samples and late lung cancer, but are poor for separation of early lungcancer from control samples. The recommendation is to select controls and latelung cancer samples from the biobank for further confirmation of the biomarkers.NyckelordLung cancer is the cancer form that has the highest mortality worldwide and inaddition the survival of lung cancer is very low. Only 15% of the patients are alivefive years from set diagnosis. More research is needed to understand the biologyof lung cancer and thus make it possible to discover the disease at an early stage.Early diagnosis leads to an increased chance of survival. In this thesis 179 lungcancer- and 116 control samples of blood serum were analyzed for identificationof metabolomic biomarkers. The control samples were derived from patients withbenign lung diseases.Data was gained from GC/TOF-MS analysis and analyzed with the help ofthe multivariate analysis methods PCA and OPLS/OPLS-DA. In this thesis it isinvestigated how to pre-treat and analyze the data in the best way in order todiscover biomarkers. One part of the aim was to give directions for how to selectsamples from a biobank for further biological validation of suspected biomarkers.Models for different stages of lung cancer versus control samples were computedand validated. The most influencing metabolites in the models were selected andconfoundings with other clinical characteristics like gender and hemoglobin levelswere studied. 13 lung cancer biomakers were identified and validated by raw dataand new OPLS models based solely upon the biomarkers.In summary the identified biomarkers are able to separate fairly good betweencontrol samples and late lung cancer, but are poor for separation of early lungcancer from control samples. The recommendation is to select controls and latelung cancer samples from the biobank for further confirmation of the biomarkers.Nyckelord
|
6 |
Lumped kinetic modelling and multivariate data analysis of propylene conversion over H-ZSM-5Nie, Jinjun Unknown Date
No description available.
|
7 |
Symetrie náhodných vektorů / Symmetry of random vectorsŘíha, Adam January 2021 (has links)
In this thesis we introduce the spherical, central, angular, halfspace and regression symmetry of random vectors and their measures. Firstly we deal with their mutual relations and equivalent expressions. We also study the uniqueness of the center of individual symmetries and other interesting properties. Then we define the halfspace, projection, spatial and regression multidimensional median and show their properties. Finally we look at the relationships between these medians and symmetric distributions. 1
|
8 |
A Comparison of Techniques Used In Discrimination and ClassificationHamilton, Owen Michael Grant 08 1900 (has links)
<p> Application of four statistical techniques of discrimination is made to a set of multivariate data. The techniques, proposed by R.A . Fisher [6], C.R. Rao Q4] , D.F. Andrews [l] and H. Chernoff [4], are reviewed, applied and criticized in an intercomparison of the four methods. Graphic illustrations are also utilized to aid in the classification of sampling units. </p> / Thesis / Master of Science (MSc)
|
9 |
Directed Evolution of Glutathione Transferases Guided by Multivariate Data AnalysisKurtovic, Sanela January 2008 (has links)
<p>Evolution of enzymes with novel functional properties has gained much attention in recent years. Naturally evolved enzymes are adapted to work in living cells under physiological conditions, circumstances that are not always available for industrial processes calling for novel and better catalysts. Furthermore, altering enzyme function also affords insight into how enzymes work and how natural evolution operates. </p><p>Previous investigations have explored catalytic properties in the directed evolution of mutant libraries with high sequence variation. Before this study was initiated, functional analysis of mutant libraries was, to a large extent, restricted to uni- or bivariate methods. Consequently, there was a need to apply multivariate data analysis (MVA) techniques in this context. Directed evolution was approached by DNA shuffling of glutathione transferases (GSTs) in this thesis. GSTs are multifarious enzymes that have detoxication of both exo- and endogenous compounds as their primary function. They catalyze the nucleophilic attack by the tripeptide glutathione on many different electrophilic substrates. </p><p>Several multivariate analysis tools, <i>e.g.</i> principal component (PC), hierarchical cluster, and K-means cluster analyses, were applied to large mutant libraries assayed with a battery of GST substrates. By this approach, evolvable units (quasi-species) fit for further evolution were identified. It was clear that different substrates undergoing different kinds of chemical transformation can group together in a multi-dimensional substrate-activity space, thus being responsible for a certain quasi-species cluster. Furthermore, the importance of the chemical environment, or substrate matrix, in enzyme evolution was recognized. Diverging substrate selectivity profiles among homologous enzymes acting on substrates performing the same kind of chemistry were identified by MVA. Important structure-function activity relationships with the prodrug azathioprine were elucidated by segment analysis of a shuffled GST mutant library. Together, these results illustrate important methods applied to molecular enzyme evolution.</p>
|
10 |
A consistent test of independence between random vectorsBoglioni Beaulieu, Guillaume 11 1900 (has links)
Tester l’indépendance entre plusieurs vecteurs aléatoires est une question importante
en statistique. Puisqu’il y a une infinité de manières par lesquelles une
quantité aléatoire X peut dépendre d’une autre quantité aléatoire Y , ce n’est pas
une question triviale, et plusieurs tests “classiques” comme Spearman [33], Wilks
[40], Kendall [18] ou Puri and Sen [24] sont inefficaces pour détecter plusieurs
formes de dépendance. De significatifs progrès dans ce domaine ont été réalisés
récemment, par exemple dans Székely et al. [34], Gretton et al. [14] ou Heller
et al. [15]. Cela dit, la majorité des tests disponibles détectent l’indépendance
entre deux quantités aléatoires uniquement. L’indépendance par paires ne garantissant
pas l’indépendance mutuelle, il est pertinent de développer des méthodes
testant l’hypothèse d’indépendance mutuelle entre n’importe quel nombre de variables.
Dans cette recherche nous proposons un test non-paramétrique et toujours
convergent, applicable à un nombre quelconque de vecteurs aléatoires.
Précisément, nous étendons la méthode décrite dans Heller et al. [15] de deux
manières. Premièrement, nous proposons d’appliquer leur test aux rangs des observations,
plutôt qu’aux observations elles-mêmes. Ensuite, nous étendons leur
méthode pour qu’elle puisse tester l’indépendance entre un nombre quelconque
de vecteurs. La distribution de notre statistique de test étant inconnue, nous
utilisons une méthode de permutations pour calculer sa valeur-p. Des simulations
sont menées pour obtenir la puissance du test, que nous comparons à celles
d’autres test décrits dans Genest and Rémillard [10], Gretton et al. [14], Székely
et al. [34], Beran et al. [3] et Heller et al. [15]. Nous investiguons divers exemples
et dans plusieurs de ceux-ci la puissance de notre test est meilleure que celle des
autres tests. En particulier, lorsque les variables aléatoires sont Cauchy notre test
performe bien mieux que les autres. Pour le cas de vecteurs aléatoires strictement
discrets, nous présentons une preuve que notre test est toujours convergent. / Testing for independence between random vectors is an important question in statistics.
Because there is an infinite number of ways by which a random quantity
X can be dependent of another random quantity Y , it is not a trivial question.
It has been found that classical tests such has Spearman [33],Wilks [40], Kendall
[18] or Puri and Sen [24] are ineffective to detect many forms of dependence.
Recent, significant results on the topic include Székely et al. [35], Gretton et al.
[14] or Heller et al. [15]. However, most of the available tests can only detect dependence
between two random quantities. Because pairwise independence does
not guarantee mutual independence, techniques testing the hypothesis of mutual
independence between any number of random quantities are required. In this
research we propose a non-parametric and universally consistent test of independence,
applicable to any number of random vectors of any size.
Precisely, we extend the procedure described in Heller et al. [15] in two ways.
Firstly, we propose to use the ranks of the observations instead of the observations
themselves. Secondly, we extend their method to test for independence between
any number of random vectors. As the distribution of our test statistic is not
known, a permutation method is used to compute p−values. Then, simulations
are performed to obtain the power of the test. We compare the power of our new
test to that of other tests, namely those in Genest and Rémillard [10], Gretton
et al. [14], Székely et al. [34], Beran et al. [3] and Heller et al. [15]. Examples featuring
random variables and random vectors are considered. For many examples
investigated we find that our new test has similar or better power than that of
the other tests. In particular, when the random variables are Cauchy, our new
test outperforms the others. In the case of strictly discrete random vectors, we
present a proof that our test is universally consistent.
|
Page generated in 0.0646 seconds