• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 316
  • 151
  • 35
  • 32
  • 25
  • 20
  • 19
  • 16
  • 14
  • 14
  • 7
  • 6
  • 5
  • 3
  • 3
  • Tagged with
  • 785
  • 785
  • 756
  • 142
  • 129
  • 122
  • 108
  • 93
  • 77
  • 73
  • 69
  • 58
  • 57
  • 56
  • 56
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Approximation to K-Means-Type Clustering

Wei, Yu 05 1900 (has links)
<p> Clustering involves partitioning a given data set into several groups based on some similarity/dissimilarity measurements. Cluster analysis has been widely used in information retrieval, text and web mining, pattern recognition, image segmentation and software reverse engineering.</p> <p> K-means is the most intuitive and popular clustering algorithm and the working horse for clustering. However, the classical K-means suffers from several flaws. First, the algorithm is very sensitive to the initialization method and can be easily trapped at a local minimum regarding to the measurement (the sum of squared errors) used in the model. On the other hand, it has been proved that finding a global minimal sum of the squared errors is NP-hard even when k = 2. In the present model for K-means clustering, all the variables are required to be discrete and the objective is nonlinear and nonconvex.</p> <p> In the first part of the thesis, we consider the issue of how to derive an optimization model to the minimum sum of squared errors for a given data set based on continuous convex optimization. For this, we first transfer the K-means clustering into a novel optimization model, 0-1 semidefinite programming where the eigenvalues of involved matrix argument must be 0 or 1. This provides an unified way for many other clustering approaches such as spectral clustering and normalized cut. Moreover, the new optimization model also allows us to attack the original problem based on the relaxed linear and semidefinite programming.</p> <p> Moreover, we consider the issue of how to get a feasible solution of the original clustering from an approximate solution of the relaxed problem. By using principal component analysis, we construct a rounding procedure to extract a feasible clustering and show that our algorithm can provide a 2-approximation to the global solution of the original problem. The complexity of our rounding procedure is O(n^(k2(k-1)/2)), which improves substantially a similar rounding procedure in the literature with a complexity O(n^k3/2). In particular, when k = 2, our rounding procedure runs in O(n log n) time. To the best of our knowledge, this is the lowest complexity that has been reported in the literature to find a solution to K-means clustering with guaranteed quality.</p> <p> In the second part of the thesis, we consider approximation methods for the so-called balanced bi-clustering. By using a simple heuristics, we prove that we can improve slightly the constrained K-means for bi-clustering. For the special case where the size of each cluster is fixed, we develop a new algorithm, called Q means, to find a 2-approximation solution to the balanced bi-clustering. We prove that the Q-means has a complexity O(n^2).</p> <p> Numerical results based our approaches will be reported in the thesis as well.</p> / Thesis / Master of Science (MSc)
62

Investigating the Beverage Patterns of Children and Youth with Obesity at the Time of Enrollment into Canadian Pediatric Weight Management Programs / Beverage Intake of Children and Youth with Obesity

Bradbury, Kelly January 2019 (has links)
Introduction: Beverages influence diet quality, however, beverage intake among youth with obesity is not well-described in literature. Dietary pattern analysis can identify how beverages cluster together and enable exploration of population characteristics. Objectives: 1) Assess the frequency of children and youth with obesity who fail thresholds of: no sugar-sweet beverages (SSB), <1 serving/week of SSB, ≥2 servings/day of milk and factors influencing the likelihood of failing to meet these cut-offs. 2) Derive patterns of beverage intake and examine related social and behavioural factors and health outcomes at entry into Canadian pediatric weight management programs. Methods: Beverage intake of youth (2–17 years) enrolled in the CANPWR study (n=1425) was reported at baseline visits from 2013-2017. Beverage thresholds identified weekly SSB consumers and approximated Canadian recommendations. The relationship of sociodemographic (income, guardian education, race, household status) and behaviours (eating habits, physical activity, screen time) to the likelihood of failing cut-offs was explored using multivariable logistic regression. Beverage patterns were derived using Principal Component Analysis. Related sociodemographic, behavioural and health outcomes (lipid profile, fasting glucose, HbA1c, liver enzymes) were evaluated with multiple linear regression. Results: Nearly 80% of youth consumed ≥1 serving/week of SSB. This was more common in males, lower educated families and was related to eating habits and higher screen time. Two-thirds failed to drink ≥2 servings milk/day and were more likely female, demonstrated favourable eating habits and lower screen time. Five beverage patterns were identified: 1) SSB, 2) 1% Milk, 3) 2% Milk, 4) Alternatives, 5) Sports Drinks/Flavoured Milks. Patterns were related to social and lifestyle determinants; the only related health outcome was HDL. Conclusion: Many children and youth with obesity consumed SSB weekly. Fewer drank milk twice daily. Beverage intake was predicted by sex, socioeconomic status and other behaviours, however most beverage patterns were unrelated to health outcomes. / Thesis / Master of Science (MSc) / Beverage intake can influence diet and health outcomes in population-based studies. However, patterns of beverage consumption are not well-described among youth with obesity. This study examined beverage intake and relationships with sociodemographic information, behaviours and health outcomes among youth (2-17 years) at time of entry into Canadian pediatric weight management programs (n=1425). In contrast to current recommendations, 80% of youth consumed ≥1 serving/week of sugar-sweetened beverages and 66% consumed 2 servings/day of milk. Additionally, five distinct patterns of beverage intake were identified using dietary pattern analysis. Social factors (age, sex, socioeconomic status) and behaviours (screen time, eating habits) were related to the risk of failing to meet recommendations and to beverage patterns. Identifying sociodemographic characteristics and behaviours of youth with obesity who fail to meet beverage intakes thresholds and adhere to certain patterns of consumption may provide insight for clinicians to guide youth to improved health in weight management settings.
63

Macroeconomic Forecasting: Statistically Adequate, Temporal Principal Components

Dorazio, Brian Arthur 05 June 2023 (has links)
The main goal of this dissertation is to expand upon the use of Principal Component Analysis (PCA) in macroeconomic forecasting, particularly in cases where traditional principal components fail to account for all of the systematic information making up common macroeconomic and financial indicators. At the outset, PCA is viewed as a statistical model derived from the reparameterization of the Multivariate Normal model in Spanos (1986). To motivate a PCA forecasting framework prioritizing sound model assumptions, it is demonstrated, through simulation experiments, that model mis-specification erodes reliability of inferences. The Vector Autoregressive (VAR) model at the center of these simulations allows for the Markov (temporal) dependence inherent in macroeconomic data and serves as the basis for extending conventional PCA. Stemming from the relationship between PCA and the VAR model, an operational out-of-sample forecasting methodology is prescribed incorporating statistically adequate, temporal principal components, i.e. principal components which capture not only Markov dependence, but all of the other, relevant information in the original series. The macroeconomic forecasts produced from applying this framework to several, common macroeconomic indicators are shown to outperform standard benchmarks in terms of predictive accuracy over longer forecasting horizons. / Doctor of Philosophy / The landscape of macroeconomic forecasting and nowcasting has shifted drastically in the advent of big data. Armed with significant growth in computational power and data collection resources, economists have augmented their arsenal of statistical tools to include those which can produce reliable results in big data environments. At the forefront of such tools is Principal Component Analysis (PCA), a method which reduces the number of predictors into a few factors containing the majority of the variation making up the original data series. This dissertation expands upon the use of PCA in the forecasting of key, macroeconomic indicators, particularly in instances where traditional principal components fail to account for all of the systematic information comprising the data. Ultimately, a forecasting methodology which incorporates temporal principal components, ones capable of capturing both time dependence as well as the other, relevant information in the original series, is established. In the final analysis, the methodology is applied to several, common macroeconomic and financial indicators. The forecasts produced using this framework are shown to outperform standard benchmarks in terms of predictive accuracy over longer forecasting horizons.
64

A Statistical Examination of the Climatic Human Expert System, The Sunset Garden Zones for California

Logan, Ben 11 January 2008 (has links)
Twentieth Century climatology was dominated by two great figures: Wladamir Köppen and C. Warren Thornthwaite. The first carefully developed climatic parameters to match the larger world vegetation communities. The second developed complex formulas of "Moisture Factors" that provided efficient understanding of how evapotranspiration influences plant growth and health, both for native and non-native communities. In the latter half of the Twentieth Century, the Sunset Magazine Corporation develop a purely empirical set of Garden Zones, first for California, then for the thirteen states of the West, now for the entire nation in the National Garden Maps. The Sunset Garden Zones are well recognized and respected in Western States for illustrating the several factors of climate that distinguish zones. But the Sunset Garden Zones have never before been digitized and examined statistically for validation of their demarcations. This thesis examines the digitized zones with reference to PRISM climate data. Variable coverages resembling those described by Sunset are extracted from the PRISM data. These variable coverages are collected for two buffered areas, one in northern California and one in southern California. The coverages are exported from ArcGIS 9.1 to SAS® where they are processed first through a Principal Component Analysis, and then the first five principal components are entered into a Ward's Hierarchical Cluster Analysis. The resulting clusters were translated back into ArcGIS as a raster coverage, where the clusters were climatic regions. This process is quite amenable for further examination of other regions of California / Master of Science
65

Chování tří populací myši domácí ( Mus musculus sensu lato) v baterii pěti behaviorálních testů: vliv poddruhové příslušnosti a komensálního způsobu života / Behavioural patterns exhibited by three populations of house mouse ( Mus musculus lato) in five-tests battery: the effects of subspecies and commensal way of life

Voráčková, Petra January 2015 (has links)
The term "personality" nowadays occurs more often not only in psychological studies of humans but also in animal studies. Studying of personality help us to define the behavioural characteristics which can vary within the age, sexes, species or enviroments. Behavioral experiments are used to detect these behavioral patterns and they can divide the animals into the different groups. The subject of our research became three populations of house mouse (Mus musculus sensu lato) which we tested in a series of experiments involving free exploration, forced exploration, hole- board test, test of vertical activity and Elevated plus-maze. These experiments should reveal wheter the mice differ in their behaviour through the context of sex, comensalism or subspecies. We found (with in excepcion of one test) that intrapopulation variability differences are very small but interpopulation differences purely increase in the cas of comensalism and effects of subspecies. Keywords: Mus musculus, comensalism, open fieldtest, Elevated plus-maze, Principal Component Analysis (PCA)
66

Detection And Classification Of Buried Radioactive Materials

Wei, Wei 09 December 2011 (has links)
This dissertation develops new approaches for detection and classification of buried radioactive materials. Different spectral transformation methods are proposed to effectively suppress noise and to better distinguish signal features in the transformed space. The contributions of this dissertation are detailed as follows. 1) Propose an unsupervised method for buried radioactive material detection. In the experiments, the original Reed-Xiaoli (RX) algorithm performs similarly as the gross count (GC) method; however, the constrained energy minimization (CEM) method performs better if using feature vectors selected from the RX output. Thus, an unsupervised method is developed by combining the RX and CEM methods, which can efficiently suppress the background noise when applied to the dimensionality-reduced data from principle component analysis (PCA). 2) Propose an approach for buried target detection and classification, which applies spectral transformation followed by noisejusted PCA (NAPCA). To meet the requirement of practical survey mapping, we focus on the circumstance when sensor dwell time is very short. The results show that spectral transformation can alleviate the effects from spectral noisy variation and background clutters, while NAPCA, a better choice than PCA, can extract key features for the following detection and classification. 3) Propose a particle swarm optimization (PSO)-based system to automatically determine the optimal partition for spectral transformation. Two PSOs are incorporated in the system with the outer one being responsible for selecting the optimal number of bins and the inner one for optimal bin-widths. The experimental results demonstrate that using variable bin-widths is better than a fixed bin-width, and PSO can provide better results than the traditional Powell’s method. 4) Develop parallel implementation schemes for the PSO-based spectral partition algorithm. Both cluster and graphics processing units (GPU) implementation are designed. The computational burden of serial version has been greatly reduced. The experimental results also show that GPU algorithm has similar speedup as cluster-based algorithm.
67

A Principal Component Regression Analysis for Detection of the Onset of Nocturnal Hypoglycemia in Type 1 Diabetic Patients

Zuzarte, Ian Jeromino January 2008 (has links)
No description available.
68

Feature Extraction using Dimensionality Reduction Techniques: Capturing the Human Perspective

Coleman, Ashley B. January 2015 (has links)
No description available.
69

Utilização de análise de componentes principais em séries temporais / Use of principal component analysis in time series

Teixeira, Sérgio Coichev 12 April 2013 (has links)
Um dos principais objetivos da análise de componentes principais consiste em reduzir o número de variáveis observadas em um conjunto de variáveis não correlacionadas, fornecendo ao pesquisador subsídios para entender a variabilidade e a estrutura de correlação dos dados observados com uma menor quantidade de variáveis não correlacionadas chamadas de componentes principais. A técnica é muito simples e amplamente utilizada em diversos estudos de diferentes áreas. Para construção, medimos a relação linear entre as variáveis observadas pela matriz de covariância ou pela matriz de correlação. Entretanto, as matrizes de covariância e de correlação podem deixar de capturar importante informações para dados correlacionados sequencialmente no tempo, autocorrelacionados, desperdiçando parte importante dos dados para interpretação das componentes. Neste trabalho, estudamos a técnica de análise de componentes principais que torna possível a interpretação ou análise da estrutura de autocorrelação dos dados observados. Para isso, exploramos a técnica de análise de componentes principais para o domínio da frequência que fornece para dados autocorrelacionados um resultado mais específico e detalhado do que a técnica de componentes principais clássica. Pelos métodos SSA (Singular Spectrum Analysis) e MSSA (Multichannel Singular Spectrum Analysis), a análise de componentes principais é baseada na correlação no tempo e entre as diferentes variáveis observadas. Essas técnicas são muito utilizadas para dados atmosféricos na identificação de padrões, tais como tendência e periodicidade. / The main objective of principal component analysis (PCA) is to reduce the number of variables in a small uncorrelated data sets, providing support and helping researcher understand the variation present in all the original variables with small uncorrelated amount of variables, called components. The principal components analysis is very simple and frequently used in several areas. For its construction, the components are calculated through covariance matrix. However, the covariance matrix does not capture the autocorrelation information, wasting important information about data sets. In this research, we present some techniques related to principal component analysis, considering autocorrelation information. However, we explore the principal component analysis in the domain frequency, providing more accurate and detailed results than classical component analysis time series case. In subsequent method SSA (Singular Spectrum Analysis) and MSSA (Multichannel Singular Spectrum Analysis), we study the principal component analysis considering relationship between locations and time points. These techniques are broadly used for atmospheric data sets to identify important characteristics and patterns, such as tendency and periodicity.
70

Utilização de análise de componentes principais em séries temporais / Use of principal component analysis in time series

Sérgio Coichev Teixeira 12 April 2013 (has links)
Um dos principais objetivos da análise de componentes principais consiste em reduzir o número de variáveis observadas em um conjunto de variáveis não correlacionadas, fornecendo ao pesquisador subsídios para entender a variabilidade e a estrutura de correlação dos dados observados com uma menor quantidade de variáveis não correlacionadas chamadas de componentes principais. A técnica é muito simples e amplamente utilizada em diversos estudos de diferentes áreas. Para construção, medimos a relação linear entre as variáveis observadas pela matriz de covariância ou pela matriz de correlação. Entretanto, as matrizes de covariância e de correlação podem deixar de capturar importante informações para dados correlacionados sequencialmente no tempo, autocorrelacionados, desperdiçando parte importante dos dados para interpretação das componentes. Neste trabalho, estudamos a técnica de análise de componentes principais que torna possível a interpretação ou análise da estrutura de autocorrelação dos dados observados. Para isso, exploramos a técnica de análise de componentes principais para o domínio da frequência que fornece para dados autocorrelacionados um resultado mais específico e detalhado do que a técnica de componentes principais clássica. Pelos métodos SSA (Singular Spectrum Analysis) e MSSA (Multichannel Singular Spectrum Analysis), a análise de componentes principais é baseada na correlação no tempo e entre as diferentes variáveis observadas. Essas técnicas são muito utilizadas para dados atmosféricos na identificação de padrões, tais como tendência e periodicidade. / The main objective of principal component analysis (PCA) is to reduce the number of variables in a small uncorrelated data sets, providing support and helping researcher understand the variation present in all the original variables with small uncorrelated amount of variables, called components. The principal components analysis is very simple and frequently used in several areas. For its construction, the components are calculated through covariance matrix. However, the covariance matrix does not capture the autocorrelation information, wasting important information about data sets. In this research, we present some techniques related to principal component analysis, considering autocorrelation information. However, we explore the principal component analysis in the domain frequency, providing more accurate and detailed results than classical component analysis time series case. In subsequent method SSA (Singular Spectrum Analysis) and MSSA (Multichannel Singular Spectrum Analysis), we study the principal component analysis considering relationship between locations and time points. These techniques are broadly used for atmospheric data sets to identify important characteristics and patterns, such as tendency and periodicity.

Page generated in 0.1675 seconds