Spelling suggestions: "subject:"anda principal component analysis"" "subject:"ando principal component analysis""
71 |
Sistemáticas de agrupamento de países com base em indicadores de desempenho / Countries clustering systematics based on performance indexesMello, Paula Lunardi de January 2017 (has links)
A economia mundial passou por grandes transformações no último século, as quais incluiram períodos de crescimento sustentado seguidos por outros de estagnação, governos alternando estratégias de liberalização de mercado com políticas de protecionismo comercial e instabilidade nos mercados, dentre outros. Figurando como auxiliar na compreensão de problemas econômicos e sociais de forma sistêmica, a análise de indicadores de desempenho é capaz de gerar informações relevantes a respeito de padrões de comportamento e tendências, além de orientar políticas e estratégias para incremento de resultados econômicos e sociais. Indicadores que descrevem as principais dimensões econômicas de um país podem ser utilizados como norteadores na elaboração e monitoramento de políticas de desenvolvimento e crescimento desses países. Neste sentido, esta dissertação utiliza dados do Banco Mundial para aplicar e avaliar sistemáticas de agrupamento de países com características similares em termos dos indicadores que os descrevem. Para tanto, integra técnicas de clusterização (hierárquicas e não-hierárquicas), seleção de variáveis (por meio da técnica “leave one variable out at a time”) e redução dimensional (através da Análise de Componentes Principais) com vistas à formação de agrupamentos consistentes de países. A qualidade dos clusters gerados é avaliada pelos índices Silhouette, Calinski-Harabasz e Davies-Bouldin. Os resultados se mostraram satisfatórios quanto à representatividade dos indicadores destacados e qualidade da clusterização gerada. / The world economy faced transformations in the last century. Periods of sustained growth followed by others of stagnation, governments alternating strategies of market liberalization with policies of commercial protectionism, and instability in markets, among others. As an aid to understand economic and social problems in a systemic way, the analysis of performance indicators generates relevant information about patterns, behavior and trends, as well as guiding policies and strategies to increase results in economy and social issues. Indicators describing main economic dimensions of a country can be used guiding principles in the development and monitoring of development and growth policies of these countries. In this way, this dissertation uses data from World Bank to elaborate a system of grouping countries with similar characteristics in terms of the indicators that describe them. To do so, it integrates clustering techniques (hierarchical and non-hierarchical), selection of variables (through the "leave one variable out at a time" technique) and dimensional reduction (appling Principal Component Analysis). The generated clusters quality is evaluated by the Silhouette Index, Calinski-Harabasz and Davies-Bouldin indexes. The results were satisfactory regarding the representativity of the highlighted indicators and the generated a good clustering quality.
|
72 |
Razão 6/3 como indicador de qualidade da dieta brasileira e a relação com doenças crônicas não transmissíveis / 6/3 ratio as a quality indicator of the Brazilian diet and it relation with chronic diseasesMainardi, Giulia Marcelino 23 October 2014 (has links)
Introdução: A carga de doenças crônicas está aumentando rapidamente em todo o mundo. A proporção de ácido graxo 6/3 é um indicador qualitativo da dieta e sua elevação tem se mostrado associada a doenças crônicas na idade adulta. Em diversos países os padrões alimentares modernos apresentam proporção elevada de ácido graxo 6/3, no Brasil esse dado é desconhecido. Objetivo: Identificar os padrões de consumo alimentar da população brasileira na faixa etária de 15 a 35 anos e investigar a associação desses padrões com fatores de risco biológicos para doenças crônicas. Métodos: Foram utilizados dados do inquérito de consumo alimentar individual (POF 7) da Pesquisa Orçamento Familiares (POF) 2008 a 2009. Para estimar os padrões alimentares utilizou-se a análise de componentes principais (ACP), com rotação varimax. Para determinar o número de componentes a serem retidos na análise, consideramos aqueles com eingenvalues 1 e, para caracterizá-los, as variáveis com loadings |0,20|. Realizou-se o teste de Kaiser-Meyer-Olkin (KMO) para indicar a adequação dos dados à ACP. As associações entre os padrões alimentares (escores fatoriais) e fatores de risco para doenças crônicas, sintetizados na razão 6/3 do consumo alimentar acima de 10:1, foram estimadas através de regressões linear e logística. Foram considerados estatisticamente significantes os valores com p<0,05. As análises foram realizadas no software STATA 12. Resultados: Na amostra de 12527 indivíduos foram identificamos 3 padrões alimentares (P). O P3 caracterizado pelo consumo de preparações mistas, pizza/sanduíches, vitaminas/iogurtes, doces, sucos diversos e refrigerantes apresentou efeito de redução na 6/3 da dieta; o P1 pró inflamatório caracterizado por carnes processadas, panificados, laticínios, óleos e gorduras apresentou efeito de aumento na razão 6/3, este padrão é mais praticado pela população de menor renda em ambos os sexos. Observou-se baixo consumo de frutas e hortaliças em todos os padrões alimentares. Supondo-se um aumento na prática dos padrões 2 e 3, haveria a diminuição da probabilidade da pratica do P1 em 5 por cento em ambos os sexos. O índice de confiança da ACP, estimado pelo coeficiente KMO foi 0,57. Conclusão: Os padrões alimentares caracterizados pelo consumo de óleos e gorduras, carnes processadas, laticínios e panificados contribuíram para o aumento da 6/3 da dieta brasileira e, por extensão, para o risco de desenvolvimento de doenças crônicas não transmissíveis. Padrões alimentares complexos e com ampla gama de alimentos consumidos se mostram mais efetivos na redução da razão 6/3 da dieta de brasileiros adultos, lembrando que esse efeito é devido ao papel de sinergia durante a digestão e absorção que os alimentos exercem no organismo, já que o consumo se dá por uma variedade de alimentos e não por alimentos isolados. As políticas públicas na área da alimentação devem levar em conta a razão 6/3 como um dos marcadores da qualidade da dieta no País. / Introduction: The burden of chronic diseases is rapidly increasing worldwide. The proportion of fatty acid 6/3 is a qualitative indicator of diet and its increase has been shown to be associated with chronic diseases in adulthood. In many countries modern dietary patterns have a high proportion of fatty acid 6/3, in Brazil this data is unknown. Objective: To identify dietary patterns of the population in the age group between 15-35 years and to investigate the association between these patterns and biological risk factors for chronic diseases. Methods: We used data from individual food consumption survey (POF 7) Pesquisa de Orçamento Familiar (POF) from 2008 to 2009. To estimate the dietary patterns we used the principal component analysis (PCA) with varimax rotation. To determine the number of components to be retained in the analysis we consider those with eingenvalues 1 and to characterize them variables with loadings | 0.20 |. We used Kaiser-Meyer-Olkin (KMO) test to indicate the adequacy of the data to PCA. Associations between dietary patterns (factor scores) and risk factors for chronic diseases, characterized by 6/3 ratio > 10:1 of food consumption were estimated by linear and logistic regressions. Values with p <0.05 were considered statistically significant. Analyses were performed in STATA 12 software. Results: In the sample of 12527 individuals we identified 3 dietary patterns (P). The P3 characterized by the use of mixed preparations, pizza / sandwiches, vitamins / yogurts, pastries, juices and soft drinks, was effective in reducing the 6/3 ratio; P1 \"pro inflammatory\" characterized by processed meats, bakery, dairy, oils and fats had an effect of increasing the 6/3ratio, this pattern is more practiced by the population of lower income in both sexes. We found a low consumption of fruits and 8 vegetables in all dietary patterns. Assuming an increase in the practice of the patterns 2 and 3 would be decreased the probability of the P1 practice by 5 per cent in both sexes. The PCA confidence index, estimated by KMO coefficient was 0.57. Conclusion: Dietary patterns characterized by the consumption of oils and fats, processed meats, dairy and bakery products contributed to the increase in 6/3 in the Brazilian diet and, by extension, the risk of developing chronic diseases. Complex and wide range of foods consumed in dietary patterns are more effective in reducing the ratio 6/3 diet of Brazilian adults, this effect is due to the role of synergy during digestion and absorption that food has on the body, since the consumption takes place by a variety of food and not by food consumption isolated. Nutrition public policies must take into account the 6/3 ratio as one of the markers of diet quality in the countrys food consumption.
|
73 |
A Comparison of Data Transformations in Image DenoisingMichael, Simon January 2018 (has links)
The study of signal processing has wide applications, such as in hi-fi audio, television, voice recognition and many other areas. Signals are rarely observed without noise, which obstruct our analysis of signals. Hence, it is of great interest to study the detection, approximation and removal of noise. In this thesis we compare two methods for image denoising. The methods are each based on a data transformation. Specifically, Fourier Transform and Singular Value Decomposition are utilized in respective methods and compared on grayscale images. The comparison is based on the visual quality of the resulting image, the maximum peak signal-to-noise ratios attainable for the respective methods and their computational time. We find that the methods are fairly equal in visual quality. However, the method based on the Fourier transform scores higher in peak signal-to-noise ratio and demands considerably less computational time.
|
74 |
A Vision-Based Approach For Unsupervised Modeling Of Signs Embedded In Continuous SentencesNayak, Sunita 07 July 2005 (has links)
The common practice in sign language recognition is to first construct individual sign models, in terms of discrete state transitions, mostly represented using Hidden Markov Models, from manually isolated sign samples and then to use them to recognize signs in continuous sentences. In this thesis we use a continuous state space model, where the states are based on purely image-based features, without the use of special gloves. We also present an unsupervised approach to both extract and learn models for continuous basic units of signs, which we term as signemes, from continuous sentences. Given a set of sentences with a common sign, we can automatically learn the model for part of the sign,or signeme, that is least affected by coarticulation effects. We tested our idea using the publicly available Boston SignStreamDataset by building signeme models of 18 signs. We test the quality of the models by considering how well we can localize the sign in a new sentence. We also present the concept of smooth continuous curve based models formed using functional splines and curve registration. We illustrate this idea using 16 signs.
|
75 |
Modelling Distance Functions Induced by Face Recognition AlgorithmsChaudhari, Soumee 09 November 2004 (has links)
Face recognition algorithms has in the past few years become a very active area of research in the fields of computer vision, image processing, and cognitive psychology. This has spawned various algorithms of different complexities. The concept of principal component analysis(PCA) is a popular mode of face recognition algorithm and has often been used to benchmark other face recognition algorithms for identification and verification scenarios. However in this thesis, we try to analyze different face recognition algorithms at a deeper level. The objective is to model the distances output by any face recognition algorithm as a function of the input images. We achieve this by creating an affine eigen space from the PCA space such that it can approximate the results of the face recognition algorithm under consideration as closely as possible.
Holistic template matching algorithms like the Linear Discriminant Analysis algorithm( LDA), the Bayesian Intrapersonal/Extrapersonal classifier(BIC), as well as local feature based algorithms like the Elastic Bunch Graph Matching algorithm(EBGM) and a commercial face recognition algorithm are selected for our experiments. We experiment on two different data sets, the FERET data set and the Notre Dame data set. The FERET data set consists of images of subjects with variation in both time and expression. The Notre Dame data set consists of images of subjects with variation in time. We train our affine approximation algorithm on 25 subjects and test with 300 subjects from the FERET data set and 415 subjects from the Notre Dame data set. We also analyze the effect of different distance metrics used by the face recognition algorithm on the accuracy of the approximation. We study the quality of the approximation in the context of recognition for the identification and verification scenarios, characterized by cumulative match score curves (CMC) and receiver operator curves (ROC), respectively.
Our studies indicate that both the holistic template matching algorithms as well as feature based algorithms can be well approximated. We also find the affine approximation training can be generalized across covariates. For the data with time variation, we find that the rank order of approximation performance is BIC, LDA, EBGM, and commercial. For the data with expression variation, the rank order is LDA, BIC, commercial, and EBGM. Experiments to approximate PCA with distance measures other than Euclidean also performed very well. PCA+Euclidean distance is best approximated followed by PCA+MahL1, PCA+MahCosine, and PCA+Covariance.
|
76 |
An Indepth Analysis of Face Recognition Algorithms using Affine ApproximationsReguna, Lakshmi 19 May 2003 (has links)
In order to foster the maturity of face recognition analysis as a science, a well implemented baseline algorithm and good performance metrics are highly essential to benchmark progress. In the past, face recognition algorithms based on Principal Components Analysis(PCA) have often been used as a baseline algorithm. The objective of this thesis is to develop a strategy to estimate the best affine transformation, which when applied to the eigen space of the PCA face recognition algorithm can approximate the results of any given face recognition algorithm. The affine approximation strategy outputs an optimal affine transform that approximates the similarity matrix of the distances between a given set of faces generated by any given face recognition algorithm. The affine approximation strategy would help in comparing how close a face recognition algorithm is to the PCA based face recognition algorithm. This thesis work shows how the affine approximation algorithm can be used as a valuable tool to evaluate face recognition algorithms at a deep level.
Two test algorithms were choosen to demonstrate the usefulness of the affine approximation strategy. They are the Linear Discriminant Analysis(LDA) based face recognition algorithm and the Bayesian interpersonal and intrapersonal classifier based face recognition algorithm. Our studies indicate that both the algorithms can be approximated well. These conclusions were arrived based on the results produced by analyzing the raw similarity scores and by studying the identification and verification performance of the algorithms. Two training scenarios were considered, one in which both the face recognition and the affine approximation algorithm were trained on the same data set and in the other, different data sets were used to train both the algorithms. Gross error measures like the average RMS error and Stress-1 error were used to directly compare the raw similarity scores. The histogram of the difference between the similarity matrixes also clearly showed that the error spread is small for the affine approximation algorithm. The performance of the algorithms in the identification and the verification scenario were characterized using traditional CMS and ROC curves. The McNemar's test showed that the difference between the CMS and the ROC curves generated by the test face recognition algorithms and the affine approximation strategy is not statistically significant. The results were statistically insignificant at rank 1 for the first training scenario but for the second training scenario they became insignificant only at higher ranks. This difference in performance can be attributed to the different training sets used in the second training scenario.
|
77 |
Dimensionality Reduction Using Factor AnalysisKhosla, Nitin, n/a January 2006 (has links)
In many pattern recognition applications, a large number of features are extracted in order to ensure an accurate classification of unknown classes. One way to solve the problems of high dimensions is to first reduce the dimensionality of the data to a manageable size, keeping as much of the original information as possible and then feed the reduced-dimensional data into a pattern recognition system. In this situation, dimensionality reduction process becomes the pre-processing stage of the pattern recognition system. In addition to this, probablility density estimation, with fewer variables is a simpler approach for dimensionality reduction. Dimensionality reduction is useful in speech recognition, data compression, visualization and exploratory data analysis. Some of the techniques which can be used for dimensionality reduction are; Factor Analysis (FA), Principal Component Analysis(PCA), and Linear Discriminant Analysis(LDA). Factor Analysis can be considered as an extension of Principal Component Analysis. The EM (expectation maximization) algorithm is ideally suited to problems of this sort, in that it produces maximum-likelihood (ML) estimates of parameters when there is a many-to-one mapping from an underlying distribution to the distribution governing the observation, conditioned upon the obervations. The maximization step then provides a new estimate of the parameters. This research work compares the techniques; Factor Analysis (Expectation-Maximization algorithm based), Principal Component Analysis and Linear Discriminant Analysis for dimensionality reduction and investigates Local Factor Analysis (EM algorithm based) and Local Principal Component Analysis using Vector Quantization.
|
78 |
Human Promoter Recognition Based on Principal Component AnalysisLi, Xiaomeng January 2008 (has links)
Master of Engineering / This thesis presents an innovative human promoter recognition model HPR-PCA. Principal component analysis (PCA) is applied on context feature selection DNA sequences and the prediction network is built with the artificial neural network (ANN). A thorough literature review of all the relevant topics in the promoter prediction field is also provided. As the main technique of HPR-PCA, the application of PCA on feature selection is firstly developed. In order to find informative and discriminative features for effective classification, PCA is applied on the different n-mer promoter and exon combined frequency matrices, and principal components (PCs) of each matrix are generated to construct the new feature space. ANN built classifiers are used to test the discriminability of each feature space. Finally, the 3 and 5-mer feature matrix is selected as the context feature in this model. Two proposed schemes of HPR-PCA model are discussed and the implementations of sub-modules in each scheme are introduced. The context features selected by PCA are III used to build three promoter and non-promoter classifiers. CpG-island modules are embedded into models in different ways. In the comparison, Scheme I obtains better prediction results on two test sets so it is adopted as the model for HPR-PCA for further evaluation. Three existing promoter prediction systems are used to compare to HPR-PCA on three test sets including the chromosome 22 sequence. The performance of HPR-PCA is outstanding compared to the other four systems.
|
79 |
Improved effort estimation of software projects based on metricsAndersson, Veronika, Sjöstedt, Hanna January 2005 (has links)
<p>Saab Ericsson Space AB develops products for space for a predetermined price. Since the price is fixed, it is crucial to have a reliable prediction model to estimate the effort needed to develop the product. In general software effort estimation is difficult, and at the software department this is a problem.</p><p>By analyzing metrics, collected from former projects, different prediction models are developed to estimate the number of person hours a software project will require. Models for predicting the effort before a project begins is first developed. Only a few variables are known at this state of a project. The models developed are compared to a current model used at the company. Linear regression models improve the estimate error with nine percent units and nonlinear regression models improve the result even more. The model used today is also calibrated to improve its predictions. A principal component regression model is developed as well. Also a model to improve the estimate during an ongoing project is developed. This is a new approach, and comparison with the first estimate is the only evaluation.</p><p>The result is an improved prediction model. There are several models that perform better than the one used today. In the discussion, positive and negative aspects of the models are debated, leading to the choice of a model, recommended for future use.</p>
|
80 |
Do Self-Sustainable MFI:s help alleviate relative poverty?Stenbäcken, Rasmus January 2006 (has links)
<p>The subject of this paper is microfinance and the question: Do self-sustainable MFI:s alleviate poverty?.</p><p>A MFI is a micro financial institution, a regular bank or a NGO that has transformed into a licensed financial institutions, focused on microenterprises. To answer the question data has been gathered in Ecuador, South America. South America have a large amount of self sustainable MFI:s. Ecuador was selected as the country to be studied as it has an intermediate level of market penetration in the micro financial sector. To determine relative poverty before and after the access to microcredit, interviews were used. The data retrieved in the interviews was used to determine the impact of micro credit on different aspects of relative poverty using the Difference in Difference method.</p><p>Significant differences are found between old and new clients as well as for the change over time. But no significant results are found for the difference in change over time for clients compared to the non-clients. The author argues that the insignificant result can either be a result of a too small sample size, disturbances in the sample selection or that this specific kind of institution have little or no affect on the current clients economical development.</p>
|
Page generated in 0.1395 seconds