Global ETD Search

61	EVALUATING THE PERFORMANCE AND WATER CHEMISTRY DYNAMICS OF PASSIVE SYSTEMS TREATING MUNICIPAL WASTEWATER AND LANDFILL LEACHATE Wallace, JACK 29 October 2013 (has links) This thesis consists of work conducted in two separate studies, evaluating the performance of passive systems for treating wastewater effluents. The first study involved the characterization of three wastewater stabilization ponds (WSPs) providing secondary and tertiary treatment for municipal wastewater at a facility in Amherstview, Ontario, Canada. Since 2003, the WSPs have experienced excessive algae growth and high pH levels during the summer months. A full range of parameters consisting of: pH, chlorophyll-a (chl-a), dissolved oxygen (DO), temperature, alkalinity, oxidation-reduction potential (ORP), conductivity, nutrient species, and organic matter measures; were monitored for the system and the chemical dynamics in the three WSPs were assessed through multivariate statistical analysis. Supplementary continuous monitoring of pH, chl-a, and DO was performed to identify time-series dependencies. The analyses showed strong correlations between chl-a and sunlight, temperature, organic matter, and nutrients, and strong time dependent correlations between chl-a and DO and between chl-a and pH. Additionally, algae samples were collected and analyzed using metagenomics methods to determine the distribution and speciation of algae growth in the WSPs. A strong shift from the dominance of a major class of green algae, chlorophyceae, in the first WSP, to the dominance of land plants, embryophyta – including aquatic macrophytes – in the third WSP, was observed and corresponded to field observations during the study period. The second study involved the evaluation of the performance and chemical dynamics of a hybrid-passive system treating leachate from a municipal solid waste (MSW) landfill in North Bay, Ontario, Canada. Over a three year period, monitoring of a full range of parameters consisting of: pH, DO, temperature, alkalinity, ORP, conductivity, sulfate, chloride, phenols, solids fractions, nutrient species, organic matter measures, and metals; was conducted bi-weekly and the dataset was analyzed with time series and multivariate statistical techniques. Regression analyses identified 8 parameters that were most frequently retained for modelling the five criteria parameters (alkalinity, ammonia, chemical oxygen demand, iron, and heavy metals), on a statistically significant level (p < 0.05): conductivity, DO, nitrite, organic nitrogen, ORP, pH, sulfate, and total volatile solids. / Thesis (Master, Civil Engineering) -- Queen's University, 2013-10-27 05:29:20.564
62	Detecting and Correcting Batch Effects in High-Throughput Genomic Experiments Reese, Sarah 19 April 2013 (has links) Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal components analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of principal components analysis to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test if a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We further compare existing batch effect correction methods and apply gPCA to test their effectiveness. We conclude that our novel statistic that utilizes guided principal components analysis to identify whether batch effects exist in high-throughput genomic data is effective. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. Biostatistics Batch Effects Principal Components Analysis Bioinformatics Biostatistics Physical Sciences and Mathematics Statistics and Probability
63	A genomic association and prediction of principal components of growth traits and visual scores in Nellore cattle / Vargas, Giovana. January 2018 (has links) Orientador: Roberto Carvalheiro / Coorientador: Danísio Prado Munari / Coorientador: Haroldo Henrique de Rezende Neves / Resumo: A análise de componentes principais (ACP) é uma técnica da estatística multivariada usada para avaliar as relações entre diferentes características a fim de eliminar a redundância resultante de suas correlações. No melhoramento genético animal, a ACP tem sido usada para explorar possíveis interpretações biológicas associadas aos componentes principais (CPs) que podem levar a caracterização de diferentes biotipos de animais. Os objetivos do presente estudo foram: i) avaliar as relações entre características de crescimento, escore visual e reprodutiva, por meio de ACP; ii) identificar, por meio de estudo de associação genômica ampla (GWAS), regiões genômicas que diferenciam os animais quanto aos diferentes componentes; e iii) avaliar a habilidade de predição de valores genéticos genômicos (GEBVs) obtidos para os CPs. Foram utilizados dados fenotípicos de 355.524 animais da raça Nelore provenientes da base de dados Aliança Nelore. Destes, foram genotipados 3.382 animais em painel lllumina® BovineHD (HD, ~777.000 SNPs) e 137 animais em painel GeneSeek Genomic Profiler Bovine HD (~76.000 SNPs). Os animais genotipados com o painel GGP-HD tiveram seus genótipos imputados para o painel mais denso (HD). Após o controle de qualidade, 3.519 animais com informações genotípicas de 471.880 SNPs permaneceram nas análises. A ACP foi realizada utilizando-se a matriz de (co)variância genética aditiva (AT) obtida a partir de análise multi-característica. As estimativas dos efeitos dos SNPs fora... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Principal component analysis (PCA) is a multivariate statistical technique that allows evaluating relationships among different traits in order to eliminate the redundancy resulting from their correlations. In animal breeding, PCA has been used to explore possible biological interpretations associated with the principal components (PCs) that can lead to the characterization of distinguished animal's biotype. The objectives of the present study were: i) to evaluate relationships among growth, visual scores, and reproductive traits by performing a PCA; ii) to identify genomic regions associated with PCs by performing a genome-wide association study (GWAS) on the main PCs; and iii) to evaluate the prediction ability of genomic breeding values (GEBVs) obtained for the PCs. Phenotypic data from 355,524 Nellore animals provided by the Alliance Nellore database, were used in this investigation. A total of 3,382 Nellore animals were genotyped using the lllumina® BovineHD chip (HD, ~777,000 SNPs) and 137 animals were genotyped using the GeneSeek Genomic Profiler Bovine HD chip (~76,000 SNPs). The GGP-HD genotypes were imputed to the HD genotypes. After genomic data quality control, 471,880 SNPs from 3,519 animals were available. The PCA was applied on the additive genetic (co)variance matrix (AT) obtained using multi-trait analysis. For GWAS, SNP effects were estimated using the weighted single-step GBLUP and the BayesC methods. The genes identified within the top-10 ranking windows that explained the highest proportion of variance were used for further functional analyses. For the genomic prediction study, the GEBVs were predicted using three distinguish response variables: EBV of the original traits, EBV of the PCs, and EBV of a selection index used by some Nellore cattle commercial breeding programs. The geno... (Complete abstract click electronic access below) / Doutor Bovino de corte. Polimorfismo de nucleotídeo único. Nelore (Bovino) Genômica. Análise de componentes principais. Principal components analysis.
64	Mapeamento associativo para tolerância a altas temperaturas em germoplasma exótico de soja (Glycine max) / Association mapping to heat tolerance in exotic germplasm soybean (Glycine max) Sousa, Camila Campêlo de 13 November 2015 (has links) A soja está entre as principais culturas mundiais, uma vez que é uma excelente fonte de proteínas e óleo. Além disso, a espécie é aproveitada também pela indústria de biocombustíveis. Considerando a importância das novas mudanças climáticas no agronegócio; para a soja, esta situação é agravada em virtude das condições de temperatura e latitude recomendadas para a semeadura. Dessa forma, para aumentar a produtividade da cultura mesmo frente ao aquecimento global, fazse fundamental o desenvolvimento de cultivares com alta produtividade e tolerantes às altas temperaturas. Neste contexto, o objetivo geral deste trabalho foi selecionar genótipos de soja tolerantes ao calor. Uma população composta por 80 PI\'s de soja e 15 testemunhas foi avaliada sob condições de altas temperaturas, com experimentos instalados nas cidades de Teresina-PI, Piracicaba-SP e Jaboticabal- SP, no ano agrícola 2013/2014. Para a avaliação dos genótipos, foram realizadas análises univariadas e multivariadas. A seleção dos genótipos mais tolerantes a altas temperaturas foi realizada via análise de componentes principais. Nas análises de variâncias univariadas, todos os caracteres mostraram efeitos de tratamentos significativos pelo teste F. Pela análise de componentes principais no experimento conduzido em Teresina-PI, os caracteres que mais contribuíram para a variabilidade dos genótipos avaliados foram: data que metade da parcela atingiu o estádio R5, altura da planta na maturidade, período de granação e valor agronômico. Em Piracicaba-SP, os caracteres que mais contribuíram para a variabilidade foram o período de granação, massa de 100 sementes e o número de dias para a maturidade. Para a seleção dos genótipos mais tolerantes ao calor em Jaboticabal- SP, considerou-se principalmente a altura e a produtividade. Para a análise de mapeamento associativo, a fenotipagem foi realizada em Teresina-PI e avaliados quatro caracteres: altura da planta na maturidade, valor agronômico, massa de cem sementes e produtividade. A genotipagem foi realizada utilizando o chip da empresa Affymetrix. O desequilíbrio de ligação entre pares de marcadores foi calculado pelo coeficiente de determinação r2 e a análise de associação entre marcadores e o fenótipo de interesse foi realizada utilizando a abordagem de modelo linear generalizado. Foram identificadas 16 associações significativas. / Soybean (Glycine max) is one of most important crops in the world. This crop is an source of protein and oil. Beyond that, the species is also utilized for the biofuels industry. The recent climate changes are important on agribusiness, the ones on soybean crop are worse than on other crops because of the conditions of temperature and latitude recommended for planting. Thus, to increase the productivity of the crop even in face of global warming, it is essential that soybean breeding programs promote the development of cultivars highly productive and tolerant to high temperatures. In this context, the aim of this study was to select genotypes for heat tolerance. A population composed of 80 soybean PI\'s and 15 experimental checks was evaluated under high temperature conditions. The experiments were conducted in the cities of Teresina-PI, Piracicaba-SP and Jaboticabal-SP, in the 2013/2014 season. For the evaluation of the genotypes, univariate and multivariate analysis were performed, and the selection of the most genotypes for heat tolerance was performed by principal component analysis (PCA). In the univariate analyzes of variance, all characters showed significant effects of treatments by test F. In the PCA in the experiment conducted in Teresina-PI, the variables that most contributed to the variability of genotypes were: date in which half of the parcel reached R5 stage, height of the plant at maturity, grain filling period and agronomic value. In Piracicaba-SP PCA, the variables that most contributed to the variability were: grain filling period, 100-grain weight and the number of days to maturity. For the selection of the most heat-tolerant genotypes in Jaboticabal-SP, the height and the yield were the variables that most contributed to the variability. In the the association mapping analysis, the genotypes were evaluated under conditions of high temperatures in Teresina-PI and evaluated for four traits: height of the plant at maturity, agronomic value, 100 grain weight and yield. The genotyping was carried out using the Affymetrix chip. The linkage disequilibrium between pairs of markers was calculated by the determination coefficient r2 and the association analysis between markers and the phenotype of interest was performed using the generalized linear model approach. A total of 16 significant marker-trait associations were detected for the four traits. Glycine max Glycine max Análise de componentes principais Association mapping Mapeamento associativo Principal components analysis
65	Caracterização de germoplasma de pupunha (Bactris gasipaes Kunth) por descritores morfológicos / Iriarte Martel, Jorge Hugo. January 2002 (has links) Orientador: José Roberto Môro / Banca: João Carlos de Oliveira / Banca: Antonio Sérgio Ferraudo / Banca: José Antonio Alberto da Silva / Banca: Jair Costa Nachtigal / Resumo: A pupunheira tem um potencial econômico e social muito grande, sendo a palmeira mais importante na América pré-colombiana, constituindo junto com o milho e a mandioca, a base da alimentação dos povos primitivos. Os principais produtos extraídos são o palmito e os frutos para o consumo humano direto, alimento animal, farinhas para consumo humano e óleo vegetal. Os objetivos do presente trabalho foram de utilizar uma lista de descritores morfológicos recomendada, para discriminar primeiramente as raças Pará e Putumayo e após sua validação estatística, verificar também a existência da raça Solimões, que até hoje tem sido negada. Foram aplicadas técnicas estatísticas univariadas e multivariadas na tentativa de discriminar as raças. Dos 42 descritores iniciais, 25 apresentaram diferenças significativas entre as raças e 15 tiveram aproximação normal. A análise discriminante mostrou que a raça Pará possuía 15% das plantas mal classificadas e Putumayo 14%, já com a seleção de desenvolmer para componentes principais, as percentagens foram 9 e 19%, respectivamente, para as duas raças. A população de Manacapuru, não formou grupo nas duas primeiras análises de agrupamento e nem com componentes principais. As três análises em conjunto, conseguiram discriminar as raças Pará, Putumayo e Solimões, sendo os descritores mais importantes nesta discriminação e classificação das raças: número de espigas por cacho, comprimento da ráquis, peso dos frutos, espessura das cascas, facilidade para descascar os frutos, peso das cascas, sabor dos frutos, espessura da polpa, distância morfológica dos frutos e peso das sementes. / Abstract: The peach palm has a economic and social potential very great being the palm most important in the América pre-Colombian, contribuiting together with the maize and the cassava in the indenous feeds. The target of the present work was: to use a morphological descriptor list recommended, to discriminate between two landraces and descriptors validation , to verify the existence of solimoes landraces. Univariated and multivariated statistical techniques were used to attemp discriminate the landraces. Form fort yone initial descriptors, twenty five had presented significant difference between the landraces and fifteen had presented normal approach. The discriminant analysis have showed that Pará landrace possessed fifteen percent of the plant badly c1assified and Putumayo about fourteen percent to it. In the analysis of principal component, the percentages were nine and nineteen percent, respectively, for the two landraces. Manacapuru population did not form c1usterin in the two first one analysis of and nor with principal components. Three joint analysis in the set had obtained to discriminate the Pará, Putumayo and Solimoes landraces and the discrimnant analysis with three landraces, c1assified Manacapuru of the Putumayo landrace inside. The most important descriptors in the discrimination between landraces were: numbers of ears per raceme, raquis length, fruit weight, thickness of fruits bark, facility to peel fruits, weight of fruit bark, fruit flavor, pulp thickness, morphological distance between fruits and seed weight. / Doutor Pupunheira. Botânica - Morfologia. Germoplasma vegetal - Recursos. Botânica - Classificação. Principal components analysis. eng Peach. eng
66	Previsão da curva de juros com análise de componentes principais utilizando matriz de covariâcia de longo prazo / Forecast of the interest curve with principal components analysis using long-term covariance matrix Hugo Mamoru Aoki Hissanaga 25 August 2017 (has links) Apesar da Análise de Componentes Principais (PCA) ser um dos métodos mais importantes na análise da estrutura a termo de taxa de juros, há fortes indícios de não ser adequada para estimar fatores da curva de juros quando há presença de dependência temporal e erros de medida. Para corrigir esses problemas é indicado utilizar a matriz de covariância de longo prazo, extraindo a correta estrutura de covariância presente nestes processos. Neste trabalho, mostramos que realizar a previsão fora da amostra da curva de taxa de juros com o método de Análise de Componentes Principais (PCA) utilizando como base a matriz de covarância de longo prazo (LRCM) parece ser mais acurada comparada a PCA com base na matriz de covariância amostral. / Although Principal Component Analysis (PCA) is one of the most common methods to estimate the structure of interest rate volatility, there are strong indications that it is not adequate to estimate interest rate factors when there is temporal dependence and measurement errors. To correct these problems it is necessary to use the longterm covariance matrix, to extract the correct covariance structure present in these processes. In this work, we show that performing the out-of-sample forecasting of the interest rate curve with the Principal Component Analysis (PCA) method based on the long-term covariance matrix (LRCM) seems to be more accurate compared to PCA based on sample covariance matrix. Análise de componentes principais Curva de juros Previsão Robustez Forecasting Interest curve Principal components analysis Robustness
67	Agrupamento de trabalhadores com perfis semelhantes de aprendizado utilizando técnicas multivariadas Azevedo, Bárbara Brzezinski January 2013 (has links) A manufatura de produtos customizados resulta em variedade de modelos, redução no tamanho de lotes e alternância frequente de tarefas executadas por trabalhadores. Neste contexto, tarefas manuais são especialmente afetadas por conta do processo de adaptação do trabalhador a novos modelos de produtos. Este processo de aprendizado pode ocorrer de maneira distinta dentro de um grupo de trabalhadores. Assim, busca-se o agrupamento dos trabalhadores com perfis similares de aprendizado, monitorando a formação de gargalos em linhas de produção constituídas por dissimilaridades de aprendizado em processos manuais. A presente dissertação apresenta abordagens para clusterização de trabalhadores baseadas nos parâmetros oriundos da modelagem de Curvas de Aprendizado. Tais parâmetros, os quais caracterizam o processo de adaptação de trabalhadores a tarefas, são transformados através da Análise de Componentes Principais e então utilizados como variáveis de clusterização. Na sequência, testam-se outras transformações nos parâmetros utilizando funções Kernel. Os trabalhadores são clusterizados através do método K-Means e Fuzzy C-Means e a qualidade dos agrupamentos formados é medida através do Silhouette Index. Por fim, sugere-se um índice de importância de variável baseado em parâmetros obtidos na Análise Componentes Principais com o objetivo de selecionar as variáveis mais relevantes para clusterização. As abordagens propostas são aplicadas em um processo da indústria calçadista, gerando resultados satisfatórios quando comparados a clusterizações realizadas sem a transformação prévia dos dados ou sem seleção das variáveis. / Manufacturing of customized products relies on a large menu choice, reduced batch sizes and frequent alternation of tasks performed by workers. In this context, manual tasks are especially affected by workers’ adaptation to new product models. This learning process takes place in different paces within a group of workers. This thesis aims at grouping workers with similar learning process tailored to avoid bottlenecks in production lines due to learning dissimilarities among workers. For that matter, we present a method for clustering workers based on parameters derived from Learning Curve (LC) modeling. Such parameters are processed through Principal Component Analysis (PCA), and the PCA scores are used as clustering variables. Next, Kernel transformations are also used to improve clustering quality. The data is clustered using K-Means and Fuzzy C-Means techniques, and the quality of resulting clusters is measured by the Silhouette Index. Finally, we suggest a variable importance index based on parameters derived from PCA to select the most relevant variables for clustering. The proposed approaches are applied in a footwear process, yielding satisfactory results when compared to clustering on original data or without variable selection. Aprendizagem organizacional Cluster industrial Learning curves Clustering Principal components analysis Data transformation Variable selection
68	Previsão da curva de juros com análise de componentes principais utilizando matriz de covariâcia de longo prazo / Forecast of the interest curve with principal components analysis using long-term covariance matrix Hissanaga, Hugo Mamoru Aoki 25 August 2017 (has links) Apesar da Análise de Componentes Principais (PCA) ser um dos métodos mais importantes na análise da estrutura a termo de taxa de juros, há fortes indícios de não ser adequada para estimar fatores da curva de juros quando há presença de dependência temporal e erros de medida. Para corrigir esses problemas é indicado utilizar a matriz de covariância de longo prazo, extraindo a correta estrutura de covariância presente nestes processos. Neste trabalho, mostramos que realizar a previsão fora da amostra da curva de taxa de juros com o método de Análise de Componentes Principais (PCA) utilizando como base a matriz de covarância de longo prazo (LRCM) parece ser mais acurada comparada a PCA com base na matriz de covariância amostral. / Although Principal Component Analysis (PCA) is one of the most common methods to estimate the structure of interest rate volatility, there are strong indications that it is not adequate to estimate interest rate factors when there is temporal dependence and measurement errors. To correct these problems it is necessary to use the longterm covariance matrix, to extract the correct covariance structure present in these processes. In this work, we show that performing the out-of-sample forecasting of the interest rate curve with the Principal Component Analysis (PCA) method based on the long-term covariance matrix (LRCM) seems to be more accurate compared to PCA based on sample covariance matrix. Análise de componentes principais Curva de juros Forecasting Interest curve Previsão Principal components analysis Robustez Robustness
69	Multi-purpose multi-way data analysis Ebrahimi Mohammadi, Diako, Chemistry, Faculty of Science, UNSW January 2007 (has links) In this dissertation, application of multi-way analysis is extended into new areas of environmental chemistry, microbiology, electrochemistry and organometallic chemistry. Additionally new practical aspects of some of the multi-way analysis methods are discussed. Parallel Factor Analysis Two (PARAFAC2) is used to classify a wide range of weathered petroleum oils using GC-MS data. Various chemical and data analysis issues exist in the current methods of oil spill analysis are discussed and the proposed method is demonstrated to have potential to be employed in identification of source of oil spills. Two important practical aspects of PARAFAC2 are exploited to deal with chromatographic shifts and non-diagnostic peaks.GEneralized Multiplicative ANalysis Of VAriance (GEMANOVA) is applied to assess the bactericidal activity of new natural antibacterial extracts on three species of bacteria in different structure and oxidation forms and different concentrations. In this work while the applicability of traditional ANOVA is restricted due to the high interaction amongst the factors, GEMANOVA is shown to return robust and easily interpretable models which conform to the actual structure of the data. Peptide-modified electrochemical sensors are used to determine three metal cations of Cu2+, Cd2+ and Pb2+ simultaneously. Two sets of experiments are performed using a four-electrode system returning a three-way array of size (sample ?? current ?? electrode) and a single electrode resulting in a two-way data set of size (sample ?? current). The data of former is modeled by N-PLS and that latter using PLS. Despite the presence of highly overlapped voltammograms and several sources of non-linearity N-PLS returns reasonable models while PLS fails. An intramolecular hydroamination reaction is catalyzed by several organometallic catalysts to identify the most effective catalysts. The reaction of starting material in the presence of 72 different catalysts is monitored by UV-Vis at two time points, before and after heating the mixtures in an oven. PARAFAC is applied to the three-way data set of (sample ?? wavelength ?? time) to resolve the overlapped UV-Vis peaks and to identify the effective catalysts using the estimated relative concentration of product (loadings plot of the sample mode). Chemometrics. Multivariate analysis. Principal comparisons (Statistics). Chemistry -- Statistical methods. Principal components analysis.
70	A Spatio-Temporal Analysis of Dolphinfish; Coryphaena hippurus, Abundance in the Western Atlantic: Implications for Stock Assessment of a Data-Limited Pelagic Resource. Kleisner, Kristin Marie 26 July 2008 (has links) Dolphinfish (Coryphaena hippurus) is a pelagic species that is ecologically and commercially important in the western Atlantic region. This species has been linked to dominant oceanographic features such as sea surface temperature (SST) frontal regions. This work first explored the linkages between the catch rates of dolphinfish and the oceanography (satellite-derived SST, distance to front calculations, bottom depth and hook depth) using Principal Components Analysis (PCA). It was demonstrated that higher catch rates are found in relation to warmer SST and nearer to frontal regions. This environmental information was then included in standardizations of catch-per-unit-effort (CPUE) indices. It was found that including the satellite-derived SST and distance to front increases the confidence in the index. The second part of this work focused on addressing spatial variability in the catch rate data for a subsection of the sampling area: the Gulf of Mexico region. This study used geostatistical techniques to model and predict spatial abundances of two pelagic species with different habitat utilization patterns: dolphinfish (Coryphaena hippurus) and swordfish (Xiphias gladius). We partitioned catch rates into two components, the probability of encounter, and the abundance, given a positive encounter. We obtained separate variograms and kriged predictions for each component and combined them to give a single density estimate with corresponding variance. By using this two stage approach we were able to detect patterns of spatial autocorrelation that had distinct differences between the two species, likely due to differences in vertical habitat utilization. The patchy distribution of many living resources necessitates a two-stage variogram modeling and prediction process where the probability of encounter and the positive observations are modeled and predicted separately. Such a "geostatistical delta-lognormal" approach to modeling spatial autocorrelation has distinct advantages in allowing the probability of encounter and the abundance, given an encounter to possess separate patterns of autocorrelation and in modeling of severely non-normally distributed data that is plagued by zeros. Variogram Models Kriging Spatial Analysis Principal Components Analysis Stock Assessment Swordfish Xiphias Gladius

Search results