Spelling suggestions: "subject:"[een] PARTIAL LEAST SQUARES"" "subject:"[enn] PARTIAL LEAST SQUARES""
121 |
Comparação de métodos de estimação para problemas com colinearidade e/ou alta dimensionalidade (p > n)Casagrande, Marcelo Henrique 29 April 2016 (has links)
Submitted by Bruna Rodrigues (bruna92rodrigues@yahoo.com.br) on 2016-10-06T11:48:12Z
No. of bitstreams: 1
DissMHC.pdf: 1077783 bytes, checksum: c81f777131e6de8fb219b8c34c4337df (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T13:58:41Z (GMT) No. of bitstreams: 1
DissMHC.pdf: 1077783 bytes, checksum: c81f777131e6de8fb219b8c34c4337df (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T13:58:47Z (GMT) No. of bitstreams: 1
DissMHC.pdf: 1077783 bytes, checksum: c81f777131e6de8fb219b8c34c4337df (MD5) / Made available in DSpace on 2016-10-20T13:58:52Z (GMT). No. of bitstreams: 1
DissMHC.pdf: 1077783 bytes, checksum: c81f777131e6de8fb219b8c34c4337df (MD5)
Previous issue date: 2016-04-29 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / This paper presents a comparative study of the predictive power of four suitable regression
methods for situations in which data, arranged in the planning matrix, are very
poorly multicolinearity and / or high dimensionality, wherein the number of covariates is
greater the number of observations.
In this study, the methods discussed are: principal component regression, partial least
squares regression, ridge regression and LASSO.
The work includes simulations, wherein the predictive power of each of the techniques
is evaluated for di erent scenarios de ned by the number of covariates, sample size and
quantity and intensity ratios (e ects) signi cant, highlighting the main di erences between
the methods and allowing for the creating a guide for the user to choose which method
to use based on some prior knowledge that it may have.
An application on real data (not simulated) is also addressed. / Este trabalho apresenta um estudo comparativo do poder de predi c~ao de quatro
m etodos de regress~ao adequados para situa c~oes nas quais os dados, dispostos na matriz
de planejamento, apresentam s erios problemas de multicolinearidade e/ou de alta dimensionalidade,
em que o n umero de covari aveis e maior do que o n umero de observa c~oes.
No presente trabalho, os m etodos abordados s~ao: regress~ao por componentes principais,
regress~ao por m nimos quadrados parciais, regress~ao ridge e LASSO.
O trabalho engloba simula c~oes, em que o poder preditivo de cada uma das t ecnicas e
avaliado para diferentes cen arios de nidos por n umero de covari aveis, tamanho de amostra
e quantidade e intensidade de coe cientes (efeitos) signi cativos, destacando as principais
diferen cas entre os m etodos e possibilitando a cria c~ao de um guia para que o usu ario
possa escolher qual metodologia usar com base em algum conhecimento pr evio que o
mesmo possa ter.
Uma aplica c~ao em dados reais (n~ao simulados) tamb em e abordada
|
122 |
ESTIMATIVA DA MASSA ESPECÍFICA EM ETANOL COMBUSTÍVEL POR MODELOS DE REDES NEURAIS ARTIFICIAIS E DE REGRESSÃO POR MÍNIMOS QUADRADOS PARCIAIS / ESTIMATION OF SPECIFIC MASS IN FUEL ETHANOL BY MODELS OF ARTIFICIAL NEURAL NETWORK AND OF PARTIAL LEAST SQUARES REGRESSIONSantos, Marcelo José Castro dos 22 October 2013 (has links)
Made available in DSpace on 2016-08-19T12:56:41Z (GMT). No. of bitstreams: 1
Dissertacao Marcelo Jose.pdf: 1590491 bytes, checksum: 7be3e83649dd910e0afe9a5a25de4e73 (MD5)
Previous issue date: 2013-10-22 / The ethanol has continuously gained interests in many countries including Brazil due to the PROÁLCOOL program. The experimental determination of properties of ethanol and other fuels through official methods is very time consuming as well as tedious process. The estimation of these properties with the help of computational tools can be very useful. In the present work, the methods of partial least squares regression (PLS) and artificial neural network multilayer (ANN) were used to estimate one of the most important properties of fuel ethanol, density, using official quality parameters for ethanol, collected from LAPQAP/UFMA laboratory corresponding to 12 years (period: 2002-2013) of analyzes. A careful analysis of the data was performed to obtain a set of variables and data that best represents satisfactory performance of the two models. The estimates of both approaches were compared and validated. The predictive ability of the network obtained was very good for the parameters studied, consistent with the accuracy of the experimental measurements. The low mean square error, the randomness, the zero mean and the constant variance, obtained for the residues, indicated the suitability of the models, suggesting their use to estimate (predict) the density of ethanol. Results indicated that the model ANN was adequate, and the value of NMSE (normalized mean square error) of 0.0012, less than the PLS model of 0.2221. The result achieved is less than the range of measurement uncertainty of the equipment responsible for testing the density proving that the model used has satisfactory performance. / O etanol tem alcançado crescente interesse em muitos países, principalmente, no Brasil devido ao programa PROÁLCOOL. A determinação experimental das propriedades deste biocombustível e de outros combustíveis por meio de métodos oficiais é muito demorada, bem como é considerado um tedioso processo. A estimativa dessas propriedades com a ajuda de ferramentas computacionais pode ser de grande utilidade. No presente trabalho, os métodos de regressão por mínimos quadrados parciais (PLS) e redes neurais artificiais de múltiplas camadas (RNA) foram usados para estimar uma das mais importantes propriedades do etanol combustível, massa específica, utilizando parâmetros de qualidade oficiais de etanol, oriundos de análises realizadas no laboratório LAPQAP/UFMA, durante 12 anos (período: 2002-2013). Inicialmente, uma análise cuidadosa dos dados foi realizada a fim de selecionar um conjunto de variáveis e dados que melhor representasse um desempenho satisfatório dos dois modelos estudados. As estimativas de ambas as abordagens foram comparadas e validadas. A capacidade preditiva da rede neural obtida foi considerada muito boa para os parâmetros estudados, e compatível com a precisão das medidas experimentais. O baixo erro quadrático médio, a aleatoriedade, a média nula e a variância constante, obtida para os resíduos, evidenciaram a adequabilidade dos modelos usados, sugerindo a utilização destes modelos para estimar (predizer) a massa específica do etanol. Resultados indicaram que o modelo de RNA foi adequado, sendo o valor de NMSE (erro quadrático médio normalizado) de 0,0012, valor este, muito inferior ao modelo de PLS de 0,2221. Este resultado alcançado é inferior aos valores da faixa de incerteza de medição do equipamento responsável pelo ensaio experimental da massa específica, comprovando que o modelo utilizado possui desempenho considerado muito bom.
|
123 |
Investigation of multivariate prediction methods for the analysis of biomarker dataHennerdal, Aron January 2006 (has links)
The paper describes predictive modelling of biomarker data stemming from patients suffering from multiple sclerosis. Improvements of multivariate analyses of the data are investigated with the goal of increasing the capability to assign samples to correct subgroups from the data alone. The effects of different preceding scalings of the data are investigated and combinations of multivariate modelling methods and variable selection methods are evaluated. Attempts at merging the predictive capabilities of the method combinations through voting-procedures are made. A technique for improving the result of PLS-modelling, called bagging, is evaluated. The best methods of multivariate analysis of the ones tried are found to be Partial least squares (PLS) and Support vector machines (SVM). It is concluded that the scaling have little effect on the prediction performance for most methods. The method combinations have interesting properties – the default variable selections of the multivariate methods are not always the best. Bagging improves performance, but at a high cost. No reasons for drastically changing the work flows of the biomarker data analysis are found, but slight improvements are possible. Further research is needed.
|
124 |
Multivariat dataanalys för att undersöka skillnader i undervisnings- och bedömningspraxis i kursen kemi 2Larsson, Daniel January 2018 (has links)
Trots att det inom forskningsvärlden propageras för formativ bedömning, kan man i dagsläget notera en mycket stor variation gällande införlivandet av, samt effekter av, formativ bedömning i skolor. Metoder för att kartlägga formativ bedömningspraxis fordras för att kunna särskilja på ”god” respektive ”mindre god” formativ bedömningspraxis. Syftet med föreliggande uppsats var att, med hjälp av en elevenkät och multivariata projektionsmetoder såsom PCA och PLS-DA, kartlägga, och särskilja, formativ bedömningspraxis hos sex olika gymnasieklasser som genomfört kursen kemi 2. Ett sekundärt syfte var även att, med samma verktyg, försöka karakterisera och särskilja frekvenser av olika genomförda undervisningsmoment inom samma kurs och klasser. Studien visade, på ett grafiskt och illustrativt sätt, en stor variation av upplevelser av formativ bedömning inom de tillfrågade klasserna. Vidare visade sig PCA vara ett utmärkt verktyg för att identifiera elevsvar som låg utanför den ”normala” variationen. Genom en PLS-DA-analys påvisades en skillnad i frekvenser av genomförda undervisningsmoment mellan två kommunala och en privat skola – även om dessa resultat bör tolkas med en viss försiktighet.
|
125 |
One Step Closer to Non-Invasive: Quantifying Coral Zooxanthellae Pigment Concentrations Using Bio-OpticsHancock, Harmony Alise 01 June 2012 (has links)
Due to the invasive nature of quantification techniques, baseline pigment data for coral-dwelling zooxanthellae are not known. In an attempt to develop a model for non-invasive estimation of zooxanthellae pigment concentrations from corals, field samples were taken from Porites rus and P. lutea in Apra Harbor, Guam. In-situ reflectance spectra (R400-R800) from 22 coral colonies were collected. “Coral truthing” was accomplished by extracting corresponding tissue core samples. Subsequent analysis to quantify the concentrations of 6 zooxanthellae pigments (µg cm-2) was performed using HPLC. Trials of multiple linear regressions were attempted (EJ Hochberg) and found inappropriate, despite previous success. The multivariate calibration technique partial least squares regression (PLS-R) is an excellent tool in the case of co-linear variables. Thus, PLS-R was attempted for chlorophyll c2 and peridinin after demonstration of co-linearity. This may be an appropriate approach for development of bio-optical models to estimate zooxanthellae pigment concentrations. Further, the dinoflagellate diagnostic pigment peridinin may be of great value for reef-scale remote sensing of changes in coral status in the future.
|
126 |
Neuromuscular Strategies for Regulating Knee Joint Moments in Healthy and Injured PopulationsFlaxman, Teresa January 2017 (has links)
Background: Joint stability has been experimentally and clinically linked to mechanisms of knee injury and joint degeneration. The only dynamic, and perhaps most important, regulators of knee joint stability are contributions from muscular contractions. In participants with unstable knees, such as anterior cruciate ligament (ACL) injured, a range of neuromuscular adaptations has been observed including quadriceps weakness and increased co-activation of adjacent musculature. This co-activation is seen as a compensation strategy to increase joint stability. In fact, despite increased co-activation, instability persists and it remains unknown whether observed adaptations are the result of injury induced quadriceps weakness or the mechanical instability itself. Furthermore, there exists conflicting evidence on how and which of the neuromuscular adaptations actually improve and/or reduce knee joint stability.
Purpose: The overall aim of this thesis is therefore to elucidate the role of injury and muscle weakness on muscular contributions to knee joint stability by addressing two main objectives: (1) to further our understanding of individual muscle contribution to internal knee joint moments; and (2) to investigate neuromuscular adaptations, and their effects on knee joint moments, caused by either ACL injury and experimental voluntary quadriceps inhibition (induced by pain).
Methods: The relationship between individual muscle activation and internal net joint moments was quantified using partial least squares regression models. To limit the biomechanical contributions to force production, surface electromyography (EMG) and kinetic data was elicited during a weight-bearing isometric force matching task.
A cross-sectional study design determined differences in individual EMG-moment relationships between ACL deficient and healthy controls (CON) groups. A crossover placebo controlled study design determined these differences in healthy participants with and without induced quadriceps muscle pain. Injections of hypertonic saline (5.8%) to the vastus medialis induced muscle pain. Isotonic saline (0.9%) acted as control. Effect of muscle pain on muscle synergies recruited for the force matching task, lunging and squatting tasks was also evaluated. Synergies were extracted using a concatenated non-negative matrix factorization framework.
Results/Discussion: In CON, significant relationships of the rectus femoris and tensor fascia latae to knee extension and hip flexion; hamstrings to hip extension and knee flexion; and gastrocnemius and hamstrings to knee rotation were identified. Vastii activation was independent of moment generation, suggesting mono-articular vastii activate to produce compressive forces, essentially bracing the knee, so that bi-articular muscles crossing the hip can generate moments for the purpose of sagittal plane movement. Hip ab/adductor muscles modulate frontal plane moments, while hamstrings and gastrocnemius support the knee against externally applied rotational moments.
Compared to CON, ACL had 1) stronger relationships between rectus femoris and knee extension, semitendinosus and knee flexion, and gastrocnemius and knee flexion moments; and 2) weaker relationships between biceps femoris and knee flexion, gastrocnemius and external knee rotation, and gluteus medius and hip abduction moments. Since the knee injury mechanism, is associated with shallow knee flexion angles, valgus alignment and rotation, adaptations after ACL injury are suggested to improve sagittal plane stability, but reduce frontal and rotational plane stability. During muscle pain, EMG-moment relationships of 1) semitendinosus and knee flexor moments were stronger compared to no pain, while 2) rectus femoris and tensor fascia latae to knee extension moments and 3) semitendinosus and lateral gastrocnemius to knee internal rotation moments were reduced. Results support the theory that adaptations to quadriceps pain reduces knee extensor demand to protect the joint and prevent further pain; however, changes in non-painful muscles reduce rotational plane stability.
Individual muscle synergies were identified for each moment type: flexion and extension moments were respectively accompanied by dominant hamstring and quadriceps muscle synergies while co-activation was observed in muscle synergies associated with abduction and rotational moments. Effect of muscle pain was not evident on muscle synergies recruited for the force matching task. This may be due to low loading demands and/or a subject-specific redistribution of muscle activation. Similarly, muscle pain did not affect synergy composition in lunging and squatting tasks. Rather, activation of the extensor dominant muscle synergy and knee joint dynamics were reduced, supporting the notion that adaptive response to pain is to reduce the load and risk of further pain and/or injury.
Conclusion: This thesis evaluated the interrelationship between muscle activation and internal joint moments and the effect of ACL injury and muscle pain on this relationship. Findings indicate muscle activation is not always dependent on its anatomical orientation as previous works suggest, but rather on its role in maintaining knee joint stability especially in the frontal and transverse loading planes. In tasks that are dominated by sagittal plane loads, hamstring and quadriceps will differentially activate. However, when the knee is required to resist externally applied rotational and abduction loads, strategies of global co-activation were identified. Contributions from muscles crossing the knee for supporting against knee adduction loads were not apparent. Alternatively hip abductors were deemed more important regulators of knee abduction loads.
Both muscle pain and ACL groups demonstrated changes in muscle activation that reduced rotational stability. Since frontal plane EMG-moment changes were not present during muscle pain, reduced relationships between hip muscles and abduction moments may be chronic adaptions by ACL that facilitate instability. Findings provide valuable insight into the roles muscles play in maintaining knee joint stability. Rehabilitative/ preventative exercise interventions should focus on neuromuscular training during tasks that elicit rotational and frontal loads (i.e. side cuts, pivoting maneuvers) as well as maintaining hamstring balance, hip abductor and plantarflexor muscle strength in populations with knee pathologies and quadriceps muscle weakness.
|
127 |
The application of multivariate statistical analysis and optimization to batch processesYan, Lipeng January 2015 (has links)
Multivariate statistical process control (MSPC) techniques play an important role in industrial batch process monitoring and control. This research illustrates the capabilities and limitations of existing MSPC technologies, with a particular focus on partial least squares (PLS).In modern industry, batch processes often operate over relatively large spaces, with many chemical and physical systems displaying nonlinear performance. However, the linear PLS model cannot predict nonlinear systems, and hence non-linear extensions to PLS may be required. The nonlinear PLS model can be divided into Type I and Type II nonlinear PLS models. In the Type I Nonlinear PLS method, the observed variables are appended with nonlinear transformations. In contrast to the Type I nonlinear PLS method, the Type II nonlinear PLS method assumes a nonlinear relationship within the latent variable structure of the model. Type I and Type II nonlinear multi-way PLS (MPLS) models were applied to predict the endpoint value of the product in a benchmark simulation of a penicillin batch fermentation process. By analysing and comparing linear MPLS, and Type I and Type II nonlinear MPLS models, the advantages and limitations of these methods were identified and summarized. Due to the limitations of Type I and II nonlinear PLS models, in this study, Neural Network PLS (NNPLS) was proposed and applied to predict the final product quality in the batch process. The application of the NNPLS method is presented with comparison to the linear PLS method, and to the Type I and Type II nonlinear PLS methods. Multi-way NNPLS was found to produce the most accurate results, having the added advantage that no a-priori information regarding the order of the dynamics was required. The NNPLS model was also able to identify nonlinear system dynamics in the batch process. Finally, NNPLS was applied to build the controller and the NNPLS method was combined with the endpoint control algorithm. The proposed controller was able to be used to keep the endpoint value of penicillin and biomass concentration at a set-point.
|
128 |
Espectroscopia Raman e quimiometria como ferramentas no monitoramento on-line do processo fermentativo da glicose pela Saccharomyces cerevisiae / Raman spectroscopy and chemometrics for on-line monitoring of glucose fermentation by Saccharomyces cerevisiaeÁvila, Thiago Carvalho de, 1985- 22 August 2018 (has links)
Orientador: Ronei Jesus Poppi / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Química / Made available in DSpace on 2018-08-22T08:18:21Z (GMT). No. of bitstreams: 1
Avila_ThiagoCarvalhode_M.pdf: 7831860 bytes, checksum: 010f2295e00f097a9ecfaf3f498a7069 (MD5)
Previous issue date: 2013 / Resumo: Este trabalho visou o uso de Espectroscopia Raman e de Quimiometria para monitoramento e controle da fermentação de glicose por Saccharomyces cerevisiae. Na primeira etapa, foi utilizada calibração multivariada baseada no método dos Mínimos Quadrados Parciais (PLS) para quantificação de glicose, etanol, glicerol, ácido acético e células. Os modelos foram desenvolvidos baseados nos valores de concentração obtidos pelos métodos de referência, cromatografia líquida de alta eficiência ¿ HPLC e espectrofotometria UV/Vis. Tanto na etapa de calibração quanto na de validação, a otimização foi realizada com eliminação de amostras anômalas, baseada nos valores de leverage, resíduos e escores. Na segunda etapa, cartas de controle multivariadas foram usadas para identificação de falhas em bateladas durante o processo de fermentação. Foram construídos modelos MPCA (Análise de Componentes Principais Multimodo) a partir de bateladas NOC (Condições Normais de Operação). As cartas de controle multivariadas foram aplicadas em dois modos de desdobramento dos dados obtidos durante o monitoramento, um preservando a direção das bateladas e outro a direção do tempo. As falhas estudadas foram temperatura, mudança no substrato e contaminação do sistema. No modo de desdobramento por bateladas, a carta de controle Q foi eficiente para detecção das falhas estudas, fato comprovado pela classificação correta de três bateladas NOC como dentro de controle. No entanto, a carta de controle T2 não foi capaz de identificar as falhas estudadas corretamente como fora de controle. O modo de desdobramento pelo tempo também apresentou classificações corretas das falhas estudadas / Abstract: This work aims the use of Raman Spectroscopy and Chemometrics in the monitoring and control in the fermentation of the glucose by Saccharomyces cerevisiae. In the first step, it was applied the multivariate calibration based on Partial Least Squares (PLS) for the quantification of glucose, ethanol, glycerol, acetic acid and cells. The developed of calibration models was performed against the concentration values obtained by the reference methods, High Performance Liquid Chromatography and UV/Vis spectrophotometer. The optimization of the calibration and validation steps, the elimination of outliers was performed based on the values of leverage, residues and scores. In the second step, multivariate control charts were used for identification of batch-fault during the fermentation process. Multi-way Principal Component Analysis (MPCA) models were developed from batch NOC (Normal Operation Conditions). The multivariate control charts were based on two modes of unfolding the multi-way data, obtained during monitoring, one preserving the direction of the batch and another the direction of time. The fault studied were temperature, changes in the substrate and contamination of the system. In unfolding batch mode, the chart Q was effective for detection of the faults studied, proven by the correctly classification of 3 NOC batches as in control. However, the chart T2 failed to identify faults studied. The unfolding in time mode, also presented correct classifications of the faults studied / Mestrado / Quimica Analitica / Mestre em Química
|
129 |
Machine learning methods for seasonal allergic rhinitis studiesFeng, Zijie January 2021 (has links)
Seasonal allergic rhinitis (SAR) is a disease caused by allergens from both environmental and genetic factors. Some researchers have studied the SAR based on traditional genetic methodologies. As technology develops, a new technique called single-cell RNA sequencing (scRNA-seq) is developed, which can generate high-dimension data. We apply two machine learning (ML) algorithms, random forest (RF) and partial least squares discriminant analysis (PLS-DA), for cell source classification and gene selection based on the SAR scRNA-seq time-series data from three allergic patients and four healthy controls denoised by single-cell variational inference (scVI). We additionally propose a new fitting method consisting of bootstrap and cubic smoothing splines to fit the averaged gene expressions per cell from different populations. To sum up, we find that both RF and PLS-DA could provide high classification accuracy, and RF is more preferable, considering its stable performance and strong gene-selection ability. Based on our analysis, there are 10 genes having discriminatory power to classify cells of allergic patients and healthy controls at any timepoints. Although there is no literature founded to show the direct connections between such 10 genes and SAR, the potential associations are indirectly confirmed by some studies. It shows a possibility that we can alarm allergic patients before a disease outbreak based on their genetic information. Meanwhile, our experiment results indicate that ML algorithms may discover something between genes and SAR compared with traditional techniques, which needs to be analyzed in genetics in the future.
|
130 |
Essays in agricultural business risk managementLiu, Xuan 16 August 2021 (has links)
Insurance has been considered as a useful tool for farmers to mitigate income volatility. However, there remain concerns that insurance may distort crop production decisions. Positive mathematical programming (PMP) models of farmers’ cropping decisions can be applied to study the effect of agricultural business risk management (BRM) policies on farmers’ decisions on land use and their incomes. Before being used to examine agricultural producer responses to policy changes under the expected utility framework, the models must first be calibrated to obtain the values of the risk aversion coefficient and the cost function parameters. In chapter 2, three calibration approaches are compared for disentangling the risk parameter from the parameters of the cost function. Then, in chapter 3, to investigate the impacts on production incentives of changes in Canada’s AgriStability program, farm management models are calibrated for farms with different cost structures for three different Alberta regions. Results indicate that farmers’ observed attitudes towards risk vary with cost structure. After joining the program, all farmers alter their land allocations to some extent. The introduction of a reference margin limit (RML) in the AgriStability program under Growing Forward 2 (2013-2018), which was retained in the replacement legislation until 2020, has the most negative impact on farmers with the lowest costs. The removal of RML significantly increases the benefits to low-cost farmers.
Traditional insurance products provide financial support to farmers. However, for fruit farmers, the products’ quality can be greatly affected by the weather conditions during the stage of fruit development and ripening, which may lead to quality downgrade and a significant loss in revenue with little impacts on yields. Hence, chapters 4 and 5 investigate the conceptual feasibility of using weather-indexed insurance (WII) to hedge against non-catastrophic, but quality-impacting weather conditions to complement existing traditional insurance.
Prospect theory is applied to analyze a farmer’s demand for WII. The theoretical model demonstrates that an increase in the volatility of total revenue and the revenue proportion from blueberries increases the possibility of farmers’ participation in WII. On the other hand, the increase in the value loss aversion coefficient and WII’s basis risk leads to less demand for WII.
To design a WII product for blueberry growers to hedge against quality risk, a quality index must be constructed and the relationship between key weather conditions, such as cumulative maximum temperature and cumulative excess rainfall, and the quality index should be quantified. The results from a partial least squares structural equation modeling (PLS-SEM) show that the above goals are achievable. Further, rainfall and temperature can be modelled via a time-series model and statistical distributions, respectively, to provide reasonable estimates for calculating insurance premia. / Graduate / 2022-08-05
|
Page generated in 0.0357 seconds