Spelling suggestions: "subject:"conlinear regression"" "subject:"collinear regression""
81 |
Estudo do erro de posicionamento do eixo X em função da temperatura de um centro de usinagem / Study of the X axis error positioning in the function of the machining tool temperatureCláudia Hespanholo Nascimento 07 August 2015 (has links)
Na atual indústria de manufatura, destacam-se as empresas que sejam capazes de atender a demanda de produção de forma rápida e com produtos de qualidade. Durante a fabricação existem diversas fontes de erro que interferem na exatidão do processo de usinagem. Deste modo, torna-se importante o conhecimento destes erros para que técnicas de correção possam ser implementadas ao controle numérico da MF (máquina-ferramenta) e assim, melhorar a exatidão do processo. Neste contexto, o objetivo principal do trabalho é desenvolver uma metodologia para corrigir os erros de posicionamento do eixo X levando em consideração a variação de temperatura medida experimentalmente em pontos específicos da MF. Primeiramente foi realizado um levantamento dos erros de posicionamento experimentais ao longo do eixo X da MF em três diferentes condições de trabalho e simultaneamente havia um sistema para medir a variação de temperatura. Os dados foram tratados e em seguida sintetizados utilizando a metodologia das matrizes homogêneas de transformação, onde foi possível armazenar todos os erros de posicionamento referentes à trajetória da mesa da MF ao longo do eixo X. Os elementos da matriz resultante são utilizados como dados de entrada para análise de regressão linear múltipla que através dos métodos dos mínimos quadrados, correlaciona as variáveis de temperatura e erros de posicionamento. Como resultado, as equações lineares obtidas no método de análise de regressão geram valores previstos para os erros de posicionamento que são utilizados para correção destes erros. Estas equações exigem baixo custo computacional e portanto, podem ser futuramente implementadas no controle numérico da MF para corrigir os erros de posicionamento devido às deformações térmicas. Os resultados finais mostraram que erros de 60 µm foram reduzidos à 10 µm. Constatou-se a importância da sintetização dos erros de posicionamento nas matrizes homogêneas de transformação para aplica-los ao método de regressão. / In today\'s manufacturing industry, companies stand out if they\'re able to meet a high production demand efficiently and with quality products. During manufacturing there are several sources of error that can affect the accuracy of the machining process. Thus, it becomes important to better understand these errors to allow correction techniques to be implemented into the numerical control of the machine tool (MT) and thus improve process accuracy. In this context, the main goal of this work is to develop a method for correcting positioning errors along the X axis taking into consideration the variation in temperature, measured experimentally in specific points of the MT. First we conducted a survey of experimental positioning errors along the X axis of the MT in three different working conditions and simultaneously collecting temperature variation data. Data were treated and then synthesized using the methodology of homogeneous transformation matrices, where it was possible to store all positioning errors related to the trajectory of the board of the MT along the X axis. The elements of the matrix resulting from the homogeneous transformation are used as input data for the multiple linear regression analysis by the methods of least squares, which correlates the temperature variables with the positioning errors. As a result, linear equations obtained from the regression analysis method generates predicted values for the positioning errors which are used to correct this errors. These equations require low computer processing and therefore can be further implemented into the numerical control of the MT to correct positioning errors due to thermal deformation. The final results showed that 60 µm errors were reduced to 10 µm. It was noted the importance of synthesizing the positioning errors in homogeneous transformation matrices to apply them to the regression method.
|
82 |
Estimação do período de carência de medicamento veterinário em produtos comestíveis (tecidos) de origem animal por modelos de regressão / Estimation of the withdrawal period for veterinary drugs in edible tissues of animal origin by regression modelsSimone Cristina Rosa 12 August 2016 (has links)
Resíduos de medicamento veterinário podem estar presentes em produtos comestíveis de origem animal, tais como carne, leite, ovos e mel. Para assegurar que a concentração de tais resíduos não excede um limite considerado seguro (Limite Máximo de Resíduo - LMR) deve ser estabelecido o período de carência, que é o tempo que deve ser respeitado para que um animal possa ser enviado para o abate após ter recebido um dado medicamento veterinário. A estimação do período de carência usualmente é feita pelo ajuste de um modelo de regressão linear simples, seguido pelo cálculo de um limite de tolerância. Para isso, os pressupostos de homocedasticidade e de normalidade dos erros do modelo devem ser atendidos. No entanto, violações desses pressupostos são frequentes nos estudos de depleção residual. No presente trabalho foram utilizados dois bancos de dados da quantificação de resíduo de medicamento veterinário em tecidos de bovinos e o período de carência foi estimado para fígado, gordura, músculo e rins. Os modelos de regressão foram ajustados para a média dos resultados de cada animal, para a média dos resultados de cada extração analítica e para os resultados obtidos para cada réplica, sendo que para esta última situação foi ajustado um modelo de regressão linear com efeitos mistos. O modelo linear ajustado para as médias obtidas para cada extração analítica apresentou maior precisão nas estimativas dos parâmetros do modelo e também menor período de carência. No entanto, para esse modelo também foram detectados mais pontos potencialmente influentes comparado aos demais modelos ajustados. Não foi possível calcular o limite de tolerância e, consequentemente, predizer o período de carência quando utilizado o modelo com efeitos mistos. Conclui-se que a o ajuste de outros modelos estatísticos mais robustos e flexíveis deve ser considerado para a estimação do período de carência de medicamento veterinário. / Veterinary drugs residues can be found in foodstuffs of animal origin such as meat, milk, eggs and honey. In order to ensure that the concentration of these residues does not exceed a safe limit (Maximum Residue Limit - MRL) it is necessary to establish a withdrawal period, which is the waiting time necessary for an animal to be sent for slaughtering after having received a veterinary drug. The estimation of the withdrawal period is normally obtained by the fitting of a simple linear regression model, followed by the calculation of a tolerance limit. For this, the assumptions of homoscedasticity and the normality of the errors must be met. However, violations of these assumptions are frequent in the residual depletion studies. In the present study two database of the quantification of veterinary drug residue in bovine tissues were used and the withdrawal period was estimated for liver, fat, muscle and kidneys. The regression models were fitted to the mean value of the results obtained for each animal, to the mean value of the results obtained for each analytical extraction and to the results obtained for the repeated sample measurements, and a linear mixed model was fitted for this later situation. The linear model fitted to the mean value of the results obtained for each analytical extraction showed greater precision in the parameters estimates of the model as well as shorter withdrawal period. However, for this model, more potentially influential points were detected compared to other models fitted. It was not possible to calculate the tolerance limit, and, consequently, to predict the withdrawal period using the mixed effects model. In conclusion, the fitting of the other more robust and flexible statistical models should be considered for the estimation of the withdrawal period of veterinary drug.
|
83 |
Big Data : le nouvel enjeu de l'apprentissage à partir des données massives / Big Data : the new challenge Learning from data MassiveAdjout Rehab, Moufida 01 April 2016 (has links)
Le croisement du phénomène de mondialisation et du développement continu des technologies de l’information a débouché sur une explosion des volumes de données disponibles. Ainsi, les capacités de production, de stockage et de traitement des donnée sont franchi un tel seuil qu’un nouveau terme a été mis en avant : Big Data.L’augmentation des quantités de données à considérer, nécessite la mise en oeuvre de nouveaux outils de traitement. En effet, les outils classiques d’apprentissage sont peu adaptés à ce changement de volumétrie tant au niveau de la complexité de calcul qu’à la durée nécessaire au traitement. Ce dernier, étant le plus souvent centralisé et séquentiel,ce qui rend les méthodes d’apprentissage dépendantes de la capacité de la machine utilisée. Par conséquent, les difficultés pour analyser un grand jeu de données sont multiples.Dans le cadre de cette thèse, nous nous sommes intéressés aux problèmes rencontrés par l’apprentissage supervisé sur de grands volumes de données. Pour faire face à ces nouveaux enjeux, de nouveaux processus et méthodes doivent être développés afin d’exploiter au mieux l’ensemble des données disponibles. L’objectif de cette thèse est d’explorer la piste qui consiste à concevoir une version scalable de ces méthodes classiques. Cette piste s’appuie sur la distribution des traitements et des données pou raugmenter la capacité des approches sans nuire à leurs précisions.Notre contribution se compose de deux parties proposant chacune une nouvelle approche d’apprentissage pour le traitement massif de données. Ces deux contributions s’inscrivent dans le domaine de l’apprentissage prédictif supervisé à partir des données volumineuses telles que la Régression Linéaire Multiple et les méthodes d’ensemble comme le Bagging.La première contribution nommée MLR-MR, concerne le passage à l’échelle de la Régression Linéaire Multiple à travers une distribution du traitement sur un cluster de machines. Le but est d’optimiser le processus du traitement ainsi que la charge du calcul induite, sans changer évidement le principe de calcul (factorisation QR) qui permet d’obtenir les mêmes coefficients issus de la méthode classique.La deuxième contribution proposée est appelée "Bagging MR_PR_D" (Bagging based Map Reduce with Distributed PRuning), elle implémente une approche scalable du Bagging,permettant un traitement distribué sur deux niveaux : l’apprentissage et l’élagage des modèles. Le but de cette dernière est de concevoir un algorithme performant et scalable sur toutes les phases de traitement (apprentissage et élagage) et garantir ainsi un large spectre d’applications.Ces deux approches ont été testées sur une variété de jeux de données associées àdes problèmes de régression. Le nombre d’observations est de plusieurs millions. Nos résultats expérimentaux démontrent l’efficacité et la rapidité de nos approches basées sur la distribution de traitement dans le Cloud Computing. / In recent years we have witnessed a tremendous growth in the volume of data generatedpartly due to the continuous development of information technologies. Managing theseamounts of data requires fundamental changes in the architecture of data managementsystems in order to adapt to large and complex data. Single-based machines have notthe required capacity to process such massive data which motivates the need for scalablesolutions.This thesis focuses on building scalable data management systems for treating largeamounts of data. Our objective is to study the scalability of supervised machine learningmethods in large-scale scenarios. In fact, in most of existing algorithms and datastructures,there is a trade-off between efficiency, complexity, scalability. To addressthese issues, we explore recent techniques for distributed learning in order to overcomethe limitations of current learning algorithms.Our contribution consists of two new machine learning approaches for large scale data.The first contribution tackles the problem of scalability of Multiple Linear Regressionin distributed environments, which permits to learn quickly from massive volumes ofexisting data using parallel computing and a divide and-conquer approach to providethe same coefficients like the classic approach.The second contribution introduces a new scalable approach for ensembles of modelswhich allows both learning and pruning be deployed in a distributed environment.Both approaches have been evaluated on a variety of datasets for regression rangingfrom some thousands to several millions of examples. The experimental results showthat the proposed approaches are competitive in terms of predictive performance while reducing significantly the time of training and prediction.
|
84 |
Assignment of Estimated Average Annual Daily Traffic Volumes on All Roads in FloridaPan, Tao 27 March 2008 (has links)
In the first part, this thesis performed a study to compile and compare current procedures or methodologies for the estimation of traffic volumes on the roads where traffic counts are not easily available. In the second part, linear regression was practiced as an AADT estimation process, which was primarily based on known or accepted AADT values on the neighboring state and local roadways, population densities and other social/economic data.
To develop AADT prediction models for estimating AADT values, two different types of database were created, including a social economic database and a roadway characteristics database. Ten years social economic data, from 1995 to 2005 were collected for each of the 67 counties in the state of Florida, and a social economic database was created by manually imputing data obtained from different resources into the social economic database. The roadway characteristics database was created by joining different GIS data layers to the Tele Atlas base map provided by Florida Department of Transportation (FDOT).
Stepwise regression method was used to select variables that will be included into the final models. All selected independent variables in the models are statistically significant with a 90% level of confidence. In total, six linear regression models were built. The adjusted R2 values of the AADT prediction models vary from 0.166 to 0.418. Model validation results show that the MAPE values of the AADT prediction models vary from 31.99% to 159.49%. The model with the lowest MAPE value is found to be the minor state/county highway model for rural area. The model with the highest MAPE value is found to be the local street model for large metropolitan area. In general, minor state/county highway models provide more reasonable AADT estimates as compared to the local street model in terms of the lower MAPE values.
|
85 |
Contact Center Employee Characteristics Associated with Customer SatisfactionPow, Lara 01 January 2017 (has links)
The management of operations for a customer contact center (CCC) presents significant challenges. Management's direction is to reduce costs through operational efficiency metrics while providing maximum customer satisfaction levels to retain customers and increase profit margins. The purpose of this correlational study was to quantify the significance of various customer service representative (CSR) characteristics including internal service quality, employee satisfaction, and employee productivity, and then to determine their predictive ability on customer satisfaction, as outlined in the service-profit chain model. The research question addressed whether a linear relationship existed between CSR characteristics and the customers' satisfaction with the CSR by applying ordinary least squares regression using archival dyadic data. The data consisted of a random sample of 269 CSRs serving a large Canadian bank. Various subsets of data were analyzed via regression to help generate actionable insights. One particular model involving poor performing CSRs whose customer satisfaction was less than 75% top box proved to be statistically significant (p = .036, R-squared = .321) suggesting that poor performing CSRs contribute to a significant portion of poor customer service while high performing CSRs do not necessarily guarantee good customer service. A key variable used in this research was a CSR's level of education, which was not significant. Such a finding implies that for CCC support, a less-educated labor pool may be maintained, balancing societal benefits of employment for less-educated people at a reasonable service cost to a company. These findings relate to positive social change as hiring less-educated applicants could increase their social and economic status.
|
86 |
QUANTIFYING NON-RECURRENT DELAY USING PROBE-VEHICLE DATABrashear, Jacob Douglas Keaton 01 January 2018 (has links)
Current practices based on estimated volume and basic queuing theory to calculate delay resulting from non-recurrent congestion do not account for the day-to-day fluctuations in traffic. In an attempt to address this issue, probe GPS data are used to develop impact zone boundaries and calculate Vehicle Hours of Delay (VHD) for incidents stored in the Traffic Response and Incident Management Assisting the River City (TRIMARC) incident log in Louisville, KY. Multiple linear regression along with stepwise selection is used to generate models for the maximum queue length, the average queue length, and VHD to explore the factors that explain the impact boundary and VHD. Models predicting queue length do not explain significant amounts of variance but can be useful in queue spillback studies. Models predicting VHD are as effective as the data collected; models using cheaper-to-collect data sources explain less variance; models collecting more detailed data explained more variance. Models for VHD can be useful in incident management after action reviews and predicting road user costs.
|
87 |
Queueing Variables and Leave-Without-Treatment Rates in the Emergency RoomGibbs, Joy Jaylene 01 January 2018 (has links)
Hospitals stand to lose millions of dollars in revenue due to patients who leave without treatment (LWT). Grounded in queueing theory, the purpose of this correlational study was to examine the relationship between daily arrivals, daily staffing, triage time, emergency severity index (ESI), rooming time, door-to-provider time (DTPT), and LWT rates. The target population comprised patients who visited a Connecticut emergency room between October 1, 2017, and May 31, 2018. Archival records (N = 154) were analyzed using multiple linear regression analysis. The results of the multiple linear regression were statistically significant, with F(9,144) = 2902.49, p < .001, and R2 = 0.99, indicating 99% of the variation in LWT was accounted for by the predictor variables. ESI levels were the only variables making a significant contribution to the regression model. The implications for positive social change include the potential for patients to experience increased satisfaction due to the high quality of care and overall improvement in public health outcomes. Hospital leaders might use the information from this study to mitigate LWT rates and modify or manage staffing levels, time that patients must wait for triage, room placement, and DTPT to decrease the rate of LWT in the emergency room.
|
88 |
Selecting the Best Linear Model From a Subset of All Possible Models for a Given Set of Predictors in a Multiple Linear Regression AnalysisJensen, David L. 01 May 1972 (has links)
Sixteen "model building" and "model selection" procedures commonly encountered in industry, all of which were initially alleged to be capable of identifying the best model from the collection of 2k possible linear models corresponding to a given set of k predictors in a multiple linear regression analysis, were individually summarized and subsequently evaluated by considering their comparative advantages and limitations from both a theoretical and a practical standpoint. It was found that none of the proposed procedures were absolutely infallible and that several were actually unsuitable. However, it was also found that most of these techniques could still be profitably employed by the analyst, and specific directional guidelines were recommended for their implementation in a proper analysis. Furthermore, the specific role of the analyst in a multiple linear regression application was clearly defined in a practical sense.
|
89 |
Automated Localization and Segmentation of Pelvic Floor Structures on MRI to Predict Pelvic Organ ProlapseOnal, Sinan 29 May 2014 (has links)
Pelvic organ prolapse (POP) is a major health problem that affects women. POP is a herniation of the female pelvic floor organs (bladder, uterus, small bowel, and rectum) into the vagina. This condition can cause significant problems such as urinary and fecal incontinence, bothersome vaginal bulge, incomplete bowel and bladder emptying, and pain/discomfort. POP is normally diagnosed through clinical examination since there are few associated symptoms. However, clinical examination has been found to be inadequate and in disagreement with surgical findings. This makes POP a common but poorly understood condition. Dynamic magnetic resonance imaging (MRI) of the pelvic floor has become an increasingly popular tool to assess POP cases that may not be evident on clinical examination. Anatomical landmarks are manually identified on MRI along the midsagittal plane to determine reference lines and measurements for grading POP. However, the manual identification of these points, lines and measurements on MRI is a time-consuming and subjective procedure. This has restricted the correlation analysis of MRI measurements with clinical outcomes to improve the diagnosis of POP and predict the risk of development of this disorder.
The main goal of this research is to improve the diagnosis of pelvic organ prolapse through a model that automatically extracts image-based features from patient specific MRI and fuses them with clinical outcomes. To extract image-based features, anatomical landmarks need to be identified on MRI through the localization and segmentation of pelvic bone structures. This is the main challenge of current algorithms, which tend to fail during bone localization and segmentation on MRI. The proposed research consists of three major objectives: (1) to automatically identify pelvic floor structures on MRI using a multivariate linear regression model with global information, (2) to identify image-based features using a hybrid technique based on texture-based block classification and K-means clustering analysis to improve the segmentation of bone structures on images with low contrast and image in homogeneity, (3) to design, test and validate a prediction model using support vector machines with correlation analysis based feature selection to improve disease diagnosis.
The proposed model will enable faster and more consistent automated extraction of features from images with low contrast and high inhomogeneity. This is expected to allow studies on large databases to improve the correlation analysis between MRI features and clinical outcomes. The proposed research focuses on the pelvic region but the techniques are applicable to other anatomical regions that require automated localization and segmentation of multiple structures from images with high inhomogeneity, low contrast, and noise. This research can also be applicable to the automated extraction and analysis of image-based features for the diagnosis of other diseases where clinical examination is not adequate. The proposed model will set the foundation towards a computer-aided decision support system that will enable the fusion of image, clinical, and patient data to improve the diagnosis of POP through personalized assessment. Automating the process of pelvic floor measurements on radiologic studies will allow the use of imaging to predict the development of POP in predisposed patients, and possibly lead to preventive strategies.
|
90 |
Effect of advective pore water flow on degradation of organic matter in permeable sandy sediment : - A study of fresh- and brackish waterHofman, Birgitta January 2005 (has links)
<p>The carbon metabolism in costal sediments is of major importance for the global carbon cycle. Costal sediments are also subjected to physical forcing generating water fluxes above and through the sediments, but how the physical affect the carbon metabolism is currently poorly known. In this study, the effect of advective pore water flow on degradation of organic matter in permeable sandy sediment was investigated in a laboratory study during wintertime. Sediments were collected from both brackish water (Askö) and from a fresh water stream (Getå Stream). In two chamber experiments, with and without advective pore water flow, the degradation of organic matter was measured through carbon dioxide analysis from water and headspace. In Askö sediments mineralization rates ranged from 3.019 - 5.115 mmol C m-2 d-1 and 3.139 mmol C m-2 d-1 with and without advective pore water flow, respectively. Those results correspond with results from earlier studies of carbon mineralization rates in sediment in the North Sea and the Baltic Sea. There were no significant differences between the two groups in the Askö sediment. In Getå Stream sediments mineralization rates ranged between 4.059 mmol C m-2 d-1 and 6.806 mmol C m-2 d-1 with and without advective flow, respectively. The mineralization rates for Getå Stream correspond with earlier studies of carbon mineralization rates in a stream in New Hampshire.</p>
|
Page generated in 0.132 seconds