Spelling suggestions: "subject:"least square regression""
1 |
Regression Analysis of University Giving DataJin, Yi 02 January 2007 (has links)
This project analyzed the giving data of Worcester Polytechnic Institute's alumni and other constituents (parents, friends, neighbors, etc.) from fiscal year 1983 to 2007 using a two-stage modeling approach. Logistic regression analysis was conducted in the first stage to predict the likelihood of giving for each constituent, followed by linear regression method in the second stage which was used to predict the amount of contribution to be expected from each contributor. Box-Cox transformation was performed in the linear regression phase to ensure the assumption underlying the model holds. Due to the nature of the data, multiple imputation was performed on the missing information to validate generalization of the models to a broader population. Concepts from the field of direct and database marketing, like "score" and "lift", were also introduced in this report.
|
2 |
Model-based calibration of a non-invasive blood glucose monitorShulga, Yelena A 11 January 2006 (has links)
This project was dedicated to the problem of improving a non-invasive blood glucose monitor being developed by the VivaScan Corporation. The company has made some progress in the non-invasive blood glucose device development and approached WPI for a statistical assistance in the improvement of their model in order to predict the glucose level more accurately. The main goal of this project was to improve the ability of the non-invasive blood glucose monitor to predict the glucose values more precisely. The goal was achieved by finding and implementing the best regression model. The methods included ordinary least squared regression, partial least squares regression, robust regression method, weighted least squares regression, local regression, and ridge regression. VivaScan calibration data for seven patients were analyzed in this project. For each of these patients, the individual regression models were built and compared based on the two factors that evaluate the model prediction ability. It was determined that partial least squares and ridge regressions are two best methods among the others that were considered in this work. Using these two methods gave better glucose prediction. The additional problem of data reduction to minimize the data collection time was also considered in this work.
|
3 |
Uso de técnicas de previsão de demanda como ferramenta de apoio à gestão de emergências hospitalares com alto grau de congestionamentoCalegari, Rafael January 2016 (has links)
Os serviços de emergências hospitalares (EH) desempenham um papel fundamental no sistema de saúde, servindo de porta de entrada para hospitais e fornecendo cuidados para pacientes com lesões e doenças graves. No entanto, as EH em todo o mundo sofrem com o aumento da demanda e superlotação. Múltiplos fatores convergem simultaneamente para resultar nessa superlotação, porém a otimização do gerenciamento do fluxo dos pacientes pode auxiliar na redução do problema. Nesse contexto, o tempo de permanência dos pacientes na EH (TPEH) é consolidado na literatura como indicador de qualidade do fluxo de pacientes. O tema desta dissertação é a previsão e gestão da demanda em EH com alto grau de congestionamento, que é abordado através de três artigos científicos. O objeto de estudo é o Hospital de Clínicas de Porto Alegre (HCPA). No primeiro artigo, são aplicados quatro modelos de previsão da procura por atendimento na EH, avaliando-se a influência de fatores climáticos e de calendário. O segundo artigo utiliza a técnica de regressão por mínimos quadrados parciais (PLS – partial least squares) para previsão de quatro indicadores relacionados ao TPEH para hospitais com alto grau de congestionamento. O tempo médio de permanência (TM) na EH resultou em um modelo preditivo com melhor ajuste, com erro médio absoluto percentual (MAPE - mean absolute percent error) de 5,68%. O terceiro artigo apresenta um estudo de simulação para identificação dos fatores internos do hospital que influenciam o TPEH. O número de exames de tomografias e a taxa de ocupação nas enfermarias clínicas e cirúrgicas (ECC) foram as que mais influenciaram. / Emergency departments (ED) play a key role in the health system, serving as gateway to hospitals and providing care for patients with injuries and serious illnesses. However, EDs worldwide suffer from increased demand and overcrowding. Multiple factors simultaneously converge to result in such overcrowding, and the optimization of patient flow management can help reduce the problem. In this context, the length of stay of patients in ED (LSED) is consolidated in the literature as a patient flow quality indicator. This thesis deals with forecast and demand management in EDs with a high degree of congestion. The subject is covered in three scientific papers, all analyzing data from the Hospital de Clínicas de Porto Alegre’s ED. In the first paper we apply four demand forecasting models to predict demand for service in the ED, evaluating the influence of climatic and calendar factors. The second article uses partial least squares (PLS) regression to predict four indicators related to LSED. The mean length of stay in the ED resulted in a model with the best fit, with mean percent absolute error (MAPE) of 5.68%. The third article presents a simulation study to identify the internal hospital factors influencing LSED. The number of CT exams and the occupancy rate in the clinical and surgical wards were the most influential factors.
|
4 |
Uso de técnicas de previsão de demanda como ferramenta de apoio à gestão de emergências hospitalares com alto grau de congestionamentoCalegari, Rafael January 2016 (has links)
Os serviços de emergências hospitalares (EH) desempenham um papel fundamental no sistema de saúde, servindo de porta de entrada para hospitais e fornecendo cuidados para pacientes com lesões e doenças graves. No entanto, as EH em todo o mundo sofrem com o aumento da demanda e superlotação. Múltiplos fatores convergem simultaneamente para resultar nessa superlotação, porém a otimização do gerenciamento do fluxo dos pacientes pode auxiliar na redução do problema. Nesse contexto, o tempo de permanência dos pacientes na EH (TPEH) é consolidado na literatura como indicador de qualidade do fluxo de pacientes. O tema desta dissertação é a previsão e gestão da demanda em EH com alto grau de congestionamento, que é abordado através de três artigos científicos. O objeto de estudo é o Hospital de Clínicas de Porto Alegre (HCPA). No primeiro artigo, são aplicados quatro modelos de previsão da procura por atendimento na EH, avaliando-se a influência de fatores climáticos e de calendário. O segundo artigo utiliza a técnica de regressão por mínimos quadrados parciais (PLS – partial least squares) para previsão de quatro indicadores relacionados ao TPEH para hospitais com alto grau de congestionamento. O tempo médio de permanência (TM) na EH resultou em um modelo preditivo com melhor ajuste, com erro médio absoluto percentual (MAPE - mean absolute percent error) de 5,68%. O terceiro artigo apresenta um estudo de simulação para identificação dos fatores internos do hospital que influenciam o TPEH. O número de exames de tomografias e a taxa de ocupação nas enfermarias clínicas e cirúrgicas (ECC) foram as que mais influenciaram. / Emergency departments (ED) play a key role in the health system, serving as gateway to hospitals and providing care for patients with injuries and serious illnesses. However, EDs worldwide suffer from increased demand and overcrowding. Multiple factors simultaneously converge to result in such overcrowding, and the optimization of patient flow management can help reduce the problem. In this context, the length of stay of patients in ED (LSED) is consolidated in the literature as a patient flow quality indicator. This thesis deals with forecast and demand management in EDs with a high degree of congestion. The subject is covered in three scientific papers, all analyzing data from the Hospital de Clínicas de Porto Alegre’s ED. In the first paper we apply four demand forecasting models to predict demand for service in the ED, evaluating the influence of climatic and calendar factors. The second article uses partial least squares (PLS) regression to predict four indicators related to LSED. The mean length of stay in the ED resulted in a model with the best fit, with mean percent absolute error (MAPE) of 5.68%. The third article presents a simulation study to identify the internal hospital factors influencing LSED. The number of CT exams and the occupancy rate in the clinical and surgical wards were the most influential factors.
|
5 |
Uso de técnicas de previsão de demanda como ferramenta de apoio à gestão de emergências hospitalares com alto grau de congestionamentoCalegari, Rafael January 2016 (has links)
Os serviços de emergências hospitalares (EH) desempenham um papel fundamental no sistema de saúde, servindo de porta de entrada para hospitais e fornecendo cuidados para pacientes com lesões e doenças graves. No entanto, as EH em todo o mundo sofrem com o aumento da demanda e superlotação. Múltiplos fatores convergem simultaneamente para resultar nessa superlotação, porém a otimização do gerenciamento do fluxo dos pacientes pode auxiliar na redução do problema. Nesse contexto, o tempo de permanência dos pacientes na EH (TPEH) é consolidado na literatura como indicador de qualidade do fluxo de pacientes. O tema desta dissertação é a previsão e gestão da demanda em EH com alto grau de congestionamento, que é abordado através de três artigos científicos. O objeto de estudo é o Hospital de Clínicas de Porto Alegre (HCPA). No primeiro artigo, são aplicados quatro modelos de previsão da procura por atendimento na EH, avaliando-se a influência de fatores climáticos e de calendário. O segundo artigo utiliza a técnica de regressão por mínimos quadrados parciais (PLS – partial least squares) para previsão de quatro indicadores relacionados ao TPEH para hospitais com alto grau de congestionamento. O tempo médio de permanência (TM) na EH resultou em um modelo preditivo com melhor ajuste, com erro médio absoluto percentual (MAPE - mean absolute percent error) de 5,68%. O terceiro artigo apresenta um estudo de simulação para identificação dos fatores internos do hospital que influenciam o TPEH. O número de exames de tomografias e a taxa de ocupação nas enfermarias clínicas e cirúrgicas (ECC) foram as que mais influenciaram. / Emergency departments (ED) play a key role in the health system, serving as gateway to hospitals and providing care for patients with injuries and serious illnesses. However, EDs worldwide suffer from increased demand and overcrowding. Multiple factors simultaneously converge to result in such overcrowding, and the optimization of patient flow management can help reduce the problem. In this context, the length of stay of patients in ED (LSED) is consolidated in the literature as a patient flow quality indicator. This thesis deals with forecast and demand management in EDs with a high degree of congestion. The subject is covered in three scientific papers, all analyzing data from the Hospital de Clínicas de Porto Alegre’s ED. In the first paper we apply four demand forecasting models to predict demand for service in the ED, evaluating the influence of climatic and calendar factors. The second article uses partial least squares (PLS) regression to predict four indicators related to LSED. The mean length of stay in the ED resulted in a model with the best fit, with mean percent absolute error (MAPE) of 5.68%. The third article presents a simulation study to identify the internal hospital factors influencing LSED. The number of CT exams and the occupancy rate in the clinical and surgical wards were the most influential factors.
|
6 |
Automatic age and gender classification using supervised appearance modelBukar, Ali M., Ugail, Hassan, Connah, David 01 August 2016 (has links)
Yes / Age and gender classification are two important problems that recently gained popularity in the
research community, due to their wide range of applications. Research has shown that both age and gender
information are encoded in the face shape and texture, hence the active appearance model (AAM), a statistical
model that captures shape and texture variations, has been one of the most widely used feature extraction
techniques for the aforementioned problems. However, AAM suffers from some drawbacks, especially when
used for classification. This is primarily because principal component analysis (PCA), which is at the core of
the model, works in an unsupervised manner, i.e., PCA dimensionality reduction does not take into account
how the predictor variables relate to the response (class labels). Rather, it explores only the underlying structure
of the predictor variables, thus, it is no surprise if PCA discards valuable parts of the data that represent discriminatory
features. Toward this end, we propose a supervised appearance model (sAM) that improves on AAM
by replacing PCA with partial least-squares regression. This feature extraction technique is then used for the
problems of age and gender classification. Our experiments show that sAM has better predictive power than the
conventional AAM.
|
7 |
Convnet features for age estimationBukar, Ali M., Ugail, Hassan 07 1900 (has links)
No / Research in facial age estimation has been active for over a decade. This is due to its numerous applications. Recently, convolutional neural networks (CNNs) have been used in an attempt to solve this age old problem. For this purpose, researchers have proposed various CNN architectures. Unfortunately, most of the proposed techniques have been based on relatively ‘shallow’ networks. In this work, we leverage the capability of an off-the-shelf deep CNN model, namely the VGG-Face model, which has been trained on millions of face images. Interestingly, despite being a simple approach, features extracted from the VGG-Face model, when reduced and fed into linear regressors, outperform most of the state-of-the-art CNNs. e.g. on both FGNET-AD and Morph II benchmark databases. Furthermore, contrary to using the last fully connected (FC) layer of the trained model, we evaluate the activations from different layers of the architecture. In fact, our experiments show that generic features learnt from intermediate layer activations carry more ageing information than the FC layers.
|
8 |
Hyperspectral Remote Sensing of Temperate Pasture QualityThulin, Susanne Maria, smthulin@telia.com January 2009 (has links)
This thesis describes the research undertaken for the degree of Doctor of Philosophy, testing the hypothesis that spectrometer data can be used to establish usable relationships for prediction of pasture quality attributes. The research data consisted of reflectance measurements of various temperate pasture types recorded at four different times (years 2000 to 2002), recorded by three hyperspectral sensors, the in situ ASD, the airborne HyMap and the satellite-borne Hyperion. Corresponding ground-based pasture samples were analysed for content of chlorophyll, water, crude protein, digestibility, lignin and cellulose at three study sites in rural Victoria, Australia. This context was used to evaluate effects of sensor differences, data processing and enhancement, analytical methods and sample variability on the predictive capacity of derived prediction models. Although hyperspectral data analysis is being applied in many areas very few studies on temperate pastures have been conducted and hardly any encompass the variability and heterogeneity of these southern Australian examples. The research into the relationship between the spectrometer data and pasture quality attribute assays was designed using knowledge gained from assessment of other hyperspectral remote sensing and near-infrared spectroscopy research, including bio-chemical and physical properties of pastures, as well as practical issues of the grazing industries and carbon cycling/modelling. Processing and enhancement of the spectral data followed methods used by other hyperspectral researchers with modifications deemed essential to produce better relationships with pasture assay data. As many different methods are in use for the analysis of hyperspectral data several alternative approaches were investigated and evaluated to determine reliability, robustness and suitability for retrieval of temperate pasture quality attributes. The analyses employed included stepwise multiple linear regression (SMLR) and partial least squares regression (PLSR). The research showed that the spectral research data had a higher potential to be used for prediction of crude protein and digestibility than for the plant fibres lignin and cellulose. Spectral transformation such as continuum removal and derivatives enhanced the results. By using a modified approach based on sample subsets identified by a matrix of subjective bio-physical and ancillary data parameters, the performance of the models were enhanced. Prediction models from PLSR developed on ASD in situ spectral data, HyMap airborne imagery and Hyperion and corresponding pasture assays showed potential for predicting the two important pasture quality attributes crude protein and digestibility in hyperspectral imagery at a few quantised levels corresponding to levels currently used in commercial feed testing. It was concluded that imaging spectrometry has potential to offer synoptic, simultaneous and spatially continuous information valuable to feed based enterprises in temperate Victoria. The thesis provide a significant contribution to the field of hyperspectral remote sensing and good guidance for future hyperspectral researchers embarking on similar tasks. As the research is based on temperate pastures in Victoria, Australia, which are dominated by northern hemisphere species, the findings should be applicable to analysis of temperate pastures elsewhere, for example in Western Australia, New Zealand, South Africa, North America, Europe and northern Asia (China).
|
9 |
A Study of Missing Data Imputation and Predictive Modeling of Strength Properties of Wood CompositesZeng, Yan 01 August 2011 (has links)
Problem: Real-time process and destructive test data were collected from a wood composite manufacturer in the U.S. to develop real-time predictive models of two key strength properties (Modulus of Rupture (MOR) and Internal Bound (IB)) of a wood composite manufacturing process. Sensor malfunction and data “send/retrieval” problems lead to null fields in the company’s data warehouse which resulted in information loss. Many manufacturers attempt to build accurate predictive models excluding entire records with null fields or using summary statistics such as mean or median in place of the null field. However, predictive model errors in validation may be higher in the presence of information loss. In addition, the selection of predictive modeling methods poses another challenge to many wood composite manufacturers.
Approach: This thesis consists of two parts addressing above issues: 1) how to improve data quality using missing data imputation; 2) what predictive modeling method is better in terms of prediction precision (measured by root mean square error or RMSE). The first part summarizes an application of missing data imputation methods in predictive modeling. After variable selection, two missing data imputation methods were selected after comparing six possible methods. Predictive models of imputed data were developed using partial least squares regression (PLSR) and compared with models of non-imputed data using ten-fold cross-validation. Root mean square error of prediction (RMSEP) and normalized RMSEP (NRMSEP) were calculated. The second presents a series of comparisons among four predictive modeling methods using imputed data without variable selection.
Results: The first part concludes that expectation-maximization (EM) algorithm and multiple imputation (MI) using Markov Chain Monte Carlo (MCMC) simulation achieved more precise results. Predictive models based on imputed datasets generated more precise prediction results (average NRMSEP of 5.8% for model of MOR model and 7.2% for model of IB) than models of non-imputed datasets (average NRMSEP of 6.3% for model of MOR and 8.1% for model of IB). The second part finds that Bayesian Additive Regression Tree (BART) produced most precise prediction results (average NRMSEP of 7.7% for MOR model and 8.6% for IB model) than other three models: PLSR, LASSO, and Adaptive LASSO.
|
10 |
The Influence of Corporate Real Estate Ownership on the Risk and Return of StockholdersChung, Po-Hsiang 15 July 2012 (has links)
There are many reasons for companies to hold real estate, including for operating business, production, sales, and providing services. Previous researches show that corporate real estate (CRE) is an important part of company assets, and it will affect stock returns and risk of company. The main object of this study is to investigate the impact of changes in CRE on stock returns and risk of company in Taiwan. Moreover, this study analyzes how CRE affect toward different industry during each business cycle period. Then, we provide some suggestions to stockholders and managers. The data set from 1992 through 2011 in Taiwan stock market, the relationship between CRE and stock returns and risk are analyzed using two stage least squares regression model.
The empirical results show that, on average, higher CRE appears to be associated with higher abnormal return performance and higher total risk. On the other hand, CRE show negative impact on business operation such as lower adjusted return on assets and higher risk of bankruptcy. Furthermore, CRE factor is associated with higher abnormal return performance and higher firm value when company with small asset size, high P/E ratio or newly establish characters. Results also indicate that the impact of CRE on firm¡¦s stock price and risk depend on industries, business cycle period, and firm characters. CRE show negative impact on Textile, Tourism, and Trading and Consumers' Goods Industry. In Food Industry, higher CRE factor is associated with lower system risk and positive impact on business operation.
|
Page generated in 0.0915 seconds