Spelling suggestions: "subject:"linearregression"" "subject:"multilinearregression""
471 |
An Exploratory Study of ESGand R&D / En utforskande studie i ESG och FoUMagnúsdóttir, Karen Sif, Berndtson, Gustav January 2021 (has links)
With the increased relevance of sustainability in society these last years, initiatives have been made to create tools for reporting and measuring sustainability in companies. Recently the EU chose to favour the ESG system, which stands for Environmental, Social and Governance, to create more homogeneity in this sector. Being a relatively recent field, some aspects of the metrics, as well as its interaction with other economic metrics, has not been thoroughly studied. Therefore, this study investigates the relationship between R&D investment and ESG metrics, to see if there is some correlation between the two. In this study a linear regression model with data from listed companies from Sweden, Denmark and Norway, is conducted to analyse the relationship. The results were inconclusive, but seem to indicate that there might be ar elationship between the E variables and R&D intensity. / Med den ökade relevansen av hållbarhet i samhället de senaste åren har fler initiativ tagits för att skapa metoder för att rapportera och mäta detta. EU har för att skapa enhetlighet inom frågan nyligen tagit initiativ att börja införandet av ett av dessa system, ESG, som står för Environmental, Social och Governance. Då det är ett relativt nytt ramverk är inte alla variabler och dess interaktioner med andra mer etablerade ekonomiska mätdata fullkomligt kartlagda. Denna studie fokuserar därför på att undersöka förhållandet mellan ESG faktorer och investeringar i forskning och utveckling. Studien gör detta genom att göra en linjär regressionsmodell av data från börsnoterade företag från Sverige, Norge och Danmark. Resultatet av studien kunde inte uppvisa ett signifikant samband, men om kunde delvis peka på en samband vissa E variabler och FoU intensitet.
|
472 |
Predicting the Effects of Sedative Infusion on Acute Traumatic Brain Injury PatientsMcCullen, Jeffrey Reynolds 09 April 2020 (has links)
Healthcare analytics has traditionally relied upon linear and logistic regression models to address clinical research questions mostly because they produce highly interpretable results [1, 2]. These results contain valuable statistics such as p-values, coefficients, and odds ratios that provide healthcare professionals with knowledge about the significance of each covariate and exposure for predicting the outcome of interest [1]. Thus, they are often favored over new deep learning models that are generally more accurate but less interpretable and scalable. However, the statistical power of linear and logistic regression is contingent upon satisfying modeling assumptions, which usually requires altering or transforming the data, thereby hindering interpretability. Thus, generalized additive models are useful for overcoming this limitation while still preserving interpretability and accuracy.
The major research question in this work involves investigating whether particular sedative agents (fentanyl, propofol, versed, ativan, and precedex) are associated with different discharge dispositions for patients with acute traumatic brain injury (TBI). To address this, we compare the effectiveness of various models (traditional linear regression (LR), generalized additive models (GAMs), and deep learning) in providing guidance for sedative choice. We evaluated the performance of each model using metrics for accuracy, interpretability, scalability, and generalizability. Our results show that the new deep learning models were the most accurate while the traditional LR and GAM models maintained better interpretability and scalability. The GAMs provided enhanced interpretability through pairwise interaction heat maps and generalized well to other domains and class distributions since they do not require satisfying the modeling assumptions used in LR. By evaluating the model results, we found that versed was associated with better discharge dispositions while ativan was associated with worse discharge dispositions. We also identified other significant covariates including age, the Northeast region, the Acute Physiology and Chronic Health Evaluation (APACHE) score, Glasgow Coma Scale (GCS), and ethanol level. The versatility of versed may account for its association with better discharge dispositions while ativan may have negative effects when used to facilitate intubation. Additionally, most of the significant covariates pertain to the clinical state of the patient (APACHE, GCS, etc.) whereas most non-significant covariates were demographic (gender, ethnicity, etc.). Though we found that deep learning slightly improved over LR and generalized additive models after fine-tuning the hyperparameters, the deep learning results were less interpretable and therefore not ideal for making the aforementioned clinical insights. However deep learning may be preferable in cases with greater complexity and more data, particularly in situations where interpretability is not as critical. Further research is necessary to validate our findings, investigate alternative modeling approaches, and examine other outcomes and exposures of interest. / Master of Science / Patients with Traumatic Brain Injury (TBI) often require sedative agents to facilitate intubation and prevent further brain injury by reducing anxiety and decreasing level of consciousness. It is important for clinicians to choose the sedative that is most conducive to optimizing patient outcomes. Hence, the purpose of our research is to provide guidance to aid this decision. Additionally, we compare different modeling approaches to provide insights into their relative strengths and weaknesses.
To achieve this goal, we investigated whether the exposure of particular sedatives (fentanyl, propofol, versed, ativan, and precedex) was associated with different hospital discharge locations for patients with TBI. From best to worst, these discharge locations are home, rehabilitation, nursing home, remains hospitalized, and death. Our results show that versed was associated with better discharge locations and ativan was associated with worse discharge locations. The fact that versed is often used for alternative purposes may account for its association with better discharge locations. Further research is necessary to further investigate this and the possible negative effects of using ativan to facilitate intubation. We also found that other variables that influence discharge disposition are age, the Northeast region, and other variables pertaining to the clinical state of the patient (severity of illness metrics, etc.). By comparing the different modeling approaches, we found that the new deep learning methods were difficult to interpret but provided a slight improvement in performance after optimization. Traditional methods such as linear regression allowed us to interpret the model output and make the aforementioned clinical insights. However, generalized additive models (GAMs) are often more practical because they can better accommodate other class distributions and domains.
|
473 |
An Empirical Predictive Model for Determining the Aqueous Solubility of BCS Class IV Drugs in Amorphous Solid DispersionsRaparla, Sridivya 01 January 2024 (has links) (PDF)
Poor aqueous solubility persists as a significant challenge in the pharmaceutical industry.Ongoing research aims to enhance the solubility of drugs to deliver them more effectively. Amorphous solid dispersion (ASD) is a widely used solubility enhancement technique. The absence of a specific model to predict compound solubility from ASDs resulted in a trial-and- error approach to studying solubility enhancement and makes it a laborious and time-consuming process. Predictive models could streamline this process and accelerate the development of oral drugs with improved aqueous solubilities. This study aimed to develop a predictive model to estimate the solubility of a compound from the polymer matrices in ASDs. For this purpose, five BCS Class IV drugs (acetazolamide, chlorothiazide, furosemide, hydrochlorothiazide, sulfamethoxazole), four hydrophilic polymers (PVP, PVPVA, HPMC E5, Soluplus), and a surfactant (TPGS) were chosen as the models for drug, polymers, and surfactant, respectively. ASDs of model drugs were prepared using hotmelt process. The prepared ASDs were characterized using DSC, FTIR, and XRD. The aqueous solubility of the model drugs was determined using the shake-flask method. Multiple linear regression was used to develop a predictive model to determine aqueous solubility using the molecular descriptors of the drug and polymer as predictor variables. The model was validated using Leave-One-Out Cross-Validation.
The ASDs’ drug components were identified as amorphous via DSC and XRD Studies.There were no significant chemical interactions between the model drugs and the polymers based on FTIR studies. Compared with pure drugs, their ASDs showed a significant (p
|
474 |
HIGH-DIMENSIONAL INFERENCE OVER NETWORKS: STATISTICAL AND COMPUTATIONAL GUARANTEESYao Ji (19697335) 19 September 2024 (has links)
<p dir="ltr">Distributed optimization problems defined over mesh networks are ubiquitous in signal processing, machine learning, and control. In contrast to centralized approaches where all information and computation resources are available at a centralized server, agents on a distributed system can only use locally available information. As a result, efforts have been put into the design of efficient distributed algorithms that take into account the communication constraints and make coordinated decisions in a fully distributed manner from a pure optimization perspective. Given the massive sample size and high-dimensionality generated by distributed systems such as social media, sensor networks, and cloud-based databases, it is essential to understand the statistical and computational guarantees of distributed algorithms to solve such high-dimensional problems over a mesh network.</p><p dir="ltr">A goal of this thesis is a first attempt at studying the behavior of distributed methods in the high-dimensional regime. It consists of two parts: (I) distributed LASSO and (II) distributed stochastic sparse recovery.</p><p dir="ltr">In Part (I), we start by studying linear regression from data distributed over a network of agents (with no master node) by means of LASSO estimation, in high-dimension, which allows the ambient dimension to grow faster than the sample size. While there is a vast literature of distributed algorithms applicable to the problem, statistical and computational guarantees of most of them remain unclear in high dimensions. This thesis provides a first statistical study of the Distributed Gradient Descent (DGD) in the Adapt-Then-Combine (ATC) form. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions--which hold with high probability for standard data generation models--suitable conditions on the network connectivity and algorithm tuning, DGD-ATC converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model. In the worst-case scenario, the total number of communications to statistical optimality grows logarithmically with the ambient dimension, which improves on the communication complexity of DGD in the Combine-Then-Adapt (CTA) form, scaling linearly with the dimension. This reveals that mixing gradient information among agents, as DGD-ATC does, is critical in high-dimensions to obtain favorable rate scalings. </p><p dir="ltr">In Part (II), we focus on addressing the problem of distributed stochastic sparse recovery through stochastic optimization. We develop and analyze stochastic optimization algorithms for problems over a network, modeled as an undirected graph (with no centralized node), where the expected loss is strongly convex with respect to the Euclidean norm, and the optimum is sparse. Assuming agents only have access to unbiased estimates of the gradients of the underlying expected objective, and stochastic gradients are sub-Gaussian, we use distributed stochastic dual averaging (DSDA) as a building block to develop a fully decentralized restarting procedure for recovery of sparse solutions over a network. We show that with high probability, the iterates generated by all agents linearly converge to an approximate solution, eliminating fast the initial error; and then converge sublinearly to the exact sparse solution in the steady-state stages owing to observation noise. The algorithm asymptotically achieves the optimal convergence rate and favorable dimension dependence enjoyed by a non-Euclidean centralized scheme. Further, we precisely identify its non-asymptotic convergence rate as a function of characteristics of the objective functions and the network, and we characterize the transient time needed for the algorithm to approach the optimal rate of convergence. We illustrate the performance of the algorithm in application to classical problems of sparse linear regression, sparse logistic regression and low rank matrix recovery. Numerical experiments demonstrate the tightness of the theoretical results.</p>
|
475 |
Inference of Gene Regulatory Networks with integration of prior knowledgeMaresi, Emiliano 17 June 2024 (has links)
Gene regulatory networks (GRNs) are crucial for understanding complex biological processes and disease mechanisms, particularly in cancer. However, GRN inference remains challenging due to the intricate nature of gene interactions and limitations of existing methods. Traditionally, prior knowledge in GRN inference simplifies the problem by reducing the search space, but its full potential is unrealized. This research aims to develop a method that uses prior knowledge to guide the GRN inference process, enhancing accuracy and biological plausibility of the resulting networks. We extended the Fused Sparse Structural Equation Models (FSSEM) framework to create the Fused Lasso Adaptive Prior (FLAP) method. FSSEM incorporates gene expression data and genetic variants in the form of expression quantitative trait loci (eQTLs) perturbations. FLAP enhances FSSEM by integrating prior knowledge of gene-gene interactions into the initial network estimate, guiding the selection of relevant gene interactions in the final inferred network. We evaluated FLAP using synthetic data to assess the impact of incorrect prior knowledge and real lung cancer data, using prior knowledge from various gene network databases (GIANT, TissueNexus, STRING, ENCODE, hTFtarget). Our findings demonstrate that integrating prior knowledge improves the accuracy of inferred networks, with FLAP showing tolerance for incorrect
prior knowledge. Using real lung cancer data, functional enrichment analysis and literature validation confirmed the biological plausibility of the networks inferred by FLAP. Different sources of prior knowledge impacted the results, with GIANT providing the most biologically relevant networks, while other sources showed less consistent performance.
FLAP improves GRN inference by effectively integrating prior knowledge, demonstrating robustness against incorrect prior knowledge. The method’s application to lung cancer data indicates that high-quality prior knowledge sources enhance the biological relevance of inferred networks. Future research should focus on improving the quality and integration of prior knowledge, possibly by developing consensus methods that combine multiple sources. This
approach has potential applications in cancer research and drug sensitivity studies, offering a more accurate understanding of gene regulatory mechanisms and potential therapeutic targets.
|
476 |
模糊線性迴歸之研究趙家慶 Unknown Date (has links)
使用傳統迴歸的方式對未知事物做預測,往往不能夠精準的做出結論,縱使在相同的條件下實際去操作,也很難得到相同的結果,因此模糊數概念的建立,並運用在迴歸分析上更能有效描述預測結果的不確定性。然而模糊線性迴歸(Fuzzy Linear Regression)在利用最小平方法處理問題時,往往過於著重在模糊區間的中心與分展度上,而忽略了描述資料的模糊性,使得隸屬度函數(membership function)的功能受到相當大的限制。本文在D'Urso和Gastaldi(2000)所提出的雙重模糊線性迴歸(doubly fuzzy linear regression)模型架構下,利用Yang和Ko(1996)在LR空間下所定義模糊數間的距離公式,導出能反映隸屬度函數的最小平方估計,並引進一些傳統迴歸中常用來偵測離群值(outlier)與具影響力觀察值(influence observation)的概念與技巧,應用在模糊線性迴歸資料的偵測上。
|
477 |
Consumer liking and sensory attribute prediction for new product development support : applications and enhancements of belief rule-based methodologySavan, Emanuel-Emil January 2015 (has links)
Methodologies designed to support new product development are receiving increasing interest in recent literature. A significant percentage of new product failure is attributed to a mismatch between designed product features and consumer liking. A variety of methodologies have been proposed and tested for consumer liking or preference prediction, ranging from statistical methodologies e.g. multiple linear regression (MLR) to non-statistical approaches e.g. artificial neural networks (ANN), support vector machines (SVM), and belief rule-based (BRB) systems. BRB has been previously tested for consumer preference prediction and target setting in case studies from the beverages industry. Results have indicated a number of technical and conceptual advantages which BRB holds over the aforementioned alternative approaches. This thesis focuses on presenting further advantages and applications of the BRB methodology for consumer liking prediction. The features and advantages are selected in response to challenges raised by three addressed case studies. The first case study addresses a novel industry for BRB application: the fast moving consumer goods industry, the personal care sector. A series of challenges are tackled. Firstly, stepwise linear regression, principal component analysis and AutoEncoder are tested for predictors’ selection and data reduction. Secondly, an investigation is carried out to analyse the impact of employing complete distributions, instead of averages, for sensory attributes. Moreover, the effect of modelling instrumental measurement error is assessed. The second case study addresses a different product from the personal care sector. A bi-objective prescriptive approach for BRB model structure selection and validation is proposed and tested. Genetic Algorithms and Simulated Annealing are benchmarked against complete enumeration for searching the model structures. A novel criterion based on an adjusted Akaike Information Criterion is designed for identifying the optimal model structure from the Pareto frontier based on two objectives: model complexity and model fit. The third case study introduces yet another novel industry for BRB application: the pastry and confectionary specialties industry. A new prescriptive framework, for rule validation and random training set allocation, is designed and tested. In all case studies, the BRB methodology is compared with the most popular alternative approaches: MLR, ANN, and SVM. The results indicate that BRB outperforms these methodologies both conceptually and in terms of prediction accuracy.
|
478 |
Vliv koeficientu redukce na zdroj ceny na výsledný index odlišnosti při komparativní metodě oceňování nemovitostí / The price source reducing coefficient impact on total index of dissimilarity by the real estate valuation comparative methodCupal, Martin Unknown Date (has links)
True market prices of real estates, unlike bid prices, are often hard to reach. Nevertheless, this information is necessary for many direct and indirect real estate market subjects, especially for valuation purposes. Therefore the bid prices of concrete real estates are often used, but they are not generally equivalent market prices. And so it´s necessary to find some way to convert bid prices to market prices. This dissertation thesis shows definite approach to this issue. Market price and bid price rate is estimated by multi-dimensional linear regression model and non-linear estimations of simple regression. Multi-dimensional linear regression model estimates the values of this rate from other variables, like supply duration, price line according to localities and other. Non-linear estimations of regression function were used for the trend bid and market price modelling in dependence on number of the population in various localities.
|
479 |
Ranging Error Correction in a Narrowband, Sub-GHz, RF Localization System / Felkorrigering av avståndsmätingar i ett narrowband, sub-GHz, RF-baserat positioneringssystemBarrett, Silvia January 2023 (has links)
Being able to keep track of ones assets is a very useful thing, from avoiding losing ones keys or phone to being able to find the needed equipment in a busy hospital or on a construction site. The area of localization is actively evolving to find the best ways to accurately track objects and devices in an energy efficient manner, at any range, and in any type of environment. This thesis focuses on the last aspect of maintaining accurate localization regardless of environment. For radio frequency based systems, challenging environments containing many obstacles, e.g., indoor or urban areas, have a detrimental effect on the measurements used for positioning, making them deceptive. In this work, a method for correcting range measurements is proposed for a narrowband sub-GHz radio frequency based localization system using Received Signal Strength Indicator (RSSI) and Time-of-Flight (ToF) measurements for positioning. Three different machine learning models were implemented: a linear regressor, a least squares support vector machine regressor and a gaussian process regressor. They were compared in their ability to predict the true range between devices based on raw range measurements. Achieved was a 69.96 % increase in accuracy compared to uncorrected ToF estimates and a 88.74 % increase in accuracy compared to RSSI estimates. When the corrected range estimates were used for positioning with a trilateration algorithm using least squares estimation, a 67.84 % increase in accuracy was attained compared to positioning with uncorrected range estimates. This shows that this is an effective method of improving range estimates to facilitate more accurate positioning. / Att kunna hålla reda på var ens tillgångar befinner sig kan vara mycket användbart, från att undvika att ens nycklar eller telefon tappas bort till att kunna hitta utrustningen man behöver i ett myllrande sjukhus eller på en byggarbetsplats. Området av lokalisering utvecklas aktivt för att hitta de bästa metoderna och teknologierna för att med precision kunna spåra fysiska objekt på ett energieffektivt sätt, på vilken räckvidd som helst, och i vilken miljö som helst. Detta arbete fokuserar på den sista aspekten av att uppnå precis positionering oavsett miljö. För radiofrekvensbaserade system har utmanande miljöer med många fysiska hinder som till exempel inomhus och stadsområden en negativ effekt på de mätningar som används för positionering, vilket gör dem vilseledande. I detta arbete föreslås en metod för att korrigera avståndsmätningar i ett narrowband sub-GHz radiofrekvensbaserat lokaliseringssystem som använder Received Signal Strength Indicator (RSSI)- och Time-of-Flight (ToF)-mätningar för positionering. Tre olika maskininlärningsmodeller har implementerats: en linear regressor, en least squares support vector machine regressor och en gaussian process regressor. Dessa jämfördes i sin förmåga att förutspå det sanna avståndet mellan enheter baserat på råa avståndsmätningar. De korrigerade avståndsmätningarna uppnådde 69.96 % högre nogrannhet jämfört med okorrigerade ToF-uppskattningar och 88.74 % högre nogrannhet jämfört med RSSI-uppskattningar. Avståndsuppskattningarna användes för positionering med trilateration och minsta kvadratmetoden. De korrigerade uppskattningarna gav 67.84 % mer precis positionering jämfört med de okorrigerde uppskattningarna. Detta visar att detta är en effektiv metod förbättra avståndsuppskattningarna för att i sin tur bidra till mer exakt positionering.
|
480 |
Review of subnational credit rating methodologies and their applicability in South Africa / Erika FourieFourie, Erika January 2015 (has links)
The objectives of the research study are to review existing subnational credit rating methodologies
and their applicability in the South African context, to develop the quantitative parts of credit
rating methodologies for two provincial departments (Department of Health and Department of
Education) that best predict future payment behaviour, to test the appropriateness of the proposed
methodologies and to construct the datasets needed.
The literature study includes background information regarding the uniqueness of South Africa’s
provinces and credit rating methodologies in general. This is followed by information on subnational
credit rating methodologies, including a review of existing subnational credit rating methodologies
and an assessment of the applicability of the information provided in the South African context.
Lastly, the applicable laws and regulations within the South African regulatory framework are provided.
The knowledge gained from the literature study is applied to the data that have been collected
to predict the two departments’ future payment behaviour. Linear regression modelling is used
to identify the factors that best predict future payment behaviour and to assign weights to the
identified factors in a scientific manner. The resulting payment behaviour models can be viewed as
the quantitative part of the credit ratings. This is followed by a discussion on further investigations
to improve the models.
The developed models (both the simple and the advanced models) are tested with regard to prediction
accuracies using RAG (Red, Amber or Green) statuses. This is followed by recommendations
regarding future model usage that conclude that the department-specific models outperform the
generic models in terms of prediction accuracies. / PhD (Risk analysis), North-West University, Potchefstroom Campus, 2015
|
Page generated in 0.0782 seconds