Global ETD Search

111	Detecting Lumbar Muscle Fatigue Using Nanocomposite Strain Gauges Billmire, Darci Ann 26 June 2023 (has links) (PDF) Introduction: Muscle fatigue can contribute to acute flare-ups of lower back pain with associated consequences such as pain, disability, lost work time, increased healthcare utilization, and increased opioid use and potential abuse. The SPINE Sense system is a wearable device with 16 high deflection nanocomposite strain gauge sensors on kinesiology tape which is adhered to the skin of the lower back. This device is used to correlate lumbar skin strains with the motion of the lumbar vertebrae and to phenotype lumbar spine motion. In this work it was hypothesized that the SPINE Sense device can be used to detect differences in biomechanical movements consequent to muscle fatigue. A human subject study was completed with 30 subjects who performed 14 functional movements before and after fatiguing their back muscles through the Biering-SÃ¸rensen endurance test with the SPINE Sense device on their lower back collecting skin strain data. Various features from the strain gauge sensors were extracted from these data and were used as inputs to a random forest classification machine learning model. The accuracy of the model was assessed under two training/validation conditions, namely a hold-out method and a leave-one-out method. The random forest classification models were able to achieve up to 84.22% and 78.37% accuracies for the hold-out and leave-one-out methods respectively. Additionally, a system usability study was performed by presenting the device to 32 potential users (clinicians and individuals with lower back pain) of their device. They received a scripted explanation of the use of the device and were then instructed to score it with the validated System Usability Score. In addition they were given the opportunity to voice concerns, questions, and offer any other additional feedback about the design and use of the device. The average System Usability Score from all participants from the system usability study was 72.03 with suggestions of improving the robustness of electrical connections and smaller profiles of accompanying electronics. Feedback from the potential users of the device was used to make more robust electrical connections and smaller wires and electronics modules. These improvements were achieved by making a two-piece design: one piece contains the sensors on kinesiology tape that is directly attached to the patient and the other one contains the wires sewn into stretch fabric to create stretchable electronic connections to the device. It is concluded that a machine-learning model of the data from the SPINE Sense device can classify lumbar motion with sufficient accuracy for clinical utility. It is also concluded that the device is usable and intuitive to use. muscle fatigue low back pain high deflection strain gauges nanocomposite sensors system usability biomechanics sensors machine learning random forest classification and cross-validation Engineering
112	Toward a Theory of Auto-modeling Yiran Jiang (16632711) 25 July 2023 (has links) <p>Statistical modeling aims at constructing a mathematical model for an existing data set. As a comprehensive concept, statistical modeling leads to a wide range of interesting problems. Modern parametric models, such as deepnets, have achieved remarkable success in quite a few application areas with massive data. Although being powerful in practice, many fitted over-parameterized models potentially suffer from losing good statistical properties. For this reason, a new framework named the Auto-modeling (AM) framework is proposed. Philosophically, the mindset is to fit models to future observations rather than the observed sample. Technically, choosing an imputation model for generating future observations, we fit models to future observations via optimizing an approximation to the desired expected loss function based on its sample counterpart and what we call an adaptive {\it duality function}.</p> <p><br></p> <p>The first part of the dissertation (Chapter 2 to 7) focuses on the new philosophical perspective of the method, as well as the details of the main framework. Technical details, including essential theoretical properties of the method are also investigated. We also demonstrate the superior performance of the proposed method via three applications: Many-normal-means problem, $n < p$ linear regression and image classification.</p> <p><br></p> <p>The second part of the dissertation (Chapter 8) focuses on the application of the AM framework to the construction of linear regression models. Our primary objective is to shed light on the stability issue associated with the commonly used data-driven model selection methods such as cross-validation (CV). Furthermore, we highlight the philosophical distinctions between CV and AM. Theoretical properties and numerical examples presented in the study demonstrate the potential and promise of AM-based linear model selection. Additionally, we have devised a conformal prediction method specifically tailored for quantifying the uncertainty of AM predictions in the context of linear regression.</p> Applied statistics Computational statistics Statistical theory future observations bootstrap re-sampling cross-validation image classification linear regression over-parameterization statistical modeling
113	Two papers on car fleet modeling Habibi, Shiva January 2013 (has links) <p>QC 20130524</p> Car fleet modeling car type choice discrete choice modeling prediction aggregation of alternatives cross-validation feature-selection Transport Systems and Logistics Transportteknik och logistik
114	Neonatal Sepsis Detection Using Decision Tree Ensemble Methods: Random Forest and XGBoost Al-Bardaji, Marwan, Danho, Nahir January 2022 (has links) Neonatal sepsis is a potentially fatal medical conditiondue to an infection and is attributed to about 200 000annual deaths globally. With healthcare systems that are facingconstant challenges, there exists a potential for introducingmachine learning models as a diagnostic tool that can beautomatized within existing workflows and would not entail morework for healthcare personnel. The Herlenius Research Teamat Karolinska Institutet has collected neonatal sepsis data thathas been used for the development of many machine learningmodels across several papers. However, none have tried to studydecision tree ensemble methods. In this paper, random forestand XGBoost models are developed and evaluated in order toassess their feasibility for clinical practice. The data contained24 features of vital parameters that are easily collected througha patient monitoring system. The validation and evaluationprocedure needed special consideration due to the data beinggrouped based on patient level and being imbalanced. Theproposed methods developed in this paper have the potentialto be generalized to other similar applications. Finally, usingthe measure receiver-operating-characteristic area-under-curve(ROC AUC), both models achieved around ROC AUC= 0.84.Such results suggest that the random forest and XGBoost modelsare potentially feasible for clinical practice. Another gainedinsight was that both models seemed to perform better withsimpler models, suggesting that future work could create a moreexplainable model. / Nenatal sepsis är ett potentiellt dödligt‌‌‌ medicinskt tillstånd till följd av en infektion och uppges globalt orsaka 200 000 dödsfall årligen. Med sjukvårdssystem som konstant utsätts för utmaningar existerar det en potential för maskininlärningsmodeller som diagnostiska verktyg automatiserade inom existerande arbetsflöden utan att innebära mer arbete för sjukvårdsanställda. Herelenius forskarteam på Karolinska Institet har samlat ihop neonatal sepsis data som har använts för att utveckla många maskininlärningsmodeller över flera studier. Emellertid har ingen prövat att undersöka beslutsträds ensemble metoder. Syftet med denna studie är att utveckla och utvärdera random forest och XGBoost modeller för att bedöma deras möjligheter i klinisk praxis. Datan innehör 24 attribut av vitalparameterar som enkelt samlas in genom patientövervakningssystem. Förfarandet för validering och utvärdering krävde särskild hänsyn med tanke på att datan var grupperad på patientnivå och var obalanserad. Den föreslagna metoden har potential att generaliseras till andra liknande tillämpningar. Slutligen, genom att använda receiveroperating-characteristic area-under-curve (ROC AUC) måttet kunde vi uppvisa att båda modellerna presterade med ett resultat på ROC AUC= 0.84. Sådana resultat föreslår att både random forest och XGBoost modellerna kan potentiellt användas i klinisk praxis. En annan insikt var att båda modellerna verkade prestera bättre med enklare modeller vilket föreslår att ete skulle kunna vara att skapa en mer förklarlig skininlärningsmodell. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm Machine Learning Sepsis Neonatal Sepsis Random Forest XGBoost Imbalanced Data Binary Classification Cross-Validation Hyperparameter Tuning Elektroteknik och elektronik
115	A Comparative study of data splitting algorithms for machine learning model selection Birba, Delwende Eliane January 2020 (has links) Data splitting is commonly used in machine learning to split data into a train, test, or validation set. This approach allows us to find the model hyper-parameter and also estimate the generalization performance. In this research, we conducted a comparative analysis of different data partitioning algorithms on both real and simulated data. Our main objective was to address the question of how the choice of data splitting algorithm can improve the estimation of the generalization performance. Data splitting algorithms used in this study were variants of k-fold, Kennard-Stone, SPXY ( sample set partitioning based on joint x-y distance), and random sampling algorithm. Each algorithm divided the data into two subset, training/validation. The training set was used to fit the model and validation for the evaluation. We then analyzed the different data splitting algorithms based on the generalization performances estimated from the validation and the external test set. From the result, we noted that the important determinant for a good generalization is the size of the dataset. For all the data sample methods applied on small data set, the gap between the performance estimated on the validation and test set was significant. However, we noted that the gap reduced when there was more data in training or validation. Too many or few data in the training set can also lead to bad model performance. So it is importance to have a reasonable balance between the training/validation set sizes. In our study, KS and SPXY was the splitting algorithm with poor model performance estimation. Indeed these methods select the most representative samples to train the model, and poor representative samples are left for model performance estimation. / Datapartitionering används vanligtvis i maskininlärning för att dela data i en tränings, test eller valideringsuppsättning. Detta tillvägagångssätt gör det möjligt för oss att hitta hyperparametrar för modellen och även uppskatta generaliseringsprestanda. I denna forskning genomförde vi en jämförande analys av olika datapartitionsalgoritmer på både verkliga och simulerade data. Vårt huvudmål var att undersöka frågan om hur valet avdatapartitioneringsalgoritm kan förbättra uppskattningen av generaliseringsprestanda. Datapartitioneringsalgoritmer som användes i denna studie var varianter av k-faldig korsvalidering, Kennard-Stone (KS), SPXY (partitionering baserat på gemensamt x-y-avstånd) och bootstrap-algoritm. Varje algoritm användes för att dela upp data i två olika datamängder: tränings- och valideringsdata. Vi analyserade sedan de olika datapartitioneringsalgoritmerna baserat på generaliseringsprestanda uppskattade från valideringen och den externa testuppsättningen. Från resultatet noterade vi att det avgörande för en bra generalisering är storleken på data. För alla datapartitioneringsalgoritmer som använts på små datamängder var klyftan mellan prestanda uppskattad på valideringen och testuppsättningen betydande. Vi noterade emellertid att gapet minskade när det fanns mer data för träning eller validering. För mycket eller för litet data i träningsuppsättningen kan också leda till dålig prestanda. Detta belyser vikten av att ha en korrekt balans mellan storlekarna på tränings- och valideringsmängderna. I vår studie var KS och SPXY de algoritmer med sämst prestanda. Dessa metoder väljer de mest representativa instanserna för att träna modellen, och icke-representativa instanser lämnas för uppskattning av modellprestanda. K-fold cross-validation Kennard-Stone algorithm data splitting bootstrap overfitting SPXY k-faldig korsvalidering korsvalidering Kennard-Stone-algoritm datapartitionering bootstrap överanpassning SPXY Computer and Information Sciences Data- och informationsvetenskap
116	Artificial neural network modeling of flow stress response as a function of dislocation microstructures AbuOmar, Osama Yousef 11 August 2007 (has links) An artificial neural network (ANN) is used to model nonlinear, large deformation plastic behavior of a material. This ANN model establishes a relationship between flow stress and dislocation structure content. The density of geometrically necessary dislocations (GNDs) was calculated based on analysis of local lattice curvature evolution. The model includes essential statistical measures extracted from the distributions of dislocation microstructures, including substructure cell size, wall thickness, and GND density as the input variables to the ANN model. The model was able to successfully predict the flow stress of aluminum alloy 6022 as a function of its dislocation structure content. Furthermore, a sensitivity analysis was performed to identify the significance of individual dislocation parameters on the flow stress. The results show that an ANN model can be used to calibrate and predict inelastic material properties that are often cumbersome to model with rigorous dislocation-based plasticity models. Sensitivity Analysis Bayesian Regularization leave-one-out Cross Validation Flow Stress ANN GND Density Dislocation Cell Size Dislocation Cell Wall Thickness Dislocation Microstructures Multiple Linear Regression
117	A Comparison of Various Interpolation Techniques for Modeling and Estimation of Radon Concentrations in Ohio Gummadi, Jayaram January 2013 (has links) No description available. Computer Science artificial neural networks cross-validation prior knowledge input source difference space-mapped neural networks support vector regression radon random forest regression
118	Viewership forecast on a Twitch broadcast : Using machine learning to predict viewers on sponsored Twitch streams Malm, Jonas, Friberg, Martin January 2022 (has links) Today, the video game industry is larger than the sports and film industries combined, and the largest streaming platform Twitch with an average of 2.8 million concurrent viewers offers the possibility for gaming and non-gaming brands to market their products. Estimating streamers’ viewership is central in these marketing campaigns, but no large-scale studies have been conducted to predict viewership previously. This paper evaluates three different machine learning algorithms with regard to the three different error metrics MAE, MAPE and RMSE; and presents novel features for predicting viewership. Different models are chosen through recursive feature elimination using k-fold cross-validation with respect to both MAE and MAPE separately. The models are evaluated on an independent test and show promising results, on par with manual expert predictions. None of the models can be said to be significantly better than another. XGBoost optimized for MAPE obtained the lowest MAE error score of 282.54 and lowest MAPE error score of 41.36% on the test set, in comparison to expert predictions with 288.06 MAE and 83.05% MAPE. Furthermore, the study illustrates the importance of past viewership and streamer variety to predict future viewership. twitch viewership prediction regression machine learning XGBoost streaming distance metrics feature selection cross-validation feature engineering pre-processing Computer and Information Sciences Data- och informationsvetenskap
119	Context-Sensitive Code Completion : Improving Predictions with Genetic Algorithms Ording, Marcus January 2016 (has links) Within the area of context-sensitive code completion there is a need for accurate predictive models in order to provide useful code completion predictions. The traditional method for optimizing the performance of code completion systems is to empirically evaluate the effect of each system parameter individually and fine-tune the parameters. This thesis presents a genetic algorithm that can optimize the system parameters with a degree-of-freedom equal to the number of parameters to optimize. The study evaluates the effect of the optimized parameters on the prediction quality of the studied code completion system. Previous evaluation of the reference code completion system is also extended to include model size and inference speed. The results of the study shows that the genetic algorithm is able to improve the prediction quality of the studied code completion system. Compared with the reference system, the enhanced system is able to recognize 1 in 10 additional previously unseen code patterns. This increase in prediction quality does not significantly impact the system performance, as the inference speed remains less than 1 ms for both systems. / Inom området kontextkänslig kodkomplettering finns det ett behov av precisa förutsägande modeller för att kunna föreslå användbara kodkompletteringar. Den traditionella metoden för att optimera prestanda hos kodkompletteringssystem är att empiriskt utvärdera effekten av varje systemparameter individuellt och finjustera parametrarna. Det här arbetet presenterar en genetisk algoritm som kan optimera systemparametrarna med en frihetsgrad som är lika stor som antalet parametrar att optimera. Studien utvärderar effekten av de optimerade parametrarna på det studerade kodkompletteringssystemets pre- diktiva kvalitet. Tidigare utvärdering av referenssystemet utökades genom att även inkludera modellstorlek och slutledningstid. Resultaten av studien visar att den genetiska algoritmen kan förbättra den prediktiva kvali- teten för det studerade kodkompletteringssystemet. Jämfört med referenssystemet så lyckas det förbättrade systemet korrekt känna igen 1 av 10 ytterligare kodmönster som tidigare varit osedda. Förbättringen av prediktiv kvalietet har inte en signifikant inverkan på systemet, då slutledningstiden förblir mindre än 1 ms för båda systemen. Context-sensitive code completion Genetic algorithms k-fold cross validation Kontextkänslig kodkomplettering Genetiska algoritmer k-delad korsvalidering Computer and Information Sciences Data- och informationsvetenskap
120	Comparison of Multiple Models for Diabetes Using Model Averaging Al-Mashat, Alex January 2021 (has links) Pharmacometrics is widely used in drug development. Models are developed to describe pharmacological measurements with data gathered from a clinical trial. The information can then be applied to, for instance, safely establish dose-response relationships of a substance. Glycated hemoglobin (HbA1c) is a common biomarker used by models within antihyperglycemic drug development, as it reflects the average plasma glucose level over the previous 8-12 weeks. There are five different nonlinear mixed-effects models that describes HbA1c-formation. They use different biomarkers such as mean plasma glucose (MPG), fasting plasma glucose (FPG), fasting plasma insulin (FPI) or a combination of those. The aim of this study was to compare their performances on a population and an individual level using model averaging (MA) and to explore if reduced trial durations and different treatment could affect the outcome. Multiple weighting methods were applied to the MA workflow, such as the Akaike information criterion (AIC), cross-validation (CV) and a bootstrap model averaging method. Results show that in general, models that use MPG to describe HbA1c-formation on a population level could potentially outperform models using other biomarkers, however, models have shown similar performance on individual level. Further studies on the relationship between biomarkers and model performances must be conducted, since it could potentially lay the ground for better individual HbA1c-predictions. It can then be applied in antihyperglycemic drug development and to possibly reduce sample sizes in a clinical trial. With this project, we have illustrated how to perform MA on the aforementioned models, using different biomarkers as well as the difference between model weights on a population and individual level. pharmacometrics population pharmacokinetics pop-pk pop-PK diabetes model averaging MA cross-validation cv CV akaike information criterion Akaike aic AIC MPG mpg FPG fpg FPI fpi HbA1c hba1c farmakometri populationsfarmakokinetik diabetes model averaging MA cross-validation cv CV akaike Akaike aic AIC MPG mpg FPG fpg FPI fpi HbA1c hba1c Pharmaceutical Sciences Farmaceutiska vetenskaper

Search results