Global ETD Search

331	Inclusive hyper- to dilute-concentrated suspended sediment transport study using modified rouse model: parametrized power-linear coupled approach using machine learning Kumar, S., Singh, H.P., Balaji, S., Hanmaiahgari, P.R., Pu, Jaan H. 31 July 2022 (has links) Yes / The transfer of suspended sediment can range widely from being diluted to being hyperconcentrated, depending on the local flow and ground conditions. Using the Rouse model and the Kundu and Ghoshal (2017) model, it is possible to look at the sediment distribution for a range of hyper-concentrated and diluted flows. According to the Kundu and Ghoshal model, the sediment flow follows a linear profile for the hyper-concentrated flow regime and a power law applies for the dilute concentrated flow regime. This paper describes these models and how the Kundu and Ghoshal parameters (linear-law coefficients and power-law coefficients) are dependent on sediment flow parameters using machine-learning techniques. The machine-learning models used are XGboost Classifier, Linear Regressor (Ridge), Linear Regressor (Bayesian), K Nearest Neighbours, Decision Tree Regressor, and Support Vector Machines (Regressor). The models were implemented on Google Colab and the models have been applied to determine the relationship between every Kundu and Ghoshal parameter with each sediment flow parameter (mean concentration, Rouse number, and size parameter) for both a linear profile and a power-law profile. The models correctly calculated the suspended sediment profile for a range of flow conditions ( 0.268 𝑚𝑚𝑚𝑚 ≤ 𝑑𝑑50 ≤ 2.29 𝑚𝑚𝑚𝑚, 0.00105 𝑔𝑔 𝑚𝑚𝑚𝑚3 ≤ particle density ≤ 2.65 𝑔𝑔 𝑚𝑚𝑚𝑚3 , 0.197 𝑚𝑚𝑚𝑚 𝑠𝑠 ≤ 𝑣𝑣𝑠𝑠 ≤ 96 𝑚𝑚𝑚𝑚 𝑠𝑠 , 7.16 𝑚𝑚𝑚𝑚 𝑠𝑠 ≤ 𝑢𝑢∗ ≤ 63.3 𝑚𝑚𝑚𝑚 𝑠𝑠 , 0.00042 ≤ 𝑐𝑐̅≤ 0.54), including a range of Rouse numbers (0.0076 ≤ 𝑃𝑃 ≤ 23.5). The models showed particularly good accuracy for testing at low and extremely high concentrations for type I to III profiles. Rouse number Mean concentration Suspended sediment transport Sediment size parameter Parameterized power-linear model Machine learning Decision tree regressor Support vector machines
332	Assessing Machine Learning Algorithms to Develop Station-based Forecasting Models for Public Transport : Case Study of Bus Network in Stockholm Movaghar, Mahsa January 2022 (has links) Public transport is essential for both residents and city planners because of its environmentally and economically beneficial characteristics. During the past decade climatechange, coupled with fuel and energy crises have attracted significant attention toward public transportation. Increasing the demand for public transport on the one hand and its complexity on the other hand have made the optimum network design quite challenging for city planners. The ridership is affected by numerous variables and features like space and time. These fluctuations, coupled with inherent uncertaintiesdue to different travel behaviors, make this procedure challenging. Any demand and supply mismatching can result in great user dissatisfaction and waste of energy on the horizon. During the past years, due to recent technologies in recording and storing data and advances in data analysis techniques, finding patterns, and predicting ridership based on historical data have improved significantly. This study aims to develop forecasting models by regressing boardings toward population, time of day, month, and station. Using the available boarding dataset for blue bus line number 4 in Stockholm, Sweden, seven different machine learning algorithms were assessed for prediction: Multiple Linear Regression, Decision Tree, Random Forest, Bayesian Ridge Regression, Neural Networks, Support Vector Machines, K-Nearest Neighbors. The models were trained and tested on the dataset from 2012 to 2019, before the start of the pandemic. The best model, KNN, with an average R-squared of 0.65 in 10-fold cross-validation was accepted as the best model. This model is then used to predict reduced ridership during the pandemic in 2020 and 2021. The results showed a reduction of 48.93% in 2020 and 82.24% in 2021 for the studied bus line. Public transport ridership machine learning Multiple Linear Regression Decision Tree Random Forest Bayesian Ridge Regression Neural Networks Support Vector Machines K-Nearest Neighbors Engineering and Technology Teknik och teknologier
333	A Machine Learning Framework for the Classification of Natura 2000 Habitat Types at Large Spatial Scales Using MODIS Surface Reflectance Data Sittaro, Fabian, Hutengs, Christopher, Semella, Sebastian, Vohland, Michael 02 June 2023 (has links) Anthropogenic climate and land use change is causing rapid shifts in the distribution and composition of habitats with profound impacts on ecosystem biodiversity. The sustainable management of ecosystems requires monitoring programmes capable of detecting shifts in habitat distribution and composition at large spatial scales. Remote sensing observations facilitate such efforts as they enable cost-efficient modelling approaches that utilize publicly available datasets and can assess the status of habitats over extended periods of time. In this study, we introduce a modelling framework for habitat monitoring in Germany using readily available MODIS surface reflectance data. We developed supervised classification models that allocate (semi-)natural areas to one of 18 classes based on their similarity to Natura 2000 habitat types. Three machine learning classifiers, i.e., Support Vector Machines (SVM), Random Forests (RF), and C5.0, and an ensemble approach were employed to predict habitat type using spectral signatures from MODIS in the visible-to-near-infrared and short-wave infrared. The models were trained on homogenous Special Areas of Conservation that are predominantly covered by a single habitat type with reference data from 2013, 2014, and 2016 and tested against ground truth data from 2010 and 2019 for independent model validation. Individually, the SVM and RF methods achieved better overall classification accuracies (SVM: 0.72–0.93%, RF: 0.72–0.94%) than the C5.0 algorithm (0.66–0.93%), while the ensemble classifier developed from the individual models gave the best performance with overall accuracies of 94.23% for 2010 and 80.34% for 2019 and also allowed a robust detection of non-classifiable pixels. We detected strong variability in the cover of individual habitat types, which were reduced when aggregated based on their similarity. Our methodology is capable to provide quantitative information on the spatial distribution of habitats, differentiate between disturbance events and gradual shifts in ecosystem composition, and could successfully allocate natural areas to Natura 2000 habitat types. info:eu-repo/classification/ddc/620 ddc:620
334	Comparison of Recommendation Systems for Auto-scaling in the Cloud Environment Boyapati, Sai Nikhil January 2023 (has links) Background: Cloud computing’s rapid growth has highlighted the need for efficientresource allocation. While cloud platforms offer scalability and cost-effectiveness for a variety of applications, managing resources to match dynamic workloads remains a challenge. Auto-scaling, the dynamic allocation of resources in response to real-time demand and performance metrics, has emerged as a solution. Traditional rule-based methods struggle with the increasing complexity of cloud applications. Machine Learning models offer promising accuracy by learning from performance metrics and adapting resource allocations accordingly. Objectives: This thesis addresses the topic of cloud environments auto-scaling recommendations emphasizing the integration of Machine Learning models and significant application metrics. Its primary objectives are determining the critical metrics for accurate recommendations and evaluating the best recommendation techniques for auto-scaling. Methods: The study initially identifies the crucial metrics—like CPU usage and memory consumption that have a substantial impact on auto-scaling selections through thorough experimentation and analysis. Machine Learning(ML) techniques are selected based on literature review, and then further evaluated through thorough experimentation and analysis. These findings establish a foundation for the subsequent evaluation of ML techniques for auto-scaling recommendations. Results: The performance of Random Forests (RF), K-Nearest Neighbors (KNN), and Support Vector Machines (SVM) are investigated in this research. The results show that RF have higher accuracy, precision, and recall which is consistent with the significance of the metrics which are identified earlier. Conclusions: This thesis enhances the understanding of auto-scaling recommendations by combining the findings from metric importance and recommendation technique performance. The findings show the complex interactions between metrics and recommendation methods, establishing the way for the development of adaptive auto-scaling systems that improve resource efficiency and application functionality. Auto-Scaling Auto-Scaling Recommendations Cloud Environment K-Nearest Neighbors Machine Learning Recommendation Systems Random Forests Support Vector Machines Computer Sciences Datavetenskap (datalogi)
335	Supervised Learning for Prediction of Tumour Mutational Burden / Användning av statistisk inlärning för estimering av mutationsbörda Hargell, Joanna January 2021 (has links) Tumour Mutational Burden is a promising biomarker to predict response to immunotherapy. In this thesis, statistical methods of supervised learning were used to predict TMB: GLM, Decision Trees and SVM. Predictions were based on data from targeted DNA sequencing, using variants found in the exonic, intronic, UTR and intergenic regions of the human DNA. This project was of an exploratory nature, performed in a pan-cancer setting. Both regression and classification were considered. The purpose was to investigate whether variants found in these regions of the DNA sequence are useful when predicting TMB. Poisson regression and Negative binomial regression were used within the framework of GLM. The results indicated deficiencies in the model assumptions and that the use of GLM for the application is questionable. The single regression tree did not yield satisfactory prediction accuracy. However, performance was improved by using variance reducing methods such as bagging and random forests. The use of boosted regression trees did not yield any significant improvement in prediction accuracy. In the classification setting, binary as well as multiple classes were considered. The distinction between classes was based on commonly used thresholds in clinical care to achieve immunotherapy. SVM and classification trees yielded high prediction accuracy for the binary case: a misclassification rate of 0.0242 and 0 respectively for the independent test set. In the multiple classification setting, bagging and random forests were implemented, yet, did not improve performance over the single classification tree. SVM produced a misclassification rate of 0.103, and the corresponding number for the single classification tree was 0.109. It was concluded that SVM and Decision trees are suitable methods for predicting TMB based on targeted gene panels. However, to obtain reliable predictions, there is a need to move from a pan-cancer setting to a diagnosis-based setting. Furthermore, parameters affecting TMB, like pre-analytical factors need to be included in the statistical analysis. / Denna uppsats undersöker tre metoder inom statistisk inlärning: GLM, Decision Trees och SVM, med avsikt att förutsäga mutationsbörda, TMB, för cancerpatienter. Metoderna har applicerats både inom regression och klassificering. Förutsägelser gjordes baserat på data från panel-baserad DNA-sekvensering som innehåller varianter från kodande, introniska UTR och intergeniska regioner av mänskligt DNA. Projektet ämnar att undersöka om varianter från dessa regioner av DNA-sekvensen kan vara användbara för att förutsäga mutationsbördan för en patient. Poisson-regression och Negativ Binomial-regression undersöktes inom GLM. Resultaten indikerade på brister i modellerna och att GLM inte är lämplig för denna tillämpning. Regressionsträden gav inte tillräckligt noggranna förutsägelser, men implementering av bagging och random forests förbättrade modellernas prestanda. Boosting förbättrade inte resultaten. Inom klassificering användes både binära klasser och multipla klasser. Avgränsningen mellan klasser baserades på kända gränser för TMB inom vården för att få immunoterapi. SVM och decision trees gav god prestanda för binär klassificering, med ett klassificeringsfel på 0.024 för SVM och 0 för decision trees. Bagging och random forests implementerades för det multipla fallet inom decision trees, men förbättrade inte prestandan. För multipla klasser gav SVM ett klassificeringnsfel på 0.103 och decision trees 0.109. Både SVM och decision trees visade sig vara lämpliga metoder för för att förutse värdet på TMB. Däremot, för att förutsägelserna ska vara tillförlitliga finns det ett behov av att göra denna typ av analys för varje enskild cancerdiagnos. Dessutom finns det ett behov av att inkludera parametrar från den bioinformatiska processen i den statistiska analysen. Supervised Learning Tumour Mutational Burden Generalized Linear Models Decision trees Support Vector Machines statistik tillämpad matematik statistisk inlärning mutationsbörda Mathematics Matematik
336	Cost-Sensitive Learning-based Methods for Imbalanced Classification Problems with Applications Razzaghi, Talayeh 01 January 2014 (has links) Analysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties create bias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive learning method (CSL) to deal with the classification of imperfect data. Typically, most traditional approaches for classification demonstrate poor performance in an environment with imperfect data. We propose the use of CSL with Support Vector Machine, which is a well-known data mining algorithm. The results reveal that the proposed algorithm produces more accurate classifiers and is more robust with respect to imperfect data. Furthermore, we explore the best performance measures to tackle imperfect data along with addressing real problems in quality control and business analytics. Classification imbalanced data cost sensitive learning outliers weighted support vector machine relaxed support vector machines control chart pattern recognition Engineering Industrial Engineering
337	Supervised Speech Separation And Processing Han, Kun January 2014 (has links) No description available. Computer Science Supervised learning Speech separation Speech processing Machine learning Deep Learning Pitch estimation Speech Dereverberation Deep neural networks Support vector machines
338	Diabetic Retinopathy Classification Using Gray Level Textural Contrast and Blood Vessel Edge Profile Map Gurudath, Nikita January 2014 (has links) No description available. Biomedical Engineering Electrical Engineering Engineering Medical Imaging Ophthalmology Diabetic retinopathy Fundus images Gaussian filtering Texture and fractal features Artificial Neural Network Support Vector Machines
339	Experiments with Support Vector Machines and Kernels Kohram, Mojtaba 21 October 2013 (has links) No description available. Computer Science Support Vector Machines SVM kernel RBF kernel Gaussian Radial Basis Function Spectral Information Divergence Spectral Angle Mapper RNA-protein interaction PSSM matrix
340	A Semi Supervised Support Vector Machine for a Recommender System : Applied to a real estate dataset Méndez, José January 2021 (has links) Recommender systems are widely used in e-commerce websites to improve the buying experience of the customer. In recent years, e-commerce has been quickly expanding and its growth has been accelerated during the COVID-19 pandemic, when customers and retailers were asked to keep their distance and do lockdowns. Therefore, there is an increasing demand for items and good recommendations to the users to improve their shopping experience. In this master’s thesis a recommender system for a real-estate website is built, based on Support Vector Machines (SVM). The main characteristic of the built model is that it is trained with a few labelled samples and the rest of unlabelled samples, using a semi-supervised machine learning paradigm. The model is constructed step-by-step from the simple SVM, until the semi-supervised Nested Cost-Sensitive Support Vector Machine (NCS-SVM). Then, we compare our model using four different kernel functions: gaussian, second-degree polynomial, fourth-degree polynomial, and linear. We also compare a user with strict housing requirements against a user with vague requirements. We finish with a discussion focusing principally on parameter tuning, and briefly in the model downsides and ethical considerations. SVM Support Vector Machines Semisupervised Learning Machine Learning Semi-supervised learning Computer Engineering Datorteknik Other Computer and Information Science Annan data- och informationsvetenskap

Search results