Spelling suggestions: "subject:"8upport vector regression."" "subject:"6upport vector regression.""
61 |
Sales Forecasting by Assembly of Multiple Machine Learning Methods : A stacking approach to supervised machine learningFalk, Anton, Holmgren, Daniel January 2021 (has links)
Today, digitalization is a key factor for businesses to enhance growth and gain advantages and insight in their operations. Both in planning operations and understanding customers the digitalization processes today have key roles, and companies are spending more and more resources in this fields to gain critical insights and enhance growth. The fast-food industry is no exception where restaurants need to be highly flexible and agile in their work. With this, there exists an immense demand for knowledge and insights to help restaurants plan their daily operations and there is a great need for organizations to continuously adapt new technological solutions into their existing processes. Well implemented Machine Learning solutions in combination with feature engineering are likely to bring value into the existing processes. Sales forecasting, which is the main field of study in this thesis work, has a vital role in planning of fast food restaurant's operations, both for budgeting purposes, but also for staffing purposes. The word fast food describes itself. With this comes a commitment to provide high quality food and rapid service to the customers. Understaffing can risk violating either quality of the food or service while overstaffing leads to low overall productivity. Generating highly reliable sales forecasts are thus vital to maximize profits and minimize operational risk. SARIMA, XGBoost and Random Forest were evaluated on training data consisting of sales numbers, business hours and categorical variables describing date and month. These models worked as base learners where sales predictions from a specific dataset were used as training data for a Support Vector Regression model (SVR). A stacking approach to this type of project shows sufficient results with a significant gain in prediction accuracy for all investigated restaurants on a 6-week aggregated timeline compared to the existing solution. / Digitalisering har idag en nyckelroll för att skapa tillväxt och insikter för företag, dessa insikter ger fördelar både inom planering och i förståelsen om deras kunder. Det här är ett område som företag lägger mer och mer resurser på för att skapa större förståelse om sin verksamhet och på så sätt öka tillväxten. Snabbmatsindustrin är inget undantag då restauranger behöver en hög grad av flexibilitet i sina arbetssätt för att möta kundbehovet. Det här skapar en stor efterfrågan av kunskap och insikter för att hjälpa dem i planeringen av deras dagliga arbete och det finns ett stort behov från företagen att kontinuerligt implementera nya tekniska lösningar i befintliga processer. Med väl implementerade maskininlärningslösningar i kombination med att skapa mer informativa variabler från befintlig data kan aktörer skapa mervärde till redan existerande processer. Försäljningsprognostisering, som är huvudområdet för den här studien, har en viktig roll för verksamhetsplaneringen inom snabbmatsindustrin, både inom budgetering och bemanning. Namnet snabbmat beskriver sig själv, med det följer ett löfte gentemot kunden att tillhandahålla hög kvalitet på maten samt att kunna tillhandahålla snabb service. Underbemanning kan riskera att bryta någon av dessa löften, antingen i undermålig kvalitet på maten eller att inte kunna leverera snabb service. Överbemanning riskerar i stället att leda till ineffektivitet i användandet av resurser. Att generera högst tillförlitliga prognoser är därför avgörande för att kunna maximera vinsten och minimera operativ risk. SARIMA, XGBoost och Random Forest utvärderades på ett träningsset bestående av försäljningssiffror, timme på dygnet och kategoriska variabler som beskriver dag och månad. Dessa modeller fungerar som basmodeller vars prediktioner från ett specifikt testset används som träningsdata till en Stödvektorsreggresionsmodell (SVR). Att använda stapling av maskininlärningsmodeller till den här typen av problem visade tillfredställande resultat där det påvisades en signifikant förbättring i prediktionssäkerhet under en 6 veckors aggregerad period gentemot den redan existerande modellen.
|
62 |
Algorithmic Approaches to Output Prediction in a Virtual Power Plant / Algoritmiska Tillvägagångssätt för Effektprognoser i ett Virtuellt KraftverkRosing, Johannes, Ekhed, Oscar January 2023 (has links)
Virtual Power Plants (VPPs) are an emerging form of technology that allows owners of electricity producing appliances, such as electric vehicles, to partake in a pool of producers of sustainable energy. The Swedish electricity grid owner Svenska Kraftnät hosts a platform where VPPs act as intermediaries between energy producing customers and third party buyers. A requirement to participate in these transactions, however, is to post a bid specifying the amount of power that can be produced from a VPP during a given hour at least 48 hours into the future. This is where forecasting comes into the picture. This report compares the accuracy of eight different machine learning models when tasked with forecasting power output using the same training data from an electric vehicle-based VPP. The study also examines which inferences about customer behavior can be drawn from the same data and give strategic recommendations to VPPs based on the findings of the study. Upon evaluating the results, it was found that deep learning models outperformed autoregressive models, which in turn outperformed Random Forest Regression and Support Vector Regression. As for customer behaviors found in the data, a small negative correlation between spot prices and delivered output was found, suggesting that customers limit their charging when spot prices are high. Further, more power is generally produced during nighttime and on weekends. The data also shows an autocorrelation with a lag of 24 hours, suggesting that charging behaviors on a given day influence charging behaviors the subsequent day. / Virtuella kraftverk (VPPs) är en framväxande form av teknologi som tillåter ägare av elproducerande enheter, till exempel elbilar, att delta i ett nätverk av producenter av hållbar energi. Den svenska elnätsägaren Svenska Kraftnät driver en plattform där VPPs agerar mellanhänder mellan energiproducerande kunder och tredjepartsköpare. Ett krav för att delta i budgivningen är dock att som VPP kunna lägga ett bud som specificerar hur stor effekt som kan produceras under en viss timme, minst 48 timmar i framtiden. Här kommer prognoser in i bilden. Denna rapport jämför precisionen för åtta olika maskininlärningsmodeller som har i uppgift att predicera effektproduktion med hjälp av samma data från ett elbilsbaserat VPP. Denna studie undersöker också vilka slutsatser som kan dras angående kundbeteenden från given data och ger strategiska rekommendationer baserat på studiens resultat. Efter utvärdering av resultaten kunde det konstateras att Deep Learning-modeller överträffade autoregressiva modeller, som i sin tur överträffade Random Forest Regression och Support Vector Regression. Vad gäller kundbeteenden i given data, kan sägas att en låg negativ korrelation fanns mellan spotpriser och effektproduktion, vilket tyder på att kunder begränsar laddning av elbilar när spotpriserna är höga. Vidare kan sägas att mer effekt generellt sett produceras på kvällar och helger. Studiens data visar också på en autokorrelation med en eftersläpning (lag) på 24 timmar, vilket tyder på att laddningsmönster under en given dag influerar laddningsmönster under nästkommande dag.
|
63 |
Maskininlärning med konform förutsägelse för prediktiva underhållsuppgifter i industri 4.0 / Machine Learning with Conformal Prediction for Predictive Maintenance tasks in Industry 4.0 : Data-driven ApproachLiu, Shuzhou, Mulahuko, Mpova January 2023 (has links)
This thesis is a cooperation with Knowit, Östrand \& Hansen, and Orkla. It aimed to explore the application of Machine Learning and Deep Learning models with Conformal Prediction for a predictive maintenance situation at Orkla. Predictive maintenance is essential in numerous industrial manufacturing scenarios. It can help to reduce machine downtime, improve equipment reliability, and save unnecessary costs. In this thesis, various Machine Learning and Deep Learning models, including Decision Tree, Random Forest, Support Vector Regression, Gradient Boosting, and Long short-term memory, are applied to a real-world predictive maintenance dataset. The Orkla dataset was originally planned to use in this thesis project. However, due to some challenges met and time limitations, one NASA C-MAPSS dataset with a similar data structure was chosen to study how Machine Learning models could be applied to predict the remaining useful lifetime (RUL) in manufacturing. Besides, conformal prediction, a recently developed framework to measure the prediction uncertainty of Machine Learning models, is also integrated into the models for more reliable RUL prediction. The thesis project results show that both the Machine Learning and Deep Learning models with conformal prediction could predict RUL closer to the true RUL while LSTM outperforms the Machine Learning models. Also, the conformal prediction intervals provide informative and reliable information about the uncertainty of the predictions, which can help inform personnel at factories in advance to take necessary maintenance actions. Overall, this thesis demonstrates the effectiveness of utilizing machine learning and Deep Learning models with Conformal Prediction for predictive maintenance situations. Moreover, based on the modeling results of the NASA dataset, some insights are discussed on how to transfer these experiences into Orkla data for RUL prediction in the future.
|
64 |
Investigation of Machine Learning Regression Techniques to Predict Critical Heat FluxHelmryd Grosfilley, Emil January 2022 (has links)
A unifying model for Critical Heat Flux (CHF) prediction has been elusive for over 60 years. With the release of the data utilized in the making of the 2006 Groeneveld Lookup table (LUT), by far the largest public CHF database available to date, data-driven predictions on a large variable space can be performed. The popularization of machine learning techniques to solve regression problems allows for deeper and more advanced tools when analyzing the data. We compare three different machine learning algorithms to predict the occurrence of CHF in vertical, uniformly heated round tubes. For each selected algorithm (ν-Support vector regression, Gaussian process regression, and Neural network regression), an optimized hyperparameter set is fitted. The best performing algorithm is the Neural network, which achieves a standard deviation of the prediction/measured factor three times lower than the LUT, while the Gaussian process regression and the ν-Support vector regression both lead to two times lower standard deviation. All algorithms significantly outperform the LUT prediction performance. The neural network model and training methodology are designed to prevent overfitting, which is confirmed by data analysis of the predictions. Additionally, a feasibility study of transfer learning and uncertainty quantification is performed, to investigate potential future applications.
|
65 |
Plant yield prediction in indoor farming using machine learningAshok, Anjali, Adesoba, Mary January 2023 (has links)
Agricultural industry has started to rely more on data driven approaches to improve productivity and utilize their resources effectively. This thesis project was carried out in collaboration with Ljusgårda AB, it explores plant yield prediction using machine learning models and hyperparameter tweaking. This thesis work is based on data gathered from the company and the plant yield prediction is carried out on two scenarios whereby each scenario is focused on a different time frame of the growth stage. The first scenario predicts yield from day 8 to day 22 of DAT (Day After Transplant), while the second scenario predicts yield from day 1 to day 22 of DAT and three machine learning algorithms Support Vector Regression (SVR), Long Short Time Memory (LSTM) and Artificial Neural Network (ANN) were investigated. Machine learning model’s performances were evaluated using the metrics; Mean Square Error (MSE), Mean Absolute Error (MAE), and r-squared. The evaluation results showed that ANN performed best on MSE and r-squared with dataset 1, while SVR performed best on MAE with dataset 2. Thus, both ANN and SVR meets the objective of this thesis work. The hyperparameter tweaking experiment of the three models further demonstrated the significance of hyperparameter tuning in improving the models and making them more suitable to the available data.
|
66 |
A Comparison of Various Interpolation Techniques for Modeling and Estimation of Radon Concentrations in OhioGummadi, Jayaram January 2013 (has links)
No description available.
|
67 |
A Statistical Approach to Modeling Wheel-Rail Contact DynamicsHosseini, SayedMohammad 12 January 2021 (has links)
The wheel-rail contact mechanics and dynamics that are of great importance to the railroad industry are evaluated by applying statistical methods to the large volume of data that is collected on the VT-FRA state-of-the-art roller rig. The intent is to use the statistical principles to highlight the relative importance of various factors that exist in practice to longitudinal and lateral tractions and to develop parametric models that can be used for predicting traction in conditions beyond those tested on the rig. The experiment-based models are intended to be an alternative to the classical traction-creepage models that have been available for decades. Various experiments are conducted in different settings on the VT-FRA Roller Rig at the Center for Vehicle Systems and Safety at Virginia Tech to study the relationship between the traction forces and the wheel-rail contact variables. The experimental data is used to entertain parametric and non-parametric statistical models that efficiently capture this relationship. The study starts with single regression models and investigates the main effects of wheel load, creepage, and the angle of attack on the longitudinal and lateral traction forces. The assumptions of the classical linear regression model are carefully assessed and, in the case of non-linearities, different transformations are applied to the explanatory variables to find the closest functional form that captures the relationship between the response and the explanatory variables. The analysis is then extended to multiple models in which interaction among the explanatory variables is evaluated using model selection approaches. The developed models are then compared with their non-parametric counterparts, such as support vector regression, in terms of "goodness of fit," out-of-sample performance, and the distribution of predictions. / Master of Science / The interaction between the wheel and rail plays an important role in the dynamic behavior of railway vehicles. The wheel-rail contact has been extensively studied through analytical models, and measuring the contact forces is among the most important outcomes of such models. However, these models typically fall short when it comes to addressing the practical problems at hand. With the development of a high-precision test rig—called the VT-FRA Roller Rig, at the Center for Vehicle Systems and Safety (CVeSS)—there is an increased opportunity to tackle the same problems from an entirely different perspective, i.e. through statistical modeling of experimental data.
Various experiments are conducted in different settings that represent railroad operating conditions on the VT-FRA Roller Rig, in order to study the relationship between wheel-rail traction and the variables affecting such forces. The experimental data is used to develop parametric and non-parametric statistical models that efficiently capture this relationship. The study starts with single regression models and investigates the main effects of wheel load, creepage, and the angle of attack on the longitudinal and lateral traction forces. The analysis is then extended to multiple models, and the existence of interactions among the explanatory variables is examined using model selection approaches. The developed models are then compared with their non-parametric counterparts, such as support vector regression, in terms of "goodness of fit," out-of-sample performance, and the distribution of the predictions.
The study develops regression models that are able to accurately explain the relationship between traction forces, wheel load, creepage, and the angle of attack.
|
68 |
Estimating the load weight of freight trains using machine learningKongpachith, Erik January 2023 (has links)
Accurate estimation of the load weight of freight trains is crucial for ensuring safe, efficient and sustainable rail freight transports. Traditional methods for estimating load weight often suffer from limitations in accuracy and efficiency. In recent years, machine learning algorithms have gained significant attention and use cases within the railway industry due to their strong predictive capabilities for classification and regression tasks. This study aims to present a proof of concept in the form of a comparative analysis of five machine learning regression algorithms: Polynomial Regression, K-Nearest Neighbors, Regression Trees, Random Forest Regression, and Support Vector Regression for estimating the load weight of freight trains using simulation data. The study utilizes two comprehensive datasets derived from train simulations in GENSYS, a simulation software for modeling rail vehicles. The datasets encompasses various driving condition factors such as train speed, track conditions and running gear configurations. The algorithms are trained and evaluated on these datasets and their performance is evaluated based on the root mean squared error and R2 metrics. Results from the experiments demonstrate that all five machine learning algorithms show promising performance for estimating the load weight. Polynomial regression achieves the best result for both of the datasets when using many features of the datasets are considered. Random forest regression achieves the best result for both of the data sets when a small number features of the datasets are considered. Furthermore, it is suggested that the methodical approach of this study is examined on real world data from operating freight trains to assert the proof of concept in a real world setting. / Noggrann uppskattning av godstågens lastvikt är avgörande för att säkerställa säkra, effektiva och hållbara godstransporter via järnväg. Traditionella metoder för att uppskatta lastvikt lider ofta av begränsningar i noggrannhet och effektivitet. Under de senaste åren har maskininlärningsalgoritmer fått betydande uppmärksamhet och användningsfall inom järnvägsindustrin på grund av deras starka prediktiva förmåga för klassificerings- och regressionsproblem. Denna studie syftar till att presentera en proof of concept i form av en jämförande analys av fem maskininlärningalgoritmer för regression: Polynom regression, K-Nearest Neighbors, Regression träd, Random Forest Regression och Support Vector Regression för att uppskatta lastvikten för godståg med hjälp av simuleringsdata. Studien använder två omfattande dataset konstruerade från tågsimuleringar i GENSYS, en simuleringsprogramvara för modellering av järnvägsfordon. Dataseten omfattar olika körfaktorer såsom tåghastighet, spårförhållanden och vagns konfigurationer. Algoritmerna tränas och utvärderas på dessa dataset och deras prestanda utvärderas baserat på root mean squared error och R2 måtten. Resultat från experimenten visar att alla fem maskininlärningsalgoritmerna visar lovande prestanda för att uppskatta lastvikten. Polynom regression uppnår det bästa resultatet för båda dataset när många variabler i datan beaktas. Random Forest Regression ger det bästa resultatet för båda dataset när ett mindre antal variabler i datan beaktas. Det föreslås det att det metodiska tillvägagångssättet för denna studie undersöks på verklig data från aktiva godståg för att fastställa en proof of concept på en verklig världsbild.
|
69 |
Développement de nouveaux plans d'expériences uniformes adaptés à la simulation numérique en grande dimensionSantiago, Jenny 04 February 2013 (has links)
Cette thèse propose une méthodologie pour des études en simulation numérique en grande dimension. Elle se décline en différentes étapes : construction de plan d'expériences approprié, analyse de sensibilité et modélisation par surface de réponse. Les plans d'expériences adaptés à la simulation numérique sont les "Space Filling Designs", qui visent à répartir uniformément les points dans l'espace des variables d'entrée. Nous proposons l'algorithme WSP pour construire ces plans, rapidement, avec de bons critères d'uniformité, même en grande dimension. Ces travaux proposent la construction d'un plan polyvalent, qui sera utilisé pour les différentes étapes de l'étude : de l'analyse de sensibilité aux surfaces de réponse. L'analyse de sensibilité sera réalisée avec une approche innovante sur les points de ce plan, pour détecter le sous-ensemble de variables d'entrée réellement influentes. Basée sur le principe de la méthode de Morris, cette approche permet de hiérarchiser les variables d'entrée selon leurs effets. Le plan initial est ensuite "replié" dans le sous-espace des variables d'entrée les plus influentes, ce qui nécessite au préalable une étude pour vérifier l'uniformité de la répartition des points dans l'espace réduit et ainsi détecter d'éventuels amas et/ou lacunes. Ainsi, après réparation, ce plan est utilisé pour l'étape ultime : étude de surfaces de réponse. Nous avons alors choisi d'utiliser l'approche des Support Vector Regression, indépendante de la dimension et rapide dans sa mise en place. Obtenant des résultats comparables à l'approche classique (Krigeage), cette technique semble prometteuse pour étudier des phénomènes complexes en grande dimension. / This thesis proposes a methodology of study in numeric simulation for high dimensions. There are several steps in this methodology : setting up an experimental design, performing sensitivity analysis, then using response surface for modelling. In numeric simulation, we use a Space Filling Design that scatters the points in the entire domain. The construction of an experimental design in high dimensions must be efficient, with good uniformity properties. Moreover, this construction must be fast. We propose using the WSP algorithm to construct such an experimental design. This design is then used in all steps of the methodology, making it a versatile design, from sensitivity analysis to modelling. A sensitivity analysis allows identifying the influent factors. Adapting the Morris method principle, this approach classifies the inputs into three groups according to their effects. Then, the experimental design is folded over in the subspace of the influent inputs. This action can modify the uniformity properties of the experimental design by creating possible gaps and clusters. So, it is necessary to repair it by removing clusters and filling gaps. We propose a step-by-step approach to offer suitable repairing for each experimental design. Then, the repaired design is used for the final step: modelling from the response surface. We consider a Support Vector Machines method because dimension does not affect the construction. Easy to construct and with good results, similar to the results obtained by Kriging, the Support Vector Regression method is an alternative method for the study of complex phenomena in high dimensions.
|
70 |
Data-Driven Predictions of Heating Energy Savings in Residential BuildingsLindblom, Ellen, Almquist, Isabelle January 2019 (has links)
Along with the increasing use of intermittent electricity sources, such as wind and sun, comes a growing demand for user flexibility. This has paved the way for a new market of services that provide electricity customers with energy saving solutions. These include a variety of techniques ranging from sophisticated control of the customers’ home equipment to information on how to adjust their consumption behavior in order to save energy. This master thesis work contributes further to this field by investigating an additional incentive; predictions of future energy savings related to indoor temperature. Five different machine learning models have been tuned and used to predict monthly heating energy consumption for a given set of homes. The model tuning process and performance evaluation were performed using 10-fold cross validation. The best performing model was then used to predict how much heating energy each individual household could save by decreasing their indoor temperature by 1°C during the heating season. The highest prediction accuracy (of about 78%) is achieved with support vector regression (SVR), closely followed by neural networks (NN). The simpler regression models that have been implemented are, however, not far behind. According to the SVR model, the average household is expected to lower their heating energy consumption by approximately 3% if the indoor temperature is decreased by 1°C.
|
Page generated in 0.0866 seconds