Spelling suggestions: "subject:"hyperparameter"" "subject:"hyperparameters""
31 |
A Software Product Line for Parameter TuningPukhkaiev, Dmytro 09 August 2023 (has links)
Optimization is omnipresent in our world. Its numerous applications spread from industrial cases, such as logistics, construction management or production planning; to the private sphere, filled with problems of selecting daycare or vacation planning.
In this thesis, we concentrate on expensive black-box optimization (EBBO) problems, a subset of optimization problems (OPs), which are characterized by an expensive cost of evaluating an objective function. Such OPs are reoccurring in various domains, being known as: hyperpameter optimization in machine learning, performance configuration optimization or parameter tuning in search-based software engineering, simulation optimization in operations research, meta-optimization or parameter tuning in the optimization domain itself.
High diversity of domains introduces a plethora of solving approaches, which adhere to a similar structure and workflow, but differ in details. The software frameworks stemming from different areas possess only partially intersecting manageability points, i.e., lack manageability.
In this thesis, we argue that the lack of manageability in EBBO is a major problem, which leads to underachieving optimization quality. The goal of this thesis is to study the role of manageability in EBBO and to investigate whether improving the manageability of EBBO frameworks increases optimization quality.
To reach this goal, we appeal to software product line engineering (SPLE), a methodology for developing highly-manageable software systems. Based on the foundations of SPLE, we introduce a novel framework for EBBO called BRISE. It offers: 1) a loosely-coupled software architecture, separating concerns of the experiment designer and the developer of EBBO strategies; 2) a full coverage of all EBBO problem types; and 3) a context-aware variability model, which captures the experiment-designer-defined OP with the content model; and manageability points including their variants and constraints with the cardinality-based feature model.
High manageability of the introduced BRISE framework enables us: 1) to extend the framework with novel efficient strategies, such as adaptive repetition management; and 2) to introduce novel EBBO mechanisms, such as multi-objective compositional surrogate modeling, dynamic sampling and hierarchical surrogate modeling.
The evaluation of the novel approaches with a set of case studies, including: the WFG benchmark for multi-objective optimization, combined selection and parameter control of meta-heuristics, and energy optimization; demonstrated their superiority over the state-of-the-art competitors. Thus, it supports the research hypothesis of this thesis:
Improving manageability of an EBBO framework enables to increase optimization quality.
|
32 |
A Comparison of AutoML Hyperparameter Optimization Tools for Tabular DataPokhrel, Prativa 02 May 2023 (has links)
No description available.
|
33 |
Plant yield prediction in indoor farming using machine learningAshok, Anjali, Adesoba, Mary January 2023 (has links)
Agricultural industry has started to rely more on data driven approaches to improve productivity and utilize their resources effectively. This thesis project was carried out in collaboration with Ljusgårda AB, it explores plant yield prediction using machine learning models and hyperparameter tweaking. This thesis work is based on data gathered from the company and the plant yield prediction is carried out on two scenarios whereby each scenario is focused on a different time frame of the growth stage. The first scenario predicts yield from day 8 to day 22 of DAT (Day After Transplant), while the second scenario predicts yield from day 1 to day 22 of DAT and three machine learning algorithms Support Vector Regression (SVR), Long Short Time Memory (LSTM) and Artificial Neural Network (ANN) were investigated. Machine learning model’s performances were evaluated using the metrics; Mean Square Error (MSE), Mean Absolute Error (MAE), and r-squared. The evaluation results showed that ANN performed best on MSE and r-squared with dataset 1, while SVR performed best on MAE with dataset 2. Thus, both ANN and SVR meets the objective of this thesis work. The hyperparameter tweaking experiment of the three models further demonstrated the significance of hyperparameter tuning in improving the models and making them more suitable to the available data.
|
34 |
Data-Driven Traffic Forecasting for Completed Vehicle Simulation: : A Case Study with Volvo Test TrucksShahrokhi, Samaneh January 2023 (has links)
This thesis offers a thorough investigation into the application of machine learning algorithms for predicting the presence of vehicles in a traffic setting. The research primarily focuses on enhancing vehicle simulation by employing data-driven traffic prediction methods. The study approaches the problem as a binary classification task. Various supervised learning algorithms, including Random Forest (RF), Gradient Boosting (GB), Support Vector Machine (SVM), and Logistic Regression (LogReg) were evaluated and tested. The thesis encompasses six distinct implementations, each involving different combinations of algorithms, feature engineering, hyperparameter tuning, and data splitting. The performance of each model was assessed using metrics such as accuracy, precision, recall, and F1-score, and visualizations like ROC-AUC curves were used to gain insights into their discrimination capabilities. While the RF model achieved the highest accuracy at 97%, the AUC score of Combination 2 (RF+GB) suggests that this ensemble model could strike a better balance between high accuracy (86%) and effective class separation (99%). Ultimately, the study identifies an ensemble model as the preferred choice, leading to significant improvements in prediction accuracy. The research also explores working on the problem as a time-series prediction task, exploring the use of Long Short-Term Memory (LSTM) and Auto-Regressive Integrated Moving Average (Auto-ARIMA) models. However, we found that this approach was impractical due to the dataset’s discrete and non-sequential nature. This research contributes to the advancement of vehicle simulation and traffic forecasting, demonstrating the potential of machine learning in addressing complex real-world scenarios.
|
35 |
Maximizing the performance of point cloud 4D panoptic segmentation using AutoML technique / Maximera prestandan för punktmoln 4D panoptisk segmentering med hjälp av AutoML-teknikMa, Teng January 2022 (has links)
Environment perception is crucial to autonomous driving. Panoptic segmentation and objects tracking are two challenging tasks, and the combination of both, namely 4D panoptic segmentation draws researchers’ attention recently. In this work, we implement 4D panoptic LiDAR segmentation (4D-PLS) on Volvo datasets and provide a pipeline of data preparation, model building and model optimization. The main contributions of this work include: (1) building the Volvo datasets; (2) adopting an 4D-PLS model improved by Hyperparameter Optimization (HPO). We annotate point cloud data collected from Volvo CE, and take a supervised learning approach by employing a Deep Neural Network (DNN) to extract features from point cloud data. On the basis of the 4D-PLS model, we employ Bayesian Optimization to find the best hyperparameters for our data, and improve the model performance within a small training budget. / Miljöuppfattning är avgörande för autonom körning. Panoptisk segmentering och objektspårning är två utmanande uppgifter, och kombinationen av båda, nämligen 4D panoptisk segmentering, har nyligen uppmärksammat forskarna. I detta arbete implementerar vi 4D-PLS på Volvos datauppsättningar och tillhandahåller en pipeline av dataförberedelse, modellbyggande och modelloptimering. De huvudsakliga bidragen från detta arbete inkluderar: (1) bygga upp Volvos datauppsättningar; (2) anta en 4D-PLS-modell förbättrad av HPO. Vi kommenterar punktmolndata som samlats in från Volvo CE och använder ett övervakat lärande genom att använda en DNN för att extrahera funktioner från punktmolnsdata. På basis av 4D-PLS-modellen använder vi Bayesian Optimization för att hitta de bästa hyperparametrarna för vår data och förbättra modellens prestanda inom en liten utbildningsbudget.
|
36 |
Neonatal Sepsis Detection Using Decision Tree Ensemble Methods: Random Forest and XGBoostAl-Bardaji, Marwan, Danho, Nahir January 2022 (has links)
Neonatal sepsis is a potentially fatal medical conditiondue to an infection and is attributed to about 200 000annual deaths globally. With healthcare systems that are facingconstant challenges, there exists a potential for introducingmachine learning models as a diagnostic tool that can beautomatized within existing workflows and would not entail morework for healthcare personnel. The Herlenius Research Teamat Karolinska Institutet has collected neonatal sepsis data thathas been used for the development of many machine learningmodels across several papers. However, none have tried to studydecision tree ensemble methods. In this paper, random forestand XGBoost models are developed and evaluated in order toassess their feasibility for clinical practice. The data contained24 features of vital parameters that are easily collected througha patient monitoring system. The validation and evaluationprocedure needed special consideration due to the data beinggrouped based on patient level and being imbalanced. Theproposed methods developed in this paper have the potentialto be generalized to other similar applications. Finally, usingthe measure receiver-operating-characteristic area-under-curve(ROC AUC), both models achieved around ROC AUC= 0.84.Such results suggest that the random forest and XGBoost modelsare potentially feasible for clinical practice. Another gainedinsight was that both models seemed to perform better withsimpler models, suggesting that future work could create a moreexplainable model. / Nenatal sepsis är ett potentiellt dödligt medicinskt tillstånd till följd av en infektion och uppges globalt orsaka 200 000 dödsfall årligen. Med sjukvårdssystem som konstant utsätts för utmaningar existerar det en potential för maskininlärningsmodeller som diagnostiska verktyg automatiserade inom existerande arbetsflöden utan att innebära mer arbete för sjukvårdsanställda. Herelenius forskarteam på Karolinska Institet har samlat ihop neonatal sepsis data som har använts för att utveckla många maskininlärningsmodeller över flera studier. Emellertid har ingen prövat att undersöka beslutsträds ensemble metoder. Syftet med denna studie är att utveckla och utvärdera random forest och XGBoost modeller för att bedöma deras möjligheter i klinisk praxis. Datan innehör 24 attribut av vitalparameterar som enkelt samlas in genom patientövervakningssystem. Förfarandet för validering och utvärdering krävde särskild hänsyn med tanke på att datan var grupperad på patientnivå och var obalanserad. Den föreslagna metoden har potential att generaliseras till andra liknande tillämpningar. Slutligen, genom att använda receiveroperating-characteristic area-under-curve (ROC AUC) måttet kunde vi uppvisa att båda modellerna presterade med ett resultat på ROC AUC= 0.84. Sådana resultat föreslår att både random forest och XGBoost modellerna kan potentiellt användas i klinisk praxis. En annan insikt var att båda modellerna verkade prestera bättre med enklare modeller vilket föreslår att ete skulle kunna vara att skapa en mer förklarlig skininlärningsmodell. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm
|
37 |
Convergent and Efficient Methods to Optimize Deep LearningMashayekhi, Mehdi 29 September 2022 (has links)
No description available.
|
38 |
Classifying True and Fake Telecommunication Signals With Deep LearningMyrberger, Axel, Von Essen, Benjamin January 2020 (has links)
This project aimed to classified artificiality gener-ated,fake, and authentic,true, telecommunication signals, basedupon their frequency response, using methods from deep learn-ing. Another goal was to accomplish this with the least amountof dimension of data possible. The datasets used contained of anequal amount of measured, provided by Ericsson, and generated,by a WINNER II implementation in Matlab, frequency responses.It was determined that a normalized version of the absolute valueof the complex frequency response was enough information for afeedforward network to do a sufficient classification. To improvethe accuracy of the network we did a hyperparameter search,which allowed us to reach an accuracy of 90 percent on our testdataset. The results show that it is possible for neural networksto differentiate between true and fake telecommunication signalsbased on their frequency response, even if it is hard for a humanto tell the difference. / Målet med det här projektet var att klassificera artificiellt genererade signaler, falska, och riktiga, sanna, telekommunikation signaler med hjälp av signalernas frekvens- svar med djup inlärningsmetoder, deep learning. Ett annat mål med projektet var att klassificera signalerna med minsta möjliga antalet dimensioner av datan. Datasetet som användes bestod av till hälften av uppmät data som Ericsson har tillhandahållit, och till hälften av generad data ifrån en WINNER II modell implementerad i Matlab. En slutsats som kunde dras är att en normaliserad version av beloppet av det komplexa frekvenssvaret innehöll tillräckligt med information för att träna ett feedforward nätverk till att uppnå en hög klassificeringssäkerhet. För att vidare öka tillförlitligheten av nätverket gjordes en hyperparametersökning, detta ökade tillförligheten till 90 procent för testdataseten. Resultaten visar att det är möjligt för neurala nätverk att skilja mellan sanna och falska telekommunikations- signaler baserat på deras frekvenssvar, även om det är svårt för människor att skilja signalerna åt. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
|
39 |
Differential neural architecture search for tabular data : Efficient neural network design for tabular datasetsMedhage, Marcus January 2024 (has links)
Artificial neural networks are some of the most powerful machine learning models and have gained interest in the telecommunications domain as well as other fields and applications due to their strong performance and flexibility. Creating these models typically requires manually choosing their architecture along with other hyperparameters that are crucial for their performance. Neural Architecture Search (NAS) seeks to automate architecture choice and has gained increasing interest in recent years. In this thesis, we propose a new NAS method based on differential architecture search (DARTS) to find architectures of fully-connected feed forward networks on tabular datasets. We train a gating mechanism on a validation dataset and compare four candidate gate functions as a tool to determine the number of hidden units per hidden layer in our neural networks for different tasks. Our findings show that our new method can reliably find architectures that are more compact and outperform manually chosen architectures. Interestingly, we also found that extracting weights learned during the search process could generate models that achieve significantly higher and more stable performance than identical architectures retrained from scratch. Our method achieved equal in performance to that of another NAS-method, while only requiring half an hour of training compared to 280 hours. The trained models also demonstrated a competitive performance when benchmarked to other state-of-the-art machine learning models. The primary benefit of our method, stems from the extraction and fine-tuning of certain weights. Our results indicate that improvements from extracted weights could relate to the lottery ticket hypothesis of neural networks, which invites further study for a fuller understanding.
|
40 |
Transformer-Based Networks for Fault Detection and Diagnostics of Rotating MachineryWong, Jonathan January 2024 (has links)
Machine health and condition monitoring are billion-dollar concerns for industry. Quality control and continuous improvement are some of the most important factors for manufacturers to consider in order to maintain a successful business. When work floor interruptions occur, engineers frequently employ “Band-Aid” fixes due to resource, timing, or technical constraints without solving for the root cause. Thus, a need for quick, reliable, and accurate fault detection and diagnosis methods are required.
Within complex rotating machinery, a fundamental component that accounts for large amounts of downtime and failure involves a very basic yet crucial element, the rolling-element bearing. A worn-out bearing constitutes to some of the most drastic failures in any mechanical system next to electrical failures associated with stator windings. The cyclical motion provides a way for measurements to be taken via vibration sensors and analyzed through signal processing techniques. Methods will be discussed to transform these acquired signals into usable input data for neural network training in order to classify the type of fault that is present within the system.
With the wide-spread utilization and adoption of neural networks, we turn our attention to the growing field of sequence-to-sequence deep learning architectures. Language based models have since been adapted to a multitude of tasks outside of text translation and word prediction. We now see powerful Transformers being used to accomplish generative modeling, computer vision, and anomaly detection -- spanning across all industries.
This research aims to determine the efficacy of the Transformer neural network for use in the detection and classification of faults within 3-phase induction motors for the automotive industry. We require a quick turnaround, often leading to small datasets in which methods such as data augmentation will be employed to improve the training process of our time-series signals. / Thesis / Master of Applied Science (MASc)
|
Page generated in 0.0835 seconds