• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 72
  • 5
  • 4
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 90
  • 66
  • 63
  • 32
  • 28
  • 28
  • 27
  • 25
  • 25
  • 24
  • 21
  • 17
  • 16
  • 15
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

SYSTEMATICALLY LEARNING OF INTERNAL RIBOSOME ENTRY SITE AND PREDICTION BY MACHINE LEARNING

Junhui Wang (5930375) 15 May 2019 (has links)
<p><a>Internal ribosome entry sites (IRES) are segments of the mRNA found in untranslated regions, which can recruit the ribosome and initiate translation independently of the more widely used 5’ cap dependent translation initiation mechanism. IRES play an important role in conditions where has been 5’ cap dependent translation initiation blocked or repressed. They have been found to play important roles in viral infection, cellular apoptosis, and response to other external stimuli. It has been suggested that about 10% of mRNAs, both viral and cellular, can utilize IRES. But due to the limitations of IRES bicistronic assay, which is a gold standard for identifying IRES, relatively few IRES have been definitively described and functionally validated compared to the potential overall population. Viral and cellular IRES may be mechanistically different, but this is difficult to analyze because the mechanistic differences are still not very clearly defined. Identifying additional IRES is an important step towards better understanding IRES mechanisms. Development of a new bioinformatics tool that can accurately predict IRES from sequence would be a significant step forward in identifying IRES-based regulation, and in elucidating IRES mechanism. This dissertation systematically studies the features which can distinguish IRES from nonIRES sequences. Sequence features such as kmer words, and structural features such as predicted MFE of folding, Q<sub>MFE</sub>, and sequence/structure triplets are evaluated as possible discriminative features. Those potential features incorporated into an IRES classifier based on XGBboost, a machine learning model, to classify novel sequences as belong to IRES or nonIRES groups. The XGBoost model performs better than previous predictors, with higher accuracy and lower computational time. The number of features in the model has been greatly reduced, compared to previous predictors, by adding global kmer and structural features. The trained XGBoost model has been implemented as the first high-throughput bioinformatics tool for IRES prediction, IRESpy. This website provides a public tool for all IRES researchers and can be used in other genomics applications such as gene annotation and analysis of differential gene expression.</a></p>
22

Predicting the Movement Direction of OMXS30 Stock Index Using XGBoost and Sentiment Analysis

Elena, Podasca January 2021 (has links)
Background. Stock market prediction is an active yet challenging research area. A lot of effort has been put in by both academia and practitioners to produce accurate stock market predictions models, in the attempt to maximize investment objectives. Tree-based ensemble machine learning methods such as XGBoost have proven successful in practice. At the same time, there is a growing trend to incorporate multiple data sources in prediction models, such as historical prices and text, in order to achieve superior forecasting performance. However, most applications and research have so far focused on the American or Asian stock markets, while the Swedish stock market has not been studied extensively from the perspective of hybrid models using both price and text derived features.  Objectives. The purpose of this thesis is to investigate whether augmenting a numerical dataset based on historical prices with sentiment features extracted from financial news improves classification performance when predicting the daily price trend of the Swedish stock market index, OMXS30. Methods. A dataset of 3,517 samples between 2006 - 2020 was collected from two sources, historical prices and financial news. XGBoost was used as classifier and four different metrics were employed for model performance comparison given three complementary datasets: the dataset which contains only the sentiment feature, the dataset with only price-derived features and finally, the dataset augmented with sentiment feature extracted from financial news.  Results. Results show that XGBoost has a good performance in classifying the daily trend of OMXS30 given historical price features, achieving an accuracy of 73% on the test set. A small improvement across all metrics is recorded on the test set when augmenting the numerical dataset with sentiment features extracted from financial news.  Conclusions. XGBoost is a powerful ensemble method for stock market prediction, reflected in a satisfactory classification performance of the daily movement direction of OMXS30. However, augmenting the numerical input set with sentiment features extracted from text did not have a powerful impact on classification performance in this case, as the improvements across all employed metrics were small.
23

Machine Learning for Outcome Prediction of High-Risk Trauma Patients in the Emergency Department

Cardosi, Joshua David January 2021 (has links)
No description available.
24

CAN STATISTICAL MODELS BEAT BENCHMARK PREDICTIONS BASED ON RANKINGS IN TENNIS?

Svensson, William January 2021 (has links)
The aim of this thesis is to beat a benchmark prediction of 64.58 percent based on player rankings on the ATP tour in tennis. That means that the player with the best rank in a tennis match is deemed as the winner. Three statistical model are used, logistic regression, random forest and XGBoost. The data are over a period between the years 2000-2010 and has over 60 000 observations with 49 variables each. After the data was prepared, new variables were created and the difference between the two players in hand taken all three statistical models did outperform the benchmark prediction. All three variables had an accuracy around 66 percent with the logistic regression performing the best with an accuracy of 66.45 percent. The most important variable overall for the models is the total win rate on different surfaces, the total win rate and rank.
25

ASSESSING PREDICTION CONDITIONS ANDSEQUENTIAL CLASSIFICATION IN ICU SEPSISPREDICTION

Lind, Petter January 2023 (has links)
Patients admitted to intensive care units (ICUs) often have a higher risk of sepsis due to weakened immune systems. Early sepsis diagnosis is crucial for timely treatment, emphasizing the need to improve the predictive capabilities of sepsis prediction models. Although machine learning models have demonstrated success in predicting sepsis onset, there is limited work done on how model assessment is affected by sequential prediction rather than evaluating on one prediction per patient. This thesis assesses the effectiveness of the evaluation procedures employed by such models and explore different prediction conditions to enhance sepsis prediction. Data was collected from the MIMIC-IV data set,and includes variables commonly used in real ICU settings relevant to sepsis diagnosis. Random onset matching is used to select time points for patients with and without sepsis, with the data analyzed using XGBoost. Evaluation metrics are calculated both once per patient, and is compared to sequential measurements for all patients from 40 hours before sepsis up until sepsis onset. Results shows that a model trained on data close to sepsis onset has strong predictive performance up to 25 hours before sepsis onset. In addition,different restrictive conditions on predictions are considered and evaluated. As the test set is limited it is important that the results are validated further, as it could provide insights regarding interpretation in the practical implementation of similar prediction models for support of healthcare professionals through timely interventions.
26

Predicting and classifying atrial fibrillation from ECG recordings using machine learning

Bogstedt, Carl January 2023 (has links)
Atrial fibrillation is one of the most common types of heart arrhythmias, which can cause irregular, weak and fast atrial contractions up to 600 beats per minute. Atrial fibrillation has increased prevalence with age and is associated with increased risks of ischemia, as blood clots can form due to the weak contractions. During prolonged periods of atrial fibrillation, the atria can undergo a process called atrial remodelling. This causes electrophysiological and structural changes to the atria such as increased atrial size and changes to calcium ion densities. These changes themselves promotes the initiation and propagation of atrial fibrillation, which makes early detection crucial. Fortunately, atrial fibrillation can be detected on an electrocardiogram. Electrocardiograms measures the electrical activity of the heart during its cardiac cycle. This includes the initiation of the action potential, the depolarization of the atria and ventricles and their repolarization. On the electrocardiogram recording, these are seen as peaks and valleys, where each peak and valley can be traced back to one of these events. This means that during atrial fibrillation, the weak, irregular and fast atrial contractions can all be detected and measured. The aim of this project was to develop a machine learning model that could predict onset of atrial fibrillation, and that could classify ongoing atrial fibrillation. This was achieved by training one multiclass classification machine learning model using XGBoost, and three binary classification machine learning models using ROSETTA, on electrocardiogram recordings of people with and without atrial fibrillation. XGBoost is a tree boosting system which uses tree-like structures to classify data, while ROSETTA is a rule-based classification model which creates rules in an IF and THEN format to make decisions. The recordings were labelled according to three different classes: no atrial fibrillation, atrial fibrillation or preceding atrial fibrillation. The XGBoost model had a prediction accuracy of 99.3%, outperforming the three ROSETTA models and other atrial fibrillation classification and prediction models found. The ROSETTA models had high accuracies on the learning set, however, the predictions were subpar, indicating faulty settings for this type of data. The results in this project indicate that the models created can be used to accurately classify and predict onset of and ongoing atrial fibrillation, serving as a tool for early detection and verification of diagnosis.
27

Recommendation System for Insurance Policies : An Investigation of Unsupervised and Supervised Learning Techniques

Palmgren, Andreas January 2023 (has links)
Recommendation systems have significantly influenced user experiences across various industries, yet their application in the insurance sector remains relatively unexplored. This thesis focuses on developing a car insurance recommendation system that implements a `consumers like you' feature. The study initially employs a clustering-based recommendation system due to missing labels in an offline environment. However, challenges emerge, such as determining the optimal number of clusters and managing complex data. Additionally, the inability to effectively update based on feedback and lower predictive performance compared to supervised methods necessitated exploring supervised alternatives. In response, this thesis proposes a methodology where the unsupervised approach simulates consumer behavior in an offline environment. Supervised alternatives are pre-trained on the clustering-based system to replicate it and come with the ability to be fine-tuned based on live traffic. Three supervised alternatives — KNN, XGBoost, and a neural network — are developed and compared. Given the supervised recommendation system adaptability based on feedback, supervised methods can provide more accurate, personalized recommendations in the insurance domain. The XGBoost and neural network-based recommendation systems were able to replicate the unsupervised approach, and their expressive power makes them valid candidate models to further evaluate on live traffic. The thesis concludes with the potential to both improve and adapt these recommendation systems to other insurance types, marking a significant step toward more personalized, user-friendly insurance services.
28

Improving House Price Prediction Models: Exploring the Impact of Macroeconomic Features

Holmqvist, Martin, Hansson, Max January 2023 (has links)
This thesis investigates if house price prediction models perform better when adding macroe- conomic features to a data set with only house-specific features. Previous research has shown that tree-based models perform well when predicting house prices, especially the algorithms random forest and XGBoost. It is common to rely entirely on house-specific features when training these models. However, studies show that macroeconomic variables such as interest rate, inflation, and GDP affect house prices. Therefore it makes sense to include them in these models and study if they outperform the more traditional models with only house-specific features. The thesis also investigates which algorithm, out of random forest and XGBoost is better at predicting house prices. The results show that the mean absolute error is lower for the XGBoost and random forest models trained on data with macroeconomic features. Furthermore, XGBoost outperformed random forest regardless of the set of features. In Con- clusion, the suggestion is to include macroeconomic features and use the XGBoost algorithm when predicting house prices.
29

Application of Machine Learning to Financial Trading

Horemuz, Michal January 2018 (has links)
Machine learning methods have become powerful tools used in multiple industries. They have been successfully applied to problems such as image recognition, speech recognition and machine translation, among others. In this report, we investigated several machine learning methods for forecasting five different bond indexes. We have implemented and analyzed Feedforward Neural Nets, LSTMs, Q-Networks and Gradient Boosted Trees, and compared them to the Buy&amp;Hold strategy. We performed manual feature extraction based on some popular features used in the industry. The features were extracted from several financial instruments and were used as predictor variables. The results showed that XGBoost and Feedforward Neural Networks were consistently able to beat the Buy&amp;Hold strategy for three of five bond indexes. / Maskininlärningsmetoder har blivit kraftfulla verktyg som används i flera problemområden. De har framgångsrikt tillämpats på problem som bland annat bildigenkänning, taligenkänning och maskinöversättning. I denna rapport har vi undersökt flera maskininlärningsmetoder för att förutse fem olika obligationsindex. Vi har implementerat och analyserat Feedforward Neural Nets, LSTMs, Q-Networks och Gradient Boosted Trees, och jämfört dem med Buy\&amp;Hold strategin. Vi har utfört manuell extraktion av features baserat på några populära funktioner som används inom industrin. Dessa features beräknades från flera finansiella instrument och användes som prediktorvariabler. Resultaten visar att XGBoost och Feedforward Neural Networks kan konsekvent slå Buy\&amp;Hold strategin för tre av fem obligationsindex.
30

Automation of price prediction using machine learning in a large furniture company

Ghorbanali, Mojtaba January 2022 (has links)
The accurate prediction of the price of products can be highlybeneficial for the procurers both businesses wised and productionwise. Many companies today, in various fields ofoperations and sizes, have access to a vast amount of datathat valuable information can be extracted from them. In thismaster thesis, some large databases of products in differentcategories have been analyzed. Because of confidentiality, thelabels from the database that are in this thesis are subtitled bysome general titles and the real titles are not mentioned. Also,the company is not referred to by name, but the whole job iscarried out on the real data set of products. As a real-worlddata set, the data was messy and full of nulls and missing data.So, the data wrangling took some more time. The approachesthat were used for the model were Regression methods andGradient Boosting models.The main purpose of this master thesis was to build priceprediction models based on the features of each item to assistwith the initial positioning of the product and its initial price.The best result that was achieved during this master thesiswas from XGBoost machine learning model with about 96%accuracy which can be beneficial for the producer to acceleratetheir pricing strategies.

Page generated in 0.0569 seconds