• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 244
  • 85
  • 27
  • 20
  • 10
  • 6
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 487
  • 487
  • 180
  • 154
  • 117
  • 116
  • 111
  • 70
  • 69
  • 61
  • 55
  • 53
  • 53
  • 50
  • 49
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
181

Deep learning methods for speaker separation in reverberant conditions

Delfarah, Masood 16 October 2019 (has links)
No description available.
182

A Systematic Methodology for Developing Robust Prognostic Models Suitable for Large-Scale Deployment

Li, Pin 15 October 2020 (has links)
No description available.
183

Beating the odds : Machine Learning for football match prediction

Christoffersson, Emil January 2023 (has links)
This study aimed to compare the accuracy of machine learning models with the probabilities generatedby sports betting companies in predicting the outcome of football matches. The study also investigatedthe impact of different feature combinations on the performance of machine learning models for predicting football match outcomes. The study used data from various sources of the Swedish football leaguebetween the seasons 2018-2022. The comparison between the model’s predictions and the probabilitiesgenerated by sports betting companies showed that the model’s predictions were more accurate. SupportVector Machines(SVM) performed the best with an accuracy of 52.4 percent compared to the bettingcompanies at 40.4 percent. The results also showed that different feature combinations can have a significant impact on the performance of machine learning models for predicting football match outcomes butthe importance of features varied depending on the selection method used. The study used four different feature selection approaches: filter methods, Lasso, Ridge, and PCA, to identify the most importantfeatures for prediction. Overall, the results of this study suggest that machine learning models can outperform sports bettingcompanies in predicting football match outcomes and that the choice of feature combination can have asignificant impact on model performance. Further research is needed to explore these findings in moredetail and to investigate the usefulness of different feature selection techniques at different points in theseason.
184

Spectral Band Selection for Ensemble Classification of Hyperspectral Images with Applications to Agriculture and Food Safety

Samiappan, Sathishkumar 15 August 2014 (has links)
In this dissertation, an ensemble non-uniform spectral feature selection and a kernel density decision fusion framework are proposed for the classification of hyperspectral data using a support vector machine classifier. Hyperspectral data has more number of bands and they are always highly correlated. To utilize the complete potential, a feature selection step is necessary. In an ensemble situation, there are mainly two challenges: (1) Creating diverse set of classifiers in order to achieve a higher classification accuracy when compared to a single classifier. This can either be achieved by having different classifiers or by having different subsets of features for each classifier in the ensemble. (2) Designing a robust decision fusion stage to fully utilize the decision produced by individual classifiers. This dissertation tests the efficacy of the proposed approach to classify hyperspectral data from different applications. Since these datasets have a small number of training samples with larger number of highly correlated features, conventional feature selection approaches such as random feature selection cannot utilize the variability in the correlation level between bands to achieve diverse subsets for classification. In contrast, the approach proposed in this dissertation utilizes the variability in the correlation between bands by dividing the spectrum into groups and selecting bands from each group according to its size. The intelligent decision fusion proposed in this approach uses the probability density of training classes to produce a final class label. The experimental results demonstrate the validity of the proposed framework that results in improvements in the overall, user, and producer accuracies compared to other state-of-the-art techniques. The experiments demonstrate the ability of the proposed approach to produce more diverse feature selection over conventional approaches.
185

Evaluating machine learning strategies for classification of large-scale Kubernetes cluster logs

Sarika, Pawan January 2022 (has links)
Kubernetes is a free, open-source container orchestration system for deploying and managing Docker containers that host microservices. Its cluster logs are extremely helpful in determining the root cause of a failure. However, as systems become more complex, locating failures becomes more difficult and time-consuming. This study aims to identify the classification algorithms that accurately classify the given log data and, at the same time, require fewer computational resources. Because the data is quite large, we begin with expert-based feature selection to reduce the data size. Following that, TF-IDF feature extraction is performed, and finally, we compare five classification algorithms, SVM, KNN, random forest, gradient boosting and MLP using several metrics. The results show that Random forest produces good accuracy while requiring fewer computational resources compared to other algorithms.
186

Computational Approaches to Construct and Assess Knowledge Maps for Student Learning

Wang, Bao 19 July 2022 (has links)
No description available.
187

Enhancing Telecom Churn Prediction: Adaboost with Oversampling and Recursive Feature Elimination Approach

Tran, Long Dinh 01 June 2023 (has links) (PDF)
Churn prediction is a critical task for businesses to retain their valuable customers. This paper presents a comprehensive study of churn prediction in the telecom sector using 15 approaches, including popular algorithms such as Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, and AdaBoost. The study is segmented into three sets of experiments, each focusing on a different approach to building the churn prediction model. The model is constructed using the original training set in the first set of experiments. The second set involves oversampling the training set to address the issue of imbalanced data. Lastly, the third set combines oversampling with recursive feature selection to enhance the model's performance further. The results demonstrate that the Adaptive Boost classifier, implemented with oversampling and recursive feature selection, outperforms the other 14 techniques. It achieves the highest rank in all three evaluation metrics: recall (0.841), f1-score (0.655), and roc_auc (0.793), further indicating that the proposed approach effectively predicts churn and provides valuable insights into customer behavior.
188

MOLECULAR PROFILING IN BREAST CANCER AND TOXICOGENOMICS

Liu, Jiangang 23 August 2011 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / This dissertation presents a body of research that attempts to tackle the ‘overfitting’ problem for gene signature and biomarker development in two different aspects (mechanistically and computationally). In achievement of a deeper understanding of cancer molecular mechanisms, this study presents new approaches to derive gene signatures for various biological phenotypes, including breast cancer, in the context of well-defined and mechanistically associated biological pathways. We identified the pattern of gene expression in the cell cycle pathway can indeed serve as a powerful biomarker for breast cancer prognosis. We further built a predictive model for prognosis based on the cell cycle gene signature, and found our model to be more accurate than the Amsterdam 70-gene signature when tested with multiple gene expression datasets generated from several patient populations. Aside from demonstrating the effectiveness of dimensionality reduction, phenotypic dissection, and prognostic or diagnostic prediction, this approach also provides an alternative to the current methodology of identifying gene expression markers that links to biological mechanism. This dissertation also presents the development of a novel feature selection algorithm called Predictive Power Estimate Analysis (PPEA) to computationally tackle on overfitting. The algorithm iteratively apply a two-way bootstrapping procedure to estimate predictive power of each individual gene, and make it possible to construct a predictive model from a much smaller set of genes with the highest predictive power. Using DrugMatrix™ rat liver data, we identified genomic biomarkers of hepatic specific injury for inflammation, cell death, and bile duct hyperplasia. We demonstrated that the signature genes were mechanistically related to the phenotype the signature intended to predict (e.g. 17 out of top 20 genes for inflammation selected by PPEA were members of NF-kB pathway, which is a key pre-inflammatory pathway for a xenobiotic response). The top 4 gene signature for BDH has been further validated by QPCR in a toxicology lab. This is important because our results suggest that the PPEA model not largely deters the over-fitting problem, but also has the capability to elucidate mechanism(s) of drug action and / or of toxicity.
189

Feature Selection for High-risk Pattern Discovery in Medical Data

Li, Hua January 2012 (has links)
No description available.
190

Unsupervised Dimension Reduction Techniques for Lung Cancer Diagnosis Based on Radiomics

Kireta, Janet, Zahed, Mostafa, Dr. 25 April 2023 (has links)
One of the most pressing global health concerns is the impact of cancer, which remains a leading cause of death worldwide. The timeliness of detection and diagnosis is critical to maximizing the chances of successful treatment. Radiomics is an emerging medical imaging analysis proposed, which refers to the high-throughput extraction of a large number of image features. Radiomics generally refers to the use of CT, PET, MRI or Ultrasound imaging as input data, extracting expressive features from massive image-based data, and then using machine learning or statistical models for quantitative analysis and prediction of disease. Feature reduction is very critical in Radiomics as a large number of quantitative features can have redundant characteristics not necessarily important in the analysis process. Due to the immense features obtained from radiological images, the main objective of our research is the application of machine learning techniques to reduce the number of dimensions, thereby rendering the data more manageable. Radiomics involves several steps including: Imaging, segmentation, feature extraction, and analysis. Extracted features can be categorized in the description of tumor gray histograms, shape, texture features, and the tumor location and surrounding tissue. For this research, a large-scale CT dataset for Lung cancer diagnosis (Lung- PET-CT-Dx) which was collected by scholars from Medical University in Harbin in China is used to illustrate the dimension reduction techniques, which is a main part of radiomics process, via R, SAS and Python. The proposed reduction and analysis techniques in our research will entail; Principal Component Analysis, Clustering analysis (Hierarchical Clustering and K-means), and Manifold-based algorithms (Isometric Feature Mapping (ISOMAP).

Page generated in 0.0665 seconds