Global ETD Search

111	Vehicular Traffic Flow Prediction Model Using Machine Learning-Based Model Wang, Jiahao 14 June 2021 (has links) Intelligent Transportation Systems (ITS) have attracted an increasing amount of attention in recent years. Thanks to the fast development of vehicular computing hardware, vehicular sensors and citywide infrastructures, many impressive applications have been proposed under the topic of ITS, such as Vehicular Cloud (VC), intelligent traffic controls, etc. These applications can bring us a safer, more efficient, and also more enjoyable transportation environment. However, an accurate and efficient traffic flow prediction system is needed to achieve these applications, which creates an opportunity for applications under ITS to deal with the possible road situation in advance. To achieve better traffic flow prediction performance, many prediction methods have been proposed, such as mathematical modeling methods, parametric methods, and non-parametric methods. It is always one of the hot topics about how to implement an efficient, robust and accurate vehicular traffic prediction system. With the help of Machine Learning-based (ML) methods, especially Deep Learning-based (DL) methods, the accuracy of the prediction model is increased. However, we also noticed that there are still many open challenges under ML-based vehicular traffic prediction model real-world implementation. Firstly, the time consumption for DL model training is relatively huge compared to parametric models, such as ARIMA, SARIMA, etc. Second, it is still a hot topic for the road traffic prediction that how to capture the special relationship between road detectors, which is affected by the geographic correlation, as well as the time change. The last but not the least, it is important for us to implement the prediction system in the real world; meanwhile, we should find a way to make use of the advanced technology applied in ITS to improve the prediction system itself. In our work, we focus on improving the features of the prediction model, which can be helpful for implementing the model in the real word. Firstly, we introduced an optimization strategy for ML-based models' training process, in order to reduce the time cost in this process. Secondly, We provide a new hybrid deep learning model by using GCN and the deep aggregation structure (i.e., the sequence to sequence structure) of the GRU. Meanwhile, in order to solve the real-world prediction problem, i.e., the online prediction task, we provide a new online prediction strategy by using refinement learning. In order to further improve the model's accuracy and efficiency when applied to ITS, we provide a parallel training strategy by using the benefits of the vehicular cloud structure. ITS Machine Learning
112	Machine learning for corporate failure prediction : an empirical study of South African companies Kornik, Saul January 2004 (has links) Includes bibliographical references (leaves 255-266). / The research objective of this study was to construct an empirical model for the prediction of corporate failure in South Africa through the application of machine learning techniques using information generally available to investors. The study began with a thorough review of the corporate failure literature, breaking the process of prediction model construction into the following steps: * Defining corporate failure * Sample selection * Feature selection * Data pre-processing * Feature Subset Selection * Classifier construction * Model evaluation These steps were applied to the construction of a model, using a sample of failed companies that were listed on the JSE Securities Exchange between 1 January 1996 and 30 June 2003. A paired sample of non-failed companies was selected. Pairing was performed on the basis of year of failure, industry and asset size (total assets per the company financial statements excluding intangible assets). A minimum of two years and a maximum of three years of financial data were collated for each company. Such data was mainly sourced from BFA McGregor RAID Station, although the BFA McGregor Handbook and JSE Handbook were also consulted for certain data items. A total of 75 financial and non-financial ratios were calculated for each year of data collected for every company in the final sample. Two databases of ratios were created - one for all companies with at least two years of data and another for those companies with three years of data. Missing and undefined data items were rectified before all the ratios were normalised. The set of normalised values was then imported into MatLab Version 6 and input into a Population-Based Incremental Learning (PBIL) algorithm. PBIL was then used to identify those subsets of features that best separated the failed and non-failed data clusters for a one, two and three year forward forecast period. Thornton's Separability Index (SI) was used to evaluate the degree of separation achieved by each feature subset. Machine Learning Financial Prediction
113	Biomedical Semantic Embeddings: Using Hybrid Sentences to Construct Biomedical Word Embeddings and its Applications Shaik, Arshad 12 1900 (has links) Word embeddings is a useful method that has shown enormous success in various NLP tasks, not only in open domain but also in biomedical domain. The biomedical domain provides various domain specific resources and tools that can be exploited to improve performance of these word embeddings. However, most of the research related to word embeddings in biomedical domain focuses on analysis of model architecture, hyper-parameters and input text. In this paper, we use SemMedDB to design new sentences called `Semantic Sentences'. Then we use these sentences in addition to biomedical text as inputs to the word embedding model. This approach aims at introducing biomedical semantic types defined by UMLS, into the vector space of word embeddings. The semantically rich word embeddings presented here rivals state of the art biomedical word embedding in both semantic similarity and relatedness metrics up to 11%. We also demonstrate how these semantic types in word embeddings can be utilized. machine learning word embeddings
114	Computational modeling of learning in complex problem solving tasks Dandurand, Frédéric. January 2007 (has links) No description available. Problem solving. Machine learning.
115	Model Averaging: Methods and Applications Simardone, Camille January 2021 (has links) This thesis focuses on a leading approach for handling model uncertainty: model averaging. I examine the performance of model averaging compared to conventional econometric methods and to more recent machine learning algorithms, and demonstrate how model averaging can be applied to empirical problems in economics. It comprises of three chapters. Chapter 1 evaluates the relative performance of frequentist model averaging (FMA) to individual models, model selection, and three popular machine learning algorithms – bagging, boosting, and the post-lasso – in terms of their mean squared error (MSE). I find that model averaging performs well compared to these other methods in Monte Carlo simulations in the presence of model uncertainty. Additionally, using the National Longitudinal Survey, I use each method to estimate returns to education to demonstrate how easily model averaging can be adopted by empirical economists, with a novel emphasis on the set of candidate models that are averaged. This chapter makes three contributions: focusing on FMA rather than the more popular Bayesian model averaging; examining FMA compared to machine learning algorithms; and providing an illustrative application of FMA to empirical labour economics. Chapter 2 expands on Chapter 1 by investigating different approaches for constructing a set of candidate models to be used in model averaging – an important, yet often over- looked step. Ideally, the candidate model set should balance model complexity, breadth, and computational efficiency. Three promising approaches – model screening, recursive partitioning-based algorithms, and methods that average over nonparametric models – are discussed and their relative performance in terms of MSE is assessed via simulations. Additionally, certain heuristics necessary for empirical researchers to employ the recommended approach for constructing the candidate model set in their own work are described in detail. Chapter 3 applies the methods discussed in depth in earlier chapters to currently timely microdata. I use model selection, model averaging, and the lasso along with data from the Canadian Labour Force Survey to determine which method is best suited for assessing the impacts of the COVID-19 pandemic on the employment of parents with young children in Canada. I compare each model and method using classification metrics, including correct classification rates and receiver operating characteristic curves. I find that the models selected by model selection and model averaging and the lasso model perform better in terms of classification compared to the simpler parametric model specifications that have recently appeared in the literature, which suggests that empirical researchers should consider statistical methods for the choice of model rather than relying on ad hoc selection. Additionally, I estimate the marginal effect of sex on the probability of being employed and find that the results differ in magnitude across models in an economically important way, as these results could affect policies for post-pandemic recovery. / Thesis / Doctor of Philosophy (PhD) / This thesis focuses on model averaging, a leading approach for handling model uncertainty, which is the likelihood that one’s econometric model is incorrectly specified. I examine the performance of model averaging compared to conventional econometric methods and to more recent machine learning algorithms in simulations and applied settings, and show how easily model averaging can be applied to empirical problems in economics. This thesis makes a number of contributions to the literature. First, I focus on frequentist model averaging instead of Bayesian model averaging, which has been studied more extensively. Second, I use model averaging in empirical problems, such as estimating the returns to education and using model averaging with COVID-19 data. Third, I compare model averaging to machine learning, which is becoming more widely used in economics. Finally, I focus attention on different approaches for constructing the set of candidate models for model averaging, an important yet often overlooked step. Econometrics Machine Learning
116	Predicting survival status of lung cancer patients using machine learning Mohan, Aishwarya January 2021 (has links) 5-year survival rate of patients with metastasized non-small cell lung cancer (NSCLC) who received chemotherapy was less than 5% (Kathryn C. Arbour, 2019). Our ability to provide survival status of a patient i.e. Alive or death at any time in future is important from at least two standpoints: a) from clinical standpoint it enables clinicians to provide optimal delivery of healthcare and b) from personal standpoint by providing patient’s family with opportunities to plan their life ahead and potentially cope with emotional aspect of loss of life. / Thesis / Master of Applied Science (MASc) Lung cancer, machine learning
117	ATTENTIVE MULTI-BRANCH ENCODER-DECODER NETWORK FOR ADHERENT OBSTRUCTION REMOVAL Cao, Yuanming January 2023 (has links) With the rapid development of image hardware, outdoor computer vision systems, for instance, surveillance cameras, have been extensively utilized for various applications. These systems typically equip a protective glass layer installed in front of the camera. How- ever, during inclement weather conditions, images captured through such glass often suffer from obstructions adhering to its surface, such as raindrops or dust particles. Consequently, this leads to a degradation in image quality, which significantly affects the performance of the system. Existing obstruction removal algorithms attempt to resolve these issues using deep learning techniques with synthetic data, which may not achieve a good visual result for complex real-world situations. To solve this, some studies employ real-world data. How- ever, they tend to focus on a singular type of obstruction, such as raindrops. This thesis addresses the more challenging task of restoring images taken through glass surfaces, which are impacted by various adherent obstructions such as dirt, raindrops, muddy raindrops, and other small foreign particles commonly found in real-life scenar- ios, including stone fragments and leaf particles. This work introduces an encoder-decoder network that incorporates auxiliary learning and an attention mechanism. During the test- ing phase, the auxiliary branch updates the shared internal hyperparameters of the model, enabling it to restore images from not limited to known categories of obstructions from the training dataset, but also unseen ones. To better accommodate real-world situations, this work presents a dataset comprising real-world adherent obstruction pairs, which cov- ers a large variety of common obstructions along with their corresponding clean ground truth images. Experimental results indicate that the proposed technique outperforms many existing methods in both quantitative and qualitative assessments. / Thesis / Master of Applied Science (MASc) Image restoartion Machine Learning
118	Advances to Convolutional Neural Network Architectures for Prediction and Classification with Applications in the First Dimensional Space Kim, Hae Jin 08 1900 (has links) In the vast field of signal processing, machine learning is rapidly expanding its domain into all realms. As a constituent of this expansion, this thesis presents contributive work on advancements in machine learning algorithms by building on the shoulder of giants. The first chapter of this thesis contains enhancements to a CNN (convolutional neural network) for better classification of heartbeat arrhythmia. The network goes through a two stage development, the first being augmentations to the network and the second being the implementation of dropout. Chapter 2 involves the combination of CNN and LSTM (long short term memory) networks for the task of short-term energy use data regression. Exploiting the benefits of two of the most powerful neural networks, a unique, novel neural network is created to effectually predict future energy use. The final section concludes this work with directions for future works. Machine Learning CNN LSTM
119	Materials Design with Machine Learning Benlolo, Ian 27 October 2023 (has links) In the quest to advance materials design, this thesis integrates Machine Learning (ML) techniques with Density Functional Theory (DFT) data. A novel representation called splashdown is formulated to capture long-range interactions, an aspect often neglected by material representations. A project known as ORGANIZER leads to the creation of a pivotal database, culminating in the discovery of a new organic solid-state lasing molecule that doubled the state-of-the-art emission gain cross-section. Concurrently, a monte-carlo based optimizer, aMC, is tested, demonstrating superior performance to gradient-based methods without the need for expensive gradient computation. Enhanced Graph Neural Networks (GNN)s predict High Entropy Alloy (HEA) catalysts for oxygen reduction reaction, halving necessary DFT computations and unveiling a new HEA catalyst with a 0.27V overpotential. The splashdown representation compares to state-of-the-art ones like MBTR and SOAP in predicting long-range interactions. Collectively, these efforts highlight the transformative potential of ML and some adjacent fields in materials science. Machine Learning Materials Design
120	Machine-Learning-Assisted Test Generation to Characterize Failures for Cyber-Physical Systems Chandar, Abhishek 17 July 2023 (has links) With the advancements in Internet of Things (IoT) and innovations in the networking domain, Cyber-Physical Systems (CPS) are rapidly adopted in various domains from autonomous vehicles to manufacturing systems to improve the efficiency of the overall development of complex physical systems. CPS models allow an easy and cost-effective approach to alter the architecture of the system that yields optimal performance. This is especially crucial in the early stages of development of a physical system. Developing effective testing strategies to test CPS models is necessary to ensure that there are no defects during the execution of the system. Typically, a set of requirements are defined from the domain expertise to assert the system's behavior on different possible inputs. To effectively test CPS, a large number of test inputs is required to observe their performance on a variety of test inputs. But real-world CPS models are compute-intensive (i.e. takes a significant amount of time to execute the CPS for a given test input). Therefore, it is almost impossible to execute CPS models over a large number of test inputs. This leads to sub-optimal fixes based on the identified defects which may lead to costly issues at later stages of development. In this thesis, we aim to improve the efficiency of existing search-based software testing approaches to test compute-intensive CPS by combining them with ML. We call these ML-assisted test generation. In this work, we investigate two alternate ML-assisted test generation techniques: (1) surrogate-assisted and (2) ML-guided test generation, to efficiently test a given CPS model. Both the surrogate-assisted and ML-guided test generation can generate many test inputs. Therefore, we propose to build failure models that generate explainable rules on failure-inducing test inputs of the CPS model. Surrogate-assisted test generation involves using ML as a replacement to CPS under test so that the fitness value of some test inputs are predicted rather than executing them using CPS. A large number of test inputs are generated by combining cheap surrogate predictions and compute-intensive execution of CPS model to find the labels of the test inputs. Specifically, we propose a new surrogate-assisted test generation technique that leverages multiple surrogate models simultaneously and dynamically selects the prediction from the most accurate label. Alternatively, ML-assisted test generation aims to estimate the boundary regions that separate test inputs that pass the requirements and test inputs that fail the requirements and subsequently guide the sampling of test inputs from these boundary regions. Further, the test data generated by the ML-assisted test generation techniques are used to infer two alternative failure models namely the Decision Rule Model (DRM) and Decision Tree Model (DTM) that characterizes the failure circumstances of the CPS model. We conduct an empirical evaluation of the accuracy of failure models inferred from test data generated by both ML-assisted test generation techniques. Using a total of 15 different functional requirements from 5 Simulink-based benchmarks CPS, we observed that the proposed dynamic surrogate-assisted test generation technique generates failure models with an average accuracy of 83% for DRM and 90% for DTM. The average accuracy of the dynamic surrogate-assisted technique has a 16.9% improvement in the average accuracy of DRM and a 7.1% improvement in the average accuracy of DTM compared to the random search baseline. Software Verification Machine Learning

Search results