Global ETD Search

111	Vehicular Traffic Flow Prediction Model Using Machine Learning-Based Model Wang, Jiahao 14 June 2021 (has links) Intelligent Transportation Systems (ITS) have attracted an increasing amount of attention in recent years. Thanks to the fast development of vehicular computing hardware, vehicular sensors and citywide infrastructures, many impressive applications have been proposed under the topic of ITS, such as Vehicular Cloud (VC), intelligent traffic controls, etc. These applications can bring us a safer, more efficient, and also more enjoyable transportation environment. However, an accurate and efficient traffic flow prediction system is needed to achieve these applications, which creates an opportunity for applications under ITS to deal with the possible road situation in advance. To achieve better traffic flow prediction performance, many prediction methods have been proposed, such as mathematical modeling methods, parametric methods, and non-parametric methods. It is always one of the hot topics about how to implement an efficient, robust and accurate vehicular traffic prediction system. With the help of Machine Learning-based (ML) methods, especially Deep Learning-based (DL) methods, the accuracy of the prediction model is increased. However, we also noticed that there are still many open challenges under ML-based vehicular traffic prediction model real-world implementation. Firstly, the time consumption for DL model training is relatively huge compared to parametric models, such as ARIMA, SARIMA, etc. Second, it is still a hot topic for the road traffic prediction that how to capture the special relationship between road detectors, which is affected by the geographic correlation, as well as the time change. The last but not the least, it is important for us to implement the prediction system in the real world; meanwhile, we should find a way to make use of the advanced technology applied in ITS to improve the prediction system itself. In our work, we focus on improving the features of the prediction model, which can be helpful for implementing the model in the real word. Firstly, we introduced an optimization strategy for ML-based models' training process, in order to reduce the time cost in this process. Secondly, We provide a new hybrid deep learning model by using GCN and the deep aggregation structure (i.e., the sequence to sequence structure) of the GRU. Meanwhile, in order to solve the real-world prediction problem, i.e., the online prediction task, we provide a new online prediction strategy by using refinement learning. In order to further improve the model's accuracy and efficiency when applied to ITS, we provide a parallel training strategy by using the benefits of the vehicular cloud structure. ITS Machine Learning
112	Machine learning for corporate failure prediction : an empirical study of South African companies Kornik, Saul January 2004 (has links) Includes bibliographical references (leaves 255-266). / The research objective of this study was to construct an empirical model for the prediction of corporate failure in South Africa through the application of machine learning techniques using information generally available to investors. The study began with a thorough review of the corporate failure literature, breaking the process of prediction model construction into the following steps: * Defining corporate failure * Sample selection * Feature selection * Data pre-processing * Feature Subset Selection * Classifier construction * Model evaluation These steps were applied to the construction of a model, using a sample of failed companies that were listed on the JSE Securities Exchange between 1 January 1996 and 30 June 2003. A paired sample of non-failed companies was selected. Pairing was performed on the basis of year of failure, industry and asset size (total assets per the company financial statements excluding intangible assets). A minimum of two years and a maximum of three years of financial data were collated for each company. Such data was mainly sourced from BFA McGregor RAID Station, although the BFA McGregor Handbook and JSE Handbook were also consulted for certain data items. A total of 75 financial and non-financial ratios were calculated for each year of data collected for every company in the final sample. Two databases of ratios were created - one for all companies with at least two years of data and another for those companies with three years of data. Missing and undefined data items were rectified before all the ratios were normalised. The set of normalised values was then imported into MatLab Version 6 and input into a Population-Based Incremental Learning (PBIL) algorithm. PBIL was then used to identify those subsets of features that best separated the failed and non-failed data clusters for a one, two and three year forward forecast period. Thornton's Separability Index (SI) was used to evaluate the degree of separation achieved by each feature subset. Machine Learning Financial Prediction
113	Computational modeling of learning in complex problem solving tasks Dandurand, Frédéric. January 2007 (has links) No description available. Problem solving. Machine learning.
114	Model Averaging: Methods and Applications Simardone, Camille January 2021 (has links) This thesis focuses on a leading approach for handling model uncertainty: model averaging. I examine the performance of model averaging compared to conventional econometric methods and to more recent machine learning algorithms, and demonstrate how model averaging can be applied to empirical problems in economics. It comprises of three chapters. Chapter 1 evaluates the relative performance of frequentist model averaging (FMA) to individual models, model selection, and three popular machine learning algorithms – bagging, boosting, and the post-lasso – in terms of their mean squared error (MSE). I find that model averaging performs well compared to these other methods in Monte Carlo simulations in the presence of model uncertainty. Additionally, using the National Longitudinal Survey, I use each method to estimate returns to education to demonstrate how easily model averaging can be adopted by empirical economists, with a novel emphasis on the set of candidate models that are averaged. This chapter makes three contributions: focusing on FMA rather than the more popular Bayesian model averaging; examining FMA compared to machine learning algorithms; and providing an illustrative application of FMA to empirical labour economics. Chapter 2 expands on Chapter 1 by investigating different approaches for constructing a set of candidate models to be used in model averaging – an important, yet often over- looked step. Ideally, the candidate model set should balance model complexity, breadth, and computational efficiency. Three promising approaches – model screening, recursive partitioning-based algorithms, and methods that average over nonparametric models – are discussed and their relative performance in terms of MSE is assessed via simulations. Additionally, certain heuristics necessary for empirical researchers to employ the recommended approach for constructing the candidate model set in their own work are described in detail. Chapter 3 applies the methods discussed in depth in earlier chapters to currently timely microdata. I use model selection, model averaging, and the lasso along with data from the Canadian Labour Force Survey to determine which method is best suited for assessing the impacts of the COVID-19 pandemic on the employment of parents with young children in Canada. I compare each model and method using classification metrics, including correct classification rates and receiver operating characteristic curves. I find that the models selected by model selection and model averaging and the lasso model perform better in terms of classification compared to the simpler parametric model specifications that have recently appeared in the literature, which suggests that empirical researchers should consider statistical methods for the choice of model rather than relying on ad hoc selection. Additionally, I estimate the marginal effect of sex on the probability of being employed and find that the results differ in magnitude across models in an economically important way, as these results could affect policies for post-pandemic recovery. / Thesis / Doctor of Philosophy (PhD) / This thesis focuses on model averaging, a leading approach for handling model uncertainty, which is the likelihood that one’s econometric model is incorrectly specified. I examine the performance of model averaging compared to conventional econometric methods and to more recent machine learning algorithms in simulations and applied settings, and show how easily model averaging can be applied to empirical problems in economics. This thesis makes a number of contributions to the literature. First, I focus on frequentist model averaging instead of Bayesian model averaging, which has been studied more extensively. Second, I use model averaging in empirical problems, such as estimating the returns to education and using model averaging with COVID-19 data. Third, I compare model averaging to machine learning, which is becoming more widely used in economics. Finally, I focus attention on different approaches for constructing the set of candidate models for model averaging, an important yet often overlooked step. Econometrics Machine Learning
115	Predicting survival status of lung cancer patients using machine learning Mohan, Aishwarya January 2021 (has links) 5-year survival rate of patients with metastasized non-small cell lung cancer (NSCLC) who received chemotherapy was less than 5% (Kathryn C. Arbour, 2019). Our ability to provide survival status of a patient i.e. Alive or death at any time in future is important from at least two standpoints: a) from clinical standpoint it enables clinicians to provide optimal delivery of healthcare and b) from personal standpoint by providing patient’s family with opportunities to plan their life ahead and potentially cope with emotional aspect of loss of life. / Thesis / Master of Applied Science (MASc) Lung cancer, machine learning
116	ATTENTIVE MULTI-BRANCH ENCODER-DECODER NETWORK FOR ADHERENT OBSTRUCTION REMOVAL Cao, Yuanming January 2023 (has links) With the rapid development of image hardware, outdoor computer vision systems, for instance, surveillance cameras, have been extensively utilized for various applications. These systems typically equip a protective glass layer installed in front of the camera. How- ever, during inclement weather conditions, images captured through such glass often suffer from obstructions adhering to its surface, such as raindrops or dust particles. Consequently, this leads to a degradation in image quality, which significantly affects the performance of the system. Existing obstruction removal algorithms attempt to resolve these issues using deep learning techniques with synthetic data, which may not achieve a good visual result for complex real-world situations. To solve this, some studies employ real-world data. How- ever, they tend to focus on a singular type of obstruction, such as raindrops. This thesis addresses the more challenging task of restoring images taken through glass surfaces, which are impacted by various adherent obstructions such as dirt, raindrops, muddy raindrops, and other small foreign particles commonly found in real-life scenar- ios, including stone fragments and leaf particles. This work introduces an encoder-decoder network that incorporates auxiliary learning and an attention mechanism. During the test- ing phase, the auxiliary branch updates the shared internal hyperparameters of the model, enabling it to restore images from not limited to known categories of obstructions from the training dataset, but also unseen ones. To better accommodate real-world situations, this work presents a dataset comprising real-world adherent obstruction pairs, which cov- ers a large variety of common obstructions along with their corresponding clean ground truth images. Experimental results indicate that the proposed technique outperforms many existing methods in both quantitative and qualitative assessments. / Thesis / Master of Applied Science (MASc) Image restoartion Machine Learning
117	Materials Design with Machine Learning Benlolo, Ian 27 October 2023 (has links) In the quest to advance materials design, this thesis integrates Machine Learning (ML) techniques with Density Functional Theory (DFT) data. A novel representation called splashdown is formulated to capture long-range interactions, an aspect often neglected by material representations. A project known as ORGANIZER leads to the creation of a pivotal database, culminating in the discovery of a new organic solid-state lasing molecule that doubled the state-of-the-art emission gain cross-section. Concurrently, a monte-carlo based optimizer, aMC, is tested, demonstrating superior performance to gradient-based methods without the need for expensive gradient computation. Enhanced Graph Neural Networks (GNN)s predict High Entropy Alloy (HEA) catalysts for oxygen reduction reaction, halving necessary DFT computations and unveiling a new HEA catalyst with a 0.27V overpotential. The splashdown representation compares to state-of-the-art ones like MBTR and SOAP in predicting long-range interactions. Collectively, these efforts highlight the transformative potential of ML and some adjacent fields in materials science. Machine Learning Materials Design
118	Machine-Learning-Assisted Test Generation to Characterize Failures for Cyber-Physical Systems Chandar, Abhishek 17 July 2023 (has links) With the advancements in Internet of Things (IoT) and innovations in the networking domain, Cyber-Physical Systems (CPS) are rapidly adopted in various domains from autonomous vehicles to manufacturing systems to improve the efficiency of the overall development of complex physical systems. CPS models allow an easy and cost-effective approach to alter the architecture of the system that yields optimal performance. This is especially crucial in the early stages of development of a physical system. Developing effective testing strategies to test CPS models is necessary to ensure that there are no defects during the execution of the system. Typically, a set of requirements are defined from the domain expertise to assert the system's behavior on different possible inputs. To effectively test CPS, a large number of test inputs is required to observe their performance on a variety of test inputs. But real-world CPS models are compute-intensive (i.e. takes a significant amount of time to execute the CPS for a given test input). Therefore, it is almost impossible to execute CPS models over a large number of test inputs. This leads to sub-optimal fixes based on the identified defects which may lead to costly issues at later stages of development. In this thesis, we aim to improve the efficiency of existing search-based software testing approaches to test compute-intensive CPS by combining them with ML. We call these ML-assisted test generation. In this work, we investigate two alternate ML-assisted test generation techniques: (1) surrogate-assisted and (2) ML-guided test generation, to efficiently test a given CPS model. Both the surrogate-assisted and ML-guided test generation can generate many test inputs. Therefore, we propose to build failure models that generate explainable rules on failure-inducing test inputs of the CPS model. Surrogate-assisted test generation involves using ML as a replacement to CPS under test so that the fitness value of some test inputs are predicted rather than executing them using CPS. A large number of test inputs are generated by combining cheap surrogate predictions and compute-intensive execution of CPS model to find the labels of the test inputs. Specifically, we propose a new surrogate-assisted test generation technique that leverages multiple surrogate models simultaneously and dynamically selects the prediction from the most accurate label. Alternatively, ML-assisted test generation aims to estimate the boundary regions that separate test inputs that pass the requirements and test inputs that fail the requirements and subsequently guide the sampling of test inputs from these boundary regions. Further, the test data generated by the ML-assisted test generation techniques are used to infer two alternative failure models namely the Decision Rule Model (DRM) and Decision Tree Model (DTM) that characterizes the failure circumstances of the CPS model. We conduct an empirical evaluation of the accuracy of failure models inferred from test data generated by both ML-assisted test generation techniques. Using a total of 15 different functional requirements from 5 Simulink-based benchmarks CPS, we observed that the proposed dynamic surrogate-assisted test generation technique generates failure models with an average accuracy of 83% for DRM and 90% for DTM. The average accuracy of the dynamic surrogate-assisted technique has a 16.9% improvement in the average accuracy of DRM and a 7.1% improvement in the average accuracy of DTM compared to the random search baseline. Software Verification Machine Learning
119	Characterization of the promotion, adverse events, and regulation related to synthetic nicotine products on social media: a multiplatform content analysis using topic modeling Shah, Neal 08 March 2024 (has links) Objective: Social media has been implicated as a leading driver of the youth vaping epidemic in the United States. Despite the recent proliferation of synthetic nicotine products in the marketplace, there is limited understanding of the promotion, health risks, and regulatory policy associated with these products. We aim to identify and characterize posts on Instagram and Twitter related to the promotion of synthetic nicotine products, self-reporting of adverse events following synthetic nicotine product use, and also identify discussion topics related to the regulation, health policy, and education about synthetic nicotine. Methods: We conducted a hashtag and keyword search on Instagram and Twitter, respectively, to collect posts related to synthetic nicotine products. We then analyzed this data utilizing multilanguage BERT analysis, a pre-trained supervised topic modeling algorithm, to sort the dataset into clusters grouped based on textual similarity. After this, we manually annotated the most representative posts corresponding to each topic cluster using a codebook associated with characteristics of interest and categorized clusters by their coherence to themes of promotion, adverse events, and regulation. Results: A total of 14,651 Instagram posts and 24,081 Twitter posts were collected from the keyword search. After an intermediary data cleaning phase to remove posts which could not be recognized by topic modeling, the multilanguage BERT topic model thematically clustered 49.4% (n=6034) of Instagram posts and 46.0% (n=3200) of tweets. After manual content analysis, we detected 52.9% (n=3193) of Instagram posts and 30.4% (n=972) of tweets that were in clusters thematically related to our study aims. The most representative theme on Instagram was synthetic nicotine electronic nicotine delivery systems (ENDS) promotion, with 91.6% (n=2924) of posts belong to that thematic cluster, followed by regulation and policy related posts (8.4%, n=268). In comparison, the most representative theme on Twitter related to self-reporting of adverse events (50.2%, n = 488), followed by promotion (39.8%, n=387), and regulation and policy (10.0%, n=97). Manual annotation of the most representative tweets within these clusters showed a higher level of coherence that Instagram clusters had towards its respective theme than Twitter clusters. A qualitative sub-analysis found that most of the synthetic nicotine ENDS promotion activity was more specifically related to the selling of synthetic nicotine ENDS products by different vendors. Conclusion: Despite platform prohibitions against the marketing and sale of various tobacco products, there remains significant user-generated content related to synthetic nicotine products on Instagram and Twitter, with most posts related to the promotion and sales of synthetic nicotine ENDS products. Most of the synthetic nicotine ENDS content in our dataset on Instagram is closely related to the theme of ENDS product promotion, with little discussion of regulation and adverse events. On Twitter, synthetic nicotine ENDS content is more heterogeneous, with significant discussion of adverse events following synthetic nicotine ENDS use along with similar promotion of synthetic nicotine ENDS products. Further research is needed to better understand the acute health risks unique to synthetic nicotine products and whether or not these public health challenges are exacerbated due to unregulated and illegal promotion and sale of these products via social media. / 2025-03-08T00:00:00Z Public health Machine learning
120	Machine Learning for Classification of Pediatric Concussion Recovery Stages Anderson, Lauren January 2021 (has links) Mild traumatic brain injury (mTBI), or concussion, results from sudden acceleration or deceleration of the brain and subsequent complex tissue propagation of shock waves that disrupt structure and function. Concussions can cause many symptoms including headache, dizziness, and difficulty concentrating. These can be detrimental to children, a ecting their participation in school, sport, and social activities. Therefore, return to school (RTS) and return to activity (RTA) protocols have been developed to help safely return children to these activities without risking further injury. The goal of this study was to develop machine learning (ML) algorithms to predict RTA and RTS stages, that can easily be incorporated into a smartphone application (APP). Ideally this would assist children in tracking and determining their RTA and RTS progression leading them to a safe and timely return. Support vector machine classi er (SVC) and random forest (RF) algorithms were developed to predict RTA/RTS stages. Both were modeled on previously acquired data, and on newly acquired data, and results were compared. Models were trained and tested using accelerometry and symptom data from pediatric concussion patients. A sliding window technique and feature extraction were performed on raw acceleration data to extract suitable features, which were combined with yes/no symptom recordings as ML inputs. The dataset consisted of 67 participants aged 10 to 18, 42 female and 25 male, with a total of 844408 samples. The best results for RTS prediction showed average accuracy of 83% for RF and 66% for SVC. For RTA predictions, the best results had average accuracy of 60% for RF and 58% for SVC. For new data, RTS predictions showed an accuracy of 45% for RF and 41% for SVC. RTA predictions had an accuracy of 35% for RF and 30% for SVC. RF models had superior performance on all data. These results show that predicting RTA/RTS is possible with ML. However, improvements to these models can be made by training on more data prior to APP implementation. More data is needed, as recruitment during this study was limited due to Covid-19 restrictions. / Thesis / Master of Applied Science (MASc) / Concussions are recorded in approximately 300,000 athletes annually and are estimated to a ect up to 3.8 million individuals per year in the United States alone. Understanding when its safe to return to normal routine after an injury is important but challenging. Therefore, a series of stages have been developed to lead children through a safe and timely return to sport and activity after concussion. The goal of this study was to develop machine learning (ML) algorithms which predict these return stages using symptom recordings and gross body movement data. Algorithms could be incorporated into a smartphone application (APP) to provide accessible return guidelines for children with concussions. Algorithms were created and model performance was tested using symptom and body movement data collected from children after a concussive injury. The results of this study show that it is possible to predict return to school and return to activity stages with ML, and with improvements, can be used to facilitate return from injury Machine Learning Concussion Pediatric

Search results