• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7530
  • 1107
  • 1049
  • 794
  • 483
  • 291
  • 238
  • 185
  • 90
  • 81
  • 64
  • 52
  • 45
  • 44
  • 42
  • Tagged with
  • 14558
  • 9365
  • 3969
  • 2384
  • 1933
  • 1930
  • 1740
  • 1650
  • 1536
  • 1451
  • 1382
  • 1362
  • 1360
  • 1304
  • 1282
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
291

Forecasting Success in the National Hockey League Using In-Game Statistics and Textual Data

Weissbock, Joshua January 2014 (has links)
In this thesis, we look at a number of methods to forecast success (winners and losers), both of single games and playoff series (best-of-seven games) in the sport of ice hockey, more specifically within the National Hockey League (NHL). Our findings indicate that there exists a theoretical upper bound, which seems to hold true for all sports, that makes prediction difficult. In the first part of this thesis, we look at predicting success of individual games to learn which of the two teams will win or lose. We use a number of traditional statistics (published on the league’s website and used by the media) and performance metrics (used by Internet hockey analysts; they are shown to have a much higher correlation with success over the long term). Despite the demonstrated long term success of performance metrics, it was the traditional statistics that had the most value to automatic game prediction, allowing our model to achieve 59.8% accuracy. We found it interesting that regardless of which features we used in our model, we were not able to increase the accuracy much higher than 60%. We compared the observed win% of teams in the NHL to many simulated leagues and found that there appears to be a theoretical upper bound of approximately 62% for single game prediction in the NHL. As one game is difficult to predict, with a maximum of accuracy of 62%, then pre- dicting a longer series of games must be easier. We looked at predicting the winner of the best-of-seven series between two teams using over 30 features, both traditional and advanced statistics, and found that we were able to increase our prediction accuracy to almost 75%. We then re-explored predicting single games with the use of pre-game textual reports written by hockey experts from http://www.NHL.com using Bag-of-Word features and sentiment analysis. We combined these features with the numerical data in a multi-layer meta-classifiers and were able to increase the accuracy close to the upper bound
292

Robot simulation studies

Rowat, Peter Forbes January 1972 (has links)
The history of the robot as a concept and as a fact is indicated, and the current linguistic approach to robotology discussed. The problem of designing a robot-controller is approached by taking a simplified, computer-simulated, model of a robot in an environment, and writing programs to enable the robot to move around its environment in a reasonably intelligent manner. The problems of concept representation and the creation and execution of plans are dealt with in this simple system, and the problem of exploration is encountered but not satisfactorily dealt with. The robots' environment consists of a rectangular grid in which squares are labelled as belonging to the boundary, to fixed or movable objects, or to holes, while the robot itself occupies a single square, can sense the labels of the eight surrounding squares, can turn, and can pickup, move, and drop movable objects. The boundary of a typical environment is thus a rectanguloid polygon, which can be compared to the floor-plan of a one-level house. After an initial exploration the basic representation of the environment is as a sequence of edge-lengths and turns, called the ring-representation. An algorithm is described which produces the set of maximal subrectangles of the environment (i.e. rooms, passages, doorways) from the ring representation. To make plans for moving within the environment, the robot first views the maximal subrectangles as the vertices of a graph, wherein two vertices are connected by an edge if and only if the corresponding maximal subrectangles overlap, and then uses a path-finding algorithm to find a path between two vertices of the graph. This path constitutes a "plan of action". Whenever an isolated object or hole is found, its ring-representation is generated and its set of maximal subrectangles produced. Thus the shapes of objects and holes within the environment can be compared in various ways. In particular, an algorithm is described which compares the shape of a movable object with that of a hole to ascertain if the movable object could be moved to fit inside the hole without "physically" moving the object. ROSS, an interactive computer program which simulates the robot-environment model, is described. A command language allows the user to specify tasks for the robot at various conceptual levels. Several problems are listed concerning the ways in which a robot might explore, represent, and make plans about, its environment, most of which are amenable to direct attack in this simplified model. Finally, theoretical questions concerning two-dimensional rectanguloid shapes are raised. / Science, Faculty of / Computer Science, Department of / Graduate
293

Machine recognition of independent and contextually constrained contour-traced handprinted characters

Toussaint, Godfried T. January 1969 (has links)
A contour-tracing technique originally divised by Clemens and Mason was modified and used with several different classifiers of varying complexity to recognize upper case handprinted alphabetic characters. An analysis and comparison of the various classifiers, with the modifications introduced to handle variable length feature vectors, is presented. On independent characters, one easily realized suboptimum parametric classifier yielded recognition accuracies which compare favourably with other published results. Additional simple tests on commonly confused characters improved results significantly as did use of contextual constraints. In addition, the above classifier uses much less storage capacity than a non-parametric optimum Bayes classifier and performs significantly better than the optimum classifier when training and testing data are limited. The optimum decision on a string of m contextually constrained characters, each having a variable-length feature vector, is derived. A computationally efficient algorithm, based on this equation, was developed and tested with monogram, bigram and trigram contextual constraints of English text. A marked improvement in recognition accuracy was noted over the case when contextual constraints were not used, and a trade-off was observed not only between the order of contextual information used and the number of measurements taken, but also between the order of context and the value of a parameter ds which indicates the complexity of the classification algorithm. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
294

Design and Control of a Brushless Doubly-Fed Machine for Aviation Propulsion

Peng, Peng January 2020 (has links)
No description available.
295

GAINING INSIGHTS INTO TOURMALINE-BEARING LOCALITIES WITH MACHINE LEARNING ALGORITHMS

Williams, Jason Ryan 01 September 2021 (has links)
Machine learning algorithms can be used to analyze large datasets and to identify relationships and patterns that otherwise might be missed by more traditional scientific and statistical approaches. The aim of this study is to evaluate the ability of machine learning algorithms to classify mineral systems and provide insights into the geological processes operating on Earth. This study examines the potential of machine learning algorithms as interpretive tools for the identification of geological processes and additional approaches are implemented to predict how geological processes may have evolved at tourmaline-bearing localities in the United States. Tourmaline mineral occurrence data for localities in the United States were retrieved from mineral databases and exploratory machine learning algorithms, such as market basket analysis and hierarchical clustering, were used to identify geological and geochemical processes. Common geological processes operating in sedimentary, igneous, metamorphic, and hydrothermal systems were all identified based on the presence of diagnostic mineral assemblages such as actinolite-wollastonite-dravite in metamorphic rocks or microcline-schorl-beryl in igneous deposits. Several different iterations of supervised machine learning algorithms were used with models incorporating different combinations of mineral occurrence data, environmental data, and geological process labels in order to learn how to predict the geologic evolution of tourmaline-bearing localities. A test dataset was generated by selecting different locations within the United States randomly and mineralogy was assigned to each site by using interpolation methods. Decision tree and random forest algorithms were both then used to classify the randomly generated test dataset. Cross-validation approaches show that the decision trees likely performed better when classifying the test dataset. The results discussed throughout this study highlight how machine learning algorithms can be very effective and accurate supplementary tools when characterizing tourmaline-bearing deposits. The models discussed in this paper were able to classify different geological processes with over ~90% accuracy and they were able to predict how geological processes evolved at different tourmaline-bearing localities with an estimated ~70% accuracy. The most accurate classification of tourmaline-bearing localities occurred when analyzing deposits that were subjected to higher temperatures and pressures which in turn generates more distinct mineralogies that allow machine learning algorithms to identify patterns with greater confidence. The analysis of tourmaline localities associated with low-temperature hydrothermal and sedimentary environments results in much more error-prone classifiers which can be attributed to a lack of tourmaline-bearing sedimentary deposits in mineral databases and because sedimentary deposits can have a record of processes from multiple geologic environments that may or may not be related. The strengths and limitations of the models trained are detailed throughout this paper.
296

Performance Enhancement Schemesand Effective Incentives for Federated Learning

Wang, Yuwei 16 November 2021 (has links)
The advent of artificial intelligence applications demands for massive amount of data to supplement the training of machine learning models. Traditional machine learning schemes require central processing of large volumes of data that may contain sensitive patterns such as user location, personal information, or transactions history. Federated Learning (FL) has been proposed to complement the traditional centralized methods where multiple local models are trained and aggregated over a centralized cloud server. However, the performance of FL needs to be further improved, since its accuracy is not on par with traditional centralized machine learning approaches. Furthermore, due to the possibility of privacy information leakage, there are not enough clients willing to participate in FL training process. Common practice for the uploaded local models is an evenly weighted aggregation, assuming that each node of the network contributes to advancing the global model equally, which is unfair with higher contribution model owners. This thesis focuses on three aspects of improving a whole federated learning pipeline: client selection; reputation enabled weight aggregation; and incentive mechanism. For client selection, a reputation score consists of evaluation metrics is introduced to eliminate poor performing model contributions. This scheme enhances the original implementation by up to 10% for non-IID datasets. We also reduce the training time of selection scheme by roughly 27.7% compared to the baseline implementation. Then, a reputation-enabled weighted aggregation of the local models for distributed learning is proposed. Thus, the contribution of a local model and its aggregation weight is evaluated and determined by its reputation score, which is formulated as same above. Numerical comparison of the proposed methodology that assigns different aggregation weights based on the accuracy of each model to a baseline that utilizes standard average aggregation weight shows an accuracy improvement of 17.175% over the standard baseline for not independent and identically distributed (non-IID) scenarios for an FL network of 100 participants. Last but not least, for incentive mechanism, we can reward participants based on data quality, data quantity, reputation and resource allocation of participants. In this thesis, we adopt a reputation-aware reverse auction that was earlier proposed to recruit dependable participants for mobile crowdsensing campaigns, and modify that incentive to adapt it to a FL setting where user utility is defined as a function of the assigned payment from the central server and the user’s service cost, such as battery and processor usage. Through numerical results, we show that: 1) the proposed incentive can improve the user utilities when compared to the baseline approaches, 2) platform utility can be maintained at a close value to that under the baselines, 3) the overall test accuracy of the aggregated global model can even slightly improve.
297

APIC: A method for automated pattern identification and classification

Goss, Ryan Gavin January 2017 (has links)
Machine Learning (ML) is a transformative technology at the forefront of many modern research endeavours. The technology is generating a tremendous amount of attention from researchers and practitioners, providing new approaches to solving complex classification and regression tasks. While concepts such as Deep Learning have existed for many years, the computational power for realising the utility of these algorithms in real-world applications has only recently become available. This dissertation investigated the efficacy of a novel, general method for deploying ML in a variety of complex tasks, where best feature selection, data-set labelling, model definition and training processes were determined automatically. Models were developed in an iterative fashion, evaluated using both training and validation data sets. The proposed method was evaluated using three distinct case studies, describing complex classification tasks often requiring significant input from human experts. The results achieved demonstrate that the proposed method compares with, and often outperforms, less general, comparable methods designed specifically for each task. Feature selection, data-set annotation, model design and training processes were optimised by the method, where less complex, comparatively accurate classifiers with lower dependency on computational power and human expert intervention were produced. In chapter 4, the proposed method demonstrated improved efficacy over comparable systems, automatically identifying and classifying complex application protocols traversing IP networks. In chapter 5, the proposed method was able to discriminate between normal and anomalous traffic, maintaining accuracy in excess of 99%, while reducing false alarms to a mere 0.08%. Finally, in chapter 6, the proposed method discovered more optimal classifiers than those implemented by comparable methods, with classification scores rivalling those achieved by state-of-the-art systems. The findings of this research concluded that developing a fully automated, general method, exhibiting efficacy in a wide variety of complex classification tasks with minimal expert intervention, was possible. The method and various artefacts produced in each case study of this dissertation are thus significant contributions to the field of ML.
298

Feature matching and learning for controlling multiple identical agents with global inputs

Negishi, Tomoya 24 May 2022 (has links)
Simple identical agent systems are becoming more common in nanotechnology, biology, and chemistry. Since, in these domains, each agent can implement only necessarily by simple mechanisms, the major challenge of these systems is how to control the agents using limited control input, such as broadcast control. Inspired by previous work, in which identical agents can be controlled via global inputs using a single fixed obstacle, we propose a new pipeline that uses tree search and matching methods to identify target and agent pairs to move, and their orders. In this work, we compare several matching methods from a hand-crafted template matching to learned feature descriptors matching, and discuss their validity in the pathfinding problem. We also employ the Monte Carlo Tree Search algorithm in order to enhance the efficiency of the tree search. In experiments, we execute the proposed pipeline in shape formation tasks. We compare the total number of control steps and computation time between the different matching methods, as well as against previous work and human solutions. The results show all our methods significantly reduce the total number of input steps compared to the previous work. In particular, the combination of learned feature matching and the Monte Carlo Tree Search algorithm outperforms all other methods.
299

Vehicular Traffic Flow Prediction Model Using Machine Learning-Based Model

Wang, Jiahao 14 June 2021 (has links)
Intelligent Transportation Systems (ITS) have attracted an increasing amount of attention in recent years. Thanks to the fast development of vehicular computing hardware, vehicular sensors and citywide infrastructures, many impressive applications have been proposed under the topic of ITS, such as Vehicular Cloud (VC), intelligent traffic controls, etc. These applications can bring us a safer, more efficient, and also more enjoyable transportation environment. However, an accurate and efficient traffic flow prediction system is needed to achieve these applications, which creates an opportunity for applications under ITS to deal with the possible road situation in advance. To achieve better traffic flow prediction performance, many prediction methods have been proposed, such as mathematical modeling methods, parametric methods, and non-parametric methods. It is always one of the hot topics about how to implement an efficient, robust and accurate vehicular traffic prediction system. With the help of Machine Learning-based (ML) methods, especially Deep Learning-based (DL) methods, the accuracy of the prediction model is increased. However, we also noticed that there are still many open challenges under ML-based vehicular traffic prediction model real-world implementation. Firstly, the time consumption for DL model training is relatively huge compared to parametric models, such as ARIMA, SARIMA, etc. Second, it is still a hot topic for the road traffic prediction that how to capture the special relationship between road detectors, which is affected by the geographic correlation, as well as the time change. The last but not the least, it is important for us to implement the prediction system in the real world; meanwhile, we should find a way to make use of the advanced technology applied in ITS to improve the prediction system itself. In our work, we focus on improving the features of the prediction model, which can be helpful for implementing the model in the real word. Firstly, we introduced an optimization strategy for ML-based models' training process, in order to reduce the time cost in this process. Secondly, We provide a new hybrid deep learning model by using GCN and the deep aggregation structure (i.e., the sequence to sequence structure) of the GRU. Meanwhile, in order to solve the real-world prediction problem, i.e., the online prediction task, we provide a new online prediction strategy by using refinement learning. In order to further improve the model's accuracy and efficiency when applied to ITS, we provide a parallel training strategy by using the benefits of the vehicular cloud structure.
300

Machine learning for corporate failure prediction : an empirical study of South African companies

Kornik, Saul January 2004 (has links)
Includes bibliographical references (leaves 255-266). / The research objective of this study was to construct an empirical model for the prediction of corporate failure in South Africa through the application of machine learning techniques using information generally available to investors. The study began with a thorough review of the corporate failure literature, breaking the process of prediction model construction into the following steps: * Defining corporate failure * Sample selection * Feature selection * Data pre-processing * Feature Subset Selection * Classifier construction * Model evaluation These steps were applied to the construction of a model, using a sample of failed companies that were listed on the JSE Securities Exchange between 1 January 1996 and 30 June 2003. A paired sample of non-failed companies was selected. Pairing was performed on the basis of year of failure, industry and asset size (total assets per the company financial statements excluding intangible assets). A minimum of two years and a maximum of three years of financial data were collated for each company. Such data was mainly sourced from BFA McGregor RAID Station, although the BFA McGregor Handbook and JSE Handbook were also consulted for certain data items. A total of 75 financial and non-financial ratios were calculated for each year of data collected for every company in the final sample. Two databases of ratios were created - one for all companies with at least two years of data and another for those companies with three years of data. Missing and undefined data items were rectified before all the ratios were normalised. The set of normalised values was then imported into MatLab Version 6 and input into a Population-Based Incremental Learning (PBIL) algorithm. PBIL was then used to identify those subsets of features that best separated the failed and non-failed data clusters for a one, two and three year forward forecast period. Thornton's Separability Index (SI) was used to evaluate the degree of separation achieved by each feature subset.

Page generated in 0.0605 seconds