• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 49
  • 11
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 87
  • 87
  • 55
  • 24
  • 24
  • 19
  • 15
  • 13
  • 13
  • 11
  • 10
  • 8
  • 7
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Machine learning-based human observer analysis of video sequences

Al-Raisi, Seema F. A. R. January 2017 (has links)
The research contributes to the field of video analysis by proposing novel approaches to automatically generating human observer performance patterns that can be effectively used in advancing the modern video analytic and forensic algorithms. Eye tracker and eye movement analysis technology are employed in medical research, psychology, cognitive science and advertising. The data collected on human eye movement from the eye tracker can be analyzed using the machine and statistical learning approaches. Therefore, the study attempts to understand the visual attention pattern of people when observing a captured CCTV footage. It intends to prove whether the eye gaze of the observer which determines their behaviour is dependent on the given instructions or the knowledge they learn from the surveillance task. The research attempts to understand whether the attention of the observer on human objects is differently identified and tracked considering the different areas of the body of the tracked object. It attempts to know whether pattern analysis and machine learning can effectively replace the current conceptual and statistical approaches to the analysis of eye-tracking data captured within a CCTV surveillance task. A pilot study was employed that took around 30 minutes for each participant. It involved observing 13 different pre-recorded CCTV clips of public space. The participants are provided with a clear written description of the targets they should find in each video. The study included a total of 24 participants with varying levels of experience in analyzing CCTV video. A Tobii eye tracking system was employed to record the eye movements of the participants. The data captured by the eye tracking sensor is analyzed using statistical data analysis approaches like SPSS and machine learning algorithms using WEKA. The research concluded the existence of differences in behavioural patterns which could be used to classify participants of study is appropriate machine learning algorithms are employed. The research conducted on video analytics was perceived to be limited to few iii projects where the human object being observed was viewed as one object, and hence the detailed analysis of human observer attention pattern based on human body part articulation has not been investigated. All previous attempts in human observer visual attention pattern analysis on CCTV video analytics and forensics either used conceptual or statistical approaches. These methods were limited with regards to making predictions and the detection of hidden patterns. A novel approach to articulating human objects to be identified and tracked in a visual surveillance task led to constrained results, which demanded the use of advanced machine learning algorithms for classification of participants The research conducted within the context of this thesis resulted in several practical data collection and analysis challenges during formal CCTV operator based surveillance tasks. These made it difficult to obtain the appropriate cooperation from the expert operators of CCTV for data collection. Therefore, if expert operators were employed in the study rather than novice operator, a more discriminative and accurate classification would have been achieved. Machine learning approaches like ensemble learning and tree based algorithms can be applied in cases where a more detailed analysis of the human behaviour is needed. Traditional machine learning approaches are challenged by recent advances in the field of convolutional neural networks and deep learning. Therefore, future research can replace the traditional machine learning approaches employed in this study, with convolutional neural networks. The current research was limited to 13 different videos with different descriptions given to the participants for identifying and tracking different individuals. The research can be expanded to include any complicated demands with regards to changes in the analysis process.
12

Dimensionality reduction and representation for nearest neighbour learning

Payne, Terry R. January 1999 (has links)
An increasing number of intelligent information agents employ Nearest Neighbour learning algorithms to provide personalised assistance to the user. This assistance may be in the form of recognising or locating documents that the user might find relevant or interesting. To achieve this, documents must be mapped into a representation that can be presented to the learning algorithm. Simple heuristic techniques are generally used to identify relevant terms from the documents. These terms are then used to construct large, sparse training vectors. The work presented here investigates an alternative representation based on sets of terms, called set-valued attributes, and proposes a new family of Nearest Neighbour learning algorithms that utilise this set-based representation. The importance of discarding irrelevant terms from the documents is then addressed, and this is generalised to examine the behaviour of the Nearest Neighbour learning algorithm with high dimensional data sets containing such values. A variety of selection techniques used by other machine learning and information retrieval systems are presented, and empirically evaluated within the context of a Nearest Neighbour framework. The thesis concludes with a discussion of ways in which attribute selection and dimensionality reduction techniques may be used to improve the selection of relevant attributes, and thus increase the reliability and predictive accuracy of the Nearest Neighbour learning algorithm.
13

Learning algorithms and statistical software, with applications to bioinformatics / Algorithmes d'apprentissage et logiciels pour la statistique, avec applications à la bioinformatique

Hocking, Toby Dylan 20 November 2012 (has links)
L'apprentissage statistique est le domaine des mathématiques qui aborde le développement des algorithmes d'analyse de données. Cette thèse est divisée en deux parties : l'introduction de modèles mathématiques et l'implémentation d'outils logiciels. Dans la première partie, je présente de nouveaux algorithmes pour la segmentation et pour le partitionnement de données (clustering). Le partitionnement de données et la segmentation sont des méthodes d'analyse qui cherche des structures dans les données. Je présente les contributions suivantes, en soulignant les applications à la bioinformatique. Dans la deuxième partie, je présente mes contributions au logiciel libre pour la statistique, qui est utilisé pour l'analyse quotidienne du statisticien. / Statistical machine learning is a branch of mathematics concerned with developing algorithms for data analysis. This thesis presents new mathematical models and statistical software, and is organized into two parts. In the first part, I present several new algorithms for clustering and segmentation. Clustering and segmentation are a class of techniques that attempt to find structures in data. I discuss the following contributions, with a focus on applications to cancer data from bioinformatics. In the second part, I focus on statistical software contributions which are practical for use in everyday data analysis.
14

On the simulation and design of manycore CMPs

Thompson, Christopher Callum January 2015 (has links)
The progression of Moore’s Law has resulted in both embedded and performance computing systems which use an ever increasing number of processing cores integrated in a single chip. Commercial systems are now available which provide hundreds of cores, and academics have proposed architectures for up to 1024 cores. Embedded multicores are increasingly popular as it is easier to guarantee hard-realtime constraints using individual cores dedicated for tasks, than to use traditional time-multiplexed processing. However, finding the optimal hardware configuration to meet these requirements at minimum cost requires extensive trial and error approaches to investigate the design space. This thesis tackles the problems encountered in the design of these large scale multicore systems by first addressing the problem of fast, detailed micro-architectural simulation. Initially addressing embedded systems, this work exploits the lack of hardware cache-coherence support in many deeply embedded systems to increase the available parallelism in the simulation. Then, through partitioning the NoC and using packet counting and cycle skipping reduces the amount of computation required to accurately model the NoC interconnect. In combination, this enables simulation speeds significantly higher than the state of the art, while maintaining less error, when compared to real hardware, than any similar simulator. Simulation speeds reach up to 370MIPS (Million (target) Instructions Per Second), or 110MHz, which is better than typical FPGA prototypes, and approaching final ASIC production speeds. This is achieved while maintaining an error of only 2.1%, significantly lower than other similar simulators. The thesis continues by scaling the simulator past large embedded systems up to 64-1024 core processors, adding support for coherent architectures using the same packet counting techniques along with low overhead context switching to enable the simulation of such large systems with stricter synchronisation requirements. The new interconnect model was partitioned to enable parallel simulation to further improve simulation speeds in a manner which did not sacrifice any accuracy. These innovations were leveraged to investigate significant novel energy saving optimisations to the coherency protocol, processor ISA, and processor micro-architecture. By introducing a new instruction, with the name wait-on-address, the energy spent during spin-wait style synchronisation events can be significantly reduced. This functions by putting the core into a low-power idle state while the cache line of the indicated address is monitored for coherency action. Upon an update or invalidation (or traditional timer or external interrupts) the core will resume execution, but the active energy of running the core pipeline and repeatedly accessing the data and instruction caches is effectively reduced to static idle power. The thesis also shows that existing combined software-hardware schemes to track data regions which do not require coherency can adequately address the directory-associativity problem, and introduces a new coherency sharer encoding which reduces the energy consumed by sharer invalidations when sharers are grouped closely together, such as would be the case with a system running many tasks with a small degree of parallelism in each. The research concludes by using the extremely fast simulation speeds developed to produce a large set of training data, collecting various runtime and energy statistics for a wide range of embedded applications on a huge diverse range of potential MPSoC designs. This data was used to train a series of machine learning based models which were then evaluated on their capacity to predict performance characteristics of unseen workload combinations across the explored MPSoC design space, using only two sample simulations, with promising results from some of the machine learning techniques. The models were then used to produce a ranking of predicted performance across the design space, and on average Random Forest was able to predict the best design within 89% of the runtime performance of the actual best tested design, and better than 93% of the alternative design space. When predicting for a weighted metric of energy, delay and area, Random Forest on average produced results within 93% of the optimum result. In summary this thesis improves upon the state of the art for cycle accurate multicore simulation, introduces novel energy saving changes the the ISA and microarchitecture of future multicore processors, and demonstrates the viability of machine learning techniques to significantly accelerate the design space exploration required to bring a new manycore design to market.
15

A Novel Hybrid Learning Algorithm For Artificial Neural Networks

Ghosh, Ranadhir, n/a January 2003 (has links)
Last few decades have witnessed the use of artificial neural networks (ANN) in many real-world applications and have offered an attractive paradigm for a broad range of adaptive complex systems. In recent years ANN have enjoyed a great deal of success and have proven useful in wide variety pattern recognition or feature extraction tasks. Examples include optical character recognition, speech recognition and adaptive control to name a few. To keep the pace with its huge demand in diversified application areas, many different kinds of ANN architecture and learning types have been proposed by the researchers to meet varying needs. A novel hybrid learning approach for the training of a feed-forward ANN has been proposed in this thesis. The approach combines evolutionary algorithms with matrix solution methods such as singular value decomposition, Gram-Schmidt etc., to achieve optimum weights for hidden and output layers. The proposed hybrid method is to apply evolutionary algorithm in the first layer and least square method (LS) in the second layer of the ANN. The methodology also finds optimum number of hidden neurons using a hierarchical combination methodology structure for weights and architecture. A learning algorithm has many facets that can make a learning algorithm good for a particular application area. Often there are trade offs between classification accuracy and time complexity, nevertheless, the problem of memory complexity remains. This research explores all the different facets of the proposed new algorithm in terms of classification accuracy, convergence property, generalization ability, time and memory complexity.
16

Machine Learning in Logistics: Machine Learning Algorithms : Data Preprocessing and Machine Learning Algorithms

Andersson, Viktor January 2017 (has links)
Data Ductus is a Swedish IT-consultant company, their customer base ranging from small startups to large scale cooperations. The company has steadily grown since the 80s and has established offices in both Sweden and the US. With the help of machine learning, this project will present a possible solution to the errors caused by the human factor in the logistic business.A way of preprocessing data before applying it to a machine learning algorithm, as well as a couple of algorithms to use will be presented. / Data Ductus är ett svenskt IT-konsultbolag, deras kundbas sträcker sig från små startups till stora redan etablerade företag. Företaget har stadigt växt sedan 80-talet och har etablerat kontor både i Sverige och i USA. Med hjälp av maskininlärning kommer detta projket att presentera en möjlig lösning på de fel som kan uppstå inom logistikverksamheten, orsakade av den mänskliga faktorn.Ett sätt att förbehandla data innan den tillämpas på en maskininlärning algoritm, liksom ett par algoritmer för användning kommer att presenteras.
17

A Machine Learning Approach for Tracking the Torque Losses in Internal Gear Pump - AC Motor Units

Ali, Emad, Weber, Jürgen, Wahler, Matthias 27 April 2016 (has links) (PDF)
This paper deals with the application of speed variable pumps in industrial hydraulic systems. The benefit of the natural feedback of the load torque is investigated for the issue of condition monitoring as the development of losses can be taken as evidence of faults. A new approach is proposed to improve the fault detection capabilities by tracking the changes via machine learning techniques. The presented algorithm is an art of adaptive modeling of the torque balance over a range of steady operation in fault free behavior. The aim thereby is to form a numeric reference with acceptable accuracy of the unit used in particular, taking into consideration the manufacturing tolerances and other operation conditions differences. The learned model gives baseline for identification of major possible abnormalities and offers a fundament for fault isolation by continuously estimating and analyzing the deviations.
18

Advanced Text Analytics and Machine Learning Approach for Document Classification

Anne, Chaitanya 19 May 2017 (has links)
Text classification is used in information extraction and retrieval from a given text, and text classification has been considered as an important step to manage a vast number of records given in digital form that is far-reaching and expanding. This thesis addresses patent document classification problem into fifteen different categories or classes, where some classes overlap with other classes for practical reasons. For the development of the classification model using machine learning techniques, useful features have been extracted from the given documents. The features are used to classify patent document as well as to generate useful tag-words. The overall objective of this work is to systematize NASA’s patent management, by developing a set of automated tools that can assist NASA to manage and market its portfolio of intellectual properties (IP), and to enable easier discovery of relevant IP by users. We have identified an array of methods that can be applied such as k-Nearest Neighbors (kNN), two variations of the Support Vector Machine (SVM) algorithms, and two tree based classification algorithms: Random Forest and J48. The major research steps in this work consist of filtering techniques for variable selection, information gain and feature correlation analysis, and training and testing potential models using effective classifiers. Further, the obstacles associated with the imbalanced data were mitigated by adding synthetic data wherever appropriate, which resulted in a superior SVM classifier based model.
19

Early Stratification of Gestational Diabetes Mellitus (GDM) by building and evaluating machine learning models

Sharma, Vibhor January 2020 (has links)
Gestational diabetes Mellitus (GDM), a condition involving abnormal levels of glucose in the blood plasma has seen a rapid surge amongst the gestating mothers belonging to different regions and ethnicities around the world. Cur- rent method of screening and diagnosing GDM is restricted to Oral Glucose Tolerance Test (OGTT). With the advent of machine learning algorithms, the healthcare has seen a surge of machine learning methods for disease diag- nosis which are increasingly being employed in a clinical setup. Yet in the area of GDM, there has not been wide spread utilization of these algorithms to generate multi-parametric diagnostic models to aid the clinicians for the aforementioned condition diagnosis.In literature, there is an evident scarcity of application of machine learn- ing algorithms for the GDM diagnosis. It has been limited to the proposed use of some very simple algorithms like logistic regression. Hence, we have attempted to address this research gap by employing a wide-array of machine learning algorithms, known to be effective for binary classification, for GDM classification early on amongst gestating mother. This can aid the clinicians for early diagnosis of GDM and will offer chances to mitigate the adverse out- comes related to GDM among the gestating mother and their progeny.We set up an empirical study to look into the performance of different ma- chine learning algorithms used specifically for the task of GDM classification. These algorithms were trained on a set of chosen predictor variables by the ex- perts. Then compared the results with the existing machine learning methods in the literature for GDM classification based on a set of performance metrics. Our model couldn’t outperform the already proposed machine learning mod- els for GDM classification. We could attribute it to our chosen set of predictor variable and the under reporting of various performance metrics like precision in the existing literature leading to a lack of informed comparison. / Graviditetsdiabetes Mellitus (GDM), ett tillstånd som involverar onormala ni- våer av glukos i blodplasma har haft en snabb kraftig ökning bland de drab- bade mammorna som tillhör olika regioner och etniciteter runt om i världen. Den nuvarande metoden för screening och diagnos av GDM är begränsad till Oralt glukosetoleranstest (OGTT). Med tillkomsten av maskininlärningsalgo- ritmer har hälso- och sjukvården sett en ökning av maskininlärningsmetoder för sjukdomsdiagnos som alltmer används i en klinisk installation. Ändå inom GDM-området har det inte använts stor spridning av dessa algoritmer för att generera multiparametriska diagnostiska modeller för att hjälpa klinikerna för ovannämnda tillståndsdiagnos.I litteraturen finns det en uppenbar brist på tillämpning av maskininlär- ningsalgoritmer för GDM-diagnosen. Det har begränsats till den föreslagna användningen av några mycket enkla algoritmer som logistisk regression. Där- för har vi försökt att ta itu med detta forskningsgap genom att använda ett brett spektrum av maskininlärningsalgoritmer, kända för att vara effektiva för binär klassificering, för GDM-klassificering tidigt bland gesterande mamma. Det- ta kan hjälpa klinikerna för tidig diagnos av GDM och kommer att erbjuda chanser att mildra de negativa utfallen relaterade till GDM bland de dödande mamma och deras avkommor.Vi inrättade en empirisk studie för att undersöka prestandan för olika ma- skininlärningsalgoritmer som används specifikt för uppgiften att klassificera GDM. Dessa algoritmer tränades på en uppsättning valda prediktorvariabler av experterna. Jämfört sedan resultaten med de befintliga maskininlärnings- metoderna i litteraturen för GDM-klassificering baserat på en uppsättning pre- standametriker. Vår modell kunde inte överträffa de redan föreslagna maskininlärningsmodellerna för GDM-klassificering. Vi kunde tillskriva den valda uppsättningen prediktorvariabler och underrapportering av olika prestanda- metriker som precision i befintlig litteratur vilket leder till brist på informerad jämförelse.
20

Transfer learning for classification of spatially varying data

Jun, Goo 13 December 2010 (has links)
Many real-world datasets have spatial components that provide valuable information about characteristics of the data. In this dissertation, a novel framework for adaptive models that exploit spatial information in data is proposed. The proposed framework is mainly based on development and applications of Gaussian processes. First, a supervised learning method is proposed for the classification of hyperspectral data with spatially adaptive model parameters. The proposed algorithm models spatially varying means of each spectral band of a given class using a Gaussian process regression model. For a given location, the predictive distribution of a given class is modeled by a multivariate Gaussian distribution with spatially adjusted parameters obtained from the proposed algorithm. The Gaussian process model is generally regarded as a good tool for interpolation, but not for extrapolation. Moreover, the uncertainty of the predictive distribution increases as the distance from the training instances increases. To overcome this problem, a semi-supervised learning algorithm is presented for the classification of hyperspectral data with spatially adaptive model parameters. This algorithm fits the test data with a spatially adaptive mixture-of-Gaussians model, where the spatially varying parameters of each component are obtained by Gaussian process regressions with soft memberships using the mixture-of-Gaussian-processes model. The proposed semi-supervised algorithm assumes a transductive setting, where the unlabeled data is considered to be similar to the training data. This is not true in general, however, since one may not know how many classes may existin the unexplored regions. A spatially adaptive nonparametric Bayesian framework is therefore proposed by applying spatially adaptive mechanisms to the mixture model with infinitely many components. In this method, each component in the mixture has spatially adapted parameters estimated by Gaussian process regressions, and spatial correlations between indicator variables are also considered. In addition to land cover and land use classification applications based on hyperspectral imagery, the Gaussian process-based spatio-temporal model is also applied to predict ground-based aerosol optical depth measurements from satellite multispectral images, and to select the most informative ground-based sites by active learning. In this application, heterogeneous features with spatial and temporal information are incorporated together by employing a set of covariance functions, and it is shown that the spatio-temporal information exploited in this manner substantially improves the regression model. The conventional meaning of spatial information usually refers to actual spatio-temporal locations in the physical world. In the final chapter of this dissertation, the meaning of spatial information is generalized to the parametrized low-dimensional representation of data in feature space, and a corresponding spatial modeling technique is exploited to develop a nearest-manifold classification algorithm. / text

Page generated in 0.0928 seconds