• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 30
  • 11
  • 8
  • 2
  • Tagged with
  • 56
  • 56
  • 14
  • 12
  • 12
  • 12
  • 11
  • 10
  • 9
  • 7
  • 7
  • 6
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

A scalable evolutionary learning classifier system for knowledge discovery in stream data mining

Dam, Hai Huong, Information Technology & Electrical Engineering, Australian Defence Force Academy, UNSW January 2008 (has links)
Data mining (DM) is the process of finding patterns and relationships in databases. The breakthrough in computer technologies triggered a massive growth in data collected and maintained by organisations. In many applications, these data arrive continuously in large volumes as a sequence of instances known as a data stream. Mining these data is known as stream data mining. Due to the large amount of data arriving in a data stream, each record is normally expected to be processed only once. Moreover, this process can be carried out on different sites in the organisation simultaneously making the problem distributed in nature. Distributed stream data mining poses many challenges to the data mining community including scalability and coping with changes in the underlying concept over time. In this thesis, the author hypothesizes that learning classifier systems (LCSs) - a class of classification algorithms - have the potential to work efficiently in distributed stream data mining. LCSs are an incremental learner, and being evolutionary based they are inherently adaptive. However, they suffer from two main drawbacks that hinder their use as fast data mining algorithms. First, they require a large population size, which slows down the processing of arriving instances. Second, they require a large number of parameter settings, some of them are very sensitive to the nature of the learning problem. As a result, it becomes difficult to choose a right setup for totally unknown problems. The aim of this thesis is to attack these two problems in LCS, with a specific focus on UCS - a supervised evolutionary learning classifier system. UCS is chosen as it has been tested extensively on classification tasks and it is the supervised version of XCS, a state of the art LCS. In this thesis, the architectural design for a distributed stream data mining system will be first introduced. The problems that UCS should face in a distributed data stream task are confirmed through a large number of experiments with UCS and the proposed architectural design. To overcome the problem of large population sizes, the idea of using a Neural Network to represent the action in UCS is proposed. This new system - called NLCS { was validated experimentally using a small fixed population size and has shown a large reduction in the population size needed to learn the underlying concept in the data. An adaptive version of NLCS called ANCS is then introduced. The adaptive version dynamically controls the population size of NLCS. A comprehensive analysis of the behaviour of ANCS revealed interesting patterns in the behaviour of the parameters, which motivated an ensemble version of the algorithm with 9 nodes, each using a different parameter setting. In total they cover all patterns of behaviour noticed in the system. A voting gate is used for the ensemble. The resultant ensemble does not require any parameter setting, and showed better performance on all datasets tested. The thesis concludes with testing the ANCS system in the architectural design for distributed environments proposed earlier. The contributions of the thesis are: (1) reducing the UCS population size by an order of magnitude using a neural representation; (2) introducing a mechanism for adapting the population size; (3) proposing an ensemble method that does not require parameter setting; and primarily (4) showing that the proposed LCS can work efficiently for distributed stream data mining tasks.
52

A scalable evolutionary learning classifier system for knowledge discovery in stream data mining

Dam, Hai Huong, Information Technology & Electrical Engineering, Australian Defence Force Academy, UNSW January 2008 (has links)
Data mining (DM) is the process of finding patterns and relationships in databases. The breakthrough in computer technologies triggered a massive growth in data collected and maintained by organisations. In many applications, these data arrive continuously in large volumes as a sequence of instances known as a data stream. Mining these data is known as stream data mining. Due to the large amount of data arriving in a data stream, each record is normally expected to be processed only once. Moreover, this process can be carried out on different sites in the organisation simultaneously making the problem distributed in nature. Distributed stream data mining poses many challenges to the data mining community including scalability and coping with changes in the underlying concept over time. In this thesis, the author hypothesizes that learning classifier systems (LCSs) - a class of classification algorithms - have the potential to work efficiently in distributed stream data mining. LCSs are an incremental learner, and being evolutionary based they are inherently adaptive. However, they suffer from two main drawbacks that hinder their use as fast data mining algorithms. First, they require a large population size, which slows down the processing of arriving instances. Second, they require a large number of parameter settings, some of them are very sensitive to the nature of the learning problem. As a result, it becomes difficult to choose a right setup for totally unknown problems. The aim of this thesis is to attack these two problems in LCS, with a specific focus on UCS - a supervised evolutionary learning classifier system. UCS is chosen as it has been tested extensively on classification tasks and it is the supervised version of XCS, a state of the art LCS. In this thesis, the architectural design for a distributed stream data mining system will be first introduced. The problems that UCS should face in a distributed data stream task are confirmed through a large number of experiments with UCS and the proposed architectural design. To overcome the problem of large population sizes, the idea of using a Neural Network to represent the action in UCS is proposed. This new system - called NLCS { was validated experimentally using a small fixed population size and has shown a large reduction in the population size needed to learn the underlying concept in the data. An adaptive version of NLCS called ANCS is then introduced. The adaptive version dynamically controls the population size of NLCS. A comprehensive analysis of the behaviour of ANCS revealed interesting patterns in the behaviour of the parameters, which motivated an ensemble version of the algorithm with 9 nodes, each using a different parameter setting. In total they cover all patterns of behaviour noticed in the system. A voting gate is used for the ensemble. The resultant ensemble does not require any parameter setting, and showed better performance on all datasets tested. The thesis concludes with testing the ANCS system in the architectural design for distributed environments proposed earlier. The contributions of the thesis are: (1) reducing the UCS population size by an order of magnitude using a neural representation; (2) introducing a mechanism for adapting the population size; (3) proposing an ensemble method that does not require parameter setting; and primarily (4) showing that the proposed LCS can work efficiently for distributed stream data mining tasks.
53

Modélisation et implémentation de parallélisme implicite pour les simulations scientifiques basées sur des maillages / Model and implementation of implicit parallélism for mesh-based scientific simulations

Coullon, Hélène 29 September 2014 (has links)
Le calcul scientifique parallèle est un domaine en plein essor qui permet à la fois d’augmenter la vitesse des longs traitements, de traiter des problèmes de taille plus importante ou encore des problèmes plus précis. Ce domaine permet donc d’aller plus loin dans les calculs scientifiques, d’obtenir des résultats plus pertinents, car plus précis, ou d’étudier des problèmes plus volumineux qu’auparavant. Dans le monde plus particulier de la simulation numérique scientifique, la résolution d’équations aux dérivées partielles (EDP) est un calcul particulièrement demandeur de ressources parallèles. Si les ressources matérielles permettant le calcul parallèle sont de plus en plus présentes et disponibles pour les scientifiques, à l’inverse leur utilisation et la programmation parallèle se démocratisent difficilement. Pour cette raison, des modèles de programmation parallèle, des outils de développement et même des langages de programmation parallèle ont vu le jour et visent à simplifier l’utilisation de ces machines. Il est toutefois difficile, dans ce domaine dit du “parallélisme implicite”, de trouver le niveau d’abstraction idéal pour les scientifiques, tout en réduisant l’effort de programmation. Ce travail de thèse propose tout d’abord un modèle permettant de mettre en oeuvre des solutions de parallélisme implicite pour les simulations numériques et la résolution d’EDP. Ce modèle est appelé “Structured Implicit Parallelism for scientific SIMulations” (SIPSim), et propose une vision au croisement de plusieurs types d’abstraction, en tentant de conserver les avantages de chaque vision. Une première implémentation de ce modèle, sous la forme d’une librairie C++ appelée SkelGIS, est proposée pour les maillages cartésiens à deux dimensions. Par la suite, SkelGIS, et donc l’implémentation du modèle, est étendue à des simulations numériques sur les réseaux (permettant l’application de simulations représentant plusieurs phénomènes physiques). Les performances de ces deux implémentations sont évaluées et analysées sur des cas d’application réels et complexes et démontrent qu’il est possible d’obtenir de bonnes performances en implémentant le modèle SIPSim. / Parallel scientific computations is an expanding domain of computer science which increases the speed of calculations and offers a way to deal with heavier or more accurate calculations. Thus, the interest of scientific computations increases, with more precised results and bigger physical domains to study. In the particular case of scientific numerical simulations, solving partial differential equations (PDEs) is an especially heavy calculation and a perfect applicant to parallel computations. On one hand, it is more and more easy to get an access to very powerfull parallel machines and clusters, but on the other hand parallel programming is hard to democratize, and most scientists are not able to use these machines. As a result, high level programming models, framework, libraries, languages etc. have been proposed to hide technical details of parallel programming. However, in this “implicit parallelism” field, it is difficult to find the good abstraction level while keeping a low programming effort. This thesis proposes a model to write implicit parallelism solutions for numerical simulations such as mesh-based PDEs computations. This model is called “Structured Implicit Parallelism for scientific SIMulations” (SIPSim), and proposes an approach at the crossroads of existing solutions, taking advantage of each one. A first implementation of this model is proposed, as a C++ library called SkelGIS, for two dimensional Cartesian meshes. A second implementation of the model, and an extension of SkelGIS, proposes an implicit parallelism solution for network-simulations (which deals with simulations with multiple physical phenomenons), and is studied in details. A performance analysis of both these implementations is given on real case simulations, and it demonstrates that the SIPSim model can be implemented efficiently.
54

Métodos e softwares para análise da produção científica e detecção de frentes emergentes de pesquisa / Methods and software for scientific production analysis and detection of emerging research trends

REIS JUNIOR, JOSE S.B. 21 December 2016 (has links)
Submitted by Marco Antonio Oliveira da Silva (maosilva@ipen.br) on 2016-12-21T15:07:24Z No. of bitstreams: 0 / Made available in DSpace on 2016-12-21T15:07:24Z (GMT). No. of bitstreams: 0 / O progresso de projetos anteriores salientou a necessidade de tratar o problema dos softwares para detecção, a partir de bases de dados de publicações científicas, de tendências emergentes de pesquisa e desenvolvimento. Evidenciou-se a carência de aplicações computacionais eficientes dedicadas a este propósito, que são artigos de grande utilidade para um melhor planejamento de programas de pesquisa e desenvolvimento em instituições. Foi realizada, então, uma revisão dos softwares atualmente disponíveis, para poder-se delinear claramente a oportunidade de desenvolver novas ferramentas. Como resultado, implementou-se um aplicativo chamado Citesnake, projetado especialmente para auxiliar a detecção e o estudo de tendências emergentes a partir da análise de redes de vários tipos, extraídas das bases de dados científicas. Através desta ferramenta computacional robusta e eficaz, foram conduzidas análises de frentes emergentes de pesquisa e desenvolvimento na área de Sistemas Geradores de Energia Nuclear de Geração IV, de forma que se pudesse evidenciar, dentre os tipos de reatores selecionados como os mais promissores pelo GIF - Generation IV International Forum, aqueles que mais se desenvolveram nos últimos dez anos e que se apresentam, atualmente, como os mais capazes de cumprir as promessas realizadas sobre os seus conceitos inovadores. / Dissertação (Mestrado em Tecnologia Nuclear) / IPEN/D / Instituto de Pesquisas Energéticas e Nucleares - IPEN-CNEN/SP
55

Studies on two specific inverse problems from imaging and finance

Rückert, Nadja 20 July 2012 (has links) (PDF)
This thesis deals with regularization parameter selection methods in the context of Tikhonov-type regularization with Poisson distributed data, in particular the reconstruction of images, as well as with the identification of the volatility surface from observed option prices. In Part I we examine the choice of the regularization parameter when reconstructing an image, which is disturbed by Poisson noise, with Tikhonov-type regularization. This type of regularization is a generalization of the classical Tikhonov regularization in the Banach space setting and often called variational regularization. After a general consideration of Tikhonov-type regularization for data corrupted by Poisson noise, we examine the methods for choosing the regularization parameter numerically on the basis of two test images and real PET data. In Part II we consider the estimation of the volatility function from observed call option prices with the explicit formula which has been derived by Dupire using the Black-Scholes partial differential equation. The option prices are only available as discrete noisy observations so that the main difficulty is the ill-posedness of the numerical differentiation. Finite difference schemes, as regularization by discretization of the inverse and ill-posed problem, do not overcome these difficulties when they are used to evaluate the partial derivatives. Therefore we construct an alternative algorithm based on the weak formulation of the dual Black-Scholes partial differential equation and evaluate the performance of the finite difference schemes and the new algorithm for synthetic and real option prices.
56

Studies on two specific inverse problems from imaging and finance

Rückert, Nadja 16 July 2012 (has links)
This thesis deals with regularization parameter selection methods in the context of Tikhonov-type regularization with Poisson distributed data, in particular the reconstruction of images, as well as with the identification of the volatility surface from observed option prices. In Part I we examine the choice of the regularization parameter when reconstructing an image, which is disturbed by Poisson noise, with Tikhonov-type regularization. This type of regularization is a generalization of the classical Tikhonov regularization in the Banach space setting and often called variational regularization. After a general consideration of Tikhonov-type regularization for data corrupted by Poisson noise, we examine the methods for choosing the regularization parameter numerically on the basis of two test images and real PET data. In Part II we consider the estimation of the volatility function from observed call option prices with the explicit formula which has been derived by Dupire using the Black-Scholes partial differential equation. The option prices are only available as discrete noisy observations so that the main difficulty is the ill-posedness of the numerical differentiation. Finite difference schemes, as regularization by discretization of the inverse and ill-posed problem, do not overcome these difficulties when they are used to evaluate the partial derivatives. Therefore we construct an alternative algorithm based on the weak formulation of the dual Black-Scholes partial differential equation and evaluate the performance of the finite difference schemes and the new algorithm for synthetic and real option prices.

Page generated in 0.065 seconds