Global ETD Search

71	Analyse théorique et numérique de dynamiques non-réversibles en physique statistique computationnelle / Theoretical and numerical analysis of non-reversible dynamics in computational statistical physics Roussel, Julien 27 November 2018 (has links) Cette thèse traite de quatre sujets en rapport avec les dynamiques non-réversibles. Chacun fait l'objet d'un chapitre qui peut être lu indépendamment.Le premier chapitre est une introduction générale présentant les problématiques et quelques résultats majeurs de physique statistique computationnelle.Le second chapitre concerne la résolution numérique d'équations aux dérivées partielles hypoelliptiques, c'est-à-dire faisant intervenir un opérateur différentiel inversible mais non coercif. Nous prouvons la consistance de la méthode de Galerkin ainsi que des taux de convergence pour l'erreur. L'analyse est également conduite dans le cas d'une formulation point-selle, qui s'avère être la plus adaptée dans les cas qui nous intéressent. Nous démontrons que nos hypothèses sont satisfaites dans un cas simple et vérifions numériquement nos prédictions théoriques sur cet exemple.Dans le troisième chapitre nous proposons une stratégie générale permettant de construire des variables de contrôle pour des dynamiques hors-équilibre. Cette méthode permet en particulier de réduire la variance des estimateurs de coefficient de transport par moyenne ergodique. Cette réduction de variance est quantifiée dans un régime perturbatif. La variable de contrôle repose sur la solution d'une équation aux dérivées partielles. Dans le cas de l'équation de Langevin cette équation est hypoelliptique, ce qui motive le chapitre précédent. La méthode proposée est testée numériquement sur trois exemples.Le quatrième chapitre est connecté au troisième puisqu'il utilise la même idée de variable de contrôle. Il s'agit d'estimer la mobilité d'une particule dans le régime sous-amorti, où la dynamique est proche d'être Hamiltonienne. Ce travail a été effectué en collaboration avec G. Pavliotis durant un séjour à l'Imperial College London.Le dernier chapitre traite des processus de Markov déterministes par morceaux, qui permettent l'échantillonnage de mesure en grande dimension. Nous prouvons la convergence exponentielle vers l'équilibre de plusieurs dynamiques de ce type sous un formalisme général incluant le processus de Zig-Zag (ZZP), l'échantillonneur à particule rebondissante (BPS) et la dynamique de Monte Carlo hybride randomisée (RHMC). La dépendances des bornes sur le taux de convergence que nous démontrons sont explicites par rapport aux paramètres du problème. Cela permet en particulier de contrôler la taille des intervalles de confiance pour des moyennes empiriques lorsque la dimension de l'espace des phases sous-jacent est grande. Ce travail a été fait en collaboration avec C. Andrieu, A. Durmus et N. Nüsken. / This thesis deals with four topics related to non-reversible dynamics. Each is the subject of a chapter which can be read independently. The first chapter is a general introduction presenting the problematics and some major results of computational statistical physics. The second chapter concerns the numerical resolution of hypoelliptic partial differential equations, i.e. involving an invertible but non-coercive differential operator. We prove the consistency of the Galerkin method as well as convergence rates for the error. The analysis is also carried out in the case of a saddle-point formulation, which is the most appropriate in the cases of interest to us. We demonstrate that our assumptions are met in a simple case and numerically check our theoretical predictions on this example. In the third chapter we propose a general strategy for constructing control variates for nonequilibrium dynamics. In particular, this method reduces the variance of transport coefficient estimators by ergodic mean. This variance reduction is quantified in a perturbative regime. The control variate is based on the solution of a partial differential equation. In the case of Langevin's equation this equation is hypoelliptic, which motivates the previous chapter. The proposed method is tested numerically on three examples. The fourth chapter is connected to the third since it uses the same idea of a control variate. The aim is to estimate the mobility of a particle in the underdamped regime, where the dynamics are close to being Hamiltonian. This work was done in collaboration with G. Pavliotis during a stay at Imperial College London. The last chapter deals with Piecewise Deterministic Markov Processes, which allow measure sampling in high-dimension. We prove the exponential convergence towards the equilibrium of several dynamics of this type under a general formalism including the Zig-Zag process (ZZP), the Bouncy Particle Sampler (BPS) and the Randomized Hybrid Monte Carlo (RHMC). The dependencies of the bounds on the convergence rate that we demonstrate are explicit with respect to the parameters of the problem. This allows in particular to control the size of the confidence intervals for empirical averages when the size of the underlying phase space is large. This work was done in collaboration with C. Andrieu, A. Durmus and N. Nüsken Physique statistique Hors-Équilibre Réduction de variance Analyse numérique Statistical physics Nonequilibirum Variance reduction Numerical analysis Stochastic differential equations Piecewise Deterministic Markov Processes
72	[en] PROBABILISTIC LOAD FLOW VIA MONTE CARLO SIMULATION AND CROSS-ENTROPY METHOD / [pt] FLUXO DE POTÊNCIA PROBABILÍSTICO VIA SIMULAÇÃO MONTE CARLO E MÉTODO DA ENTROPIA CRUZADA ANDRE MILHORANCE DE CASTRO 12 February 2019 (has links) [pt] Em planejamento e operação de sistemas de energia elétrica, é necessário realizar diversas avaliações utilizando o algoritmo de fluxo de potência, para obter e monitorar o ponto de operação da rede em estudo. Em sua utilização determinística, devem ser especificados valores de geração e níveis de carga por barra, bem como considerar uma configuração especifica da rede elétrica. Existe, porém, uma restrição evidente em se trabalhar com algoritmo de fluxo de potência determinístico: não há qualquer percepção do impacto gerado por incertezas nas variáveis de entrada que o algoritmo utiliza. O algoritmo de fluxo de potência probabilístico (FPP) visa extrapolar as limitações impostas pelo uso da ferramenta convencional determinística, permitindo a consideração das incertezas de entrada. Obtém-se maior sensibilidade na avaliação dos resultados, visto que possíveis regiões de operação são mais claramente examinadas. Consequentemente, estima-se o risco do sistema funcionar fora de suas condições operativas nominais. Essa dissertação propõe uma metodologia baseada na simulação Monte Carlo (SMC) utilizando técnicas de amostragem por importância via o método de entropia cruzada. Índices de risco para eventos selecionados (e.g., sobrecargas em equipamentos de transmissão) são avaliados, mantendo-se a precisão e flexibilidade permitidas pela SMC convencional, porém em tempo computacional muito reduzido. Ao contrário das técnicas analíticas concebidas para solução do FPP, que visam primordialmente à elaboração de curvas de densidade de probabilidade para as variáveis de saída (fluxos, etc.) e sempre necessitam ter a precisão obtida comparada à SMC, o método proposto avalia somente as áreas das caudas dessas densidades, obtendo resultados com maior exatidão nas regiões de interesse do ponto de vista do risco operativo. O método proposto é aplicado nos sistemas IEEE 14 barras, IEEE RTS e IEEE 118 barras, sendo os resultados obtidos amplamente discutidos. Em todos os casos, há claros ganhos de desempenho computacional, mantendo-se a precisão, quando comparados à SMC convencional. As possíveis aplicações do método e suas derivações futuras também fazem parte da dissertação. / [en] In planning and operation of electric energy systems, it is necessary to perform several evaluations using the power flow algorithm to obtain and monitor the operating point of the network under study. Bearing in mind its deterministic use, generation values and load levels per bus must be specified, as well as a specific configuration of the power network. There is, however, an obvious constraint in running a deterministic power flow tool: there is no perception of the impact produced by uncertainties on the input variables used by the conventional algorithm. The probabilistic power flow (PLF) algorithm aims to solve the limitations imposed by the use of the deterministic conventional tool, allowing the consideration of input uncertainties. Superior sensitivity is obtained in the evaluation of results, as possible regions of operation are more clearly examined. Consequently, the risk of the system operating outside its nominal conditions is duly estimated. This dissertation proposes a methodology based on Monte Carlo simulation (MCS) using importance sampling techniques via the cross-entropy method. Risk indices for selected events (e.g., overloads on transmission equipment) are evaluated, keeping the same accuracy and flexibility tolerable by the conventional MCS, but in much less computational time. Unlike the FPP solution obtained by analytical techniques, which primarily aim at assessing probability density curves for the output variables (flows, etc.) and always need to have the accuracy compared to MCS, the proposed method evaluates only the tail areas of these densities, obtaining results with greater accuracy in the regions of interest from the operational risk point of view. The proposed method is applied to IEEE 14, IEEE RTS and IEEE 118 bus systems, and the results are widely discussed. In all cases, there are clear gains in computational performance, maintaining accuracy when compared to conventional SMC. The possible applications of the method and future developments are also part of the dissertation. [pt] SIMULACAO MONTE CARLO [en] MONTE CARLO SIMULATION [pt] CONFIABILIDADE [en] RELIABILITY [pt] FLUXO DE POTENCIA PROBABILISTICO [en] PROBABILISTIC POWER FLOW [pt] ENTROPIA CRUZADA [en] MONTE CARLO SIMULATION [pt] TECNICAS DE REDUCAO DE VARIANCIA [en] VARIANCE REDUCTION TECHNIQUES
73	FW-CADIS variance reduction in MAVRIC shielding analysis of the VHTR Flaspoehler, Timothy Michael 27 September 2012 (has links) In the following work, the MAVRIC sequence of the Scale6.1 code package was tested for its efficacy in calculating a wide range of shielding parameters with respect to HTGRs. One of the NGNP designs that has gained large support internationally is the VHTR. The development of the Scale6.1 code package at ORNL has been primarily directed towards supporting the current United States' reactor fleet of LWR technology. Since plans have been made to build a prototype VHTR, it is important to verify that the MAVRIC sequence can adequately meet the simulation needs of a different reactor technology. This was accomplished by creating a detailed model of the VHTR power plant; identifying important, relevant radiation indicators; and implementing methods using MAVRIC to simulate those indicators in the VHTR model. The graphite moderator used in the design shapes a different flux spectrum than water-moderated reactors. The different flux spectrum could lead to new considerations when quantifying shielding characteristics and possibly a different gamma-ray spectrum escaping the core and surrounding components. One key portion of this study was obtaining personnel dose rates in accessible areas within the power plant from both neutron and gamma sources. Additionally, building from professional and regulatory standards a surveillance capsule monitoring program was designed to mimic those used in the nuclear industry. The high temperatures were designed to supply heat for industrial purposes and not just for power production. Since tritium, a heavier radioactive isotope of hydrogen, is produced in the reactor it is important to know the distribution of tritium production and the subsequent diffusion from the core to secondary systems to prevent contamination outside of the nuclear island. Accurately modeling indicators using MAVRIC is the main goal. However, it is almost equally as important for simulations to be carried out in a timely manner. MAVRIC uses the discrete ordinates method to solve the fixed-source transport equation for both neutron and gamma rays on a crude geometric representation of the detailed model. This deterministic forward solution is used to solve an adjoint equation with the adjoint source specified by the user. The adjoint solution is then used to create an importance map that can weight particles in a stochastic Monte Carlo simulation. The goal of using this hybrid methodology is to provide complete accuracy with high precision while decreasing overall simulation times by orders of magnitude. The MAVRIC sequence provides a platform to quickly alter inputs so that vastly different shielding studies can be simulated using one model with minimal effort by the user. Each separate shielding study required unique strategies while looking at different regions in the VHTR plant. MAVRIC proved to be effective for each case. Tritium Reaction rate Neutral particle SINBAD HTGR VENUS PCA Variance reduction FW-CADIS MAVRIC Shielding Dosimetry Surveillance capsule monitoring Dose Photon Gamma Neutron Radiation Transport Shielding (Radiation) Nuclear engineering Safety measures Radioactivity Safety measures
74	Méthode de simulation avec les variables antithétiques Gatarayiha, Jean Philippe 06 1900 (has links) Dans ce mémoire, nous travaillons sur une méthode de simulation de Monte-Carlo qui utilise des variables antithétiques pour estimer un intégrale de la fonction f(x) sur un intervalle (0,1] où f peut être une fonction monotone, non-monotone ou une autre fonction difficile à simuler. L'idée principale de la méthode qu'on propose est de subdiviser l'intervalle (0,1] en m sections dont chacune est subdivisée en l sous intervalles. Cette technique se fait en plusieurs étapes et à chaque fois qu'on passe à l'étape supérieure la variance diminue. C'est à dire que la variance obtenue à la kième étape est plus petite que celle trouvée à la (k-1)ième étape ce qui nous permet également de rendre plus petite l'erreur d’estimation car l'estimateur de l'intégrale de f(x) sur [0,1] est sans biais. L'objectif est de trouver m, le nombre optimal de sections, qui permet de trouver cette diminution de la variance. / In this master thesis, we consider simulation methods based on antithetic variates for estimate integrales of f(x) on interval (0,1] where f is monotonic function, not a monotonic function or a function difficult to integrate. The main idea consists in subdividing the (0,1] in m sections of which each one is subdivided in l subintervals. This method is done recursively. At each step the variance decreases, i.e. The variance obtained at the kth step is smaller than that is found at the (k-1)th step. This allows us to reduce the error in the estimation because the estimator of integrales of f(x) on interval [0,1] is unbiased. The objective is to optimize m. / Les fichiers qui accompagnent mon document ont été réalisés avec le logiciel Latex et les simulations ont été réalisés par Splus(R). Estimateur sans biais Génération de variables aléatoires Simulation de Monte Carlo Variables antithétiques Réduction de la variance Unbiased estimator Random numbers Monte Carlo simulation Antithetic variates Variance reduction
75	Simulation de centres de contacts Buist, Éric January 2009 (has links) Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal Centres d'appels Call centers Réduction de la variance Variance reduction Stratification Stratification Variable de contrôle Control variate Variables aléatoires communes Common random numbers Chaîne de Markov Markov chain Uniformisation Uniformization Scission et recombinaison Split and merge
76	偏常態因子信用組合下之效率估計值模擬 / Efficient Simulation in Credit Portfolio with Skew Normal Factor 林永忠, Lin, Yung Chung Unknown Date (has links) 在因子模型下，損失分配函數的估算取決於混合型聯合違約分配。蒙地卡羅是一個經常使用的計算工具。然而，一般蒙地卡羅模擬是一個不具有效率的方法，特別是在稀有事件與複雜的債務違約模型的情形下，因此，找尋可以增進效率的方法變成了一件迫切的事。對於這樣的問題，重點採樣法似乎是一個可以採用且吸引人的方法。透過改變抽樣的機率測度，重點採樣法使估計量變得更有效率，尤其是針對相對複雜的模型。因此，我們將應用重點採樣法來估計偏常態關聯結構模型的尾部機率。這篇論文包含兩個部分。Ⅰ：應用指數扭轉法---一個經常使用且為較佳的終點採樣技巧---於條件機率。然而，這樣的程序無法確保所得的估計量有足夠的變異縮減。此結果指出，對於因子在選擇重點採樣上，我們需要更進一步的考慮。Ⅱ：進一步應用重點採樣法於因子；在這樣的問題上，已經有相當多的方法在文獻中被提出。在這些文獻中，重點採樣的方法可大略區分成兩種策略。第一種策略主要在選擇一個最好的位移。最佳的位移值可透過操作不同的估計法來求得，這樣的策略出現在Glasserman等(1999)或Glasserman與Li (2005)。第二種策略則如同在Capriotti (2008)中的一樣，則是考慮擁有許多參數的因子密度函數作為重點採樣的候選分配。透過解出非線性優化問題，就可確立一個未受限於位移的重點採樣分配。不過，這樣的方法在尋找最佳的參數當中，很容易引起另一個效率上的問題。為了要讓此法有效率，就必須在使用此法前，對參數的穩健估計上，投入更多的工作，這將造成問題更行複雜。本文中，我們說明了另一種簡單且具有彈性的策略。這裡，我們所提的演算法不受限在如同Gaussian模型下決定最佳位移的作法，也不受限於因子分配函數參數的估計。透過Chiang, Yueh與Hsie (2007)文章中的主要概念，我們提供了重點採樣密度函數一個合理的推估並且找出了一個不同於使用隨機近似的演算法來加速模擬的進行。最後，我們提供了一些單因子的理論的證明。對於多因子模型，我們也因此有了一個較有效率的估計演算法。我們利用一些數值結果來凸顯此法在效率上，是遠優於蒙地卡羅模擬。 / Under a factor model, computation of the loss density function relies on the estimates of some mixture of the joint default probability and joint survival probability. Monte Carlo simulation is among the most widely used computational tools in such estimation. Nevertheless, general Monte Carlo simulation is an ineffective simulation approach, in particular for rare event aspect and complex dependence between defaults of multiple obligors. So a method to increase efficiency of estimation is necessary. Importance sampling (IS) seems to be an attractive method to address this problem. Changing the measure of probabilities, IS makes an estimator to be efficient especially for complicated model. Therefore, we consider IS for estimation of tail probability of skew normal copula model. This paper consists of two parts. First, we apply exponential twist, a usual and better IS technique, to conditional probabilities and the factors. However, this procedure does not always guarantee enough variance reduction. Such result indicates the further consideration of choosing IS factor density. Faced with this problem, a variety of approaches has recently been proposed in the literature ( Capriotti 2008, Glasserman et al 1999, Glasserman and Li 2005). The better choices of IS density can be roughly classified into two kinds of strategies. The first strategy depends on choosing optimal shift. The optimal drift is decided by using different approximation methods. Such strategy is shown in Glasserman et al 1999, or Glasserman and Li 2005. The second strategy, as shown in Capriotti (2008), considers a family of factor probability densities which depend on a set of real parameters. By formulating in terms of a nonlinear optimization problem, IS density which is not limited the determination of drift is then determinate. The method that searches for the optimal parameters, however, incurs another efficiency problem. To keep the method efficient, particular care for robust parameters estimation needs to be taken in preliminary Monte Carlo simulation. This leads method to be more complicated. In this paper, we describe an alternative strategy that is straightforward and flexible enough to be applied in Monte Carlo setting. Indeed, our algorithm is not limited to the determination of optimal drift in Gaussian copula model, nor estimation of parameters of factor density. To exploit the similar concept developed for basket default swap valuation in Chiang, Yueh, and Hsie (2007), we provide a reasonable guess of the optimal sampling density and then establish a way different from stochastic approximation to speed up simulation. Finally, we provide theoretical support for single factor model and take this approach a step further to multifactor case. So we have a rough but fast approximation that execute entirely with Monte Carlo in general situation. We support our approach by some portfolio examples. Numerical results show that such algorithm is more efficient than general Monte Carlo simulation. 蒙地卡羅模擬重點採樣法信用風險組合變異縮減 Monte Carlo Simulation Importance Sampling Portfolio credit risk Variance reduction
77	Méthodes de Monte Carlo stratifiées pour l'intégration numérique et la simulation numériques / Stratified Monte Carlo methods for numerical integration and simulation Fakhereddine, Rana 26 September 2013 (has links) Les méthodes de Monte Carlo (MC) sont des méthodes numériques qui utilisent des nombres aléatoires pour résoudre avec des ordinateurs des problèmes des sciences appliquées et des techniques. On estime une quantité par des évaluations répétées utilisant N valeurs et l'erreur de la méthode est approchée par la variance de l'estimateur. Le présent travail analyse des méthodes de réduction de la variance et examine leur efficacité pour l'intégration numérique et la résolution d'équations différentielles et intégrales. Nous présentons d'abord les méthodes MC stratifiées et les méthodes d'échantillonnage par hypercube latin (LHS : Latin Hypercube Sampling). Parmi les méthodes de stratification, nous privilégions la méthode simple (MCS) : l'hypercube unité Is := [0; 1)s est divisé en N sous-cubes d'égale mesure, et un point aléatoire est choisi dans chacun des sous-cubes. Nous analysons la variance de ces méthodes pour le problème de la quadrature numérique. Nous étudions particulièrment le cas de l'estimation de la mesure d'un sous-ensemble de Is. La variance de la méthode MCS peut être majorée par O(1=N1+1=s). Les résultats d'expériences numériques en dimensions 2,3 et 4 montrent que les majorations obtenues sont précises. Nous proposons ensuite une méthode hybride entre MCS et LHS, qui possède les propriétés de ces deux techniques, avec un point aléatoire dans chaque sous-cube et les projections des points sur chacun des axes de coordonnées également réparties de manière régulière : une projection dans chacun des N sousintervalles qui divisent I := [0; 1) uniformément. Cette technique est appelée Stratification Sudoku (SS). Dans le même cadre d'analyse que précédemment, nous montrons que la variance de la méthode SS est majorée par O(1=N1+1=s) ; des expériences numériques en dimensions 2,3 et 4 valident les majorations démontrées. Nous présentons ensuite une approche de la méthode de marche aléatoire utilisant les techniques de réduction de variance précédentes. Nous proposons un algorithme de résolution de l'équation de diffusion, avec un coefficient de diffusion constant ou non-constant en espace. On utilise des particules échantillonnées suivant la distribution initiale, qui effectuent un déplacement gaussien à chaque pas de temps. On ordonne les particules suivant leur position à chaque étape et on remplace les nombres aléatoires qui permettent de calculer les déplacements par les points stratifiés utilisés précédemment. On évalue l'amélioration apportée par cette technique sur des exemples numériques Nous utilisons finalement une approche analogue pour la résolution numérique de l'équation de coagulation, qui modélise l'évolution de la taille de particules pouvant s'agglomérer. Les particules sont d'abord échantillonnées suivant la distribution initiale des tailles. On choisit un pas de temps et, à chaque étape et pour chaque particule, on choisit au hasard un partenaire de coalescence et un nombre aléatoire qui décide de cette coalescence. Si l'on classe les particules suivant leur taille à chaque pas de temps et si l'on remplace les nombres aléatoires par des points stratifiés, on observe une réduction de variance par rapport à l'algorithme MC usuel. / Monte Carlo (MC) methods are numerical methods using random numbers to solve on computers problems from applied sciences and techniques. One estimates a quantity by repeated evaluations using N values ; the error of the method is approximated through the variance of the estimator. In the present work, we analyze variance reduction methods and we test their efficiency for numerical integration and for solving differential or integral equations. First, we present stratified MC methods and Latin Hypercube Sampling (LHS) technique. Among stratification strategies, we focus on the simple approach (MCS) : the unit hypercube Is := [0; 1)s is divided into N subcubes having the same measure, and one random point is chosen in each subcube. We analyze the variance of the method for the problem of numerical quadrature. The case of the evaluation of the measure of a subset of Is is particularly detailed. The variance of the MCS method may be bounded by O(1=N1+1=s). The results of numerical experiments in dimensions 2,3, and 4 show that the upper bounds are tight. We next propose an hybrid method between MCS and LHS, that has properties of both approaches, with one random point in each subcube and such that the projections of the points on each coordinate axis are also evenly distributed : one projection in each of the N subintervals that uniformly divide the unit interval I := [0; 1). We call this technique Sudoku Sampling (SS). Conducting the same analysis as before, we show that the variance of the SS method is bounded by O(1=N1+1=s) ; the order of the bound is validated through the results of numerical experiments in dimensions 2,3, and 4. Next, we present an approach of the random walk method using the variance reduction techniques previously analyzed. We propose an algorithm for solving the diffusion equation with a constant or spatially-varying diffusion coefficient. One uses particles, that are sampled from the initial distribution ; they are subject to a Gaussian move in each time step. The particles are renumbered according to their positions in every step and the random numbers which give the displacements are replaced by the stratified points used above. The improvement brought by this technique is evaluated in numerical experiments. An analogous approach is finally used for numerically solving the coagulation equation ; this equation models the evolution of the sizes of particles that may agglomerate. The particles are first sampled from the initial size distribution. A time step is fixed and, in every step and for each particle, a coalescence partner is chosen and a random number decides if coalescence occurs. If the particles are ordered in every time step by increasing sizes an if the random numbers are replaced by statified points, a variance reduction is observed, when compared to the results of usual MC algorithm. Méthode de Monte Carlo Quadrature numérique Équation de diffusion Simulation numérique Réduction de la variance Marche aléatoire Équation de coagulation Monte Carlo method Numerical quadrature Diffusion equation Variance reduction Numerical simulation Random walk Coagulation equation 510
78	Dataset selection for aggregate model implementation in predictive data mining Lutu, P.E.N. (Patricia Elizabeth Nalwoga) 15 November 2010 (has links) Data mining has become a commonly used method for the analysis of organisational data, for purposes of summarizing data in useful ways and identifying non-trivial patterns and relationships in the data. Given the large volumes of data that are collected by business, government, non-government and scientific research organizations, a major challenge for data mining researchers and practitioners is how to select relevant data for analysis in sufficient quantities, in order to meet the objectives of a data mining task. This thesis addresses the problem of dataset selection for predictive data mining. Dataset selection was studied in the context of aggregate modeling for classification. The central argument of this thesis is that, for predictive data mining, it is possible to systematically select many dataset samples and employ different approaches (different from current practice) to feature selection, training dataset selection, and model construction. When a large amount of information in a large dataset is utilised in the modeling process, the resulting models will have a high level of predictive performance and should be more reliable. Aggregate classification models, also known as ensemble classifiers, have been shown to provide a high level of predictive accuracy on small datasets. Such models are known to achieve a reduction in the bias and variance components of the prediction error of a model. The research for this thesis was aimed at the design of aggregate models and the selection of training datasets from large amounts of available data. The objectives for the model design and dataset selection were to reduce the bias and variance components of the prediction error for the aggregate models. Design science research was adopted as the paradigm for the research. Large datasets obtained from the UCI KDD Archive were used in the experiments. Two classification algorithms: See5 for classification tree modeling and K-Nearest Neighbour, were used in the experiments. The two methods of aggregate modeling that were studied are One-Vs-All (OVA) and positive-Vs-negative (pVn) modeling. While OVA is an existing method that has been used for small datasets, pVn is a new method of aggregate modeling, proposed in this thesis. Methods for feature selection from large datasets, and methods for training dataset selection from large datasets, for OVA and pVn aggregate modeling, were studied. The experiments of feature selection revealed that the use of many samples, robust measures of correlation, and validation procedures result in the reliable selection of relevant features for classification. A new algorithm for feature subset search, based on the decision rule-based approach to heuristic search, was designed and the performance of this algorithm was compared to two existing algorithms for feature subset search. The experimental results revealed that the new algorithm makes better decisions for feature subset search. The information provided by a confusion matrix was used as a basis for the design of OVA and pVn base models which aren combined into one aggregate model. A new construct called a confusion graph was used in conjunction with new algorithms for the design of pVn base models. A new algorithm for combining base model predictions and resolving conflicting predictions was designed and implemented. Experiments to study the performance of the OVA and pVn aggregate models revealed the aggregate models provide a high level of predictive accuracy compared to single models. Finally, theoretical models to depict the relationships between the factors that influence feature selection and training dataset selection for aggregate models are proposed, based on the experimental results. / Thesis (PhD)--University of Pretoria, 2010. / Computer Science / unrestricted Dataset partitioning Data mining Bias reduction Predictive modeling Classification Model aggregation Ensemble classifiers Ova classification Pvn classification Dataset selection Featureselection Variable selection Large datasets Variance reduction Dataset sampling UCTD
79	Mathematical modelling and numerical simulation in materials science / Modélisation mathématique et simulation numérique en science des matériaux Boyaval, Sébastien 16 December 2009 (has links) Dans une première partie, nous étudions des schémas numériques utilisant la méthode des éléments finis pour discrétiser le système d'équations Oldroyd-B modélisant un fluide viscolélastique avec conditions de collement dans un domaine borné, en dimension deux ou trois. Le but est d'obtenir des schémas stables au sens où ils dissipent une énergie libre, imitant ainsi des propriétés thermodynamiques de dissipation similaires à celles identifiées pour des solutions régulières du modèle continu. Cette étude s'ajoute a de nombreux travaux antérieurs sur les instabilités observées dans les simulations numériques d'équations viscoélastiques (dont celles connues comme étant des Problèmes à Grand Nombre de Weissenberg). A notre connaissance, c'est la première étude qui considère rigoureusement la stabilité numérique au sens de la dissipation d'une énergie pour des discrétisations de type Galerkin. Dans une seconde partie, nous adaptons et utilisons les idées d'une méthode numérique initialement développée dans des travaux de Y. Maday, A. T. Patera et al., la méthode des bases réduites, pour simuler efficacement divers modèles multi-échelles. Le principe est d'approcher numériquement chaque élément d'une collection paramétrée d'objets complexes dans un espace de Hilbert par la plus proche combinaison linéaire dans le meilleur sous-espace vectoriel engendré par quelques éléments bien choisis au sein de la même collection paramétrée. Nous appliquons ce principe pour des problèmes numériques liés : à l'homogénéisation numérique d'équations elliptiques scalaires du second-ordre, avec coefficients de diffusion oscillant à deux échelles, puis ; à la propagation d'incertitudes (calculs de moyenne et de variance) dans un problème elliptique avec coefficients stochastiques (un champ aléatoire borné dans une condition de bord du troisième type), enfin ; au calcul Monte-Carlo de l'espérance de nombreuses variables aléatoires paramétrées, en particulier des fonctionnelles de processus stochastiques d'Itô paramétrés proches de ce qu'on rencontre dans les modèles micro-macro de fluides polymériques, avec une variable de contrôle pour en réduire la variance. Dans chaque application, le but de l'approche bases-réduites est d'accélérer les calculs sans perte de précision / In a first part, we study numerical schemes using the finite-element method to discretize the Oldroyd-B system of equations, modelling a viscoelastic fluid under no flow boundary condition in a 2- or 3- dimensional bounded domain. The goal is to get schemes which are stable in the sense that they dissipate a free-energy, mimicking that way thermodynamical properties of dissipation similar to those actually identified for smooth solutions of the continuous model. This study adds to numerous previous ones about the instabilities observed in the numerical simulations of viscoelastic fluids (in particular those known as High Weissenberg Number Problems). To our knowledge, this is the first study that rigorously considers the numerical stability in the sense of an energy dissipation for Galerkin discretizations. In a second part, we adapt and use ideas of a numerical method initially developped in the works of Y. Maday, A.T. Patera et al., the reduced-basis method, in order to efficiently simulate some multiscale models. The principle is to numerically approximate each element of a parametrized family of complicate objects in a Hilbert space through the closest linear combination within the best linear subspace spanned by a few elementswell chosen inside the same parametrized family. We apply this principle to numerical problems linked : to the numerical homogenization of second-order elliptic equations, with two-scale oscillating diffusion coefficients, then ; to the propagation of uncertainty (computations of the mean and the variance) in an elliptic problem with stochastic coefficients (a bounded stochastic field in a boundary condition of third type), last ; to the Monte-Carlo computation of the expectations of numerous parametrized random variables, in particular functionals of parametrized Itô stochastic processes close to what is encountered in micro-macro models of polymeric fluids, with a control variate to reduce its variance. In each application, the goal of the reduced-basis approach is to speed up the computations without any loss of precision Fluides viscoélastiques Réduction de Variance Propagation d'incertitudes Homogénéisation numérique Méthode des bases réduites Méthode des éléments finis Eléments finis, Méthode des Matériaux viscoélastiques Viscoelastic fluids Finite-element method Reduced-basis method Numerical homogenization Uncertainty propagation Variance reduction
80	[pt] APLICAÇÕES DO MÉTODO DA ENTROPIA CRUZADA EM ESTIMAÇÃO DE RISCO E OTIMIZAÇÃO DE CONTRATO DE MONTANTE DE USO DO SISTEMA DE TRANSMISSÃO / [en] CROSS-ENTROPY METHOD APPLICATIONS TO RISK ESTIMATE AND OPTIMIZATION OF AMOUNT OF TRANSMISSION SYSTEM USAGE 23 November 2021 (has links) [pt] As companhias regionais de distribuição não são autossuficientes em energia elétrica para atender seus clientes, e requerem importar a potência necessária do sistema interligado. No Brasil, elas realizam anualmente o processo de contratação do montante de uso do sistema de transmissão (MUST) para o horizonte dos próximos quatro anos. Essa operação é um exemplo real de tarefa que envolve decisões sob incerteza com elevado impacto na produtividade das empresas distribuidoras e do setor elétrico em geral. O trabalho se torna ainda mais complexo diante da crescente variabilidade associada à geração de energia renovável e à mudança do perfil do consumidor. O MUST é uma variável aleatória, e ser capaz de compreender sua variabilidade é crucial para melhor tomada de decisão. O fluxo de potência probabilístico é uma técnica que mapeia as incertezas das injeções nodais e configuração de rede nos equipamentos de transmissão e, consequentemente, nas potências importadas em cada ponto de conexão com o sistema interligado. Nesta tese, o objetivo principal é desenvolver metodologias baseadas no fluxo de potência probabilístico via simulação Monte Carlo, em conjunto com a técnica da entropia cruzada, para estimar os riscos envolvidos na contratação ótima do MUST. As metodologias permitem a implementação de software comercial para lidar com o algoritmo de fluxo de potência, o que é relevante para sistemas reais de grande porte. Apresenta-se, portanto, uma ferramenta computacional prática que serve aos engenheiros das distribuidoras de energia elétrica. Resultados com sistemas acadêmicos e reais mostram que as propostas cumprem os objetivos traçados, com benefícios na redução dos custos totais no processo de otimização de contratos e dos tempos computacionais envolvidos nas estimativas de risco. / [en] Local power distribution companies are not self-sufficient in electricity to serve their customers, and require importing additional energy supply from the interconnected bulk power systems. In Brazil, they annually carry out the contracting process for the amount of transmission system usage (ATSU) for the next four years. This process is a real example of a task that involves decisions under uncertainty with a high impact on the productivity of the distributions companies and on the electricity sector in general. The task becomes even more complex in face of the increasing variability associated with the generation of renewable energy and the changing profile of the consumer. The ATSU is a random variable, and being able to understand its variability is crucial for better decision making. Probabilistic power flow is a technique that maps the uncertainties of nodal injections and network configuration in the transmission equipment and, consequently, in the imported power at each connection point with the bulk power system. In this thesis, the main objective is to develop methodologies based on probabilistic power flow via Monte Carlo simulation, together with cross entropy techniques, to assess the risks involved in the optimal contracting of the ATSU. The proposed approaches allow the inclusion of commercial software to deal with the power flow algorithm, which is relevant for large practical systems. Thus, a realistic computational tool that serves the engineers of electric distribution companies is presented. Results with academic and real systems show that the proposals fulfill the objectives set, with the benefits of reducing the total costs in the optimization process of contracts and computational times involved in the risk assessments. [pt] OTIMIZACAO [pt] TECNICAS DE REDUCAO DE VARIANCIA [pt] ENTROPIA CRUZADA [pt] FLUXO DE POTENCIA PROBABILISTICO [pt] SIMULACAO MONTE CARLO [en] OPTIMIZATION [en] VARIANCE REDUCTION TECHNIQUES [en] MONTE CARLO SIMULATION [en] TRANSMISSION SYSTEM USAGE AMOUNT [en] PROBABILISTIC POWER FLOW [en] MONTE CARLO SIMULATION

Search results