Spelling suggestions: "subject:"rareevent"" "subject:"calledevent""
11 |
Phenomenological structure for large deviation principle in time-series statistics / 時系列統計における大偏差原理の現象論的構造Nemoto, Takahiro 23 March 2015 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(理学) / 甲第18783号 / 理博第4041号 / 新制||理||1582(附属図書館) / 31734 / 京都大学大学院理学研究科物理学・宇宙物理学専攻 / (主査)教授 佐々 真一, 准教授 篠本 滋, 准教授 武末 真二 / 学位規則第4条第1項該当 / Doctor of Science / Kyoto University / DFAM
|
12 |
Computer simulation of the homogeneous nucleation of iceReinhardt, Aleks January 2013 (has links)
In this work, we wish to determine the free energy landscape and the nucleation rate associated with the process of homogeneous ice nucleation. To do this, we simulate the homogeneous nucleation of ice with the mW monatomic model of water and with all-atom models of water using primarily the umbrella sampling rare event method. We find that the use of the mW model of water, which has simpler dynamics compared to all-atom models of water, but is nevertheless surprisingly good at reproducing experimental data, results in very reasonable agreement with classical nucleation theory, in contrast to some previous simulations of homogeneous ice nucleation. We suggest that previous simulations did not observe the lowest free energy pathway in order parameter space because of their use of global order parameters, leading to a deviation from classical nucleation theory predictions. Whilst monatomic water can nucleate reasonably quickly, all-atom models of water are considerably more difficult to simulate, primarily because of their slow dynamics of ice growth and the fact that standard order parameters do not work well in driving nucleation when such models are being used. In this thesis, we describe a local, rotationally invariant order parameter that is capable of growing ice homogeneously in a biassed simulation without the unnatural effects introduced by global order parameters, and without leading to non-physical chain-like growth of 'ice' clusters that results from a naïve implementation of the standard Steinhardt-Ten Wolde order parameter. We have successfully used this order parameter to force the growth of ice clusters in simulations of all-atom models of water. However, although ice growth can be achieved, equilibrating simulations with all-atom models of water is extremely difficult. We describe several approaches to speeding up the equilibration in all-atom models of water to enable the computation of free energy profiles for homogeneous ice nucleation.
|
13 |
Advanced Monte Carlo Methods with Applications in FinanceJoshua Chi Chun Chan Unknown Date (has links)
The main objective of this thesis is to develop novel Monte Carlo techniques with emphasis on various applications in finance and economics, particularly in the fields of risk management and asset returns modeling. New stochastic algorithms are developed for rare-event probability estimation, combinatorial optimization, parameter estimation and model selection. The contributions of this thesis are fourfold. Firstly, we study an NP-hard combinatorial optimization problem, the Winner Determination Problem (WDP) in combinatorial auctions, where buyers can bid on bundles of items rather than bidding on them sequentially. We present two randomized algorithms, namely, the cross-entropy (CE) method and the ADAptive Mulitilevel splitting (ADAM) algorithm, to solve two versions of the WDP. Although an efficient deterministic algorithm has been developed for one version of the WDP, it is not applicable for the other version considered. In addition, the proposed algorithms are straightforward and easy to program, and do not require specialized software. Secondly, two major applications of conditional Monte Carlo for estimating rare-event probabilities are presented: a complex bridge network reliability model and several generalizations of the widely popular normal copula model used in managing portfolio credit risk. We show how certain efficient conditional Monte Carlo estimators developed for simple settings can be extended to handle complex models involving hundreds or thousands of random variables. In particular, by utilizing an asymptotic description on how the rare event occurs, we derive algorithms that are not only easy to implement, but also compare favorably to existing estimators. Thirdly, we make a contribution at the methodological front by proposing an improvement of the standard CE method for estimation. The improved method is relevant, as recent research has shown that in some high-dimensional settings the likelihood ratio degeneracy problem becomes severe and the importance sampling estimator obtained from the CE algorithm becomes unreliable. In contrast, the performance of the improved variant does not deteriorate as the dimension of the problem increases. Its utility is demonstrated via a high-dimensional estimation problem in risk management, namely, a recently proposed t-copula model for credit risk. We show that even in this high-dimensional model that involves hundreds of random variables, the proposed method performs remarkably well, and compares favorably to existing importance sampling estimators. Furthermore, the improved CE algorithm is then applied to estimating the marginal likelihood, a quantity that is fundamental in Bayesian model comparison and Bayesian model averaging. We present two empirical examples to demonstrate the proposed approach. The first example involves women's labor market participation and we compare three different binary response models in order to find the one best fits the data. The second example utilizes two vector autoregressive (VAR) models to analyze the interdependence and structural stability of four U.S. macroeconomic time series: GDP growth, unemployment rate, interest rate, and inflation. Lastly, we contribute to the growing literature of asset returns modeling by proposing several novel models that explicitly take into account various recent findings in the empirical finance literature. Specifically, two classes of stylized facts are particularly important. The first set is concerned with the marginal distributions of asset returns. One prominent feature of asset returns is that the tails of their distributions are heavier than those of the normal---large returns (in absolute value) occur much more frequently than one might expect from a normally distributed random variable. Another robust empirical feature of asset returns is skewness, where the tails of the distributions are not symmetric---losses are observed more frequently than large gains. The second set of stylized facts is concerned with the dependence structure among asset returns. Recent empirical studies have cast doubts on the adequacy of the linear dependence structure implied by the multivariate normal specification. For example, data from various asset markets, including equities, currencies and commodities markets, indicate the presence of extreme co-movement in asset returns, and this observation is again incompatible with the usual assumption that asset returns are jointly normally distributed. In light of the aforementioned empirical findings, we consider various novel models that generalize the usual normal specification. We develop efficient Markov chain Monte Carlo (MCMC) algorithms to estimate the proposed models. Moreover, since the number of plausible models is large, we perform a formal Bayesian model comparison to determine the model that best fits the data. In this way, we can directly compare the two approaches of modeling asset returns: copula models and the joint modeling of returns.
|
14 |
The Generalized Splitting method for Combinatorial Counting and Static Rare-Event Probability EstimationZdravko Botev Unknown Date (has links)
This thesis is divided into two parts. In the first part we describe a new Monte Carlo algorithm for the consistent and unbiased estimation of multidimensional integrals and the efficient sampling from multidimensional densities. The algorithm is inspired by the classical splitting method and can be applied to general static simulation models. We provide examples from rare-event probability estimation, counting, optimization, and sampling, demonstrating that the proposed method can outperform existing Markov chain sampling methods in terms of convergence speed and accuracy. In the second part we present a new adaptive kernel density estimator based on linear diffusion processes. The proposed estimator builds on existing ideas for adaptive smoothing by incorporating information from a pilot density estimate. In addition, we propose a new plug-in bandwidth selection method that is free from the arbitrary normal reference rules used by existing methods. We present simulation examples in which the proposed approach outperforms existing methods in terms of accuracy and reliability.
|
15 |
Rare Events Predictions with Time Series Data / Prediktion av sällsynta händelser med tidsseriedataEriksson, Jonas, Kuusela, Tuomas January 2024 (has links)
This study aims to develop models for predicting rare events, specifically elevated intracranial pressure (ICP) in patients with traumatic brain injury (TBI). Using time-series data of ICP, we created and evaluated several machine learning models, including K-Nearest Neighbors, Random Forest, and logistic regression, in order to predict ICP levels exceeding 20 mmHg – acritical threshold for medical intervention. The time-series data was segmented and transformed into a tabular format, with feature engineering applied to extract meaningful statistical characteristics. We framed the problem as a binary classification task, focusing on whether ICP levels exceeded the 20 mmHg threshold. We focused on evaluating the optimal model by comparing the predictive performance of the algorithms. All models demonstrated good performance for predictions up to 30 minutes in advance, after which a significant decline in performance was observed. Within this timeframe, the models achieved Matthews Correlation Coefficient (MCC) scores ranging between 0.876 and 0.980, where the Random Forest models showed the highest performance. In contrast, logistic regression displayed a notable deviation at the 40-minute mark, recording an MCC score of 0.752. The results presented highlight potential to provide reliable, real-time predictions of dangerous ICP levels up to 30 minutes in advance, which is crucial for timely and effective medical interventions.
|
16 |
Analyse de sensibilité fiabiliste avec prise en compte d'incertitudes sur le modèle probabiliste - Application aux systèmes aérospatiaux / Reliability-oriented sensitivity analysis under probabilistic model uncertainty – Application to aerospace systemsChabridon, Vincent 26 November 2018 (has links)
Les systèmes aérospatiaux sont des systèmes complexes dont la fiabilité doit être garantie dès la phase de conception au regard des coûts liés aux dégâts gravissimes qu’engendrerait la moindre défaillance. En outre, la prise en compte des incertitudes influant sur le comportement (incertitudes dites « aléatoires » car liées à la variabilité naturelle de certains phénomènes) et la modélisation de ces systèmes (incertitudes dites « épistémiques » car liées au manque de connaissance et aux choix de modélisation) permet d’estimer la fiabilité de tels systèmes et demeure un enjeu crucial en ingénierie. Ainsi, la quantification des incertitudes et sa méthodologie associée consiste, dans un premier temps, à modéliser puis propager ces incertitudes à travers le modèle numérique considéré comme une « boîte-noire ». Dès lors, le but est d’estimer une quantité d’intérêt fiabiliste telle qu’une probabilité de défaillance. Pour les systèmes hautement fiables, la probabilité de défaillance recherchée est très faible, et peut être très coûteuse à estimer. D’autre part, une analyse de sensibilité de la quantité d’intérêt vis-à-vis des incertitudes en entrée peut être réalisée afin de mieux identifier et hiérarchiser l’influence des différentes sources d’incertitudes. Ainsi, la modélisation probabiliste des variables d’entrée (incertitude épistémique) peut jouer un rôle prépondérant dans la valeur de la probabilité obtenue. Une analyse plus profonde de l’impact de ce type d’incertitude doit être menée afin de donner une plus grande confiance dans la fiabilité estimée. Cette thèse traite de la prise en compte de la méconnaissance du modèle probabiliste des entrées stochastiques du modèle. Dans un cadre probabiliste, un « double niveau » d’incertitudes (aléatoires/épistémiques) doit être modélisé puis propagé à travers l’ensemble des étapes de la méthodologie de quantification des incertitudes. Dans cette thèse, le traitement des incertitudes est effectué dans un cadre bayésien où la méconnaissance sur les paramètres de distribution des variables d‘entrée est caractérisée par une densité a priori. Dans un premier temps, après propagation du double niveau d’incertitudes, la probabilité de défaillance prédictive est utilisée comme mesure de substitution à la probabilité de défaillance classique. Dans un deuxième temps, une analyse de sensibilité locale à base de score functions de cette probabilité de défaillance prédictive vis-à-vis des hyper-paramètres de loi de probabilité des variables d’entrée est proposée. Enfin, une analyse de sensibilité globale à base d’indices de Sobol appliqués à la variable binaire qu’est l’indicatrice de défaillance est réalisée. L’ensemble des méthodes proposées dans cette thèse est appliqué à un cas industriel de retombée d’un étage de lanceur. / Aerospace systems are complex engineering systems for which reliability has to be guaranteed at an early design phase, especially regarding the potential tremendous damage and costs that could be induced by any failure. Moreover, the management of various sources of uncertainties, either impacting the behavior of systems (“aleatory” uncertainty due to natural variability of physical phenomena) and/or their modeling and simulation (“epistemic” uncertainty due to lack of knowledge and modeling choices) is a cornerstone for reliability assessment of those systems. Thus, uncertainty quantification and its underlying methodology consists in several phases. Firstly, one needs to model and propagate uncertainties through the computer model which is considered as a “black-box”. Secondly, a relevant quantity of interest regarding the goal of the study, e.g., a failure probability here, has to be estimated. For highly-safe systems, the failure probability which is sought is very low and may be costly-to-estimate. Thirdly, a sensitivity analysis of the quantity of interest can be set up in order to better identify and rank the influential sources of uncertainties in input. Therefore, the probabilistic modeling of input variables (epistemic uncertainty) might strongly influence the value of the failure probability estimate obtained during the reliability analysis. A deeper investigation about the robustness of the probability estimate regarding such a type of uncertainty has to be conducted. This thesis addresses the problem of taking probabilistic modeling uncertainty of the stochastic inputs into account. Within the probabilistic framework, a “bi-level” input uncertainty has to be modeled and propagated all along the different steps of the uncertainty quantification methodology. In this thesis, the uncertainties are modeled within a Bayesian framework in which the lack of knowledge about the distribution parameters is characterized by the choice of a prior probability density function. During a first phase, after the propagation of the bi-level input uncertainty, the predictive failure probability is estimated and used as the current reliability measure instead of the standard failure probability. Then, during a second phase, a local reliability-oriented sensitivity analysis based on the use of score functions is achieved to study the impact of hyper-parameterization of the prior on the predictive failure probability estimate. Finally, in a last step, a global reliability-oriented sensitivity analysis based on Sobol indices on the indicator function adapted to the bi-level input uncertainty is proposed. All the proposed methodologies are tested and challenged on a representative industrial aerospace test-case simulating the fallout of an expendable space launcher.
|
17 |
電路設計中電流值之罕見事件的統計估計探討 / A study of statistical method on estimating rare event in IC Current彭亞凌, Peng, Ya Ling Unknown Date (has links)
距離期望值4至6倍標準差以外的罕見機率電流值,是當前積體電路設計品質的關鍵之一,但隨著精確度的標準提升,實務上以蒙地卡羅方法模擬電路資料,因曠日廢時愈發不可行,而過去透過參數模型外插估計或迴歸分析方法,也因變數蒐集不易、操作電壓減小使得電流值尾端估計產生偏差,上述原因使得尾端電流值估計困難。因此本文引進統計方法改善罕見機率電流值的估計:先以Box-Cox轉換觀察值為近似常態,改善尾端分配值的估計,再以加權迴歸方法估計罕見電流值,其中迴歸解釋變數為Log或Z分數轉換的經驗累積機率,而加權方法採用Down-weight加重極值樣本資訊的重要性,此外,本研究也考慮能蒐集完整變數的情況,改以電路資料作為解釋變數進行加權迴歸。另一方面,本研究也採用極值理論作為估計方法。
本文先以電腦模擬評估各方法的優劣,假設母體分配為常態、T分配、Gamma分配,以均方誤差作為衡量指標,模擬結果驗證了加權迴歸方法的可行性。而後參考模擬結果決定篩選樣本方式進行實證研究,資料來源為新竹某科技公司,實證結果顯示加權迴歸配合Box-Cox轉換能以十萬筆樣本數,準確估計左、右尾機率10^(-4) 、10^(-5)、10^(-6)、10^(-7)極端電流值。其中右尾部分的加權迴歸解釋變數採用對數轉換,而左尾部分的加權迴歸解釋變數採用Z分數轉換,估計結果較為準確,又若能蒐集電路資訊作為解釋變數,在左尾部份可以有最準確的估計結果;而篩選樣本尾端1%和整筆資料的方式對於不同方法的估計準確度各有利弊,皆可考慮。另外,1%門檻值比例的極值理論能穩定且中等程度的估計不同電壓下的電流值,且有短程估計最準的趨勢。 / To obtain the tail distribution of current beyond 4 to 6 sigma is nowadays a key issue in integrated circuit (IC) design and computer simulation is a popular tool to estimate the tail values. Since creating rare events via simulation is time-consuming, often the linear extrapolation methods (such as regression analysis) are applied to enhance efficiency. However, it is shown from past work that the tail values is likely to behave differently if the operating voltage is getting lower. In this study, a statistical method is introduced to deal with the lower voltage case. The data are evaluated via the Box-Cox (or power) transformation and see if they need to be transformed into normally distributed data, following by weighted regression to extrapolate the tail values. In specific, the independent variable is the empirical CDF with logarithm or z-score transformation, and the weight is down-weight in order to emphasize the information of extreme values observations. In addition to regression analysis, Extreme Value Theory (EVT) is also adopted in the research.
The computer simulation and data sets from a famous IC manufacturer in Hsinchu are used to evaluate the proposed method, with respect to mean squared error. In computer simulation, the data are assumed to be generated from normal, student t, or Gamma distribution. For empirical data, there are 10^8 observations and tail values with probabilities 10^(-4),10^(-5),10^(-6),10^(-7) are set to be the study goal given that only 10^5 observations are available. Comparing to the traditional methods and EVT, the proposed method has the best performance in estimating the tail probabilities. If the IC current is produced from regression equation and the information of independent variables can be provided, using the weighted regression can reach the best estimation for the left-tailed rare events. Also, using EVT can also produce accurate estimates provided that the tail probabilities to be estimated and the observations available are on the similar scale, e.g., probabilities 10^(-5)~10^(-7) vs.10^5 observations.
|
18 |
Essays in theory of the firm and indivisual decision making experimentsUrsino, Giovanni 28 October 2009 (has links)
Esta tesis se compone de dos partes separadas y sin relación entre ellas. El primer capítulo, coautorado con el Profesor Greg Barron, es un experimento en toma de decisiones individuales. Este capítulo se construye a partir de una literatura creciente, que enfatiza el siguiente punto: cuando aprendemos las probabilidades y los resultados de una lotería a través de la experiencia en vez de la descripción visual del problema -un prospecto- entonces tomamos decisiones como si estuviéramos devaluando eventos poco probables. Esto contrasta con el fenómeno bien conocido de que las probabilidades pequeñas suelen sobrevaluarse cuando se toman decisiones a partir de prospectos. Nuestro trabajo contribuye a la literatura dando fuerza al punto mencionado frente a algunas críticas. En particular, nosotros encontramos que la devaluación sobrevive la eliminación de un problema de muestreo que afectaba trabajos anteriores y está correcto en el nuestro. Encontramos tambi´en que hay devaluación de probabilidades pequeñas vii en toma de decisiones al mismo tiempo que sobrevaluación en juicio sobre las mismas probabilidades. Este útimo resultado no puede ser explicado. El segundo capítulo introduce una nueva teoría de integración vertical a partir del hecho de que aumentar el poder contractual de una empresa es citado muy a menudo como una razón para integrarse verticalmente con los proveedores. En mi modelo las empresas se integran para ganar poder contractual hacia proveedores no integrados en la cadena productiva. El coste de la integración es una pérdida de flexibilidad a la hora de escoger los proveedores más apropiados para un particular producto final. Muestro como las empresas que tienen inversiones más específicas en el proceso productivo tienen un ayor incentivo a integrarse. La teoría presentada permite explicar numerosos hechos estilizados como el efecto del desarrollo financiero sobre la estructura vertical de las empresas, la evolución que se observa de inversión extranjera directa a outsourcing en el comercio internacional, la conexi´on entre ciclo de vida del producto y la estructura vertical, etc. / This thesis is composed of two separate, unrelated chapters. Chapter I, coauthored with Greg Barron, is an experiment in individual decision making. It builds on a small and growing literature which makes the following point: whenever we learn the odds and outcomes of a binary choice problem through experience rather than from a visual description -a prospect- then we take decisions as if we were underweighting rare events. This is in contrast to the well known phenomenon of overweighting rare events in prospect based decisions. Our work contributes to the literature by strengthening this finding in the face of earlier criticism. In particular we find that the underweighting is robust to the elimination of sampling bias which affected previous studies and is absent from ours. We also find that underweighting in choice happens at the same time as overweighting in probability judgment. This remains unexplained. Chapter II introduces a new theory of vertical integration building on the fact that improving a company's bargaining position is often cited as a chief motivation to vertically integrate with suppliers. In my model firms integrate to gain bargaining power against other suppliers in the production process. The cost of integration is a loss of flexibility in choosing the most suitable suppliers for a particular final product. I show that the firms who make the most specific investments in the production process have the greatest incentive to integrate. The theory provides novel insights to the understanding of numerous stylized facts such as the effect of financial development on the vertical structure of firms, the observed pattern from FDI to outsourcing in international trade, the connection between product cycle and vertical structure, etc.
|
19 |
Contributions aux méthodes de branchement multi-niveaux pour les évènements rares, et applications au trafic aérien / Contributions to multilevel splitting for rare events, and applications to air trafficJacquemart, Damien 08 December 2014 (has links)
La thèse porte sur la conception et l'analyse mathématique de méthodes de Monte Carlo fiables et précises pour l'estimation de la (très petite) probabilité qu'un processus de Markov atteigne une région critique de l'espace d'état avant un instant final déterministe. L'idée sous-jacente aux méthodes de branchement multi-niveaux étudiées ici est de mettre en place une suite emboitée de régions intermédiaires de plus en plus critiques, de telle sorte qu'atteindre une région intermédiaire donnée sachant que la région intermédiaire précédente a déjà été atteinte, n'est pas si rare. En pratique, les trajectoires sont propagées, sélectionnées et répliquées dès que la région intermédiaire suivante est atteinte, et il est facile d'estimer avec précision la probabilité de transition entre deux régions intermédiaires successives. Le biais dû à la discrétisation temporelle des trajectoires du processus de Markov est corrigé en utilisant des régions intermédiaires perturbées, comme proposé par Gobet et Menozzi. Une version adaptative consiste à définir automatiquement les régions intermédiaires, à l’aide de quantiles empiriques. Néanmoins, une fois que le seuil a été fixé, il est souvent difficile voire impossible de se rappeler où (dans quel état) et quand (à quel instant) les trajectoires ont dépassé ce seuil pour la première fois, le cas échéant. La contribution de la thèse consiste à utiliser une première population de trajectoires pilotes pour définir le prochain seuil, à utiliser une deuxième population de trajectoires pour estimer la probabilité de dépassement du seuil ainsi fixé, et à itérer ces deux étapes (définition du prochain seuil, et évaluation de la probabilité de transition) jusqu'à ce que la région critique soit finalement atteinte. La convergence de cet algorithme adaptatif à deux étapes est analysée dans le cadre asymptotique d'un grand nombre de trajectoires. Idéalement, les régions intermédiaires doivent êtres définies en terme des variables spatiale et temporelle conjointement (par exemple, comme l'ensemble des états et des temps pour lesquels une fonction scalaire de l’état dépasse un niveau intermédiaire dépendant du temps). Le point de vue alternatif proposé dans la thèse est de conserver des régions intermédiaires simples, définies en terme de la variable spatiale seulement, et de faire en sorte que les trajectoires qui dépassent un seuil précocement sont davantage répliquées que les trajectoires qui dépassent ce même seuil plus tardivement. L'algorithme résultant combine les points de vue de l'échantillonnage pondéré et du branchement multi-niveaux. Sa performance est évaluée dans le cadre asymptotique d'un grand nombre de trajectoires, et en particulier un théorème central limite est obtenu pour l'erreur d'approximation relative. / The thesis deals with the design and mathematical analysis of reliable and accurate Monte Carlo methods in order to estimate the (very small) probability that a Markov process reaches a critical region of the state space before a deterministic final time. The underlying idea behind the multilevel splitting methods studied here is to design an embedded sequence of intermediate more and more critical regions, in such a way that reaching an intermediate region, given that the previous intermediate region has already been reached, is not so rare. In practice, trajectories are propagated, selected and replicated as soon as the next intermediate region is reached, and it is easy to accurately estimate the transition probability between two successive intermediate regions. The bias due to time discretization of the Markov process trajectories is corrected using perturbed intermediate regions as proposed by Gobet and Menozzi. An adaptive version would consist in the automatic design of the intermediate regions, using empirical quantiles. However, it is often difficult if not impossible to remember where (in which state) and when (at which time instant) did each successful trajectory reach the empirically defined intermediate region. The contribution of the thesis consists in using a first population of pilot trajectories to define the next threshold, in using a second population of trajectories to estimate the probability of exceeding this empirically defined threshold, and in iterating these two steps (definition of the next threshold, and evaluation of the transition probability) until the critical region is reached. The convergence of this adaptive two-step algorithm is studied in the asymptotic framework of a large number of trajectories. Ideally, the intermediate regions should be defined in terms of the spatial and temporal variables jointly (for example, as the set of states and times for which a scalar function of the state exceeds a time-dependent threshold). The alternate point of view proposed in the thesis is to keep intermediate regions as simple as possible, defined in terms of the spatial variable only, and to make sure that trajectories that manage to exceed a threshold at an early time instant are more replicated than trajectories that exceed the same threshold at a later time instant. The resulting algorithm combines importance sampling and multilevel splitting. Its preformance is evaluated in the asymptotic framework of a large number of trajectories, and in particular a central limit theorem is obtained for the relative approximation error.
|
20 |
Rare event simulation for statistical model checking / Simulation d'événements rares pour le model checking statistiqueJegourel, Cyrille 19 November 2014 (has links)
Dans cette thèse, nous considérons deux problèmes auxquels le model checking statistique doit faire face. Le premier concerne les systèmes hétérogènes qui introduisent complexité et non-déterminisme dans l'analyse. Le second problème est celui des propriétés rares, difficiles à observer et donc à quantifier. Pour le premier point, nous présentons des contributions originales pour le formalisme des systèmes composites dans le langage BIP. Nous en proposons une extension stochastique, SBIP, qui permet le recours à l'abstraction stochastique de composants et d'éliminer le non-déterminisme. Ce double effet a pour avantage de réduire la taille du système initial en le remplaçant par un système dont la sémantique est purement stochastique sur lequel les algorithmes de model checking statistique sont définis. La deuxième partie de cette thèse est consacrée à la vérification de propriétés rares. Nous avons proposé le recours à un algorithme original d'échantillonnage préférentiel pour les modèles dont le comportement est décrit à travers un ensemble de commandes. Nous avons également introduit les méthodes multi-niveaux pour la vérification de propriétés rares et nous avons justifié et mis en place l'utilisation d'un algorithme multi-niveau optimal. Ces deux méthodes poursuivent le même objectif de réduire la variance de l'estimateur et le nombre de simulations. Néanmoins, elles sont fondamentalement différentes, la première attaquant le problème au travers du modèle et la seconde au travers des propriétés. / In this thesis, we consider two problems that statistical model checking must cope. The first problem concerns heterogeneous systems, that naturally introduce complexity and non-determinism into the analysis. The second problem concerns rare properties, difficult to observe, and so to quantify. About the first point, we present original contributions for the formalism of composite systems in BIP language. We propose SBIP, a stochastic extension and define its semantics. SBIP allows the recourse to the stochastic abstraction of components and eliminate the non-determinism. This double effect has the advantage of reducing the size of the initial system by replacing it by a system whose semantics is purely stochastic, a necessary requirement for standard statistical model checking algorithms to be applicable. The second part of this thesis is devoted to the verification of rare properties in statistical model checking. We present a state-of-the-art algorithm for models described by a set of guarded commands. Lastly, we motivate the use of importance splitting for statistical model checking and set up an optimal splitting algorithm. Both methods pursue a common goal to reduce the variance of the estimator and the number of simulations. Nevertheless, they are fundamentally different, the first tackling the problem through the model and the second through the properties.
|
Page generated in 0.0571 seconds