Spelling suggestions: "subject:"biostatistik."" "subject:"datastatistik.""
561 |
Evaluation of HYDRA - A risk model for hydropower plants / Analys av HYDRA - En riskmodell för vattenkraftverkAlmgren, Lars January 2016 (has links)
Vattenfall Hydro AB has more than 50 large scale power plants. In these power plants there are over 130 power generating units. The planning of renewals of these units is important to minimize the risk of having big breakdowns which inflict long downtime. Because all power plants are different Vattenfall Hydro AB started using a self developed risk model in 2003 to improve the comparisons between power plants. Since then the model has been used without larger improvements or validation. The purpose of this study is to evaluate and analyse how well the risk model has performed and is performing. This thesis is divided into five subsections where analyses are made on the input to the model, adverse events used in the model, the probabilities used in the model, risk forecasts from the model and finally trends for the periods the model has been used. In each subsection different statistical methods are used for the analyses. From the analyses it is clear that the low number of adverse events in power plants makes the usage of statistical methods for evaluating performance of Vattenfall Hydro AB’s risk model imprecise. Based on the results of this thesis the conclusion is made that if the risk model is to be used in the future it needs further improvements to generate more accurate results. / Vattenfall Vattenkraft AB har fler än 50 storskaliga vattenkraftverk. I dessa finns det totalt över 130 stycken aggregat som omvandlar energi. För att minimera risken för stora avbrott är planeringen av förnyelser av dessa aggregat viktigt. 2003 började Vattenfall Vattenkraft AB använa en egenutvecklad riskmodell för att lättare kunna jämföra riskerna mellan kraftverken. Sedan dess har man använt modellen utan större förbättringar och validering. Syftet med detta examensarbete är att utvärdera och analysera hur väl riskmodellen fungerar och har fungerat. Studien är uppdela i fem sektioner där analyser görs på indata till modellen, oönskade händelser som används i modellen, sannolikheter som används i modellen, riskprognoser från modellen och slutligen trender för perioden då modellen använts. Baserat på resultaten från denna studie är slutsatsen att om riskmodellen ska fortsätta användas i framtiden behöver man göra förbättringar för att få fram mer precisa resultat. Från resultaten står det även klart att det låga antalet oönskade händelser i kraftverken påverkar de statistiska modeller som används för analysera riskmodellen.
|
562 |
Methods of high-dimensional statistical analysis for the prediction and monitoring of engine oil quality / Metoder för statistisk analys av högdimensionella data för prediktion och diagnostisering av kvalitet av motoroljaBerntsson, Fredrik January 2016 (has links)
Engine oils fill important functions in the operation of modern internal combustion engines. Many essential functions are provided by compounds that are either sacrificial or susceptible to degradation. The engine oil will eventually fail to provide these functions with possibly unrepairable damages as a result. To decide how often the oil should be changed, there are several laboratory tests to monitor the oil condition, e.g. FTIR (oxidation, nitration, soot, water), viscosity, TAN (acidity), TBN (alkalinity), ICP (elemental analysis) and GC (fuel dilution). These oil tests are however often labor intensive and costly and it would be desirable to supplement and/or replace some of them with simpler and faster methods. One way, is to utilise the whole spectrum of the FTIR-measurements already performed. FTIR is traditionally used to monitor chemical properties at specific wave lengths, but also provides information, in a more multivariate way though, relevant for viscosity, TAN, and TBN. In order to make use of the whole FTIR-spectrum, methods capable of handling high dimensional data have to be used. Partial Least Squares Regression (PLSR) will be used in order to predict the relevant chemical properties. This survey also considers feature selection methods based on the second order statistic Higher Criticism as well as Hierarchical Clustering. The Feature Selection methods are used in order to ease further research on how infrared data may be put into usage as a tool for more automated oil analyses. Results show that PLSR may be utilised to provide reliable estimates of mentioned chemical quantities. In addition may mentioned feature selection methods be applied without losing prediction power. The feature selection methods considered may also aid analysis of the engine oil itself and feature work on how to utilise infrared properties in the analysis of engine oil in other situations. / Motoroljor fyller viktiga funktioner i en modern förbränningsmotor. Många viktiga funktioner tillgodoses av ämnen som antingen förbrukas över tid eller är sårbara mot nedbrytning. Efter tillräckligt lång tid kommer motoroljan inte längre kunna tillgodose motorn med dessa funktioner. För att ta reda på hur snabbt en olja bryts ned under körning och hur ofta den behöver bytas ut, finns det flera laborativa analysmetoder. De vanligaste är FTIR (oxidation, nitrering, sot, vatten), viskositet (smörjförmåga), TAN (surhet), TBN (basisk buffert), ICP (grundämnesanalys) and GC (bränsleutspädning). Dessa oljetester är ofta arbetsintensiva och kostsamma och det vore önskvärt att ersätta eller komplettera dessa med mätningar från en FTIR-spektrometer. För att kunna använda hela det spektrumet krävs att statistiska metoder kapabla till att hantera högdimensionella data används. Partial Least Squares Regression används för att prediktera relevanta test från kemisk oljeanalys. Den här studien betraktar också variabelselektionsmetoder baserade på andra ordningens statistik Higher Criticism såväl som hierarkisk klustring. Variabelselektionsmetoderna används för att bidra till framtida utveckling mot hur data från det infraröda spektrumet ska användas för att kunna automatisera oljeanalyser. Resultaten visar att PLSR kan användas för att tillhandahålla exakta prediktioner av ovan nämnda kvantiteter. Vidare kan variabelselektionsmetoderna i undersökningen användas för att reducera antalet variabler och fortfarande bibehålla prediktionsförmågan. Variabelselektionsmetoderna kan också användas som verktyg för att i framtiden bidra med kunskap hur oljans egenskaper i det infraröda spektrumet kan utnyttjas för analys av motorolja i andra situationer.
|
563 |
Necessary Optimality Conditions for Two Stochastic Control ProblemsAndersson, Daniel January 2008 (has links)
This thesis consists of two papers concerning necessary conditions in stochastic control problems. In the first paper, we study the problem of controlling a linear stochastic differential equation (SDE) where the coefficients are random and not necessarily bounded. We consider relaxed control processes, i.e. the control is defined as a process taking values in the space of probability measures on the control set. The main motivation is a bond portfolio optimization problem. The relaxed control processes are then interpreted as the portfolio weights corresponding to different maturity times of the bonds. We establish existence of an optimal control and necessary conditions for optimality in the form of a maximum principle, extended to include the family of relaxed controls. In the second paper we consider the so-called singular control problem where the control consists of two components, one absolutely continuous and one singular. The absolutely continuous part of the control is allowed to enter both the drift and diffusion coefficient. The absolutely continuous part is relaxed in the classical way, i.e. the generator of the corresponding martingale problem is integrated with respect to a probability measure, guaranteeing the existence of an optimal control. This is shown to correspond to an SDE driven by a continuous orthogonal martingale measure. A maximum principle which describes necessary conditions for optimal relaxed singular control is derived. / QC 20101102
|
564 |
Generative models of limit order booksHultin, Hanna January 2021 (has links)
In this thesis generative models in machine learning are developed with the overall aim to improve methods for algorithmic trading on high-frequency electronic exchanges based on limit order books. The thesis consists of two papers. In the first paper a new generative model for the dynamic evolution of a limit order book, based on recurrent neural networks, is developed. The model captures the full dynamics of the limit order book by decomposing the probability of each transition of the limit order book into a product of conditional probabilities of order type, price level, order size, and time delay. Each such conditional probability is modeled by a recurrent neural network. In addition several evaluation metrics for generative models related to order execution are introduced. The generative model is successfully trained to fit both synthetic data generated by a Markov model and real data from the Nasdaq Stockholm exchange. The second paper explores reinforcement learning methods to find optimal policies for trading execution in Markovian models. A number of different approaches are implemented and compared, including a baseline time-weighted average price (TWAP) strategy, tabular Q-learning, and deep Q-learning based on predefined features as well as with the entire limit order book as input. The results indicate that it is preferable to use deep Q-learning with the entire limit order book as input to design efficient execution policies. In order to improve the understanding of the decisions taken by the agent, the learned action-value function for the deep Q-learning with predefined features is visualized as a function of selected features. / I denna avhandling utvecklas generativa modeller i maskininlärning med syfte att förbättra metoder för algoritmisk handel på högfrekventa elektroniska marknader baserat på orderböcker. Avhandlingen består av två artiklar. Den första artikeln utvecklar en generativ modell för den dynamiska utvecklingen av en orderbok baserad på rekurrenta neurala nätverk. Modellen fångar orderbokens fullständiga dynamik genom att bryta ned sannolikheten för varje förändring av orderboken i en produkt av betingade sannolikheter för ordertyp, prisnivå, orderstorlek och tidsfördröjning. Var och en av de betingade sannolikheterna modelleras med ett rekurrent neuralt nätverk. Dessutom introduceras flera evalueringsmetoder för generativa modeller relaterade till orderexekvering. Den generativa modellen tränas framgångsrikt både för syntetisk data, genererad av en Markovmodell, och riktig data från Nasdaq Stockholm. Den andra artikeln utforskar förstärkningsinlärning för att hitta optimala strategier för orderexekvering i Markovska modeller. Flera olika metoder implementeras och jämförs, inklusive en referensstrategi med tidsviktat medelpris, tabulär Q-inlärning och djup Q-inlärning baserade både på fördefinierade statistikor och med hela orderboken som indata. Resultaten indikerar att det är fördelaktigt att använda hela orderboken som indata för djup Q-inlärning. För att förbättra förståelsen för besluten som agenten tar, visualiseras Q-funktionen för djup Q-inlärning som funktion av de fördefinierade statistikorna.
|
565 |
On Aggregation and Dynamics of Opinions in Complex NetworksAbrahamsson, Olle January 2024 (has links)
This thesis studies two problems defined on complex networks, of which the first explores a conceivable extension of structural balance theory and the other concerns convergence issues in opinion dynamics. In the first half of the thesis we discuss possible definitions of structural balance conditions in a network with preference orderings as node attributes. The main result is that for the case with three alternatives (A, B, C) we reduce the (3!)3 = 216 possible configurations of triangles to 10 equivalence classes, and use these as measures of balance of a triangle towards possible extensions of structural balance theory. Moreover, we derive a general formula for the number of equivalent classes for preferences on n alternatives. Finally, we analyze a real-world data set and compare its empirical distribution of triangle equivalence classes to a null hypothesis in which preferences are randomly assigned to the nodes. The second half of the thesis concerns an opinion dynamics model in which each agent takes a random Bernoulli distributed action whose probability is updated at each discrete time step, and we prove that this model converges almost surely to consensus. We also provide a detailed critique of a claimed proof of this result in the literature. We generalize the result by proving that the assumption of irreducibility in the original model is not necessary. Furthermore, we prove as a corollary of the generalized result that the almost sure convergence to consensus holds also in the presence of a fully stubborn agent which never changes its opinion. In addition, we show that the model, in both the original and generalized cases, converges to consensus also in rth moment. / Avhandlingen studerar två problem definierade på komplexa nätverk, varav det första utforskar en tänkbar utökning av strukturell balansteori och det andra behandlar konvergensfrågor inom opinionsdynamik. I avhandlingens första hälft diskuteras möjliga definitioner på villkor för strukturell balans i ett nätverk med preferensordningar som nodattribut. Huvudresultatet är att för fallet med tre alternativ (A, B, C) så kan de (3!)3 = 216 möjliga konfigurationerna av trianglar reduceras till 10 ekvivalensklasser, vilka används som mått på en triangels balans som ett steg mot möjliga utökningar av strukturell balansteori. Vi härleder även en generell formel för antalet ekvivalensklasser för preferensordningar med n alternativ. Slutligen analyseras en empirisk datamängd och dess empiriska sannolikhetsfördelning av triangel-ekvivalensklasser jämförs med en nollhypotes i vilken preferenser tilldelas noderna slumpmässigt. Den andra hälften av avhandlingen rör en opinionsdynamikmodell där varje agent agerar slumpmässigt enligt en Bernoullifördelning vars sannolikhet uppdateras vid varje diskret tidssteg, och vi bevisar att denna modell konvergerar nästan säkert till konsensus. Vi ger också en detaljerad kritik av ett påstått bevis av detta resultat i litteraturen. Vi generaliserar resultatet genom att visa att antagandet om irreducibilitet i den ursprungliga modellen inte är nödvändigt. Vidare visar vi, som följdsats av det generaliserade resultatet, att den nästan säkra konvergensen till konsensus även håller om en agent är fullständigt envis och aldrig byter åsikt. I tillägg till detta visar vi att modellen, både i det ursprungliga och i det generaliserade fallet, konvergerar till konsensus även i r:te ordningens moment.
|
566 |
An Analysis of Asynchronous DataChatall, Kim, Johansson, Niklas January 2013 (has links)
Risk analysis and financial decision making requires true and appropriate estimates of correlations today and how they are expected to evolve in the future. If a portfolio consists of assets traded in markets with different trading hours, there could potentially occur an underestimation of the right correlation. This is due the asynchronous data - there exist an asynchronicity within the assets time series in the portfolio. The purpose of this paper is twofold. First, we suggest a modified synchronization model of Burns, Engle and Mezrich (1998) which replaces the first-order vector moving average with an first-order vector autoregressive process. Second, we study the time-varying dynamics along with forecasting the conditional variance-covariance and correlation through a DCC model. The performance of the DCC model is compared to the industrial standard RiskMetrics Exponentially Weighted Moving Averages (EWMA) model. The analysis shows that the covariance of the DCC model is slightly lower than of the RiskmMetrics EWMA model. Our conclusion is that the DCC model is simple and powerful and therefore a promising tool. It provides good insight into how correlations are likely to evolve in the short-run time horizon. / När man mäter risk och vid finansiellt beslutfattande är det viktigt att det finns skattningar på dagens korrelation samt dess förväntade utveckling. Om en portfölj består av tillgångar som handlas på olika finansiella marknader finns det risk att korrelationen är underskattad. Detta beror på att det existerar en assynkronitet mellan tidsserierna i portföljen, det vill säga att dem handlas under olika tider på dygnet. Syftet med denna uppsats är tvåfaldig. Först anges en modifierad synkroniseringsmodell som antar en AR(1) process istället för en MA(1) som föreslogs av Burns, Engle och Mezrich (1998). Sedan studerar vi den dagliga betingande korrelations- samt varians-kovariansmatrisen med hjälp av en DCC modell. Utförandet av DCC modellen jämförs sedan med RiskMetrics EMWA. Analysen visar i ett konkret exempel att varians-kovariansmatrisen för DCC modellen är längre än för RiskMetrics EMWA modellen. Vår slutsats är att DCC modellen ger sken av att vara ett enkelt och kraftfullt verktyg för att mäta och få en god inblick i hur korrelationer sannolikt kommer att utvecklas på kort sikt.
|
567 |
Generalised linear models with clustered dataHolmberg, Henrik January 2010 (has links)
In situations where a large data set is partitioned into many relativelysmall clusters, and where the members within a cluster have some common unmeasured characteristics, the number of parameters requiring estimation tends to increase with sample size if a fixed effects model is applied. This fact causes the assumptions underlying asymptotic results to be violated. The first paper in this thesis considers two possible solutions to this problem, a random intercepts model and a fixed effects model, where asymptoticsare replaced by a simple form of bootstrapping. A profiling approach is introduced in the fixed effects case, which makes it computationally efficient even with a huge number of clusters. The grouping effect is mainly seen as a nuisance in this paper. In the second paper the effect of misspecifying the distribution of the random effects in a generalised linear mixed model for binary data is studied. One problem with mixed effects models is that the distributional assumptions about the random effects are not easily checked from real data. Models with Gaussian, logistic and Cauchy distributional assumptions are used for parameter estimation on data simulated using the same three distributions. The effect of these assumptions on parameter estimation is presented. Two criteria for model selection are investigated, the Akaike information criterion and a criterion based on a chi-square statistic. The estimators for fixed effects parameters are quite robust against misspecification of the random effects distribution, at least with the distributions used in this paper. Even when the true random effects distribution is Cauchy, models assuming a Gaussian or a logistic distribution regularly produce estimates with less bias.
|
568 |
Anomaly Detection in Signaling Data Streams : A Time-Series ApproachThorén, Sofia, Sörberg, Richard January 2017 (has links)
This work has aimed to develop a method which can be used in order to detect anomalies in signaling data streams at a telecommunication company. It has been done by modeling and evaluating three prediction models and two detection methods. The prediction models which have been implemented are Autoregressive Integrated Moving Average (ARIMA), Holt-Winters and a Benchmark model, furthermore have two detection methods been tested; Method 1 (M1), which is based on a robust evaluation of previous prediction errors and Method 2 (M2), which is based on the standard deviation in previous data. From the evaluation of the work, we could conclude that the best performing combination of prediction- and detection methods was achieved with a modified Benchmark model and M1- detection.
|
569 |
Segmentation of motor units in ultrasound image sequencesRohlén, Robin January 2016 (has links)
The archetypal modern comic book superhero, Superman, has two superpowers of interest: the ability to see into objects and the ability to see distant objects. Now, humans possess these powers as well, due to the medical ultrasound imaging and sound navigation. Ultrasound, a type of sound we cannot hear, has enabled us to see a world otherwise invisible to us. Ultrasound medical imaging can be used to visualize and quantify anatomical and functional aspects of internal tissues and organs of the human body. Skeletal muscle tissue is functionally composed by so called motor units which are the smallest voluntarily activatable units and is of primary interest in this study. The major complexity in segmentation of motor units in skeletal muscle tissue in ultrasound image sequences is the aspect of overlapping objects. We propose a framework and evaluate the performance on simulated synthetic data. We have found that it is possible to segment motor units under an isometric contraction using high-end ultrasound scanners and we have proposed a framework which is robust when simulating up to 10 components when exposed to 20 dB Gaussian white noise. The framework is not satisfactory robust when exposed to significant amount of noise. In order to be able to segment a large number of components, decomposition is inevitable and together with development of a step including smoothing, the framework can be further improved.
|
570 |
CAPTURING BLACK FRIDAY’S EFFECT : A NEW METHOD FOR SEASONAL ADJUSTMENT OFTHE RETAIL SALES INDEXHeed, Ingrid January 2024 (has links)
No description available.
|
Page generated in 0.0753 seconds