Spelling suggestions: "subject:"cmpirical distribution."" "subject:"7empirical distribution.""
11 |
Asymptotics of beta-Hermite EnsemblesBerglund, Filip January 2020 (has links)
In this thesis we present results about some eigenvalue statistics of the beta-Hermite ensembles, both in the classical cases corresponding to beta = 1, 2, 4, that is the Gaussian orthogonal ensemble (consisting of real symmetric matrices), the Gaussian unitary ensemble (consisting of complex Hermitian matrices) and the Gaussian symplectic ensembles (consisting of quaternionic self-dual matrices) respectively. We also look at the less explored general beta-Hermite ensembles (consisting of real tridiagonal symmetric matrices). Specifically we look at the empirical distribution function and two different scalings of the largest eigenvalue. The results we present relating to these statistics are the convergence of the empirical distribution function to the semicircle law, the convergence of the scaled largest eigenvalue to the Tracy-Widom distributions, and with a different scaling, the convergence of the largest eigenvalue to 1. We also use simulations to illustrate these results. For the Gaussian unitary ensemble, we present an expression for its level density. To aid in understanding the Gaussian symplectic ensemble we present properties of the eigenvalues of quaternionic matrices. Finally, we prove a theorem about the symmetry of the order statistic of the eigenvalues of the beta-Hermite ensembles. / I denna kandidatuppsats presenterar vi resultat om några olika egenvärdens-statistikor från beta-Hermite ensemblerna, först i de klassiska fallen då beta = 1, 2, 4, det vill säga den gaussiska ortogonala ensemblen (bestående av reella symmetriska matriser), den gaussiska unitära ensemblen (bestående av komplexa hermitiska matriser) och den gaussiska symplektiska ensemblen (bestående av kvaternioniska själv-duala matriser). Vi tittar även på de mindre undersökta generella beta-Hermite ensemblerna (bestående av reella symmetriska tridiagonala matriser). Specifikt tittar vi på den empiriska fördelningsfunktionen och två olika normeringar av det största egenvärdet. De resultat vi presenterar för dessa statistikor är den empiriska fördelningsfunktionens konvergens mot halvcirkel-fördelningen, det normerade största egenvärdets konvergens mot Tracy-Widom fördelningen, och, med en annan normering, största egenvärdets konvergens mot 1. Vi illustrerar även dessa resultat med hjälp av simuleringar. För den gaussiska unitära ensemblen presenterar vi ett uttryck för dess nivåtäthet. För att underlätta förståelsen av den gaussiska symplektiska ensemblen presenterar vi egenskaper hos egenvärdena av kvaternioniska matriser. Slutligen bevisar vi en sats om symmetrin hos ordningsstatistikan av egenvärdena av beta-Hermite ensemblerna.
|
12 |
Market Surveillance Using Empirical Quantile Model and Machine Learning / Marknadsövervakning med hjälp av empirisk kvantilmodell och maskininlärningLandberg, Daniel January 2022 (has links)
In recent years, financial trading has become more available. This has led to more market participants and more trades taking place each day. The increased activity also implies an increasing number of abusive trades. To detect the abusive trades, market surveillance systems are developed and used. In this thesis, two different methods were tested to detect these abusive trades on high-dimensional data. One was based on empirical quantiles, and the other was based on an unsupervised machine learning technique called isolation forest. The empirical quantile method uses empirical quantiles on dimensionally reduced data to determine if a datapoint is an outlier or not. Principal Component Analysis (PCA) is used to reduce the dimensionality of the data and handle the correlation between features.Isolation forest is a machine learning method that detects outliers by sorting each datapoint in a tree structure. If a datapoint is close to the root, it is more likely to be an outlier. Isolation forest have been proven to detect outliers in high-dimensional datasets successfully, but have not been tested before for market surveillance. The performance of both the quantile method and isolation forest was tested by using recall and run-time. The conclusion was that the empirical quantile method did not detect outliers accurately when all dimensions of the data were used. The method most likely suffered from the curse of dimensionality and could not handle high dimensional data. However, the performance increased when the dimensionality was reduced. Isolation forest performed better than the empirical quantile method and detected 99% of all outliers by classifying 226 datapoints as outliers out of a dataset with 184 true outliers and 1882 datapoints. / Under de senaste åren har finansiell handel blivit mer tillgänglig för allmänheten. Detta har lett till fler deltagare på marknaderna och att fler affärer sker varje dag. Den ökade aktiviteten innebär också att de missbruk som förekommer ökar. För att upptäcka otillåtna affärer utvecklas och används marknadsövervakningssystem. I den här avhandlingen testades två olika metoder för att upptäcka dessa missbruk utifrån högdimensionell data. Den ena baserades på empiriska kvantiler och den andra baserades på en oövervakad maskininlärningsteknik som kallas isolationsskog. Den empiriska kvantilmetoden använder empiriska kvantiler på dimensionellt reducerad data för att avgöra om en datapunkt är ett extremvärde eller inte. För att reducera dimensionen av datan, och för att hantera korrelationen mellan variabler, används huvudkomponent analys (HKA).Isolationsskog är en maskininlärnings metod som upptäcker extremvärden genom att sortera varje datapunkt i en trädstruktur. Om en datapunkt är nära roten är det mer sannolikt att det är en extremvärde. Isolationsskog har visat sig framgångsrikt upptäcka extremvärden i högdimensionella datauppsättningar, men har inte testats för marknadsövervakning tidigare. För att mäta prestanda för båda metoderna användes recall och körtid. Slutsatsen är att den empiriska kvantilmetoden inte hittade extremvärden när alla dimensioner av datan användes. Metoden led med största sannolikhet av dimensionalitetens förbannelse och kunde inte hantera högdimensionell data, men när dimensionaliteten reducerades ökade prestandan. Isolationsskog presterade bättre än den empiriska kvantilmetoden och lyckades detektera 99% av alla extremvärden genom att klassificera 226 datapunkter som extremvärden ur ett dataset med 184 verkliga extremvärden och 1882 datapunkter.
|
13 |
Conformidade à lei de Newcomb-Benford de grandezas astronômicas segundo a medida de Kolnogorov-SmirnovALENCASTRO JUNIOR, José Vianney Mendonça de 09 September 2016 (has links)
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2017-02-21T15:12:08Z
No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertação_JoséVianneyMendonçaDeAlencastroJr.pdf: 648691 bytes, checksum: f2fbc98e547f0284f5aef34aee9249ca (MD5) / Made available in DSpace on 2017-02-21T15:12:08Z (GMT). No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertação_JoséVianneyMendonçaDeAlencastroJr.pdf: 648691 bytes, checksum: f2fbc98e547f0284f5aef34aee9249ca (MD5)
Previous issue date: 2016-09-09 / A lei de Newcomb-Benford, também conhecida como a lei do dígito mais significativo, foi descrita pela primeira vez por Simon Newcomb, sendo apenas embasada estatisticamente após 57 anos pelo físico Frank Benford. Essa lei rege grandezas naturalmente aleatórias e tem sido utilizada por várias áreas como forma de selecionar e validar diversos tipos de dados. Em nosso trabalho tivemos como primeiro objetivo propor o uso de um método substituto ao qui-quadrado, sendo este atualmente o método comumente utilizado pela literatura para verificação da conformidade da Lei de Newcomb-Benford. Fizemos isso pois em uma massa de dados com uma grande quantidade de amostras o método qui-quadrado tende a sofrer de um problema estatístico conhecido por excesso de poder, gerando assim resultados do tipo falso negativo na estatística. Dessa forma propomos a substituição do método qui-quadrado pelo método de Kolmogorov-Smirnov baseado na Função de Distribuição Empírica para análise da conformidade global, pois esse método é mais robusto não sofrendo do excesso de poder e também é mais fiel à definição formal da Lei de Benford, já que o mesmo trabalha considerando as mantissas ao invés de apenas considerar dígitos isolados. Também propomos investigar um intervalo de confiança para o Kolmogorov-Smirnov baseando-nos em um qui-quadrado que não sofre de excesso de poder por se utilizar o Bootstraping. Em dois artigos publicados recentemente, dados de exoplanetas foram analisados e algumas grandezas foram declaradas como conformes à Lei de Benford. Com base nisso eles sugerem que o conhecimento dessa conformidade possa ser usado para uma análise na lista de objetos candidatos, o que poderá ajudar no futuro na identificação de novos exoplanetas nesta lista. Sendo assim, um outro objetivo de nosso trabalho foi explorar diversos bancos e catálogos de dados astronômicos em busca de grandezas, cuja a conformidade à lei do dígito significativo ainda não seja conhecida a fim de propor aplicações práticas para a área das ciências astronômicas. / The Newcomb-Benford law, also known as the most significant digit law, was described for the first time by astronomer and mathematician Simon Newcomb. This law was just statistically grounded after 57 years after the Newcomb’s discovery. This law governing naturally random greatness and, has been used by many knowledge areas to validate several kind of data. In this work, the first goal is propose a substitute of qui-square method. The qui-square method is the currently method used in the literature to verify the Newcomb-Benford Law’s conformity. It’s necessary because in a greatness with a big quantity of samples, the qui-square method can has false negatives results. This problem is named Excess of Power. Because that, we proposed to use the Kolmogorov-Smirnov method based in Empirical Distribution Function (EDF) to global conformity analysis. Because this method is more robust and not suffering of the Excess of Power problem. The Kolmogorov-Smirnov method also more faithful to the formal definition of Benford’s Law since the method working considering the mantissas instead of single digits. We also propose to invetigate a confidence interval for the Kolmogorov-Smirnov method based on a qui-square with Bootstrapping strategy which doesn’t suffer of Excess of Power problem. Recently, two papers were published. I this papaers exoplanets data were analysed and some greatness were declared conform to a Newcomb-Benford distribution. Because that, the authors suggest that knowledge of this conformity can be used for help in future to indentify new exoplanets in the candidates list. Therefore, another goal of this work is explorer a several astronomicals catalogs and database looking for greatness which conformity of Benford’s law is not known yet. And after that , the authors suggested practical aplications for astronomical sciences area.
|
Page generated in 0.1105 seconds