761 |
Quantitative Methods of Statistical ArbitrageBoming Ning (18414465) 22 April 2024 (has links)
<p dir="ltr">Statistical arbitrage is a prevalent trading strategy which takes advantage of mean reverse property of spreads constructed from pairs or portfolios of assets. Utilizing statistical models and algorithms, statistical arbitrage exploits and capitalizes on the pricing inefficiencies between securities or within asset portfolios. </p><p dir="ltr">In chapter 2, We propose a framework for constructing diversified portfolios with multiple pairs trading strategies. In our approach, several pairs of co-moving assets are traded simultaneously, and capital is dynamically allocated among different pairs based on the statistical characteristics of the historical spreads. This allows us to further consider various portfolio designs and rebalancing strategies. Working with empirical data, our experiments suggest the significant benefits of diversification within our proposed framework.</p><p dir="ltr">In chapter 3, we explore an optimal timing strategy for the trading of price spreads exhibiting mean-reverting characteristics. A sequential optimal stopping framework is formulated to analyze the optimal timings for both entering and subsequently liquidating positions, all while considering the impact of transaction costs. Then we leverages a refined signature optimal stopping method to resolve this sequential optimal stopping problem, thereby unveiling the precise entry and exit timings that maximize gains. Our framework operates without any predefined assumptions regarding the dynamics of the underlying mean-reverting spreads, offering adaptability to diverse scenarios. Numerical results are provided to demonstrate its superior performance when comparing with conventional mean reversion trading rules.</p><p dir="ltr">In chapter 4, we introduce an innovative model-free and reinforcement learning based framework for statistical arbitrage. For the construction of mean reversion spreads, we establish an empirical reversion time metric and optimize asset coefficients by minimizing this empirical mean reversion time. In the trading phase, we employ a reinforcement learning framework to identify the optimal mean reversion strategy. Diverging from traditional mean reversion strategies that primarily focus on price deviations from a long-term mean, our methodology creatively constructs the state space to encapsulate the recent trends in price movements. Additionally, the reward function is carefully tailored to reflect the unique characteristics of mean reversion trading.</p>
|
762 |
The theory and application of transformation in statisticsKanjo, Anis Ismail January 1962 (has links)
This paper is a review of the major literature dealing with transformations of random variates which achieve variance stabilization and approximate normalization. The subject can be said to have been initiated by a genetical paper of R. A. Fisher (1922) which uses the angular transformation Φ = 2 arcsin√p to deal with the analysis of proportions p with E(p) = P. Here it turns out that Var Φ is almost independent of P and so stabilizes the variance. Some fourteen years later Bartlett introduced the so-called square-root transformation which achieves variance stabilization for variates following a Poisson distribution. These two transformations and their ramifications in theory and application are fully discussed. along with refinements introduced by later writers, notably Curtiss (1943) and Anscombe (1948).
Another important transformation discussed is one which refers to an analysis of observations on to a logarithmic scale, and here there are uses in analysis of variance situations and theoretical problems in the field of estimation: in the case of the latter, the work of D. J. Finney (1941) is considered in some detail. The asymptotic normality of the transformation is also considered.
Transformations primarily designed to bring about ultimate normality in distribution are also included. In particular, there is reference to work on the chi-square probability integral (Fisher), (Wilson and Hilferty (1931)) and the logarithmic transformation of a correlation coefficient (Fisher (1921)).
Other miscellaneous topics referred include
i. the probability integral transformation (Probits), with applications in bioassay:
ii. applications of transformation theory to set up approximate confidence intervals for distribution parameters (BIom (1954)):
iii. transformations in connection with the interpretation of so-called 'ranked' data. / M.S.
|
763 |
A course in statistical engineeringCampean, Felician, Grove, Daniel M., Henshall, Edwin January 2005 (has links)
No / A course in statistical engineering has recently been added to the Ford Motor Company's Technical Education Program. The aim was to produce materials suitable for use by Ford but which could also be promoted by the UK's Royal Statistical Society within the university sector. The course is built around a sequence of realistic tasks dictated by the flow of the product creation process. Its structure and content is thus driven by engineering need rather than statistical method, promoting constructivist learning. Before
describing the course content we review the changing role of the engineer and comment on the relationships between Systems
Engineering, Design for Six Sigma and Statistical Engineering. We give details of a case study which plays a crucial role in the course. We focus on some important features of the development process and conclude with a discussion of the approach we have taken and possible future developments.
|
764 |
Statistical Foundations of Operator LearningNelsen, Nicholas Hao January 2024 (has links) (PDF)
<p>This thesis studies operator learning from a statistical perspective. Operator learning uses observed data to estimate mappings between infinite-dimensional spaces. It does so at the conceptually continuum level, leading to discretization-independent machine learning methods when implemented in practice. Although this framework shows promise for physical model acceleration and discovery, the mathematical theory of operator learning lags behind its empirical success. Motivated by scientific computing and inverse problems where the available data are often scarce, this thesis develops scalable algorithms for operator learning and theoretical insights into their data efficiency.</p>
<p>The thesis begins by introducing a convergent operator learning algorithm that is implementable on a computer with controlled complexity. The method is based on linear combinations of function-valued random features, enjoys efficient training via convex optimization, and accurately approximates nonlinear solution operators of parametric partial differential equations. A statistical analysis derives state-of-the-art error bounds for the method and establishes its robustness to errors stemming from noisy observations and model misspecification. Next, the thesis tackles fundamental statistical questions about how problem structure, data quality, and prior information influence learning accuracy. Specializing to a linear setting, a sharp Bayesian nonparametric analysis shows that continuum linear operators, such as the integration or differentiation of spatially varying functions, are provably learnable from noisy input-output pairs. The theory reveals that smoothing operators are easier to learn than unbounded ones and that training with rough or high-frequency input data improves sample complexity. When only specific linear functionals of the operator’s output are the primary quantities of interest, the final part of the thesis proves that the smoothness of the functionals determines whether learning directly from these finite-dimensional observations carries a statistical advantage over plug-in estimators based on learning the entire operator. To validate the findings beyond linear problems, the thesis develops practical deep operator learning architectures for nonlinear mappings that send functions to vectors, or vice versa, and shows their corresponding universal approximation properties. Altogether, this thesis advances the reliability and efficiency of operator learning for continuum problems in the physical and data sciences.</p>
|
765 |
A statistical theory of the epilepsiesThomas, Kuryan January 1988 (has links)
A new physical and mathematical model for the epilepsies is proposed, based on the theory of bond percolation on finite lattices. Within this model, the onset of seizures in the brain is identified with the appearance of spanning clusters of neurons engaged in the spurious and uncontrollable electrical activity characteristic of seizures. It is proposed that the fraction of excitatory to inhibitory synapses can be identified with a bond probability, and that the bond probability is a randomly varying quantity displaying Gaussian statistics. The consequences of the proposed model to the treatment of the epilepsies is explored.
The nature of the data on the epilepsies which can be acquired in a clinical setting is described. It is shown that such data can be analyzed to provide preliminary support for the bond percolation hypothesis, and to quantify the efficacy of anti-epileptic drugs in a treatment program. The results of a battery of statistical tests on seizure distributions are discussed.
The physical theory of the electroencephalogram (EEG) is described, and extant models of the electrical activity measured by the EEG are discussed, with an emphasis on their physical behavior. A proposal is made to explain the difference between the power spectra of electrical activity measured with cranial probes and with the EEG. Statistical tests on the characteristic EEG manifestations of epileptic activity are conducted, and their results described.
Computer simulations of a correlated bond percolating system are constructed. It is shown that the statistical properties of the results of such a simulation are strongly suggestive of the statistical properties of clinical data.
The study finds no contradictions between the predictions of the bond percolation model and the observed properties of the available data. Suggestions are made for further research and for techniques based on the proposed model which may be used for tuning the effects of anti-epileptic drugs. / Ph. D.
|
766 |
The Statistical Bootstrap ModelHamer, Christopher John January 1972 (has links) (PDF)
<p>A review is presented of the statistical bootstrap model of Hagedorn and Frautschi. This model is an attempt to apply the methods of statistical mechanics in high-energy physics, while treating all hadron states (stable or unstable) on an equal footing. A statistical calculation of the resonance spectrum on this basis leads to an exponentially rising level density ρ(m) ~ cm<sup>-3</sup> e<sup>βom</sup> at high masses.</p>
<p>In the present work, explicit formulae are given for the asymptotic dependence of the level density on quantum numbers, in various cases. Hamer and Frautschi's model for a realistic hadron spectrum is described.</p>
<p>A statistical model for hadron reactions is then put forward, analogous to the Bohr compound nucleus model in nuclear physics, which makes use of this level density. Some general features of resonance decay are predicted. The model is applied to the process of NN annihilation at rest with overall success, and explains the high final state pion multiplicity, together with the low individual branching ratios into two-body final states, which are characteristic of the process. For more general reactions, the model needs modification to take account of correlation effects. Nevertheless it is capable of explaining the phenomenon of limited transverse momenta, and the exponential decrease in the production frequency of heavy particles with their mass, as shown by Hagedorn. Frautschi's results on "Ericson fluctuations" in hadron physics are outlined briefly. The value of β<sub>o</sub> required in all these applications is consistently around [120 MeV]<sup>-1</sup> corresponding to a "resonance volume" whose radius is very close to ƛ<sub>π</sub>. The construction of a "multiperipheral cluster model" for high-energy collisions is advocated.</p>
|
767 |
A practical introduction to medical statisticsScally, Andy J. 16 October 2013 (has links)
No / Medical statistics is a vast and ever-growing field of academic endeavour, with direct application to developing the robustness of the evidence base in all areas of medicine. Although the complexity of available statistical techniques has continued to increase, fuelled by the rapid data processing capabilities of even desktop/laptop computers, medical practitioners can go a long way towards creating, critically evaluating and assimilating this evidence with an understanding of just a few key statistical concepts. While the concepts of statistics and ethics are not common bedfellows, it should be emphasised that a statistically flawed study is also an unethical study.[1] This review will outline some of these key concepts and explain how to interpret the output of some commonly used statistical analyses. Examples will be confined to two-group tests on independent samples, using both a continuous and a dichotomous/binary outcome measure.
|
768 |
Statistics of Tensor FormsSon, Jaesung January 2025 (has links)
Statistics of tensor forms appear in various contexts and provide a useful way to model dependence, which naturally arises in network data. In this thesis, we study the statistics of two different tensor forms.
In the first part of the thesis, we derive central limit theorem results which exhibit a fourth moment phenomenon. That is, the fourth moment converging to 3 implies the convergence of the statistic to a normal distribution. We also establish the other direction, which provides us with an if and only if condition for asymptotic normality. The settings and the results are very easily applied to the monochromatic subgraph count in the problem of graph coloring.
The second part of the thesis compares the relative efficiency of the maximum likelihood estimator (MLE) and the maximum pseudolikelihood estimator (MPLE) for particular p-tensor Ising models. Specifically, we show that in the graph case, i.e. when p = 2, the MLE and MPLE are equally Bahadur efficient. For high-order tensors, i.e. p ≥ 3, we show a two-layer phase transition in which the MPLE is less Bahadur efficient than the MLE in certain regimes of the parameter space, depending also on the magnitudes of the null and alternate parameters.
|
769 |
Atitude e motivação em relação ao desempenho acadêmico de alunos do curso de graduação em administração em disciplinas de estatística / Attitude and motivation in relation to the academic performance of undergraduate students in management courses in statisticsViana, Gustavo Salomão 31 October 2012 (has links)
Em uma sociedade que apresenta ênfase no conhecimento, torna-se importante analisar uma quantidade significativa de informações contidas nos bancos de dados, objetivando transformá-las em conhecimentos utilizáveis, tanto para fins comerciais, quanto científicos. A Administração surge como uma área em que uma grande multiplicidade de aplicações estatísticas é possível, indo ao encontro das próprias competências e habilidades focadas no processo decisório do administrador. Neste sentido, a Estatística torna-se importante ferramenta na área financeira, de marketing, de produção e de recursos humanos. Porém, uma questão de grande relevância reside na formação Estatística dos profissionais da Administração, considerando as problemáticas envolvidas com o ensino de tal conteúdo nos cursos de graduação. Observando, portanto, a problemática envolvida no ensino de Estatística para o curso de graduação em Administração e levando em consideração a existência de alternativas para mensuração da atitude perante a Estatística, bem como da motivação acadêmica, surgiu como possibilidade de pesquisa a investigação do modo como se dá a interação da atitude perante a Estatística e da motivação acadêmica com o desempenho acadêmico do aluno nas disciplinas de Estatística. Para a consecução do objetivo do presente trabalho, foi realizado um estudo quantitativo, por meio da aplicação da Escala de Atitude dos alunos frente a Estatística - Survey of Attitudes Toward Statistics (SATS) - e da Escala de Motivação Acadêmica - Échelle de Motivation en Éducation (EMA) -, com 278 alunos de duas faculdades públicas de Administração. Na criação dos modelos de relacionamento entre motivação acadêmica e atitude perante a Estatística (variáveis independentes) e desempenho (variável dependente), verificou-se que, em relação à nota da disciplina, o melhor modelo apresentou um baixo valor de explicação (R2 ajustado = 7,3%), surgindo como variáveis preditoras significantes apenas o Afeto, a Motivação Extrínseca - introjeção, a Motivação Extrínseca - controle externo e a Motivação Intrínseca - vivenciar estímulos. Entretanto, o modelo que apresentou a autopercepção de desempenho como variável dependente apresentou um considerável valor de explicação (R2 ajustado = 45,5%), surgindo como variáveis preditoras significantes apenas o Afeto, a Competência cognitiva e a Motivação Extrínseca - introjeção. Por meio da análise de cluster, verificou-se que um dos três grupos formados apresentou valores superiores e menos dispersos, tanto no que concerne à nota, quanto em relação à autopercepção de desempenho. Neste sentido, dada a análise dos resultados, foi possível concluir que, de modo geral, o grupo de alunos com maior interesse na área de Finanças apresentou as maiores pontuações tanto em relação à atitude perante a Estatística como em relação à motivação acadêmica. / In a society that has an emphasis on knowledge it becomes important to analyze a significant amount of information in databases in order to transform them into usable knowledge both for commercial purposes, as scientific. The Administration appears as an area where a large variety of statistical applications is possible, to suit their own skills and abilities in decision making focused administrator. In this sense, the statistic becomes important tool in finance, marketing, production and human resources. However, that issue becomes of great importance is the training of professionals Statistics Administration, considering the issues involved with teaching such content in undergraduate courses. Noting therefore the problems involved in teaching statistics to undergraduate degree in Business Administration and taking into account the existence of alternatives to measure attitude toward statistics and the academic motivation, emerged as potential research investigating the so how is the interaction of attitude Statistics and academic motivation with the academic performance of students in the disciplines of Statistics. To achieve the objective of this study was a quantitative study conducted by applying the Attitude Scale front of students to Statistics - Survey of Attitudes Toward Statistics (SATS) - and the Academic Motivation Scale - Échelle of Motivation en Éducation (EMA) - with 278 students from two public colleges of Management. In the creation of models of relationship between academic motivation and attitude towards Statistics (independent variables) and performance (dependent variable) found that compared to note the best model of discipline exhibited a low value of explanation (adjusted R2 = 7.3 %), emerging as the only significant predictors of Affection, Extrinsic Motivation - introjection, Extrinsic Motivation - external control and Intrinsic Motivation - experiencing stimuli. However, the model showed that the perception of performance as the dependent variable showed a considerable amount of explanation (adjusted R2 = 45.5%), emerging as the only significant predictors Affect, Cognitive Competence and Extrinsic Motivation - introjection. By means of cluster analysis verified that one of the three groups obtained showed greater and less dispersed, both with regard to note, as compared to the perception of performance. In this sense, given the analysis of the results, it was concluded that, in general, the group of students with greater interest in Finance presented the highest scores both in terms of attitude towards Statistics as in relation to academic motivation.
|
770 |
Análise do sinal de variabilidade da frequência cardíaca através de estatística não extensiva: taxa de q-entropia multiescala / Heart rate variability analysis through nonextensive statistics: multiscale q-entropy rateSilva, Luiz Eduardo Virgilio da 28 February 2013 (has links)
O corpo humano é um sistema complexo composto por vários subsistemas interdependentes, que interagem entre si em várias escalas. Sabe-se que essa complexidade fisiológica tende a diminuir com a presença de doenças e com o avanço da idade, reduzindo a capacidade de adaptação dos indivíduos. No sistema cardiovascular, uma das maneira de se avaliar sua dinâmica regulatória é através da análise da variabilidade da frequência cardíaca (VFC). Os métodos clássicos de análise da VFC são baseados em modelos lineares, como é o caso da análise espectral. Contudo, como os mecanismos fisiológicos de regulação cardíaca apresentam características não lineares, as análises utilizando tais modelos podem ser limitadas. Nos últimos tempos, várias propostas de métodos não lineares têm surgido. Porém, não se sabe de uma medida consistente com o conceito de complexidade fisiológica, onde tanto os regimes periódicos como aleatórios são caracterizados como perda de complexidade. Baseado no conceito de complexidade fisiológica, esta tese propõe novos métodos de análise não lineares para séries de VFC. Os métodos consistem da generalização de medidas de entropia já existentes, utilizando a mecânica estatística não aditiva de Tsallis e a técnica de geração de dados substitutos. Foi definido um método, chamado de qSDiff, que calcula a diferença entre a entropia de um sinal e a entropia média de suas séries substitutas. O método de entropia utilizado consiste de uma generalização da entropia amostral (SampEn), utilizando o paradigma não aditivo. Das séries qSDiff foram extraídos três atributos, que foram avaliados como possíveis índices de complexidade fisiológica. A entropia multiescala (MSE) também foi generalizada seguindo o paradigma não aditivo, e os mesmos atributos foram calculados em várias escalas. Os métodos foram aplicados em séries reais de VFC de humanos e de ratos, bem como em um conjunto de sinais simulados, formado por ruídos e mapas, este último em regimes caótico e periódico. O atributo qSDiffmax demonstrou ser consistente para baixas escalas ao passo que os atributos qmax e qzero para escalas maiores, separando e classificando os grupos quanto à complexidade fisiológica. Observou-se ainda uma possível relação entre estes q-atributos com a presença de caos, que precisa ser melhor estudada. Os resultados ainda apontam a possibilidade de que, na insuficiência cardíaca, ocorre maior degradação nos mecanismos de baixa escala, de curto período, ao passo que na fibrilação atrial o prejuízo se estenderia para escalas maiores. As medidas baseadas em entropia propostas são capazes de extrair informações importantes das séries de VFC, sendo mais consistentes com o conceito de complexidade fisiológica do que a SampEn (clássica). Reforçou-se a hipótese de que a complexidade se revela em múltiplas escalas de um sinal. Acreditamos que os métodos propostos podem contribuir bastante na análise da VFC e também de outros sinais biomédicos. / Human body is a complex system composed of several interdependent subsystems, interacting at various scales. It is known that physiological complexity tends to decrease with disease and aging, reducing the adaptative capabilities of the individual. In the cardiovascular system, one way to evaluate its regulatory dynamics is through the analysis of heart rate variability (HRV). Classical methods of HRV analysis are based on linear models, such as spectral analysis. However, as the physiological mechanisms regulating heart rate exhibit nonlinear characteristics, analyzes using such models may be limited. In the last years, several proposals nonlinear methods have emerged. Nevertheless, no one is known to be consistent with the physiological complexity theory, where both periodic and random regimes are characterized as complexity loss. Based on physiological complexity theory, this thesis proposes new methods for nonlinear HRV series analysis. The methods are generalization of existing entropy measures, through Tsallis nonadditive statistical mechanics and surrogate data. We defined a method, called qSDiff, which calculates the difference between the entropy of a signal and its surrogate data average entropy. The entropy method used is a generalization of sample entropy (SampEn), through nonadditive paradigm. From qSDiff we extracted three attributes, which were evaluated as potential physiological complexity indexes. Multiscale entropy (MSE) was also generalized following nonadditive paradigm, and the same attributes were calculated at various scales. The methods were applied to real human and rats HRV series, as well as to a set of simulated signals, consisting of noises and maps, the latter in chaotic and periodic regimes. qSDiffmax attribute proved to be consistent for low scales while qmax and qzero attributes to larger scales, separating and ranking groups in terms of physiological complexity. There was also found a possible relationship between these q-attributes with the presence of chaos, which must be further investigated. The results also suggested the possibility that, in congestive heart failure, degradation occurs rather at small scales or short time mechanisms, while in atrial fibrillation, damage would extend to larger scales. The proposed entropy based measures are able to extract important information of HRV series, being more consistent with physiological complexity theory than SampEn (classical). Results strengthened the hypothesis that complexity is revealed at multiple scales. We believe that the proposed methods can contribute to HRV as well as to other biomedical signals analysis.
|
Page generated in 0.196 seconds