1 |
Implications of probabilistic data modeling for rule miningHahsler, Michael, Hornik, Kurt, Reutterer, Thomas January 2005 (has links) (PDF)
Mining association rules is an important technique for discovering meaningful patterns in transaction databases. In the current literature, the properties of algorithms to mine associations are discussed in great detail. In this paper we investigate properties of transaction data sets from a probabilistic point of view. We present a simple probabilistic framework for transaction data and its implementation using the R statistical computing environment. The framework can be used to simulate transaction data when no associations are present. We use such data to explore the ability to filter noise of confidence and lift, two popular interest measures used for rule mining. Based on the framework we develop the measure hyperlift and we compare this new measure to lift using simulated data and a real-world grocery database. / Series: Research Report Series / Department of Statistics and Mathematics
|
2 |
A model-based frequency constraint for mining associations from transaction dataHahsler, Michael January 2004 (has links) (PDF)
In this paper we develop an alternative to minimum support which utilizes knowledge of the process which generates transaction data and allows for highly skewed frequency distributions. We apply a simple stochastic model (the NB model), which is known for its usefulness to describe item occurrences in transaction data, to develop a frequency constraint. This model-based frequency constraint is used together with a precision threshold to find individual support thresholds for groups of associations. We develop the notion of NB-frequent itemsets and present two mining algorithms which find all NB-frequent itemsets in a database. In experiments with publicly available transaction databases we show that the new constraint can provide significant improvements over a single minimum support threshold and that the precision threshold is easier to use. (author's abstract) / Series: Working Papers on Information Systems, Information Business and Operations
|
3 |
Online Monitoring Systems of Market Reaction to Realized Return VolatilityLiu, Chi-chin 23 July 2008 (has links)
Volatility is an important measure of stock market performance. Competing securities market makers keep abreast of the pace of volatility change by adjusting the bid-ask spreads and bid/ask quotes properly and efficiently. For intradaily high frequency transaction data, the observed volatility of stock returns can be decomposed into the sum of the two components - the realized volatility and the volatility due to microstructure noise. The quote adjustments of the market makers comprise part of the microstructure noise. In this study, we define the ratio of the realized integrated volatility to the observed squared returns as the proportion of realized integrated volatility (PIV). Time series models with generalized error distributed innovations are fitted to the PIV data based on 70-minute returns of NYSE tick-to-tick transaction data. Both retrospective and dynamic online control charts of the PIV data are established based on the fitted time series models. The McNemar test supports that the dynamic online control charts have the same power of detecting out of control events as the retrospective control charts. The Wilcoxon signedrank test is adopted to test the differences between the changes of the market maker
volatility and the realized volatility for in-control and out-of-control periods, respectively. The results reveals that the points above the upper control limit are related to the situation when the market makers can not keep up with the realized integrated volatility, whereas the points below the lower control limit indicate excessive reaction of the the market makers.
|
4 |
Visualization of an Individual Carbon Footprint Mitigation Plan Using Transaction DataBrånemark, Beatrice January 2021 (has links)
Achieving the objectives of the Paris Agreements requires actions being taken on different fronts. This study looked into how to visualize a carbon footprint mitigation plan for individuals. The research consisted of designing and implementing a prototype containing visualization of a mitigation plan in the Swedish mobile app DO, a newer type of carbon calculator that uses transaction data to estimate the users' carbon footprints. A user study was then conducted with app users to evaluate the visualization. Findings from the study involved that proper handling of data is important for what a mitigation plan communicates to the user, but that receiving guidance on how to proceed ahead was greatly appreciated regardless. For future research, the visualization of a mitigation plan on small screens could be developed further with the prototype developed for this study as a starting point. It was suggested that such research could revolve around interaction improvements, evaluation with more frequent users, and observing whether a mitigation plan could affect behavior change. / För att uppnå Parisavtalets mål krävs att åtgärder vidtas på olika fronter. Denna studie undersökte hur man kan visualisera en nedtrappningsplan för individers livsstilsrelaterade koldioxidavtryck. Studien bestod av utformning och implementering av en prototyp innehållande visualisering av en nedtrappningsplan i den svenska mobilappen DO, en nyare typ av koldioxidkalkylator som använder transaktionsdata för att uppskatta användarnas koldioxidavtryck. En användarstudie genomfördes sedan med appanvändare för att utvärdera visualiseringen. Resultaten från studien belyste att ordentlig hantering av data är avgörande för vad en nedtrappningsplan kommunicerar till användaren, men att få vägledning om vägen framåt var uppskattat oavsett. För framtida forskning kan visualisering av en nedtrappningsplan på små skärmar utvecklas vidare med prototypen utvecklad för denna studie som utgångspunkt. Det föreslogs att sådan forskning skulle kunna kretsa kring förbättringar av interaktionen, utvärdering med mer frekventa användare och att observera huruvida en nedtrappningsplan kan påverka beteendeförändringar.
|
5 |
Information visualization as an interactive business intelligence tool for improved management and self-assessment of financial brokers in private bankingTasola Kullander, Petter January 2019 (has links)
With an increase in storage capacity, many organizations strive to collect as much data as possible. The data can then be aggregated and visualized in order to provide a strategic advantage, commonly referred to as Business Intelligence (BI). Several sectors are contemporarily unfamiliar with the full potential of BI visualizations, among them the financial sector.In this report, an experimental design-oriented research study, set out to explore the affordances and challenges with designing an information visualization to improve both management and self-assessment of financial brokers in private banking at a large-scale bank.To explore this area, a prototype was iteratively developed based on information gathered from interviews and evaluations on two private banking managers and a panel of UX professionals at the bank. The final prototype was then evaluated by the two managers and five of their financial brokers through a combination of a task analysis and semi-structured interviews. The results concluded that the proposed visualization improved several aspects for financial management including business overview and workload balancing. However, the proposed tool was not deemed useful in self-assessment terms as financial brokers’ performance is so largely dependent on the current market state. / Tack vare den ständigt ökande lagringskapaciteten så strävar idag många organisationer efter att samla på sig så mycket data som möjligt om sin verksamhet. Denna data kan sedan aggregeras och visualiseras för att förbättra företagets strategi, vilket kallas Business Intelligence (BI). Idag känner en del verksamheter inte till den sanna potentialen av informationsvisualisering inom BI, däribland äldre banker.Denna rapport är en experimentell, design-orienterad forskningsstudie med syftet att utforska möjligheterna och utmaningarna med att designa en informationsvisualisering för att förbättra både management och själv-utvärdering av börsmäklare inom private banking, på en storskalig, svensk bank.För att utforska detta område så utvecklades en prototyp iterativt, baserat på information som löpande samlats under semi-strukturerade intervjuer och utvärderingar. Prototypen utvärderades slutligen av två chefer och fem av deras anställda börsmäklare genom en kombination av task analysis och semi-strukturerade intervjuer. Resultaten visar att den föreslagna visualiseringen förbättrade ett flertal aspekter inom finansiellt management, bl.a verksamhetsöverblick och balansering av arbetsbörda. Den föreslagna prototypen var dock inte användbar sett till själv-utvärdering eftersom börsmäklares prestationer är så starkt kopplat till marknadens dagliga skick.
|
6 |
Analysis of Taiwan Stock Exchange high frequency transaction dataHao Hsu, Chia- 06 July 2012 (has links)
Taiwan Security Market is a typical order-driven market. The electronic trading system of Taiwan Security Market launched in 1998 significantly reduces the trade matching time (the current matching time is around 20 seconds) and promptly provides updated online trading information to traders. In this study, we establish an online transaction simulation system which can be applied to predict trade prices and study market efficiency. Models are established for the times and volumes of the newly added bid/ask orders on the match list. Exponentially weighted moving average (EWMA) method is adopted to update the model parameters. Match prices are predicted dynamically based on the EWMA updated models. Further, high frequency bid/ask order data are used to find the supply and demand curves as well as the equilibrium prices. Differences between the transaction prices and the equilibrium prices are used to investigate the efficiency of Taiwan Security Market. Finally, EWMA and cusum control charts are used to monitor the market efficiency. In empirical study, we analyze the intra-daily (April, 2005) high frequency match data of Uni-president Enterprises Corporation and Formosa Plastics Corporation.
|
7 |
Pattern Matching for Financial Time Series DataLiu, Ching-An 29 July 2008 (has links)
In security markets, the stock price movements are closely linked to the market information. For example, the subprime mortgage triggered a global financial crisis in 2007. Drops occurred in virtually every stock market in the world. After the Federal Reserve took several steps to address the crisis, the stock markets have been gradually stable. Reaction of the traders to the arrival information results in different patterns of the stock price movements. Thus pattern matching is an important subject in future movement prediction, rule discovery and computer aided diagnosis. In this research, we propose a pattern matching procedure to seize the similar stock price movements of two listed companies during one day. First, the algorithm of searching the longest common subsequence is introduced to sieve out the time intervals where the two listed companies have the same integrated volatility levels and price rise/drop trends. Next we transform the raw price data in the found matching time periods to the Bollinger Band Percent data, then use the power spectrum to extract low frequency components. Adjusted Pearson chi-squared tests are performed to analyze the similarity of the price movement patterns in these periods. We perform the study by simulation investigation first, then apply the procedure to empirical analysis of high frequency transaction data of NYSE.
|
8 |
Essays on bayesian analysis of state space models with financial applicationsGingras, Samuel 05 1900 (has links)
Cette thèse est organisée en trois chapitres où sont développées des méthodes de simulation à posteriori pour inférence Bayesienne dans des modèles espace-état ainsi que des modèles économétriques pour l’analyse de données financières.
Au chapitre 1, nous considérons le problème de simulation a posteriori dans les modèles espace-état univariés et non-Gaussiens. Nous proposons une nouvelle méthode de Monte-Carlo par chaînes de Markov (MCMC) mettant à jour le vecteur de paramètres de la dynamique d’état ainsi que la séquence de variables d’état conjointement dans un bloc unique. La proposition MCMC est tirée en deux étapes: la distribution marginale du vecteur de paramètres de la dynamique d’état est construite en utilisant une approximation du gradient et du Hessien du logarithme de sa densité a posteriori, pour laquelle le vecteur de variables d’état a été intégré. La distribution conditionnelle de la séquence de variables d’état, étant donné la proposition du vecteur de paramètres, est telle que décrite dans McCausland (2012). Le calcul du gradient et du Hessien approximatif combine des sous-produits de calcul du tirage d’état avec une quantité modeste de calculs supplémentaires. Nous comparons l’efficacité numérique de notre simulation a posteriori à celle de la méthode Ancillarity-Sufficiency Interweaving Strategy (ASIS) décrite dans Kastner & Frühwirth-Schnatter (2014), en utilisant un modèle de volatilité stochastique Gaussien et le même panel de 23 taux de change quotidiens utilisé dans ce même article. Pour calculer la moyenne a posteriori du paramètre de persistance de la volatilité, notre efficacité numérique est de 6 à 27 fois plus élevée; pour la volatilité du paramètre de volatilité, elle est de 18 à 53 fois plus élevée. Nous analysons dans un second exemple des données de compte de transaction avec un modèle Poisson et Gamma-Poisson dynamique. Malgré la nature non Gaussienne des données de compte, nous obtenons une efficacité numérique élevée, guère inférieure à celle rapportée dans McCausland (2012) pour une méthode d’échantillonnage impliquant un calcul préliminaire de la forme de la distribution a posteriori statique des paramètres.
Au chapitre 2, nous proposons un nouveau modèle de durée conditionnelle stochastique (SCD) pour l’analyse de données de transactions financières en haute fréquence. Nous identifions certaines caractéristiques indésirables des densités de durée conditionnelles paramétriques existantes et proposons une nouvelle famille de densités conditionnelles flexibles pouvant correspondre à une grande variété de distributions avec des fonctions de taux de probabilité modérément variable. Guidés par des considérations théoriques issues de la théorie des files d’attente, nous introduisons des déviations non-paramétriques autour d’une distribution exponentielle centrale, qui, selon nous, est un bon modèle de premier ordre pour les durées financières, en utilisant une densité de Bernstein. La densité résultante est non seulement flexible, dans le sens qu’elle peut s’approcher de n’importe quelle densité continue sur [0, ∞) de manière arbitraire, à condition qu’elle se compose d’un nombre suffisamment grand de termes, mais également susceptible de rétrécissement vers la distribution exponentielle. Grâce aux tirages très efficaces des variables d’état, l’efficacité numérique de notre simulation a posteriori se compare très favorablement à celles obtenues dans les études précédentes. Nous illustrons nos méthodes à l’aide des données de cotation d’actions négociées à la Bourse de Toronto. Nous constatons que les modèles utilisant notre densité conditionnelle avec moins de qua- tre termes offrent le meilleur ajustement. La variation régulière trouvée dans les fonctions de taux de probabilité, ainsi que la possibilité qu’elle ne soit pas monotone, aurait été impossible à saisir avec une spécification paramétrique couramment utilisée.
Au chapitre 3, nous présentons un nouveau modèle de durée stochastique pour les temps de transaction dans les marchés d’actifs. Nous soutenons que les règles largement acceptées pour l’agrégation de transactions apparemment liées induisent une inférence erronée concernant les durées entre des transactions non liées: alors que deux transactions exécutées au cours de la même seconde sont probablement liées, il est extrêmement improbable que toutes paires de transactions le soient, dans un échantillon typique. En plaçant une incertitude sur les transactions liées dans notre modèle, nous améliorons l’inférence pour la distribution de la durée entre les transactions non liées, en particulier près de zéro. Nous proposons un modèle en temps discret pour les temps de transaction censurés permettant des valeurs nulles excessives résultant des durées entre les transactions liées. La distribution discrète des durées entre les transactions indépendantes découle d’une densité flexible susceptible de rétrécissement vers une distribution exponentielle. Dans un exemple empirique, nous constatons que la fonction de taux de probabilité conditionnelle sous-jacente pour des durées (non censurées) entre transactions non liées varie beaucoup moins que celles trouvées dans la plupart des études; une distribution discrète pour les transactions non liées basée sur une distribution exponentielle fournit le meilleur ajustement pour les trois séries analysées. Nous prétendons que c’est parce que nous évitons les artefacts statistiques qui résultent de règles déterministes d’agrégation des échanges et d’une distribution paramétrique inadaptée. / This thesis is organized in three chapters which develop posterior simulation methods for Bayesian inference in state space models and econometrics models for the analysis of financial data.
In Chapter 1, we consider the problem of posterior simulation in state space models with non-linear non-Gaussian observables and univariate Gaussian states. We propose a new Markov Chain Monte Carlo (MCMC) method that updates the parameter vector of the state dynamics and the state sequence together as a single block. The MCMC proposal is drawn in two steps: the marginal proposal distribution for the parameter vector is constructed using an approximation of the gradient and Hessian of its log posterior density, with the state vector integrated out. The conditional proposal distribution for the state sequence given the proposal of the parameter vector is the one described in McCausland (2012). Computation of the approximate gradient and Hessian combines computational by-products of the state draw with a modest amount of additional computation. We compare the numerical efficiency of our posterior simulation with that of the Ancillarity-Sufficiency Interweaving Strategy (ASIS) described in Kastner & Frühwirth-Schnatter (2014), using the Gaus- sian stochastic volatility model and the panel of 23 daily exchange rates from that paper. For computing the posterior mean of the volatility persistence parameter, our numerical efficiency is 6-27 times higher; for the volatility of volatility parameter, 18-53 times higher. We analyse trans- action counts in a second example using dynamic Poisson and Gamma-Poisson models. Despite non-Gaussianity of the count data, we obtain high numerical efficiency that is not much lower than that reported in McCausland (2012) for a sampler that involves pre-computing the shape of a static posterior distribution of parameters.
In Chapter 2, we propose a new stochastic conditional duration model (SCD) for the analysis of high-frequency financial transaction data. We identify undesirable features of existing parametric conditional duration densities and propose a new family of flexible conditional densities capable of matching a wide variety of distributions with moderately varying hazard functions. Guided by theoretical consideration from queuing theory, we introduce nonparametric deviations around a central exponential distribution, which we argue is a sound first-order model for financial durations, using a Bernstein density. The resulting density is not only flexible, in the sense that it can approximate any continuous density on [0,∞) arbitrarily closely, provided it consists of a large enough number of terms, but also amenable to shrinkage towards the exponential distribution. Thank to highly efficiency draws of state variables, numerical efficiency of our posterior simulation compares very favourably with those obtained in previous studies. We illustrate our methods using quotation data on equities traded on the Toronto Stock Exchange. We find that models with our proposed conditional density having less than four terms provide the best fit. The smooth variation found in the hazard functions, together with the possibility of it being non-monotonic, would have been impossible to capture using commonly used parametric specification.
In Chapter 3, we introduce a new stochastic duration model for transaction times in asset markets. We argue that widely accepted rules for aggregating seemingly related trades mislead inference pertaining to durations between unrelated trades: while any two trades executed in the same second are probably related, it is extremely unlikely that all such pairs of trades are, in a typical sample. By placing uncertainty about which trades are related within our model, we improve inference for the distribution of duration between unrelated trades, especially near zero. We propose a discrete model for censored transaction times allowing for zero-inflation resulting from clusters of related trades. The discrete distribution of durations between unrelated trades arises from a flexible density amenable to shrinkage towards an exponential distribution. In an empirical example, we find that the underlying conditional hazard function for (uncensored) durations between unrelated trades varies much less than what most studies find; a discrete distribution for unrelated trades based on an exponential distribution provides a better fit for all three series analyzed. We claim that this is because we avoid statistical artifacts that arise from deterministic trade-aggregation rules and unsuitable parametric distribution.
|
9 |
Financial Models of Interaction Based on Marked Point Processes and Gaussian Fields / Modellierung von Interaktionseffekten in Finanzdaten mittels Markierter Punktprozesse und Gaußscher ZufallsfelderMalinowski, Alexander 18 December 2012 (has links)
No description available.
|
Page generated in 0.1182 seconds