Doctor of Philosophy / The motivation for the research presented in this thesis stems from the recent availability of high frequency limit order book data, relative scarcity of studies employing such data, economic significance of transaction costs management, and a perceived potential of data mining for uncovering patterns and relationships not identified by the traditional top-down modelling approach. We analyse and build computational models for order submissions on the Australian Stock Exchange, an order-driven market with a public electronic limit order book. The focus of the thesis is on the trade implementation problem faced by a trader who wants to transact a buy or sell order of a certain size. We use two approaches to build our models, top-down and bottom-up. The traditional, top-down approach is applied to develop an optimal order submission plan for an order which is too large to be traded immediately without a prohibitive price impact. We present an optimisation framework and some solutions for non-stationary and non-linear price impact and price impact risk. We find that our proposed transaction costs model produces fairly good forecasts of the variance of the execution shortfall. The second, bottom-up, or data mining, approach is employed for trade sign inference, where trade sign is defined as the side which initiates both a trade and the market order that triggered the trade. We are interested in an endogenous component of the order flow, as evidenced by the predictable relationship between trade sign and the variables used to infer it. We want to discover the rules which govern the trade sign, and establish a connection between them and two empirically observed regularities in market order submissions, competition for order execution and transaction cost minimisation. To achieve the above aims we first use exploratory analysis of trade and limit order book data. In particular, we conduct unsupervised clustering with the self-organising map technique. The visualisation of the transformed data reveals that buyer-initiated and seller-initiated trades form two distinct clusters. We then propose a local non-parametric trade sign inference model based on the k-nearest-neighbour classifier. The best k-nearest-neighbour classifier constructed by us requires only three predictor variables and achieves an average out-of-sample accuracy of 71.40% (SD=4.01%)1, across all of the tested stocks. The best set of predictor variables found for the non-parametric model is subsequently used to develop a piecewise linear trade sign model. That model proves superior to the k-nearest-neighbour classifier, and achieves an average out-of-sample classification accuracy of 74.38% (SD=4.25%). The result is statistically significant, after adjusting for multiple comparisons. The overall classification performance of the piecewise linear model indicates a strong dependence between trade sign and the three predictor variables, and provides evidence for the endogenous component in the order flow. Moreover, the rules for trade sign classification derived from the structure of the piecewise linear model reflect the two regularities observed in market order submissions, competition for order execution and transaction cost minimisation, and offer new insights into the relationship between them. The obtained results confirm the applicability and relevance of data mining for the analysis and modelling of stock market order submissions.
Identifer | oai:union.ndltd.org:ADTP/216019 |
Date | January 2006 |
Creators | Blazejewski, Adam |
Publisher | Engineering, School of Electrical and Information Engineering |
Source Sets | Australiasian Digital Theses Program |
Language | en_AU |
Detected Language | English |
Rights | The author retains copyright of this thesis., http://www.library.usyd.edu.au/copyright.html |
Page generated in 0.002 seconds