Return to search

An online adaptive learning algorithm for optimal trade execution in high-frequency markets

A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy
in the Faculty of Science, School of Computer Science and Applied Mathematics
University of the Witwatersrand. October 2016. / Automated algorithmic trade execution is a central problem in modern financial markets,
however finding and navigating optimal trajectories in this system is a non-trivial
task. Many authors have developed exact analytical solutions by making simplifying
assumptions regarding governing dynamics, however for practical feasibility and robustness,
a more dynamic approach is needed to capture the spatial and temporal system
complexity and adapt as intraday regimes change.
This thesis aims to consolidate four key ideas: 1) the financial market as a complex
adaptive system, where purposeful agents with varying system visibility collectively and
simultaneously create and perceive their environment as they interact with it; 2) spin
glass models as a tractable formalism to model phenomena in this complex system; 3) the
multivariate Hawkes process as a candidate governing process for limit order book events;
and 4) reinforcement learning as a framework for online, adaptive learning. Combined
with the data and computational challenges of developing an efficient, machine-scale
trading algorithm, we present a feasible scheme which systematically encodes these ideas.
We first determine the efficacy of the proposed learning framework, under the conjecture
of approximate Markovian dynamics in the equity market. We find that a simple lookup
table Q-learning algorithm, with discrete state attributes and discrete actions, is able
to improve post-trade implementation shortfall by adapting a typical static arrival-price
volume trajectory with respect to prevailing market microstructure features streaming
from the limit order book.
To enumerate a scale-specific state space whilst avoiding the curse of dimensionality, we
propose a novel approach to detect the intraday temporal financial market state at each
decision point in the Q-learning algorithm, inspired by the complex adaptive system
paradigm. A physical analogy to the ferromagnetic Potts model at thermal equilibrium
is used to develop a high-speed maximum likelihood clustering algorithm, appropriate
for measuring critical or near-critical temporal states in the financial system. State
features are studied to extract time-scale-specific state signature vectors, which serve as
low-dimensional state descriptors and enable online state detection.
To assess the impact of agent interactions on the system, a multivariate Hawkes process is
used to measure the resiliency of the limit order book with respect to liquidity-demand
events of varying size. By studying the branching ratios associated with key quote
replenishment intensities following trades, we ensure that the limit order book is expected
to be resilient with respect to the maximum permissible trade executed by the agent.
Finally we present a feasible scheme for unsupervised state discovery, state detection
and online learning for high-frequency quantitative trading agents faced with a multifeatured,
asynchronous market data feed. We provide a technique for enumerating the
state space at the scale at which the agent interacts with the system, incorporating the
effects of a live trading agent on limit order book dynamics into the market data feed,
and hence the perceived state evolution. / LG2017

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:wits/oai:wiredspace.wits.ac.za:10539/21710
Date January 2016
CreatorsHendricks, Dieter
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis
FormatOnline resource (189 leaves), application/pdf

Page generated in 0.0014 seconds