Global ETD Search

351	Algorithmic Trading : Hidden Markov Models on Foreign Exchange Data Idvall, Patrik, Jonsson, Conny January 2008 (has links) In this master's thesis, hidden Markov models (HMM) are evaluated as a tool for forecasting movements in a currency cross. With an ever increasing electronic market, making way for more automated trading, or so called algorithmic trading, there is constantly a need for new trading strategies trying to find alpha, the excess return, in the market. HMMs are based on the well-known theories of Markov chains, but where the states are assumed hidden, governing some observable output. HMMs have mainly been used for speech recognition and communication systems, but have lately also been utilized on financial time series with encouraging results. Both discrete and continuous versions of the model will be tested, as well as single- and multivariate input data. In addition to the basic framework, two extensions are implemented in the belief that they will further improve the prediction capabilities of the HMM. The first is a Gaussian mixture model (GMM), where one for each state assign a set of single Gaussians that are weighted together to replicate the density function of the stochastic process. This opens up for modeling non-normal distributions, which is often assumed for foreign exchange data. The second is an exponentially weighted expectation maximization (EWEM) algorithm, which takes time attenuation in consideration when re-estimating the parameters of the model. This allows for keeping old trends in mind while more recent patterns at the same time are given more attention. Empirical results shows that the HMM using continuous emission probabilities can, for some model settings, generate acceptable returns with Sharpe ratios well over one, whilst the discrete in general performs poorly. The GMM therefore seems to be an highly needed complement to the HMM for functionality. The EWEM however does not improve results as one might have expected. Our general impression is that the predictor using HMMs that we have developed and tested is too unstable to be taken in as a trading tool on foreign exchange data, with too many factors influencing the results. More research and development is called for. Algorithmic Trading Foreign Exchange Gaussian Mixture Models Hidden Markov Models Business and economics Ekonomi
352	K-way Partitioning Of Signed Bipartite Graphs Omeroglu, Nurettin Burak 01 September 2012 (has links) (PDF) Clustering is the process in which data is differentiated, classified according to some criteria. As a result of partitioning process, data is grouped into clusters for specific purpose. In a social network, clustering of people is one of the most popular problems. Therefore, we mainly concentrated on finding an efficient algorithm for this problem. In our study, data is made up of two types of entities (e.g., people, groups vs. political issues, religious beliefs) and distinct from most previous works, signed weighted bipartite graphs are used to model relations among them. For the partitioning criterion, we use the strength of the opinions between the entities. Our main intention is to partition the data into k-clusters so that entities within clusters represent strong relationship. One such example from a political domain is the opinion of people on issues. Using the signed weights on the edges, these bipartite graphs can be partitioned into two or more clusters. In political domain, a cluster represents strong relationship among a group of people and a group of issues. After partitioning, each cluster in the result set contains like-minded people and advocated issues. Our work introduces a general mechanism for k-way partitioning of signed bipartite graphs. One of the great advantages of our thesis is that it does not require any preliminary information about the structure of the input dataset. The idea has been illustrated on real and randomly generated data and promising results have been shown.
353	Duality-based adaptive finite element methods with application to time-dependent problems Johansson, August January 2010 (has links) To simulate real world problems modeled by differential equations, it is often not sufficient to consider and tackle a single equation. Rather, complex phenomena are modeled by several partial dierential equations that are coupled to each other. For example, a heart beat involve electric activity, mechanics of the movement of the walls and valves, as well as blood fow - a true multiphysics problem. There may also be ordinary differential equations modeling the reactions on a cellular level, and these may act on a much finer scale in both space and time. Determining efficient and accurate simulation tools for such multiscalar multiphysics problems is a challenge. The five scientific papers constituting this thesis investigate and present solutions to issues regarding accurate and efficient simulation using adaptive finite element methods. These include handling local accuracy through submodeling, analyzing error propagation in time-dependent multiphysics problems, developing efficient algorithms for adaptivity in time and space, and deriving error analysis for coupled PDE-ODE systems. In all these examples, the error is analyzed and controlled using the framework of dual-weighted residuals, and the spatial meshes are handled using octree based data structures. However, few realistic geometries fit such grid and to address this issue a discontinuous Galerkin Nitsche method is presented and analyzed. finite element methods dual-weighted residual method multiphysics a posteriori error estimation adaptive algorithms discontinuous Galerkin MATHEMATICS MATEMATIK
354	Visualizing and modeling partial incomplete ranking data Sun, Mingxuan 23 August 2012 (has links) Analyzing ranking data is an essential component in a wide range of important applications including web-search and recommendation systems. Rankings are difficult to visualize or model due to the computational difficulties associated with the large number of items. On the other hand, partial or incomplete rankings induce more difficulties since approaches that adapt well to typical types of rankings cannot apply generally to all types. While analyzing ranking data has a long history in statistics, construction of an efficient framework to analyze incomplete ranking data (with or without ties) is currently an open problem. This thesis addresses the problem of scalability for visualizing and modeling partial incomplete rankings. In particular, we propose a distance measure for top-k rankings with the following three properties: (1) metric, (2) emphasis on top ranks, and (3) computational efficiency. Given the distance measure, the data can be projected into a low dimensional continuous vector space via multi-dimensional scaling (MDS) for easy visualization. We further propose a non-parametric model for estimating distributions of partial incomplete rankings. For the non-parametric estimator, we use a triangular kernel that is a direct analogue of the Euclidean triangular kernel. The computational difficulties for large n are simplified using combinatorial properties and generating functions associated with symmetric groups. We show that our estimator is computational efficient for rankings of arbitrary incompleteness and tie structure. Moreover, we propose an efficient learning algorithm to construct a preference elicitation system from partial incomplete rankings, which can be used to solve the cold-start problems in ranking recommendations. The proposed approaches are examined in experiments with real search engine and movie recommendation data. Recommender systems Weighted hoeffding distance Kernel smoothing Search algorithm dissimilarity Partial incomplete ranking Algorithms Ranking and selection (Statistics)
355	Contributions to the multivariate Analysis of Marine Environmental Monitoring Graffelman, Jan 12 September 2000 (has links) The thesis parts from the view that statistics starts with data, and starts by introducing the data sets studied: marine benthic species counts and chemical measurements made at a set of sites in the Norwegian Ekofisk oil field, with replicates and annually repeated. An introductory chapter details the sampling procedure and shows with reliability calculations that the (transformed) chemical variables have excellent reliability, whereas the biological variables have poor reliability, except for a small subset of abundant species. Transformed chemical variables are shown to be approximately normal. Bootstrap methods are used to assess whether the biological variables follow a Poisson distribution, and lead to the conclusion that the Poisson distribution must be rejected, except for rare species. A separate chapter details more work on the distribution of the species variables: truncated and zero-inflated Poisson distributions as well as Poisson mixtures are used in order to account for sparseness and overdispersion. Species are thought to respond to environmental variables, and regressions of the abundance of a few selected species onto chemical variables are reported. For rare species, logistic regression and Poisson regression are the tools considered, though there are problems of overdispersion. For abundant species, random coefficient models are needed in order to cope with intraclass correlation. The environmental variables, mainly heavy metals, are highly correlated, leading to multicollinearity problems. The next chapters use a multivariate approach, where all species data is now treated simultaneously. The theory of correspondence analysis is reviewed, and some theoretical results on this method are reported (bounds for singular values, centring matrices). An applied chapter discusses the correspondence analysis of the species data in detail, detects outliers, addresses stability issues, and considers different ways of stacking data matrices to obtain an integrated analysis of several years of data, and to decompose variation into a within-sites and between-sites component. More than 40 % of the total inertia is due to variation within stations. Principal components analysis is used to analyse the set of chemical variables. Attempts are made to integrate the analysis of the biological and chemical variables. A detailed theoretical development shows how continuous variables can be mapped in an optimal manner as supplementary vectors into a correspondence analysis biplot. Geometrical properties are worked out in detail, and measures for the quality of the display are given, whereas artificial data and data from the monitoring survey are used to illustrate the theory developed. The theory of display of supplementary variables in biplots is also worked out in detail for principal component analysis, with attention for the different types of scaling, and optimality of displayed correlations. A theoretical chapter follows that gives an in depth theoretical treatment of canonical correspondence analysis, (linearly constrained correspondence analysis, CCA for short) detailing many mathematical properties and aspects of this multivariate method, such as geometrical properties, biplots, use of generalized inverses, relationships with other methods, etc. Some applications of CCA to the survey data are dealt with in a separate chapter, with their interpretation and indication of the quality of the display of the different matrices involved in the analysis. Weighted principal component analysis of weighted averages is proposed as an alternative for CCA. This leads to a better display of the weighted averages of the species, and in the cases so far studied, also leads to biplots with a higher amount of explained variance for the environmental data. The thesis closes with a bibliography and outlines some suggestions for further research, such as a the generalization of canonical correlation analysis for working with singular covariance matrices, the use partial least squares methods to account for the excess of predictors, and data fusion problems to estimate missing biological data. inertia supplementary variables principal component analysis singular values abundance data observational data multivariate analysis weighted averages 311 51
356	Design of a Rom-Less Direct Digital Frequency Synthesizer in 65nm CMOS Technology Ebrahimi Mehr, Golnaz January 2013 (has links) A 4 bit, Rom-Less Direct Digital Frequency Synthesizer (DDFS) is designed in 65nm CMOS technology. Interleaving with Return-to-Zero (RTZ) technique is used to increase the output bandwidth and synthesized frequencies. The performance of the designed synthesizer is evaluated using Cadence Virtuoso design tool. With 3.2 GHz sampling frequency, the DDFS achieves the spurious-free dynamic range (SFDR) of 60 dB to 58 dB for synthesized frequencies between 200 MHz to 1.6 GHz. With 6.4 GHz sampling frequency, the synthesizer achieves the SFDR of 46 dB to 40 dB for synthesized frequencies between 400 MHz to 3.2 GHz. The power consumption is 80 mW for the designed mixed-signal blocks. Rom-Less DDFS Interleaved DACs Return-to-Zero Sine weighted DAC
357	Modi fied Genetic Algorithms for the Single Machine Scheduling Problem Yang, Chih-Wei 11 August 2011 (has links) In this paper we propose an improved algorithm to search optimal solutions to the single machine total weighted tardiness scheduling problem. We propose using longest common sequence to combine with the random key method. Numerical simulation shows that the scheme we proposed could improve the search efficiency of genetic algorithm in this problem for some cases. scheduling problem single machine scheduling problem modified genetic algorithms genetic algorithms
358	The key factor of gold price and gold price forecasting¡GIs the gold price rise to 2000 USD per ounce a bubble? Kuo, Yi-Wei 24 June 2012 (has links) Gold price hits record high more than ¢C1900 in 2011, so how to forecast gold price and whether the influence factor of gold price change over time become more interesting issues for people. The beginning of this paper tries to find out the reasonable gold price then cut the study period into 7 stages and examines the influence factor of gold price in each stage from 1972 to 2011. Finally, this research uses the recent influence factor to build a forecasting model and tests its performance. The empirical result has three parts. First, from the view of purchasing power at December 31, 1971, gold price is too high in the end of 2011. Secondly, influence factors of gold price will change over time. They usually alter with important economic events of the world. Thirdly, the forecasting model has good performance in both in-sample and out-of-sample backtesting, but if the influence factor had changed, the performance would be worse in out-of-sample backtesting. USA Dollar Trade Weighted Index Oil price S&P500 USA 10-year bond YTM Gold price
359	Trading Strategy Mining with Gene Expression Programming Huang, Chang-Hao 12 September 2012 (has links) In the thesis, we apply the gene expression programming (GEP) to training profitable trading strategies. We propose a model which utilizes several historical periods that are highly related to the current template period, and the best trading strategies of the historical periods generate the trading signals. To keep stability of our model, we proposed the trading decision mechanism based on simple majority vote in our model. The Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) is selected as our investment target and the trading period starts from 2000/9/14 to 2012/1/17, approximately twelve years. In our experiments, the lengths of our training period are 60, 90, 120, 180, and 270 trading days, respectively. We observe that the model with higher voting threshold usually can make profitable trading decisions. The best cumulative return 236.25\% and the best annualized cumulative return 10.63\% occur when the 180-day training models pairs with available threshold 0.21 and voting threshold 0.88, which are higher than the cumulative return 0.96\% and annualized cumulative return 0.08\% of the buy-and-hold strategy. simple majority vote feature set strategy pool gene expression programming
360	D-optimal designs for weighted polynomial regression - a functional-algebraic approach Chang, Sen-Fang 20 June 2004 (has links) This paper is concerned with the problem of computing theapproximate D-optimal design for polynomial regression with weight function w(x)>0 on the design interval I=[m_0-a,m_0+a]. It is shown that if w'(x)/w(x) is a rational function on I and a is close to zero, then the problem of constructing D-optimal designs can be transformed into a differential equation problem leading us to a certain matrix including a finite number of auxiliary unknown constants, which can be approximated by a Taylor expansion. We provide a recursive algorithm to compute Taylor expansion of these constants. Moreover, the D-optimal interior support points are the zeros of a polynomial which has coefficients that can be computed from a linear system. weighted polynomial regression. recursive algorithm Taylor series rational function approximate D-optimal design matrix implicit function theorem

Search results