Global ETD Search

1	Může modelová kombinace řídit prognózu volatility? / Can Model Combination Improve Volatility Forecasting? Tyuleubekov, Sabyrzhan January 2019 (has links) Nowadays, there is a wide range of forecasting methods and forecasters encounter several challenges during selection of an optimal method for volatility forecasting. In order to make use of wide selection of forecasts, this thesis tests multiple forecast combination methods. Notwithstanding, there exists a plethora of forecast combination literature, combination of traditional methods with machine learning methods is relatively rare. We implement the following combination techniques: (1) simple mean forecast combination, (2) OLS combination, (3) ARIMA on OLS combined fit, (4) NNAR on OLS combined fit and (5) KNN regression on OLS combined fit. To our best knowledge, the latter two combination techniques are not yet researched in academic literature. Additionally, this thesis should help a forecaster with three choice complication causes: (1) choice of volatility proxy, (2) choice of forecast accuracy measure and (3) choice of training sample length. We found that squared and absolute return volatility proxies are much less efficient than Parkinson and Garman-Klass volatility proxies. Likewise, we show that forecast accuracy measure (RMSE, MAE or MAPE) influences optimal forecasts ranking. Finally, we found that though forecast quality does not depend on training sample length, we see that forecast...
2	Přesnost satelitních modelů v zátěžových testech bank / Satellite Model Accuracy in Bank Stress Testing Hamáček, Filip January 2019 (has links) Satellite Model Accuracy in Bank Stress Testing Abstract Filip Hamáček January 4, 2019 This thesis is dealing with credit risk satellite models in Czech Republic. Satellite model is a tool to predict financial variable from macroeconomic vari- ables and is useful for stress testing the resilience of the banking sector. The aim of this thesis is to test accuracy of prediction models for Probability of De- fault in three different segments of loans - Corporate, Housing and Consumer. Model currently used in Czech National Bank is fairly unchanged since 2012 and its predictions can be improved. This thesis tests accuracy of the original model from CNB by developing new models using modern techniques, mainly by model combination methods: Bayesian Model Averaging (currently used in European Central Bank) and Frequentist Model Averaging. Last approach used are Neural Networks. 1
3	Mitigating predictive uncertainty in hydroclimatic forecasts: impact of uncertain inputs and model structural form Chowdhury, Shahadat Hossain, Civil & Environmental Engineering, Faculty of Engineering, UNSW January 2009 (has links) Hydrologic and climate models predict variables through a simplification of the underlying complex natural processes. Model development involves minimising predictive uncertainty. Predictive uncertainty arises from three broad sources which are measurement error in observed responses, uncertainty of input variables and model structural error. This thesis introduces ways to improve predictive accuracy of hydroclimatic models by considering input and structural uncertainties. The specific methods developed to reduce the uncertainty because of erroneous inputs and model structural errors are outlined below. The uncertainty in hydrological model inputs, if ignored, introduces systematic biases in the parameters estimated. This thesis presents a method, known as simulation extrapolation (SIMEX), to ascertain the extent of parameter bias. SIMEX starts by generating a series of alternate inputs by artificially adding white noise in increasing multiples of the known input error variance. The resulting alternate parameter sets allow formulation of an empirical relationship between their values and the level of noise present. SIMEX is based on the theory that the trend in alternate parameters can be extrapolated back to the notional error free zone. The case study relates to erroneous sea surface temperature anomaly (SSTA) records used as input variables of a linear model to predict the Southern Oscillation Index (SOI). SIMEX achieves a reduction in residual errors from the SOI prediction. Besides, a hydrologic application of SIMEX is demonstrated by a synthetic simulation within a three-parameter conceptual rainfall runoff model. This thesis next advocates reductions of structural uncertainty of any single model by combining multiple alternative model responses. Current approaches for combining hydroclimatic forecasts are generally limited to using combination weights that remain static over time. This research develops a methodology for combining forecasts from multiple models in a dynamic setting as an improvement of over static weight combination. The model responses are mixed on a pair wise basis using mixing weights that vary in time reflecting the persistence of individual model skills. The concept is referred here as the pair wise dynamic weight combination. Two approaches for forecasting the dynamic weights are developed. The first of the two approaches uses a mixture of two basis distributions which are three category ordered logistic regression model and a generalised linear autoregressive model. The second approach uses a modified nearest neighbour approach to forecast the future weights. These alternatives are used to first combine a univariate response forecast, the NINO3.4 SSTA index. This is followed by a similar combination, but for the entire global gridded SSTA forecast field. Results from these applications show significant improvements being achieved due to the dynamic model combination approach. The last application demonstrating the dynamic combination logic, uses the dynamically combined multivariate SSTA forecast field as the basis of developing multi-site flow forecasts in the Namoi River catchment in eastern Australia. To further reduce structural uncertainty in the flow forecasts, three forecast models are formulated and the dynamic combination approach applied again. The study demonstrates that improved SSTA forecast (due to dynamic combination) in turn improves all three flow forecasts, while the dynamic combination of the three flow forecasts results in further improvements. Dynamic weight Uncertainty Model combination Input error SIMEX Flow forecast Sea surface temperature NINO3.4 Mixture regression
4	A Hidden Markov Model-Based Approach for Emotional Speech Synthesis Yang, Chih-Yung 30 August 2010 (has links) In this thesis, we describe two approaches to automatically synthesize the emotional speech of a target speaker based on the hidden Markov model for his/her neutral speech. In the interpolation based method, the basic idea is the model interpolation between the neutral model of the target speaker and an emotional model selected from a candidate pool. Both the interpolation model selection and the interpolation weight computation are determined based on a model-distance measure. We propose a monophone-based Mahalanobis distance (MBMD). In the parallel model combination (PMC) based method, our basic idea is to model the mismatch between neutral model and emotional model. We train linear regression model to describe this mismatch. And then we combine the target speaker neutral model with the linear regression model. We evaluate our approach on the synthesized emotional speech of angriness, happiness, and sadness with several subjective tests. Experimental results show that the implemented system is able to synthesize speech with emotional expressiveness of the target speaker. speech synthesis HMM emotional expressiveness model combination linear regression model interpolation Mahalanobis distance
5	A Query Dependent Ranking Approach for Information Retrieval Lee, Lian-Wang 28 August 2009 (has links) Ranking model construction is an important topic in information retrieval. Recently, many approaches based on the idea of ¡§learning to rank¡¨ have been proposed for this task and most of them attempt to score all documents of different queries by resorting to a single function. In this thesis, we propose a novel framework of query-dependent ranking. A simple similarity measure is used to calculate similarities between queries. An individual ranking model is constructed for each training query with corresponding documents. When a new query is asked, documents retrieved for the new query are ranked according to the scores determined by a ranking model which is combined from the models of similar training queries. A mechanism for determining combining weights is also provided. Experimental results show that this query dependent ranking approach is more effective than other approaches. information retrieval Ranking model model combination query similarity learning to rank query dependent ranking
6	Mitigating predictive uncertainty in hydroclimatic forecasts: impact of uncertain inputs and model structural form Chowdhury, Shahadat Hossain, Civil & Environmental Engineering, Faculty of Engineering, UNSW January 2009 (has links) Hydrologic and climate models predict variables through a simplification of the underlying complex natural processes. Model development involves minimising predictive uncertainty. Predictive uncertainty arises from three broad sources which are measurement error in observed responses, uncertainty of input variables and model structural error. This thesis introduces ways to improve predictive accuracy of hydroclimatic models by considering input and structural uncertainties. The specific methods developed to reduce the uncertainty because of erroneous inputs and model structural errors are outlined below. The uncertainty in hydrological model inputs, if ignored, introduces systematic biases in the parameters estimated. This thesis presents a method, known as simulation extrapolation (SIMEX), to ascertain the extent of parameter bias. SIMEX starts by generating a series of alternate inputs by artificially adding white noise in increasing multiples of the known input error variance. The resulting alternate parameter sets allow formulation of an empirical relationship between their values and the level of noise present. SIMEX is based on the theory that the trend in alternate parameters can be extrapolated back to the notional error free zone. The case study relates to erroneous sea surface temperature anomaly (SSTA) records used as input variables of a linear model to predict the Southern Oscillation Index (SOI). SIMEX achieves a reduction in residual errors from the SOI prediction. Besides, a hydrologic application of SIMEX is demonstrated by a synthetic simulation within a three-parameter conceptual rainfall runoff model. This thesis next advocates reductions of structural uncertainty of any single model by combining multiple alternative model responses. Current approaches for combining hydroclimatic forecasts are generally limited to using combination weights that remain static over time. This research develops a methodology for combining forecasts from multiple models in a dynamic setting as an improvement of over static weight combination. The model responses are mixed on a pair wise basis using mixing weights that vary in time reflecting the persistence of individual model skills. The concept is referred here as the pair wise dynamic weight combination. Two approaches for forecasting the dynamic weights are developed. The first of the two approaches uses a mixture of two basis distributions which are three category ordered logistic regression model and a generalised linear autoregressive model. The second approach uses a modified nearest neighbour approach to forecast the future weights. These alternatives are used to first combine a univariate response forecast, the NINO3.4 SSTA index. This is followed by a similar combination, but for the entire global gridded SSTA forecast field. Results from these applications show significant improvements being achieved due to the dynamic model combination approach. The last application demonstrating the dynamic combination logic, uses the dynamically combined multivariate SSTA forecast field as the basis of developing multi-site flow forecasts in the Namoi River catchment in eastern Australia. To further reduce structural uncertainty in the flow forecasts, three forecast models are formulated and the dynamic combination approach applied again. The study demonstrates that improved SSTA forecast (due to dynamic combination) in turn improves all three flow forecasts, while the dynamic combination of the three flow forecasts results in further improvements. Dynamic weight Uncertainty Model combination Input error SIMEX Flow forecast Sea surface temperature NINO3.4 Mixture regression
7	Automatic Generation of Music for Inducing Emotive and Physiological Responses Monteith, Kristine Perry 13 August 2012 (has links) (PDF) Music and emotion are two realms traditionally considered to be unique to human intelligence. This dissertation focuses on furthering artificial intelligence research, specifically in the area of computational creativity, by investigating methods of composing music that elicits desired emotional and physiological responses. It includes the following: an algorithm for generating original musical selections that effectively elicit targeted emotional and physiological responses; a description of some of the musical features that contribute to the conveyance of a given emotion or the elicitation of a given physiological response; and an account of how this algorithm can be used effectively in two different situations, the generation of soundtracks for fairy tales and the generation of melodic accompaniments for lyrics. This dissertation also presents research on more general machine learning topics. These include a method of combining output from base classifiers in an ensemble that improves accuracy over a number of different baseline strategies and a description of some of the problems inherent in the Bayesian model averaging strategy and a novel algorithm for improving it. automatic music generation computational creativity ensembles Bayesian model combination Computer Sciences
8	Predicting toxicity caused by high-dose-ratebrachytherapy boost for prostate cancer Estefan, Dalia January 2019 (has links) Introduction Treating localized prostate cancer with combination radiotherapy consisting ofexternal beam radiotherapy (EBRT) and high-dose-rate brachytherapy (HDR-BT) has beenproven to result in better disease outcome than EBRT only. There is, however, a decreasingtrend in utilization of combination therapy, partially due to concerns for elevated toxicityrisks. Aim To determine which parameters correlate to acute and late (≤ 6 months) urinary toxicity(AUT and LUT) and acute and late rectal toxicity (ART and LRT), and thereafter createpredictive models for rectal toxicity. Methods Data on toxicity rates and 32 patient, tumor and treatment parameters were collectedfrom 359 patients treated between 2008 and 2018 with EBRT (42 Gy in 14 fractions) andHDR-BT (14.5 Gy in 1 fraction) for localized prostate cancer at Örebro University Hospital.Bivariate analyses were conducted on all parameters and the outcome variables AUT, LUT,ART and LRT grade ≥ 1, graded according to the RTOG-criteria. Parameters correlating toART and LRT in this and previous studies were included in multivariate logistic regressionanalyses for creation of predictive models. Results Most toxicities, 86%, were of grade 0 or 1, only 9% of patients had grade 2 – 3toxicity. Only 2 – 4 parameters correlated to the respective toxicities in bivariate analyses.Logistic regressions generated no significant predictors of ART or LRT. Therefore, nopredictive models were obtained. Conclusion None of the included parameters have enough discriminative abilities regardingrectal toxicity. Predictive models can most probably be obtained by including otherparameters and more patients. Medical and Health Sciences Medicin och hälsovetenskap
9	Using Pareto points for model identification in predictive toxicology Palczewska, Anna Maria, Neagu, Daniel, Ridley, Mick J. January 2013 (has links) no / Predictive toxicology is concerned with the development of models that are able to predict the toxicity of chemicals. A reliable prediction of toxic effects of chemicals in living systems is highly desirable in cosmetics, drug design or food protection to speed up the process of chemical compound discovery while reducing the need for lab tests. There is an extensive literature associated with the best practice of model generation and data integration but management and automated identification of relevant models from available collections of models is still an open problem. Currently, the decision on which model should be used for a new chemical compound is left to users. This paper intends to initiate the discussion on automated model identification. We present an algorithm, based on Pareto optimality, which mines model collections and identifies a model that offers a reliable prediction for a new chemical compound. The performance of this new approach is verified for two endpoints: IGC50 and LogP. The results show a great potential for automated model identification methods in predictive toxicology.
10	Reducing development costs of large vocabulary speech recognition systems / Réduction des coûts de développement de systèmes de reconnaissance de la parole à grand vocabulaire Fraga Da Silva, Thiago 29 September 2014 (has links) Au long des dernières décennies, des importants avancements ont été réalisés dans le domaine de la reconnaissance de la parole à grand vocabulaire. Un des défis à relever dans le domaine concerne la réduction des coûts de développement nécessaires pour construire un nouveau système ou adapter un système existant à une nouvelle tâche, langue ou dialecte. Les systèmes de reconnaissance de la parole à l’état de l’art sont basés sur les principes de l’apprentissage statistique, utilisant l’information fournie par deux modèles stochastiques, un modèle acoustique (MA) et un modèle de langue (ML). Les méthodes standards utilisées pour construire ces modèles s’appuient sur deux hypothèses de base : les jeux de données d’apprentissage sont suffisamment grands, et les données d’apprentissage correspondent bien à la tâche cible. Il est bien connu qu’une partie importante des coûts de développement est dû à la préparation des corpora qui remplissent ces deux conditions, l’origine principale des coûts étant la transcription manuelle des données audio. De plus, pour certaines applications, notamment la reconnaissance des langues et dialectes dits "peu dotés", la collecte des données est en soi une mission difficile. Cette thèse a pour but d’examiner et de proposer des méthodes visant à réduire le besoin de transcriptions manuelles des données audio pour une tâche donnée. Deux axes de recherche ont été suivis. Dans un premier temps, des méthodes d’apprentissage dits "non-supervisées" sont explorées. Leur point commun est l’utilisation des transcriptions audio obtenues automatiquement à l’aide d’un système de reconnaissance existant. Des méthodes non-supervisées sont explorées pour la construction de trois des principales composantes des systèmes de reconnaissance. D’abord, une nouvelle méthode d’apprentissage non-supervisée des MAs est proposée : l’utilisation de plusieurs hypothèses de décodage (au lieu de la meilleure uniquement) conduit à des gains de performance substantiels par rapport à l’approche standard. L’approche non-supervisée est également étendue à l’estimation des paramètres du réseau de neurones (RN) utilisé pour l’extraction d’attributs acoustiques. Cette approche permet la construction des modèles acoustiques d’une façon totalement non-supervisée et conduit à des résultats compétitifs en comparaison avec des RNs estimés de façon supervisée. Finalement, des méthodes non-supervisées sont explorées pour l’estimation des MLs à repli (back-off ) standards et MLs neuronaux. Il est montré que l’apprentissage non-supervisée des MLs conduit à des gains de performance additifs (bien que petits) à ceux obtenus par l’apprentissage non-supervisée des MAs. Dans un deuxième temps, cette thèse propose l’utilisation de l’interpolation de modèles comme une alternative rapide et flexible pour la construction des MAs pour une tâche cible. Les modèles obtenus à partir d’interpolation se montrent plus performants que les modèles de base, notamment ceux estimés à échantillons regroupés ou ceux adaptés à la tâche cible. Il est montré que l’interpolation de modèles est particulièrement utile pour la reconnaissance des dialectes peu dotés. Quand la quantité de données d’apprentissage acoustiques du dialecte ciblé est petite (2 à 3 heures) ou même nulle, l’interpolation des modèles conduit à des gains de performances considérables par rapport aux méthodes standards. / One of the outstanding challenges in large vocabulary automatic speech recognition (ASR) is the reduction of development costs required to build a new recognition system or adapt an existing one to a new task, language or dialect. The state-of-the-art ASR systems are based on the principles of the statistical learning paradigm, using information provided by two stochastic models, an acoustic (AM) and a language (LM) model. The standard methods used to estimate the parameters of such models are founded on two main assumptions : the training data sets are large enough, and the training data match well the target task. It is well-known that a great part of system development costs is due to the construction of corpora that fulfill these requirements. In particular, manually transcribing the audio data is the most expensive and time-consuming endeavor. For some applications, such as the recognition of low resourced languages or dialects, finding and collecting data is also a hard (and expensive) task. As a means to lower the cost required for ASR system development, this thesis proposes and studies methods that aim to alleviate the need for manually transcribing audio data for a given target task. Two axes of research are explored. First, unsupervised training methods are explored in order to build three of the main components of ASR systems : the acoustic model, the multi-layer perceptron (MLP) used to extract acoustic features and the language model. The unsupervised training methods aim to estimate the model parameters using a large amount of automatically (and inaccurately) transcribed audio data, obtained thanks to an existing recognition system. A novel method for unsupervised AM training that copes well with the automatic audio transcripts is proposed : the use of multiple recognition hypotheses (rather than the best one) leads to consistent gains in performance over the standard approach. Unsupervised MLP training is proposed as an alternative to build efficient acoustic models in a fully unsupervised way. Compared to cross-lingual MLPs trained in a supervised manner, the unsupervised MLP leads to competitive performance levels even if trained on only about half of the data amount. Unsupervised LM training approaches are proposed to estimate standard back-off n-gram and neural network language models. It is shown that unsupervised LM training leads to additive gains in performance on top of unsupervised AM training. Second, this thesis proposes the use of model interpolation as a rapid and flexible way to build task specific acoustic models. In reported experiments, models obtained via interpolation outperform the baseline pooled models and equivalent maximum a posteriori (MAP) adapted models. Interpolation proves to be especially useful for low resourced dialect ASR. When only a few (2 to 3 hours) or no acoustic data truly matching the target dialect are available for AM training, model interpolation leads to substantial performance gains compared to the standard training methods. Reconnaissance de la parole Apprentissage non-supervisée Adaptation de modèles Développement de systèmes Réduction des coûts Combinaison de modèles Speech recognition Unsupervised training Model adaptation System development Cost reduction Model combination

Search results