• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 6
  • 5
  • 4
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 39
  • 39
  • 15
  • 10
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Kombinování diskrétních pravděpodobnostních rozdělení pomocí křížové entropie pro distribuované rozhodování / Cross-entropy based combination of discrete probability distributions for distributed decision making

Sečkárová, Vladimíra January 2015 (has links)
Dissertation abstract Title: Cross-entropy based combination of discrete probability distributions for distributed de- cision making Author: Vladimíra Sečkárová Author's email: seckarov@karlin.mff.cuni.cz Department: Department of Probability and Mathematical Statistics Faculty of Mathematics and Physics, Charles University in Prague Supervisor: Ing. Miroslav Kárný, DrSc., The Institute of Information Theory and Automation of the Czech Academy of Sciences Supervisor's email: school@utia.cas.cz Abstract: In this work we propose a systematic way to combine discrete probability distributions based on decision making theory and theory of information, namely the cross-entropy (also known as the Kullback-Leibler (KL) divergence). The optimal combination is a probability mass function minimizing the conditional expected KL-divergence. The ex- pectation is taken with respect to a probability density function also minimizing the KL divergence under problem-reflecting constraints. Although the combination is derived for the case when sources provided probabilistic type of information on the common support, it can applied to other types of given information by proposed transformation and/or extension. The discussion regarding proposed combining and sequential processing of available data, duplicate data, influence...
22

Optimization in an Error Backpropagation Neural Network Environment with a Performance Test on a Pattern Classification Problem

Fischer, Manfred M., Staufer-Steinnocher, Petra 03 1900 (has links) (PDF)
Various techniques of optimizing the multiple class cross-entropy error function to train single hidden layer neural network classifiers with softmax output transfer functions are investigated on a real-world multispectral pixel-by-pixel classification problem that is of fundamental importance in remote sensing. These techniques include epoch-based and batch versions of backpropagation of gradient descent, PR-conjugate gradient and BFGS quasi-Newton errors. The method of choice depends upon the nature of the learning task and whether one wants to optimize learning for speed or generalization performance. It was found that, comparatively considered, gradient descent error backpropagation provided the best and most stable out-of-sample performance results across batch and epoch-based modes of operation. If the goal is to maximize learning speed and a sacrifice in generalisation is acceptable, then PR-conjugate gradient error backpropagation tends to be superior. If the training set is very large, stochastic epoch-based versions of local optimizers should be chosen utilizing a larger rather than a smaller epoch size to avoid inacceptable instabilities in the generalization results. (authors' abstract) / Series: Discussion Papers of the Institute for Economic Geography and GIScience
23

Techniques d'optimisation déterministe et stochastique pour la résolution de problèmes difficiles en cryptologie / Deterministic and stochastic optimization techniques for hard problems in cryptology

Bouallagui, Sarra 05 July 2010 (has links)
Cette thèse s'articule autour des fonctions booléennes liées à la cryptographie et la cryptanalyse de certains schémas d'identification. Les fonctions booléennes possèdent des propriétés algébriques fréquemment utilisées en cryptographie pour constituer des S-Boxes (tables de substitution).Nous nous intéressons, en particulier, à la construction de deux types de fonctions : les fonctions courbes et les fonctions équilibrées de haut degré de non-linéarité.Concernant la cryptanalyse, nous nous focalisons sur les techniques d'identification basées sur les problèmes de perceptron et de perceptron permuté. Nous réalisons une nouvelle attaque sur le schéma afin de décider de sa faisabilité.Nous développons ici des nouvelles méthodes combinant l'approche déterministe DCA (Difference of Convex functions Algorithm) et heuristique (recuit simulé, entropie croisée, algorithmes génétiques...). Cette approche hybride, utilisée dans toute cette thèse, est motivée par les résultats intéressants de la programmation DC. / In cryptography especially in block cipher design, boolean functions are the basic elements.A cryptographic function should have high non-linearity as it can be attacked by linear method. There are three goals for the research presented in this thesis :_ Finding a new construction algorithm for the highest possible nonlinear boolean functions in the even dimension, that is bent functions, based on a detreministic model._ Finding highly non linear boolean functions._ Cryptanalysing an identification scheme based on the perceptron problem.Optimisation heuristic algorithms (Genetic algorithm and simulated annealing) and a deterministicone based on DC programming (DCA) were used together.
24

Développement d’outils pronostiques dynamiques dans le cancer de la prostate localisé traité par radiothérapie / Development of dynamic prognostic tools in localized prostate cancer treated by radiation therapy

Sene, Mbery 13 December 2013 (has links)
La prédiction d'un événement clinique à l'aide d'outils pronostiques est une question centrale en oncologie. L'émergence des biomarqueurs mesurés au cours du temps permet de proposer des outils incorporant les données répétées de ces biomarqueurs pour mieux guider le clinicien dans la prise en charge des patients. L'objectif de ce travail est de développer et valider des outils pronostiques dynamiques de rechute de cancer de la prostate, chez des patients traités initialement par radiothérapie externe, en prenant en compte les données répétées du PSA, l'antigène spécifique de la prostate, en plus des facteurs pronostiques standard. Ces outils sont dynamiques car ils peuvent être mis à jour à chaque nouvelle mesure disponible du biomarqueur. Ils sont construits à partir de modèles conjoints pour données longitudinales et de temps d'événement. Le principe de la modélisation conjointe est de décrire l'évolution du biomarqueur à travers un modèle linéaire mixte, décrire le risque d'événement à travers un modèle de survie et lier ces deux processus à travers une structure latente. Deux approches existent, les modèles conjoints à effets aléatoires partagés et les modèles conjoints à classes latentes. Dans un premier travail, nous avons tout d'abord comparé, en terme de qualité d'ajustement et de pouvoir prédictif, des modèles conjoints à effets aléatoires partagés différant par leur forme de dépendance entre le PSA et le risque de rechute clinique. Puis nous avons évalué et comparé ces deux approches de modélisation conjointe. Dans un deuxième travail, nous avons proposé un outil pronostique dynamique différentiel permettant d'évaluer le risque de rechute clinique suivant l'initiation ou non d'un second traitement (un traitement hormonal) au cours du suivi. Dans ces travaux, la validation de l'outil pronostique a été basée sur deux mesures de pouvoir prédictif: le score de Brier et l'entropie croisée pronostique. Dans un troisième travail, nous avons enfin décrit la dynamique des PSA après un second traitement de type hormonal chez des patients traités initialement par une radiothérapie seule. / The prediction of a clinical event with prognostic tools is a central issue in oncology. The emergence of biomarkers measured over time can provide tools incorporating repeated data of these biomarkers to better guide the clinician in the management of patients. The objective of this work is to develop and validate dynamic prognostic tools of recurrence of prostate cancer in patients initially treated by external beam radiation therapy, taking into account the repeated data of PSA, the Prostate-Specific Antigen, in addition to standard prognostic factors. These tools are dynamic because they can be updated at each available new measurement of the biomarker. They are built from joint models for longitudinal and time-to-event data. The principle of joint modelling is to describe the evolution of the biomarker through a linear mixed model, describe the risk of event through a survival model and link these two processes through a latent structure. Two approaches exist, shared random-effect models and joint latent class models. In a first study, we first compared in terms of goodness-of-fit and predictive accuracy shared random-effect models differing in the form of dependency between the PSA and the risk of clinical recurrence. Then we have evaluated and compared these two approaches of joint modelling. In a second study, we proposed a differential dynamic prognostic tool to evaluate the risk of clinical recurrence according to the initiation or not of a second treatment (an hormonal treatment) during the follow-up. In these works, validation of the prognostic tool was based on two measures of predictive accuracy: the Brier score and the prognostic cross-entropy. In a third study, we have described the PSA dynamics after a second treatment (hormonal) in patients initially treated by a radiation therapy alone.
25

La programmation DC et la méthode Cross-Entropy pour certaines classes de problèmes en finance, affectation et recherche d'informations : codes et simulations numériques

Nguyen, Duc Manh 24 February 2012 (has links) (PDF)
La présente thèse a pour objectif principal de développer des approches déterministes et heuristiques pour résoudre certaines classes de problèmes d'optimisation en Finance, Affectation et Recherche d'Informations. Il s'agit des problèmes d'optimisation non convexe de grande dimension. Nos approches sont basées sur la programmation DC&DCA et la méthode Cross-Entropy (CE). Grâce aux techniques de formulation/reformulation, nous avons donné la formulation DC des problèmes considérés afin d'obtenir leurs solutions en utilisant DCA. En outre, selon la structure des ensembles réalisables de problèmes considérés, nous avons conçu des familles appropriées de distributions pour que la méthode Cross-Entropy puisse être appliquée efficacement. Toutes ces méthodes proposées ont été mises en œuvre avec MATLAB, C/C++ pour confirmer les aspects pratiques et enrichir notre activité de recherche.
26

Advanced Monte Carlo Methods with Applications in Finance

Joshua Chi Chun Chan Unknown Date (has links)
The main objective of this thesis is to develop novel Monte Carlo techniques with emphasis on various applications in finance and economics, particularly in the fields of risk management and asset returns modeling. New stochastic algorithms are developed for rare-event probability estimation, combinatorial optimization, parameter estimation and model selection. The contributions of this thesis are fourfold. Firstly, we study an NP-hard combinatorial optimization problem, the Winner Determination Problem (WDP) in combinatorial auctions, where buyers can bid on bundles of items rather than bidding on them sequentially. We present two randomized algorithms, namely, the cross-entropy (CE) method and the ADAptive Mulitilevel splitting (ADAM) algorithm, to solve two versions of the WDP. Although an efficient deterministic algorithm has been developed for one version of the WDP, it is not applicable for the other version considered. In addition, the proposed algorithms are straightforward and easy to program, and do not require specialized software. Secondly, two major applications of conditional Monte Carlo for estimating rare-event probabilities are presented: a complex bridge network reliability model and several generalizations of the widely popular normal copula model used in managing portfolio credit risk. We show how certain efficient conditional Monte Carlo estimators developed for simple settings can be extended to handle complex models involving hundreds or thousands of random variables. In particular, by utilizing an asymptotic description on how the rare event occurs, we derive algorithms that are not only easy to implement, but also compare favorably to existing estimators. Thirdly, we make a contribution at the methodological front by proposing an improvement of the standard CE method for estimation. The improved method is relevant, as recent research has shown that in some high-dimensional settings the likelihood ratio degeneracy problem becomes severe and the importance sampling estimator obtained from the CE algorithm becomes unreliable. In contrast, the performance of the improved variant does not deteriorate as the dimension of the problem increases. Its utility is demonstrated via a high-dimensional estimation problem in risk management, namely, a recently proposed t-copula model for credit risk. We show that even in this high-dimensional model that involves hundreds of random variables, the proposed method performs remarkably well, and compares favorably to existing importance sampling estimators. Furthermore, the improved CE algorithm is then applied to estimating the marginal likelihood, a quantity that is fundamental in Bayesian model comparison and Bayesian model averaging. We present two empirical examples to demonstrate the proposed approach. The first example involves women's labor market participation and we compare three different binary response models in order to find the one best fits the data. The second example utilizes two vector autoregressive (VAR) models to analyze the interdependence and structural stability of four U.S. macroeconomic time series: GDP growth, unemployment rate, interest rate, and inflation. Lastly, we contribute to the growing literature of asset returns modeling by proposing several novel models that explicitly take into account various recent findings in the empirical finance literature. Specifically, two classes of stylized facts are particularly important. The first set is concerned with the marginal distributions of asset returns. One prominent feature of asset returns is that the tails of their distributions are heavier than those of the normal---large returns (in absolute value) occur much more frequently than one might expect from a normally distributed random variable. Another robust empirical feature of asset returns is skewness, where the tails of the distributions are not symmetric---losses are observed more frequently than large gains. The second set of stylized facts is concerned with the dependence structure among asset returns. Recent empirical studies have cast doubts on the adequacy of the linear dependence structure implied by the multivariate normal specification. For example, data from various asset markets, including equities, currencies and commodities markets, indicate the presence of extreme co-movement in asset returns, and this observation is again incompatible with the usual assumption that asset returns are jointly normally distributed. In light of the aforementioned empirical findings, we consider various novel models that generalize the usual normal specification. We develop efficient Markov chain Monte Carlo (MCMC) algorithms to estimate the proposed models. Moreover, since the number of plausible models is large, we perform a formal Bayesian model comparison to determine the model that best fits the data. In this way, we can directly compare the two approaches of modeling asset returns: copula models and the joint modeling of returns.
27

Multi-objective optimisation using the cross-entropy method in CO gas management at a South African ilmenite smelter

Stadler, Johan George 12 1900 (has links)
Thesis (MScEng)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: In a minerals processing environment, stable production processes, cost minimisation and energy efficiency are key to operational excellence, safety and profitability. At an ilmenite smelter, typically found in the heavy minerals industry, it is no different. Management of an ilmenite smelting process is a complex, multi-variable challenge with high costs and safety risks at stake. A by-product of ilmenite smelting is superheated carbon monoxide (CO) gas, or furnace off-gas. This gas is inflammable and extremely poisonous to humans. At the same time the gas is a potential energy source for various on-site heating applications. Re-using furnace off-gas can increase the energy efficiency of the energy intensive smelting process and can save on the cost of procuring other gas for heating purposes. In this research project, the management of CO gas from the Tronox KZN Sands ilmenite smelter in South Africa was studied with the aim of optimising the current utilisation of the gas. In the absence of any buffer capacity in the form of a pressure vessel, the stability of the available CO gas is directly dependent on the stability of the furnaces. The CO gas has been identified as a partial replacement for methane gas which is currently purchased for drying and heating of feed material and pre-heating of certain smelter equipment. With no buffer capacity between the furnaces and the gas consuming plants, a dynamic prioritisation approach had to be found if the CO was to replace the methane. The dynamics of this supply-demand problem, which has been termed the “CO gas problem”, needed to be studied. A discrete-event simulation model was developed to match the variable supply of CO gas to the variable demand for gas over time – the demand being a function of the availability of the plants requesting the gas, and the feed rates and types of feed material processed at those plants. The problem was formulated as a multi-objective optimisation problem with the two main, conflicting objectives, identified as: 1) the average production time lost per plant per day due to CO-methane switchovers; and 2) the average monthly saving on methane gas costs due to lower consumption thereof. A metaheuristic, namely multi-objective optimisation using the cross-entropy method, or MOO CEM, was applied as optimisation algorithm to solve the CO gas problem. The performance of the MOO CEM algorithm was compared with that of a recognised benchmark algorithm for multi-objective optimisation, the NSGA II, when both were applied to the CO gas problem. The background of multi-objective optimisation, metaheuristics and the usage of furnace off-gas, particularly CO gas, were investigated in the literature review. The simulation model was then developed and the optimisation algorithm applied. The research aimed to comment on the merit of the MOO CEM algorithm for solving the dynamic, stochastic CO gas problem and on the algorithm’s performance compared to the benchmark algorithm. The results served as a basis for recommendations to Tronox KZN Sands in order to implement a project to optimise usage and management of the CO gas. / AFRIKAANSE OPSOMMING: In mineraalprosessering is stabiele produksieprosesse, kostebeperking en energie-effektiwiteit sleuteldrywers tot bedryfsprestasie, veiligheid en wins. ‘n Ilmenietsmelter, tipies aangetref in swaarmineraleprosessering, is geen uitsondering nie. Die bestuur van ‘n ilmenietsmelter is ‘n komplekse, multi-doelwit uitdaging waar hoë kostes en veiligheidsrisiko’s ter sprake is. ‘n Neweproduk van die ilmenietsmeltproses is superverhitte koolstofmonoksiedgas (CO gas). Hierdie gas is ontvlambaar en uiters giftig vir die mens. Terselfdertyd kan hierdie gas benut word as energiebron vir allerlei verhittingstoepassings. Die herbenutting van CO gas vanaf die smelter kan die energie-effektiwiteit van die energie-intensiewe smeltproses verhoog en kan verder kostes bespaar op die aankoop van ‘n ander gas vir verhittingsdoeleindes. In hierdie navorsingsprojek is die bestuur van die CO gasstroom wat deur die ilmenietsmelter van Tronox KZN Sands in Suid-Afrika geproduseer word, ondersoek met die doel om die huidige benuttingsvlak daarvan te verbeter. Weens die afwesigheid van enige bufferkapasiteit in die vorm van ‘n drukbestande tenk, is die stabiliteit van CO gas beskikbaar vir hergebruik direk afhanklik van die stabiliteit van die twee hoogoonde wat die gas produseer. Die CO gas kan gedeeltelik metaangas, wat tans aangekoop word vir die droog en verhitting van voermateriaal en vir die voorverhitting van sekere smeltertoerusting, vervang. Met geen bufferkapasiteit tussen die hoogoonde en die aanlegte waar die gas verbruik word nie, was die ondersoek van ‘n dinamiese prioritiseringsbenadering nodig om te kon vasstel of die CO die metaangas kon vervang. Die dinamika van hierdie vraag-aanbod probleem, getiteld die “CO gasprobleem”, moes bestudeer word. ‘n Diskrete-element simulasiemodel is ontwikkel as probleemoplossingshulpmiddel om die vraag-aanbodproses te modelleer en die prioritiseringsbenadering te ondersoek. Die doel van die model was om oor tyd die veranderlike hoeveelhede van geproduseerde CO teenoor die veranderlike gasaanvraag te vergelyk. Die vlak van gasaanvraag is afhanklik van die beskikbaarheidsvlak van die aanlegte waar die gas verbruik word, sowel as die voertempo’s en tipes voermateriaal in laasgenoemde aanlegte. Die probleem is geformuleer as ‘n multi-doelwit optimeringsprobleem met twee hoof, teenstrydige doelwitte: 1) die gemiddelde verlies aan produksietyd per aanleg per dag weens oorgeskakelings tussen CO en metaangas; 2) die gemiddelde maandelikse besparing op metaangaskoste weens laer verbruik van dié gas. ‘n Metaheuristiek, genaamd MOO CEM (multi-objective optimisation using the cross-entropy method), is ingespan as optimeringsalgoritme om die CO gasprobleem op te los. Die prestasie van die MOO CEM algoritme is vergelyk met dié van ‘n algemeen aanvaarde riglynalgoritme, die NSGA II, met beide toepas op die CO gasprobleem. The agtergrond van multi-doelwit optimering, metaheuristieke en die benutting van hoogoond af-gas, spesifiek CO gas, is ondersoek in die literatuurstudie. Die simulasiemodel is daarna ontwikkel en die optimeringsalgoritme is toegepas.
28

Estratégias numéricas e de otimização para inferência da dinâmica de redes bioquímicas

Ladeira, Carlos Roberto Lima 28 February 2014 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-03-07T11:11:33Z No. of bitstreams: 1 carlosrobertolimaladeira.pdf: 2482685 bytes, checksum: b90ffa199573e38ddbce8d8ac0283585 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-07T15:03:08Z (GMT) No. of bitstreams: 1 carlosrobertolimaladeira.pdf: 2482685 bytes, checksum: b90ffa199573e38ddbce8d8ac0283585 (MD5) / Made available in DSpace on 2017-03-07T15:03:08Z (GMT). No. of bitstreams: 1 carlosrobertolimaladeira.pdf: 2482685 bytes, checksum: b90ffa199573e38ddbce8d8ac0283585 (MD5) Previous issue date: 2014-02-28 / Estimar parâmetros de modelos dinâmicos de sistemas biológicos usando séries temporais é cada vez mais importante, pois uma quantidade imensa de dados experimentais está sendo mensurados pela biologia molecular moderna. Uma abordagem de resolução baseada em problemas inversos pode ser utilizada na solução deste tipo de problema. A escolha do modelo matemático é uma tarefa importante, pois vários modelos podem ser utilizados, apresentando níveis diversos de precisão em suas representações. A Teoria dos Sistemas Bioquímicos (TSB) faz uso de equações diferenciais ordinárias e expansões de séries de potências para representar processos bioquímicos. O Sistema S é um dos modelos usados pela TSB que permite a transformação do sistema original de equações diferenciais em um sistema algébrico desacoplado, facilitando a solução do problema inverso. Essa transformação pode comprometer a qualidade da resposta se o valor das derivadas nos pontos das séries temporais não for obtidos com precisão. Para estimar as derivadas pretende-se explorar o método do passo complexo, que apresenta vantagens em relação ao método das diferenças finitas, mais conhecido e utilizado. A partir daí pode então ser realizada a busca pelas variáveis que definirão as equações do sistema. O método da Regressão Alternada é um dos mais rápidos para esse tipo de problema, mas a escolha inicial dos parâmetros possui influência em seu resultado, que pode até mesmo não ser encontrado. Pretende-se avaliar o método da Entropia Cruzada, que possui a vantagem de realizar buscas globais e talvez por esse motivo a escolha dos parâmetros inicias não cause tanta influência nos resultados. Além disso, será avaliado um método híbrido que fará uso das principais vantagens do método da Regressão Alternada e do Entropia Cruzada para resolver o problema. Experimentos numéricos sistematizados serão realizados tanto para a etapa de estimativa das derivadas quanto para a etapa de otimização para obtenção dos parâmetros das equações do sistema. / Estimating parameters of dynamic models of biological systems using time series is becoming very important because a huge amount of experimental data is being measured by modern molecular biology. A resolution-based approach on inverse problems can be used in solving this type of problem. The choice of the mathematical model is an important task, since many models can be used, with varying levels of accuracy in their representations. The Biochemical Systems Theory (BST) makes use of ordinary differential equations and power series expansions to represent biochemical processes. The S-system is one of the models used by BST that allows the transformation of the original system of differential equations in a decoupled system of algebric equations, favouring the solution of the inverse problem. This transformation can compromise the quality of the response if the value of the derivatives at points of time series are not obtained accurately. To estimate the derivatives we intend to explore the complex-step method, which has advantages over the finite difference method, best known and used . So the search for the variables that define the equations of the system can be performed. The Alternating Regression method is one of the fastest for this type of problem, but the initial choice of parameters has influence on its performance, which may not even be found. We intend to evaluate the Cross-entropy method, which has the advantage of performing global searches and for this reason the choice of the initial search parameters does not cause as much influence on the results. Also, will be assessed a hybrid method that makes use of the main advantages of Alternating Regression and Cross-entropy to solve the problem. Systematic numerical experiments will be conducted for both the step of estimating derivatives as for the optimization step to estimate the variables of the equations of the system.
29

Child Marriage, Human Development and Welfare : Using Public Spending, Taxation and Conditional Cash Transfers as Policy Instruments

Sayeed, Yeasmin January 2016 (has links)
The theme of this thesis is to analyze the impact of policy interventions such as financing human development (HD), tax reform and conditional cash transfer programmes, under the framework of growth and sustainable development. These policy instruments are evaluated through the application of both partial and general equilibrium models, and the last paper concentrates on developing regional social accounting matrices (SAMs) as a core database for spatial general equilibrium modelling. Essay 1: Trade-offs in Achieving Human Development Goals for Bangladesh investigates the benefits and costs associated with alternative investment financing options for achieving HD goals by applying the MAMS (Maquette for Millennium Development Goals Studies) model. We find that full achievement of these goals would have led to a GDP loss that would have been significantly larger in the domestic borrowing scenario than in the tax scenario. The tax-financing alternative is thus the better option for financing large development programs. In terms of public spending composition, we find that, under some circumstances, a trade-off arises between overall Millennium Development Goal (MDG) progress and poverty reduction. Essay 2: Welfare impact of broadening VAT by exempting Small-Scale food markets: The case of Bangladesh analyses the welfare impacts of different VAT reforms. A general and uniform VAT on all commodities is preferred as it is more efficient and less administratively costly. However, due to equity concerns, food is normally exempted from VAT. On the other hand, exemptions on food mean that an implicit subsidy is provided to high-income households. Hence, we analyze a broad-based VAT regime with a high threshold that excludes small-scale operators (where the low-income households buy their products most, including food) and the simulation result shows that welfare improves for the low-income households. Essay 3: Effect of Girls’ Secondary School Stipend on Completed Schooling and Age at Marriage: Evidence from Bangladesh estimates the effect of a conditional cash transfer programme on education and age at marriage. We apply both difference in differences (DiD) and regression discontinuity methods to evaluate the impact of the policy instrument. Our estimation results show that the girls in the treatment group who were exposed to the programme had a higher average number of completed years of schooling and also delayed their first marriage compared to the girls in the control group. We also show that the DiD approach might produce a biased result as it does not consider the convergence effect. Essay 4: Estimation of Multiregional Social Accounting Matrices using Transport Data proposes a methodology for estimating multiregional SAMs from a national SAM by applying the cross-entropy method. The methodology makes possible the construction of regional SAMs that are consistent with official regional accounts and minimize deviations from transport data.
30

Word Classes in Language Modelling

Erikson, Emrik, Åström, Marcus January 2024 (has links)
This thesis concerns itself with word classes and their application to language modelling.Considering a purely statistical Markov model trained on sequences of word classes in theSwedish language different problems in language engineering are examined. Problemsconsidered are part-of-speech tagging, evaluating text modifiers such as translators withthe help of probability measurements and matrix norms, and lastly detecting differenttypes of text using the Fourier transform of cross entropy sequences of word classes.The results show that the word class language model is quite weak by itself but that itis able to improve part-of-speech tagging for 1 and 2 letter models. There are indicationsthat a stronger word class model could aid 3-letter and potentially even stronger models.For evaluating modifiers the model is often able to distinguish between shuffled andsometimes translated text as well as to assign a score as to how much a text has beenmodified. Future work on this should however take better care to ensure large enoughtest data. The results from the Fourier approach indicate that a Fourier analysis of thecross entropy sequence between word classes may allow the model to distinguish betweenA.I. generated text as well as translated text from human written text. Future work onmachine learning word class models could be carried out to get further insights into therole of word class models in modern applications. The results could also give interestinginsights in linguistic research regarding word classes.

Page generated in 0.0599 seconds