Return to search

A Markovian approach to distributional semantics

This thesis, which is organized in two independent parts, presents work on distributional semantics and on variable selection. In the first part, we introduce a new method for learning good word representations using large quantities of unlabeled sentences. The method is based on a probabilistic model of sentence, using a hidden Markov model and a syntactic dependency tree. The latent variables, which correspond to the nodes of the dependency tree, aim at capturing the meanings of the words. We develop an efficient algorithm to perform inference and learning in those models, based on online EM and approximate message passing. We then evaluate our models on intrinsic tasks such as predicting human similarity judgements or word categorization, and on two extrinsic tasks: named entity recognition and supersense tagging. In the second part, we introduce, in the context of linear models, a new penalty function to perform variable selection in the case of highly correlated predictors. This penalty, called the trace Lasso, uses the trace norm of the selected predictors, which is a convex surrogate of their rank, as the criterion of model complexity. The trace Lasso interpolates between the $\ell_1$-norm and $\ell_2$-norm. In particular, it is equal to the $\ell_1$-norm if all predictors are orthogonal and to the $\ell_2$-norm if all predictors are equal. We propose two algorithms to compute the solution of least-squares regression regularized by the trace Lasso, and perform experiments on synthetic datasets to illustrate the behavior of the trace Lasso.

Identiferoai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-00940575
Date20 January 2014
CreatorsGrave, Edouard
PublisherUniversité Pierre et Marie Curie - Paris VI
Source SetsCCSD theses-EN-ligne, France
LanguageEnglish
Detected LanguageEnglish
TypePhD thesis

Page generated in 0.0075 seconds