Return to search

Modélisation du langage à l'aide de pénalités structurées

Modeling natural language is among fundamental challenges of artificial intelligence and the design of interactive machines, with applications spanning across various domains, such as dialogue systems, text generation and machine translation. We propose a discriminatively trained log-linear model to learn the distribution of words following a given context. Due to data sparsity, it is necessary to appropriately regularize the model using a penalty term. We design a penalty term that properly encodes the structure of the feature space to avoid overfitting and improve generalization while appropriately capturing long range dependencies. Some nice properties of specific structured penalties can be used to reduce the number of parameters required to encode the model. The outcome is an efficient model that suitably captures long dependencies in language without a significant increase in time or space requirements. In a log-linear model, both training and testing become increasingly expensive with growing number of classes. The number of classes in a language model is the size of the vocabulary which is typically very large. A common trick is to cluster classes and apply the model in two-steps; the first step picks the most probable cluster and the second picks the most probable word from the chosen cluster. This idea can be generalized to a hierarchy of larger depth with multiple levels of clustering. However, the performance of the resulting hierarchical classifier depends on the suitability of the clustering to the problem. We study different strategies to build the hierarchy of categories from their observations.

Identiferoai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-01001634
Date11 February 2014
CreatorsNelakanti, Anil Kumar
PublisherUniversité Pierre et Marie Curie - Paris VI
Source SetsCCSD theses-EN-ligne, France
LanguageEnglish
Detected LanguageEnglish
TypePhD thesis

Page generated in 0.002 seconds