Spelling suggestions: "subject:"istatistical language modeling"" "subject:"bystatistical language modeling""
1 |
A multi-objective programming perspective to statistical learning problemsYaman, Sibel 17 November 2008 (has links)
It has been increasingly recognized that realistic problems often involve a tradeoff among many conflicting objectives. Traditional methods aim
at satisfying multiple objectives by combining them into a global cost function, which
in most cases overlooks the underlying tradeoffs between the conflicting objectives.
This raises the issue about how different objectives should be combined to yield a
final solution. Moreover, such approaches promise that the chosen overall objective
function is optimized over the training samples. However, there is no guarantee on
the performance in terms of the individual objectives since they are not considered
on an individual basis.
Motivated by these shortcomings of traditional methods, the objective in this
dissertation is to investigate theory, algorithms, and applications for problems with
competing objectives and to understand the behavior of the proposed algorithms
in light of some applications. We develop a multi-objective programming (MOP)
framework for finding compromise solutions that are satisfactory for each of multiple
competing performance criteria. The fundamental idea for our formulation, which we
refer to as iterative constrained optimization (ICO), evolves around improving one
objective while allowing the rest to degrade. This is achieved by the optimization of
individual objectives with proper constraints on the remaining competing objectives.
The constraint bounds are adjusted based on the objective functions obtained in
the most recent iteration. An aggregated utility function is used to evaluate the
acceptability of local changes in competing criteria, i.e., changes from one iteration
to the next.
Conflicting objectives arise in different contexts in many problems of speech and
language technologies. In this dissertation, we consider two applications. The first
application is language model (LM) adaptation, where a general LM is adapted to a
specific application domain so that the adapted LM is as close as possible to both the
general model and the application domain data. Language modeling and adaptation is
used in many speech and language processing applications such as speech recognition,
machine translation, part-of-speech tagging, parsing, and information retrieval.
The second application is automatic language identification (LID), where the standard detection performance evaluation measures false-rejection (or miss) and false-acceptance (or false alarm) rates for a number of languages are to be simultaneously minimized. LID systems might be used as a pre-processing stage for understanding
systems and for human listeners, and find applications in, for example, a hotel lobby
or an international airport where one might speak to a multi-lingual voice-controlled
travel information retrieval system.
This dissertation is expected to provide new insights and techniques for accomplishing significant performance improvement over existing approaches in terms of the individual competing objectives. Meantime, the designer has a better control over what is achieved in terms of the individual objectives. Although many MOP approaches developed so far are formal and extensible to large number of competing objectives, their capabilities are examined only with two or three objectives. This is mainly because practical problems become significantly harder to manage when the number of objectives gets larger. We, however, illustrate the proposed framework with a larger number of objectives.
|
2 |
Recurrent Neural Networks with Elastic Time Context in Language Modeling / Recurrent Neural Networks with Elastic Time Context in Language ModelingBeneš, Karel January 2016 (has links)
Tato zpráva popisuje experimentální práci na statistické jazykovém modelování pomocí rekurentních neuronových sítí (RNN). Je zde předložen důkladný přehled dosud publikovaných prací, následovaný popisem algoritmů pro trénování příslušných modelů. Většina z popsaných technik byla implementována ve vlastním nástroji, založeném na knihovně Theano. Byla provedena rozsáhlá sada experimentů s modelem Jednoduché rekurentní sítě (SRN), která odhalila některé jejich dosud nepublikované vlastnosti. Při statické evaluaci modelu byly dosažené výsledky relativně cca. o 2.7 % horší, než nejlepší publikované výsledky. V případě dynamické evaluace však bylo dosaženo relativního zlepšení o 1 %. Dále bylo experimentováno i s modelem Strukturně omezené rekurentní sítě, ale ten se nepodařilo natrénovat k předpokládáným výkonům. Konečně bylo navrženo rozšíření SRN, pojmenované Náhodně prořidlá rekurentní neuronová síť. Experimentálně bylo potvrzeno, že RS-RNN dosahuje lepších výsledků v učení vlastního trénovacího korpusu a kombinace několika RS-RNN modelů přináší o 30 % větší zlepšení než kombinace stejného počtu SRN.
|
Page generated in 0.129 seconds