Spelling suggestions: "subject:"fouille dde données none supervisé"" "subject:"fouille dde données noun supervisé""
1 |
Inférence de la grammaire structurelle d’une émission TV récurrente à partir du contenu / Content-based inference of structural grammar for recurrent TV programs from a collection of episodesQu, Bingqing 03 December 2015 (has links)
Dans cette thèse, on aborde le problème de structuration des programmes télévisés de manière non supervisée à partir du point de vue de l'inférence grammaticale, focalisant sur la découverte de la structure des programmes récurrents à partir une collection homogène. On vise à découvrir les éléments structuraux qui sont pertinents à la structure du programme, et à l’inférence grammaticale de la structure des programmes. Des expérimentations montrent que l'inférence grammaticale permet de utiliser minimum des connaissances de domaine a priori pour atteindre la découverte de la structure des programmes. / TV program structuring raises as a major theme in last decade for the task of high quality indexing. In this thesis, we address the problem of unsupervised TV program structuring from the point of view of grammatical inference, i.e., discovering a common structural model shared by a collection of episodes of a recurrent program. Using grammatical inference makes it possible to rely on only minimal domain knowledge. In particular, we assume no prior knowledge on the structural elements that might be present in a recurrent program and very limited knowledge on the program type, e.g., to name structural elements, apart from the recurrence. With this assumption, we propose an unsupervised framework operating in two stages. The first stage aims at determining the structural elements that are relevant to the structure of a program. We address this issue making use of the property of element repetitiveness in recurrent programs, leveraging temporal density analysis to filter out irrelevant events and determine valid elements. Having discovered structural elements, the second stage is to infer a grammar of the program. We explore two inference techniques based either on multiple sequence alignment or on uniform resampling. A model of the structure is derived from the grammars and used to predict the structure of new episodes. Evaluations are performed on a selection of four different types of recurrent programs. Focusing on structural element determination, we analyze the effect on the number of determined structural elements, fixing the threshold applied on the density function as well as the size of collection of episodes. For structural grammar inference, we discuss the quality of the grammars obtained and show that they accurately reflect the structure of the program. We also demonstrate that the models obtained by grammatical inference can accurately predict the structure of unseen episodes, conducting a quantitative and comparative evaluation of the two methods by segmenting the new episodes into their structural components. Finally, considering the limitations of our work, we discuss a number of open issues in structure discovery and propose three new research directions to address in future work.
|
Page generated in 0.0977 seconds