Spelling suggestions: "subject:"detection.object"" "subject:"actions.objective""
1 |
Modeling and recognizing interactions between people, objects and scenes / Modélisation et reconnaissance des actions humaines dans les imagesDelaitre, Vincent 07 April 2015 (has links)
Nous nous intéressons dans cette thèse à la modélisation des interactions entre personnes, objets et scènes. Nous montrons l’intérêt de combiner ces trois sources d’information pour améliorer la classification d’action et la compréhension automatique des scènes. Dans la première partie, nous cherchons à exploiter le contexte fourni par les objets et la scène pour améliorer la classification des actions humaines dans les photographies. Nous explorons différentes variantes du modèle dit de “bag-of-features” et proposons une méthode tirant avantage du contexte scénique. Nous proposons ensuite un nouveau modèle exploitant les objets pour la classification d’action basé sur des paires de détecteurs de parties du corps et/ou d’objet. Nous évaluons ces méthodes sur notre base de données d’images nouvellement collectée ainsi que sur trois autres jeux de données pour la classification d’action et obtenons des résultats proches de l’état de l’art. Dans la seconde partie de cette thèse, nous nous attaquons au problème inverse et cherchons à utiliser l’information contextuelle fournie par les personnes pour aider à la localisation des objets et à la compréhension des scènes. Nous collectons une nouvelle base de données de time-lapses comportant de nombreuses interactions entre personnes, objets et scènes. Nous développons une approche permettant de décrire une zone de l’image par la distribution des poses des personnes qui interagissent avec et nous utilisons cette représentation pour améliorer la localisation d’objets. De plus, nous démontrons qu’utiliser des informations provenant des personnes détectées peut améliorer plusieurs étapes de l’algorithme utilisé pour la compréhension des scènes d’intérieur. Pour finir, nous proposons des annotations 3D de notre base de time-lapses et montrons comment estimer l’espace utilisé par différentes classes d’objets dans une pièce. Pour résumer, les contributions de cette thèse sont les suivantes : (i) nous mettons au point des modèles pour la classification d’image tirant avantage du contexte scénique et des objets environnants et nous proposons une nouvelle base de données pour évaluer leurs performances, (ii) nous développons un nouveau modèle pour améliorer la localisation d’objet grâce à l’observation des acteurs humains interagissant avec une scène et nous le testons sur un nouveau jeu de vidéos comportant de nombreuses interactions entre personnes, objets et scènes, (iii) nous proposons la première méthode pour évaluer les volumes occupés par différentes classes d’objets dans une pièce, ce qui nous permet d’analyser les différentes étapes pour la compréhension automatique de scène d’intérieur et d’en identifier les principales sources d’erreurs. / In this thesis, we focus on modeling interactions between people, objects and scenes and show benefits of combining corresponding cues for improving both action classification and scene understanding. In the first part, we seek to exploit the scene and object context to improve action classification in still images. We explore alternative bag-of-features models and propose a method that takes advantage of the scene context. We then propose a new model exploiting the object context for action classification based on pairs of body part and object detectors. We evaluate our methods on our newly collected still image dataset as well as three other datasets for action classification and show performance close to the state of the art. In the second part of this thesis, we address the reverse problem and aim at using the contextual information provided by people to help object localization and scene understanding. We collect a new dataset of time-lapse videos involving people interacting with indoor scenes. We develop an approach to describe image regions by the distribution of human co-located poses and use this pose-based representation to improve object localization. We further demonstrate that people cues can improve several steps of existing pipelines for indoor scene understanding. Finally, we extend the annotation of our time-lapse dataset to 3D and show how to infer object labels for occupied 3D volumes of a scene. To summarize, the contributions of this thesis are the following: (i) we design action classification models for still images that take advantage of the scene and object context and we gather a new dataset to evaluate their performance, (ii) we develop a new model to improve object localization thanks to observations of people interacting with an indoor scene and test it on a new dataset centered on person, object and scene interactions, (iii) we propose the first method to evaluate the volumes occupied by different object classes in a room that allow us to analyze the current 3D scene understanding pipeline and identify its main source of errors.
|
2 |
Equivalência de estímulos e generalização recombinativa no seguimento de instruções com pseudofrases (verboobjeto) / Stimulus equivalence and recombinative generalization of instructionfollowing with pseudo-phrases (action-object)Postalli, Lidia Maria Marson 08 July 2011 (has links)
Made available in DSpace on 2016-06-02T19:44:10Z (GMT). No. of bitstreams: 1
4006.pdf: 1538521 bytes, checksum: a98a7359983f1a339fbb240a7c38e199 (MD5)
Previous issue date: 2011-07-08 / Universidade Federal de Minas Gerais / An important issue in the field of verbal behavior is how a person understands and learns to behave according to verbal commands or instructions. The stimulus equivalence paradigm, as a model of symbolic behavior, may explain the origins of the comprehension of instructions. Following new instructions can result from the recombination of subunits of previously learned instructions. This work reports three studies that investigated questions related to instructional control. In the first two studies, the general objective was to establish pseudo-phrases (action-object) as members of equivalence classes with actions, objects and abstract pictures; and to verify whether, when employed with an instructional function, the pseudo-phrases and the abstract pictures would control the participants responding. Additionally, the studies asked whether participants would follow new (recombined) instructions. The third study investigated whether the overlapping of elements of pseudo-phrases in teaching phase would favor generalized instruction-following. In the studies 1 and 2, twelve of the thirteen participants learned the auditory-visual conditional discriminations among spoken pseudo-phrases and actions presented in videotapes and among the same sentences and abstract pictures. Probes for class formation showed that the same twelve children comprehended the sentences, relating, through equivalence, the pseudo-phrases, the actions and the abstract pictures. Similar performances were observed with the pictures, suggesting that they had been comprehended and that they could work as substitutes for (or equivalent to) oral instructions. However, none of the children followed new (recombined) instructions, although all children responded under partial control of what was previously taught (object or action). In Study 3, four participants learned auditory-visual conditional discriminations (Condition 1) among spoken pseudophrases and videotapes (each showing action-object) and followed oral instructions in the tests of the instructional control, but only one participant followed recombined sentences. Four other participants learned to follow the experimental instructions via execution of the action related to object in the simultaneous presence of the auditory stimulus and of the corresponding videotape (Condition 2), but did not present recombinative generalization. Seven of the eight participants followed new instructions in the pre-test of new training matrixes with overlapping of the elements of the sentences previously learned (that is, their responding was under the control of elements of the compound). As a whole, the results represent a systematic replication of previous results indicating that class formation could promote the comprehension of sentences and facilitate the instruction-following behavior when the sentences are used with instructional function. Regarding the development of stimulus control by subunits of these complex stimuli, the evidences were very fragile, but when it occurred, the recombination was clearly related to systematic training with overlapping of elements in different sentences, thus suggesting the relevance of this procedure as an effective teaching condition. / Uma das questões de interesse no campo do comportamento verbal diz respeito a como as pessoas entendem e passam a se comportar de acordo com comandos ou instruções verbais. O paradigma de equivalência de estímulos, como um modelo do comportamento simbólico, pode contribuir para esclarecer a origem da compreensão de instruções. O seguimento de instruções novas, por sua vez, pode resultar da recombinação de subunidades de instruções previamente aprendidas. Este trabalho relata três estudos que investigaram questões relativas ao controle instrucional. Nos dois primeiros o objetivo geral foi estabelecer pseudofrases (ação-objeto) como membros de classes de equivalência com ações, objetos e figuras abstratas e verificar se quando empregadas com função instrucional, as pseudossentenças e as figuras abstratas controlariam o responder dos participantes. Adicionalmente, os estudos perguntaram se os participantes seguiriam novas instruções (recombinadas). O terceiro estudo investigou se a sobreposição de elementos de pseudofrases durante o ensino favoreceria o seguimento de instrução generalizado. Nos Estudos 1 e 2, 12 dos 13 participantes aprenderam discriminações condicionais auditivo-visuais entre pseudofrases faladas e ações filmadas em videoteipe e entre as mesmas sentenças e figuras abstratas. Sondas de formação de classes mostraram que as mesmas 12 crianças compreenderam as sentenças, relacionando, por equivalência, as pseudofrases, as ações e as figuras abstratas. Desempenhos similares foram observados diante das figuras, o que sugere que passaram a ser compreendidas e que podiam funcionar como substitutos (equivalentes) das instruções orais. Entretanto, nenhuma criança seguiu novas instruções (recombinadas), embora todas responderam sob controle parcial do que foi previamente ensinado (o objeto ou a ação). No Estudo 3, quatro participantes aprenderam discriminações condicionais auditivo-visuais (Condição 1) entre pseudossentenças faladas e videoteipes (ação-objeto) e seguiram as instruções orais nos testes de controle instrucional, mas apenas um participante seguiu sentenças recombinadas. Outros quatro participantes aprenderam a seguir as instruções experimentais via execução da ação em relação ao objeto diante da apresentação simultânea do estímulo auditivo e do videoteipe correspondente (Condição 2), mas não apresentaram generalização recombinativa. Sete dos oito participantes seguiram novas instruções nos pré-testes de novas matrizes de ensino com sobreposição dos elementos das sentenças previamente aprendidas (o responder estava sob controle de elementos do composto). No conjunto, os resultados constituem uma replicação sistemática de resultados prévios indicando que a formação de classes pode promover a compreensão de sentenças e favorecer seu seguimento, quando usadas com função instrucional. Quanto ao desenvolvimento de controle por subunidades dos estímulos complexos, as evidências foram bastante frágeis, mas quando ocorreu, a recombinação esteve claramente relacionada ao treino sistemático com sobreposição de elementos em diferentes sentenças, sugerindo a relevância dos procedimentos como uma condição de ensino eficaz.
|
Page generated in 0.07 seconds