Return to search

Multimedia Content Analysis for Event Detection

The wide diffusion of multimedia contents of different type and format led to the need of effective methods to efficiently handle such huge amount of information, opening interesting research challenges in the media community. In particular, the definition of suitable content understanding methodologies is attracting the effort of a large number of researchers worldwide, who proposed various tools for automatic content organization, retrieval, search, annotation and summarization. In this thesis, we will focus on an important concept, that is the inherent link between ''media" and the ''events" that such media are depicting. We will present two different methodologies related to such problem, and in particular to the automatic discovery of event-semantics from media contents. The two methodologies address this general problem at two different levels of abstraction. In the first approach we will be concerned with the detection of activities and behaviors of people from a video sequence (i.e., what a person is doing and how), while in the second we will face the more general problem of understanding a class of events from a set visual media (i.e., the situation and context). Both problems will be addressed trying to avoid making strong a-priori assumptions, i.e., considering the largely unstructured and variable nature of events.As to the first methodology, we will discuss about events related to the behavior of a person living in a home environment. The automatic understanding of human activity is still an open problems in the scientific community, although several solutions have been proposed so far, and may provide important breakthroughs in many application domains such as context-aware computing, area monitoring and surveillance, assistive technologies for the elderly or disabled, and more. An innovative approach is presented in this thesis, providing (i) a compact representation of human activities, and (ii) an effective tool to reliably measure the similarity between activity instances. In particular, the activity pattern is modeled with a signature obtained through a symbolic abstraction of its spatio-temporal trace, allowing the application of high-level reasoning through context-free grammars for activity classification. As far as the second methodology is concerned, we will address the problem of identifying an event from single image. If event discovery from media is already a complex problem, detection from a single still picture is still considered out-of-reach for current methodologies, as demonstrated by recent results of international benchmarks in the field. In this work we will focus on a solution that may open new perspectives in this area, by providing better knowledge on the link between visual perception and event semantics. In fact, what we propose is a framework that identifies image details that allow human beings identifying an event from single image that depicts it. These details are called ''event saliency", and are detected by exploiting the power of human computation through a gamification procedure. The resulting event saliency is a map of event-related image areas containing sufficient evidence of the underlying event, which could be used to learn the visual essence of the event itself, to enable improved automatic discovery techniques. Both methodologies will be demonstrated through extensive tests using publicly available datasets, as well as additional data created ad-hoc for the specific problems under analysis.

Identiferoai:union.ndltd.org:unitn.it/oai:iris.unitn.it:11572/368623
Date January 2015
CreatorsRosani, Andrea
ContributorsRosani, Andrea, De Natale, Francesco
PublisherUniversità degli studi di Trento, place:TRENTO
Source SetsUniversità di Trento
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/doctoralThesis
Rightsinfo:eu-repo/semantics/openAccess
Relationfirstpage:1, lastpage:90, numberofpages:90

Page generated in 0.0023 seconds