Global ETD Search

Return to search

Learning to Map the Visual and Auditory World

The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Billions of images that capture this complex relationship are uploaded to social-media websites every day and often are associated with precise time and location metadata. This rich source of data can be beneficial to improve our understanding of the globe. In this work, we propose a general framework that uses these publicly available images for constructing dense maps of different ground-level attributes from overhead imagery. In particular, we use well-defined probabilistic models and a weakly-supervised, multi-task training strategy to provide an estimate of the expected visual and auditory ground-level attributes consisting of the type of scenes, objects, and sounds a person can experience at a location. Through a large-scale evaluation on real data, we show that our learned models can be used for applications including mapping, image localization, image retrieval, and metadata verification.

Artificial Intelligence and Robotics

Computer Sciences

Databases and Information Systems

Software Engineering

Theory and Algorithms

Identifer	oai:union.ndltd.org:uky.edu/oai:uknowledge.uky.edu:cs_etds-1093
Date	01 January 2019
Creators	Salem, Tawfiq
Publisher	UKnowledge
Source Sets	University of Kentucky
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations--Computer Science

Page generated in 0.0024 seconds

Learning to Map the Visual and Auditory World

Description

Links & Downloads

Tags

Additional Fields