• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Procedural Pre-Training for Visual Recognition

Anderson, Connor S. 18 June 2024 (has links) (PDF)
Deep learning models can perform many tasks very capably, provided they are trained correctly. Usually, this requires a large amount of data. Pre-training refers to a process of creating a strong initial model by first training it on a large-scale dataset. Such a model can then be adapted to many different tasks, while only requiring a comparatively small amount of task-specific training data. Pre-training is the standard approach in most computer vision scenarios, but it's not without drawbacks. Aside from the cost and effort involved in collecting large pre-training datasets, such data may also contain unwanted biases, violations of privacy, inappropriate content, or copyright material used without permission. Such issues can lead to concerns about the ethical use of models trained using the data. This dissertation addresses a different approach to pre-training visual models by using abstract, procedurally generated data. Such data is free from the concerns around human bias, privacy, and intellectual property. It also has the potential to scale more easily, and provide precisely controllable sources of supervision that are difficult or impossible to extract from data collected in-the-wild from sources like the internet. The obvious disadvantage of such data is that it doesn't model real-world semantics, and thus introduces a large domain-gap. Surprisingly, however, such pre-training can lead to performance not far below models trained in the conventional way. This is shown for different visual recognition tasks, models, and procedural data-generation processes.

Page generated in 0.144 seconds