Global ETD Search

Return to search

Task Distillation: Transforming Reinforcement Learning into Supervised Learning

Recent work in dataset distillation focuses on distilling supervised classification datasets into smaller, synthetic supervised datasets in order to reduce per-model costs of training, to provide interpretability, and to anonymize data. Distillation and its benefits can be extended to a wider array of tasks. We propose a generalization of dataset distillation, which we call task distillation. Using techniques similar to those used in dataset distillation, any learning task can be distilled into a compressed synthetic task. Task distillation allows for transmodal distillations, where a task of one modality is distilled into a synthetic task of another modality, allowing a more complex learning task, such as a reinforcement learning environment, to be reduced to a simpler learning task, such as supervised classification. In order to advance task distillation beyond supervised-to-supervised distillation, we explore distilling reinforcement learning environments into supervised learning datasets. We propose a new distillation algorithm that allows PPO to be used to distill a reinforcement learning environment. We demonstrate k-shot learning on distilled cart-pole to demonstrate the effectiveness of our distillation algorithm, as well as to explore distillation generalization. We distill multi-dimensional cart-pole environments to their minimum-sized distillations and show that this matches the theoretical minimum number of data instances required to teach each task. We demonstrate how a distilled task can be used as an interpretability artifact, as it compactly represents everything needed to learn the task. We demonstrate the feasibility of distillation in more complex Atari environments by fully distilling Centipede and demonstrating that distillation is cheaper than training directly on Centipede for training more than 9 models. We provide a method to "partially" distill more complex environments and demonstrate it on Ms. Pac-Man, Pong, and Space Invaders and show how it scales distillation difficulty fully on Centipede.

dataset distillation

task distillation

reinforcement learning

meta-learning

Physical Sciences and Mathematics

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-11160
Date	12 October 2023
Creators	Wilhelm, Connor
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	https://lib.byu.edu/about/copyright/

Page generated in 0.0024 seconds

Task Distillation: Transforming Reinforcement Learning into Supervised Learning

Description

Links & Downloads

Tags

Additional Fields