• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Shortcut Transformers and the Learnability of Automata

Martens, Willeke January 2023 (has links)
Transformers have emerged as a powerful architecture for various tasks in natural language processing, computer vision, and multi-modal domains. Despite their success, understanding the computational capabilities and limitations of transformers remains a challenge. This work focuses on relating transformers to deterministic finite automata (DFAs) and empirically investigates the architecture's ability to simulate DFAs of varying complexities. We empirically explore the simulation of DFAs by transformers, specifically focusing on the solvable A4-DFA and the non-solvable A5-DFA. We conduct experiments to evaluate the in-distribution and out-of-distribution accuracy of sub-linear depth transformers with positive results on both accounts. Additionally, we examine the impact of widening the transformer to find even shallower transformers for the  A4-DFA. While no significant improvements are observed compared to the sub-linear depth transformers, further exploration of hyperparameters is needed for more reliable results.

Page generated in 0.0816 seconds