Global ETD Search

1	Evaluating and Fine-Tuning a Few-Shot Model for Transcription of Historical Ciphers Eliasson, Ingrid January 2023 (has links) Thousands of historical ciphers, encrypted manuscripts, are stored in archives across Europe. Historical cryptology is the research field concerned with studying these manuscripts - combining the interest of humanistic fields with methods of cryptography and computational linguistics. Before a cipher can be decrypted by automatic means, it must first be transcribed into machine-readable digital text. Image processing techniques and Deep Learning have enabled transcription of handwritten text to be performed automatically, but the task faces challenges when ciphers constitute the target data. The main reason is a lack of labeled data, caused by the heterogeneity of handwriting and the tendency of ciphers to employ unique symbol sets. Few-Shot Learning is a machine learning framework which reduces the need for labeled data, using pretrained models in combination with support sets containing a few labeled examples from the target data set. This project is concerned with evaluating a Few-Shot model on the task of transcription of historical ciphers. The model is tested on pages from three in-domain ciphers which vary in handwriting style and symbol sets. The project also investigates the use of further fine-tuning the model by training it on a limited amount of labeled symbol examples from the respective target ciphers. We find that the performance of the model is dependant on the handwriting style of the target document, and that certain model parameters should be explored individually for each data set. We further show that fine-tuning the model is indeed efficient, lowering the Symbol Error Rate (SER) at best 27.6 percentage points. historical ciphers few-shot learning automatic transcription
2	The Influence of Language Models on Decryption of German Historical Ciphers Sikora, Justyna January 2022 (has links) This thesis assesses the influence of language models on decryption of historical German ciphers. Previous research on language identification and cleartext detection indicates that it is beneficial to use historical language models (LM) while dealing with historical ciphers as they can outperform models trained on present-day data. To date, no systematic investigation has considered the impact of choosing different LMs for the decryption of ciphers. Therefore, we conducted a series of experiments with the aim of exploring this assumption. Using historical data from the HistCorp collection and Project Gutenberg, we have created 3-gram, 4-gram and 5-gram models, as well as constructed substitution ciphers for testing of the models. The results show that in most cases language models trained on historical data perform better than the larger modern models, while the most consistent results for the tested ciphers gave the 4-gram models. historical cryptology historical language models historical ciphers

Search results

Evaluating and Fine-Tuning a Few-Shot Model for Transcription of Historical Ciphers

The Influence of Language Models on Decryption of German Historical Ciphers