• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Cleartext detection and language identification in ciphers

Gambardella, Maria-Elena January 2021 (has links)
In historical cryptology, cleartext represents text written in a known language ina cipher (a hand-written manuscript aiming at hiding the content of a message).Cleartext can give us an historical interpretation and contextualisation of themanuscript and could help researchers in cryptanalysis, but to these days thereis still no research on how to automatically detect cleartext and identifying itslanguage. In this paper, we investigate to what extent we can automaticallydistinguish cleartext from ciphertext in transcribed historical ciphers and towhat extent we are able to identify its language. We took a rule-based approachand run 7 different models using historical language models on ciphertextsprovided by the DECRYPT-Project. Our results show that using unigrams andbigrams on a word-level combined with 3-grams, 4-grams and 5-grams on acharacter-level is the best approach to tackle cleartext detection.
2

The Influence of Language Models on Decryption of German Historical Ciphers

Sikora, Justyna January 2022 (has links)
This thesis assesses the influence of language models on decryption of historical German ciphers. Previous research on language identification and cleartext detection indicates that it is beneficial to use historical language models (LM) while dealing with historical ciphers as they can outperform models trained on present-day data. To date, no systematic investigation has considered the impact of choosing different LMs for the decryption of ciphers. Therefore, we conducted a series of experiments with the aim of exploring this assumption. Using historical data from the HistCorp collection and Project Gutenberg, we have created 3-gram, 4-gram and 5-gram models, as well as constructed substitution ciphers for testing of the models. The results show that in most cases language models trained on historical data perform better than the larger modern models, while the most consistent results for the tested ciphers gave the 4-gram models.

Page generated in 0.3182 seconds