Global ETD Search

Return to search

Smoothening of Software documentation : comparing a self-made sequence to sequence model to a pre-trained model GPT-2 / Utjämning av mjukvarudokumentation

This thesis was done in collaboration with Ericsson AB with the goal of researching the possibility of creating a machine learning model that can transfer the style of a text into another arbitrary style depending on the data used. This had the purpose of making their technical documentation appear to have been written with one cohesive style for a better reading experience. Two approaches to solve this task were tested, the first one was to implement an encoder-decoder model from scratch, and the second was to use the pre-trained GPT-2 model created by a team from OpenAI and fine-tune the model on the specific task. Both of these models were trained on data provided by Ericsson, sentences were extracted from their documentation. To evaluate the models training loss, test sentences, and BLEU scores were used and these were compared to each other and with other state-of-the-art models. The models did not succeed in transforming text into a general technical documentation style but a good understanding of what would need to be improved and adjusted to improve the results were obtained. / <p>This thesis was presented on June 22, 2021, the presentation was done online on Microsoft teams. </p>

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-178186

Text style transfer

authorship style transfer

documentation style

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-178186
Date	January 2021
Creators	Tao, Joakim, Thimrén, David
Publisher	Linköpings universitet, Institutionen för datavetenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0012 seconds

Smoothening of Software documentation : comparing a self-made sequence to sequence model to a pre-trained model GPT-2 / Utjämning av mjukvarudokumentation

Description

Links & Downloads

Tags

Additional Fields