1 |
Finding structure in passwords : Using transformer models for password segmentationEneberg, Lina January 2024 (has links)
Passwords are common figures in everyone’s everyday life. One person has in average80 accounts for which they are supposed to use different passwords. Remembering allthese passwords is difficult and leads to people reusing, or reusing with slight modification,passwords on many accounts. Studies on memory show that information relating tosomething personal is more easily remembered. This is likely the reason as to why manypeople use passwords relating to either self, relatives, lovers, friends, or pets. Hackers will most often use either brute force or dictionary attacks to crack a password.These techniques can be quite time consuming so using machine learning could bea faster and easier approach. Segmenting someone’s previous passwords into meaningfulunits often reveals personal information about the creator and can thus be used as a basisfor password guessing. This report focuses on evaluating different sizes of the GPT-SW3model, which uses a transformer architecture, on password segmentation. The purposeis to find out if the GPT-SW3 model is suitable to use as a password segmenter and byextension if it can be used for password guessing. As training data, a list of passwords collected from a security breach on a platformcalled RockYou was used. The passwords were segmented by the author to provide themodel with a correct answer to learn from. The evaluation metric, Exact Match, checksif the model’s prediction is the same as that of the author. There were no positive resultswhen training GPT-SW3, most likely because of technical limitations. As the results arerather insufficient, future studies are required to prove or disprove the assumptions thisthesis is based on.
|
2 |
Automatic generation of definitions : Exploring if GPT is useful for defining wordsEriksson, Fanny January 2023 (has links)
When reading a text, it is common to get stuck on unfamiliar words that are difficult to understand in the local context. In these cases, we use dictionaries or similar online resources to find the general meaning of the word. However, maintaining a handwritten dictionary is highly resource demanding as the language is constantly developing, and using generative language models for producing definitions could therefore be a more efficient option. To explore this possibility, this thesis performs an online survey to examine if GPT could be useful for defining words. It also investigates how well the Swedish language model GPT-SW3 (3.5 b) define words compared to the model text-davinci-003, and how prompts should be formatted when defining words with these models. The results indicate that text-davinci-003 generates high quality definitions, and according to students t-test, the definitions received significantly higher ratings from participants than definitions taken from Svensk ordbok (SO). Furthermore, the results showed that GPT-SW3 (3.5 b) received the lowest ratings, indicating that it takes more investment to keep up with the big models developed by OpenAI. Regarding prompt formatting, the most appropriate prompt format for defining words is highly dependent on the model, and the results showed that text- davinci-003 performed well using zero-shot, while GPT-SW3 (3.5 b) required a few shot setting. Considering both the high quality of the definitions generated by text-davinci-003, and the practical advantages with generating definitions automatically, GPT could be a useful method for defining words.
|
Page generated in 0.025 seconds