Spelling suggestions: "subject:"finetune"" "subject:"vfine_tune""
1 |
Development of a Semantic Search Tool for Swedish Legal Judgements Based on Fine-Tuning Large Language ModelsMikkelsen Toth, Sebastian January 2024 (has links)
Large language models (LLMs) are very large deep learning models which are retrained on a huge amount of data. Among the LLMs are sentence bidirectional encoder representations from transformers (SBERT) where advanced training methods such as transformer-based denoising autoEncoder (TSDAE), generative query network (GenQ) and an adaption of generative pseudo labelling (GPL) can be applied. This thesis project aims to develop a semantic search tool for Swedish legal judgments in order to overcome the limitations of traditional keyword searches in legal document retrieval. For this aim, a model adept at understanding the semantic nuances of legal language has been developed by leveraging natural language processing (NLP) and fine- tuning LLMs like SBERT, using advanced training methods such as TSDAE, GenQ, and an adaption of GPL. To generate labeled data out of unlabelled data, a GPT3.5 model was used after it was fine-tuned. The generation of labeled data with the use of a generative model was crucial for this project to train the SBERT efficiently. The search tool has been evaluated. The evaluation demonstrates that the search tool can accurately retrieve relevant documents based on semantic queries and simnifically improve the efficiency and accuracy of legal research. GenQ has been shown to be the most efficient training method for this use case.
|
2 |
Fine-tuning a BERT-based NER Model for Positive Energy DistrictsOrtega, Karen, Sun, Fei January 2023 (has links)
This research presents an innovative approach to extracting information from Positive Energy Districts (PEDs), urban areas generating surplus energy. PEDs are integral to the European Commission's SET Plan, tackling housing challenges arising from population growth. The study refines BERT to categorize PED-related entities, producing a cutting-edge NER model and an integrated pipeline of diverse NER tools and data sources. The model achieves an accuracy of 0.81 and an F1 Score of 0.55 with notably high confidence scores through pipeline evaluations, confirming its practical applicability. While the F1 score falls short of expectations, this pioneering exploration in PED information extraction sets the stage for future refinements and studies, promising enhanced methodologies and impactful outcomes in this dynamic field. This research advances NER processes for Positive Energy Districts, supporting their development and implementation.
|
3 |
Stora språkmodeller för bedömning av applikationsrecensioner : Implementering och undersökning av stora språkmodeller för att sammanfatta, extrahera och analysera nyckelinformation från användarrecensioner / Large Language Models for application review data : Implementation survey of Large Language Models (LLM) to summarize, extract, and analyze key information from user reviewsvon Reybekiel, Algot, Wennström, Emil January 2024 (has links)
Manuell granskning av användarrecensioner för att extrahera relevant informationkan vara en tidskrävande process. Denna rapport har undersökt om stora språkmodeller kan användas för att sammanfatta, extrahera och analysera nyckelinformation från recensioner, samt hur en sådan applikation kan konstrueras. Det visade sig att olika modeller presterade olika bra beroende på mätvärden ochviktning mellan recall och precision. Vidare visade det sig att fine-tuning av språkmodeller som Llama 3 förbättrade prestationen vid klassifikation av användbara recensioner och ledde, enligt vissa mätvärden, till högre prestation än större språkmodeller som Chat-Bison. För engelskt översatta recensioner hade Llama 3:8b:Instruct, Chat-Bison samt den fine-tunade versionen av Llama 3:8b ett F4-makro-score på 0.89, 0.90 och 0.91 respektive. Ytterligare ett resultat är att de större modellerna Chat-Bison, Text-Bison och Gemini, presterade bättre i fallet för generering av sammanfattande texter, än de mindre modeller som testades vid inmatning av flertalet recensioner åt gången. Generellt sett presterade språkmodellerna också bättre om recensioner först översattes till engelska innan bearbetning, snarare än då recensionerna var skrivna i originalspråk där de majoriteten av recensionerna var skrivna på svenska. En annan lärdom från förbearbetning av recensioner är att antal anrop till dessa språkmodeller kan minimeras genom att filtrera utifrån ordlängd och betyg. Utöver språkmodeller visade resultaten att användningen av vektordatabaser och embeddings kan ge en större överblick över användbara recensioner genom vektordatabasers inbyggda förmåga att hitta semantiska likheter och samla liknande recensioner i kluster. / Manually reviewing user reviews to extract relevant information can be a time consuming process. This report investigates if large language models can be used to summarize, extract, and analyze key information from reviews, and how such anapplication can be constructed. It was discovered that different models exhibit varying degrees of performance depending on the metrics and the weighting between recall and precision. Furthermore, fine-tuning of language models such as Llama 3 was found to improve performance in classifying useful reviews and, according to some metrics, led to higher performance than larger language models like Chat-bison. Specifically, for English translated reviews, Llama 3:8b:Instruct, Chat-bison, and Llama 3:8b fine-tuned had an F4 macro score 0.89, 0.90, 0.91 respectively. A further finding is that the larger models, Chat-Bison, Text-Bison, and Gemini performed better than the smaller models that was tested, when inputting multiple reviews at a time in the case of summary text generation. In general, language models performed better if reviews were first translated into English before processing rather than when reviews were written in the original language where most reviews were written in Swedish. Additionally, another insight from the pre-processing phase, is that the number of API-calls to these language models can be minimized by filtering based on word length and rating. In addition to findings related to language models, the results also demonstrated that the use of vector databases and embeddings can provide a greater overview of reviews by leveraging the databases’ built-in ability to identify semantic similarities and cluster similar reviews together.
|
Page generated in 0.0386 seconds