• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

KERMIT: Knowledge Extractive and Reasoning Model usIng Transformers

Hameed, Abed Alkarim, Mäntyniemi, Kevin January 2024 (has links)
In the rapidly advancing field of artificial intelligence, Large Language Models (LLMs) like GPT-3, GPT-4, and Gemini have revolutionized sectors by automating complex tasks. Despite their advancements, LLMs and more noticeably smaller language models (SLMs) still face challenges, such as generating unfounded content "hallucinations." This project aims to enhance SLMs for broader accessibility without extensive computational infrastructure. By supervised fine-tuning of smaller models with new datasets, SQUAD-ei and SQUAD-GPT, the resulting model, KERMIT-7B, achieved superior performance in TYDIQA-GoldP, demonstrating improved information extraction while retaining generative quality. / Inom det snabbt växande området artificiell intelligens har stora språkmodeller (LLM) som GPT-3, GPT-4 och Gemini revolutionerat sektorer genom att automatisera komplexa uppgifter. Trots sina framsteg stårdessa modeller, framför allt mindre språkmodeller (SLMs) fortfarande inför utmaningar, till exempel attgenerera ogrundat innehåll "hallucinationer". Denna studie syftar till att förbättra SLMs för bredare till-gänglighet utan krävande infrastruktur. Genom supervised fine-tuning av mindre modeller med nya data-set, SQUAD-ei och SQUAD-GPT, uppnådde den resulterande modellen, KERMIT-7B, överlägsen pre-standa i TYDIQA-GoldP, vilket visar förbättrad informationsutvinning samtidigt som den generativa kva-liteten bibehålls.

Page generated in 0.0166 seconds