FINE-TUNING VS CONTEXT-INJECTION: USING GPT FOR AMBIGUOUS QUESTION-ANSWERING ON PROPRIETARY DATA

VanHorn, Rex A

FINE-TUNING VS CONTEXT-INJECTION: USING GPT FOR AMBIGUOUS QUESTION-ANSWERING ON PROPRIETARY DATA

VanHorn, Rex A

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Current large language models (LLMs) have demonstrated abilities that, just a few short years ago, would have seemed impossible e.g., question answering. While LLMs like OpenAI’s GPT can do impressive unanticipated things, to maximize their value, the models need to be trained on or have access to additional, often proprietary, data. I compare two popular methods, fine tuning and context injection (a specific application of RAG), for integrating additional data into the LLMs for use in the task of question answering. A suite of semantic measurements is evaluated for use in comparing the answers generated by the methods. I use the best performing measurement, Ada 002 with Cosine Similarity, to show that context injection, using vector embeddings and semantic search, generates answers that are semantically closer to the desired answers, while lacking hallucinations or confabulations. We also provide qualitative and stylistic observations from the experiments further segmenting the two methods.

Details

Record ID

2439

Record Created

2024-12-05

Title

FINE-TUNING VS CONTEXT-INJECTION: USING GPT FOR AMBIGUOUS QUESTION-ANSWERING ON PROPRIETARY DATA

Author

VanHorn, Rex A

Contributor

Maier, Frederick W Advisor
Anastasopoulos, Jason Committee Member
Balashov, Yuri Committee Member
Rasheed, Khaled Committee Member

College or School

Franklin College of Arts and Sciences

Department

Institute for Artificial Intelligence

Content Type

Thesis

Pagination

89

File Format

pdf

Language

English

Degree Type

Master of Science (MS)

Name of Granting Institution

University of Georgia

Year Degree Granted

2023-12

Keywords

AI; Context-injection; Fine-tuning; GPT; Large Language Models; LLM

Record Appears in

Electronic Theses and Dissertations > Graduate Thesis
Franklin College of Arts and Sciences
All Resources

System Control Number

9949618028102959

PDF

Statistics

Download Full History