DISTRIBUTIONAL CORPUS ANALYSIS OF KOREAN NEOLOGISMS USING ARTIFICIAL INTELLIGENCE

Kim, Wonbin

DISTRIBUTIONAL CORPUS ANALYSIS OF KOREAN NEOLOGISMS USING ARTIFICIAL INTELLIGENCE

Kim, Wonbin

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Distributional corpus analysis (DCA) is an approach which reveals lexical relations using large-scale corpora and computational techniques in natural language processing. It has an advantage of processing and analyzing lexical relations in a quantitative, consistent, and objective way. Although the DCA approach allows analysts to process large-scale linguistic data efficiently, there are few studies using the DCA approach to investigate language phenomena within corpus linguistics. Therefore, this study aims to bridge the gap between the DCA approach and corpus linguistics by designing and describing DCA from the perspective of corpus linguistics. Specifically, this study uses the DCA approach to analyze the distributional behaviors of three Korean neologisms leyal, lwuce, and kay- and track semantic change of the three neologisms. For the analysis of distributional behaviors, Korean Twitter data spanning about ten years is collected and three state-of-the-art techniques are employed. For leyal, word2vec and cosine similarity are used and for lwuce, Latent Dirichlet Allocation is employed. For kay-, long short-term memory is utilized. Regarding kay-, its connotational and attitudinal meaning is investigated. The results from DCA show that (i) between the two meanings of leyal, ‘really’ has always been more dominant than ‘Real Madrid’, (ii) between the new and existing meanings of lwuce, the existing meaning has always been more dominant and the use of the new meaning most significantly decreased in 2015, and (iii) the semantic prosody of kay- has shifted from negative toward positive. This study has made several “first attempts”. First, this work is the first study using artificial intelligence and Korean social media data to analyze the distributional behaviors of Korean neologisms and track their semantic change over time. Secondly, this work is the first study showing DCA from the perspective of corpus linguistics. Thirdly, this work has established specific methods to validate the DCA approach using a collocation analysis in corpus linguistics for the first time. This study making several “first attempts” will be able to encourage interdisciplinary research between corpus linguistics and artificial intelligence as well as function as a foundational study upon which further DCA studies can build in corpus linguistics.

Details

Record ID

3407

Record Created

2024-12-05

Title

DISTRIBUTIONAL CORPUS ANALYSIS OF KOREAN NEOLOGISMS USING ARTIFICIAL INTELLIGENCE

Author

Kim, Wonbin

Contributor

Kretzschmar, William A. Advisor
Harklau, Linda Committee Member
Mellom, Paula J. Committee Member

College or School

Franklin College of Arts and Sciences

Department

Linguistics

Content Type

Dissertation

Pagination

179

File Format

pdf

Language

English

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia

Year Degree Granted

2022-12

Keywords

artificial intelligence; collocation analysis; distributional corpus analysis; distributional frequency profiles; Korean neologisms; semantic change

Record Appears in

College, School, or Unit > Franklin College of Arts and Sciences > Linguistics
Electronic Theses and Dissertations > Doctoral Dissertation
All Resources
Doctoral

System Control Number

9949515726902959

PDF

Statistics

Download Full History