Analysis of language variation and word segmentation for a corpus of Vietnamese blogs: a sociolinguistic approach

Mello, Heather Lee

Analysis of language variation and word segmentation for a corpus of Vietnamese blogs: a sociolinguistic approach

Mello, Heather Lee

2013

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

This dissertation examines issues of units of meaning, word segmentation and language variation in a corpus of Vietnamese language blogs collected from publicly accessible internet sources originating in Viet Nam, the US, and Australia. Research using corpus linguistics techniques for study of the Vietnamese language have begun to proliferate in western sources in the past decade, however, studies using language-in-use data remain rare. Analysis of the corpus as a whole and by comments and blogs and Viet Nam, US, and Australia subcorpora used the Vietnamese syllable, or ting, as the basic unit of meaning, with subsequent iterations of one- through 5-ting. While results support previous research asserting the Vietnamese syllable as the basic distributional element in Vietnamese discourse, claims about Vietnamese as a monosyllabic language are not supported by results. Ting collocate and colligate meaningfully and regularly throughout the corpora in clusters larger than one syllable, indicating that syllable combinations, the union of ting (Nguyen, 1984), are also primary distributional patterns for the Vietnamese language. Varieties of Vietnamese by country show similarity in a variety of distributional patterns, including by a-curve (frequency of frequencies), structural, content, and units of meaning analyses. Variations of Vietnamese by country are primarily limited to collocational and colligational content and topical patterns.

Details

Record ID

18767

Record Created

2024-12-05

Title

Analysis of language variation and word segmentation for a corpus of Vietnamese blogs: a sociolinguistic approach

Author

Mello, Heather Lee

Contributor

Kretzschmar, William Advisor
Benedek, Dezso Committee Member
Howe, Lewis Committee Member

College or School

Linguistics

Date

2013

Publisher

University of Georgia

Content Type

Dissertation

Language

English

Dissertation/ Thesis Note

Doctoral

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia, Summer 2013

Year Degree Granted

2013

Keywords

Vietnamese Language; Corpus Linguistics; Sociolinguistics; Word Segmentation; Unit of Meaning; A-Curve; Blogs; Internet; Diaspora; Language Variety; Tiáº¿ng

Record Appears in

Electronic Theses and Dissertations > Doctoral Dissertation
All Resources
Doctoral

System Control Number

9949333204402959

PDF

Statistics

Download Full History