Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In recent years, machine learning based computational methods have been applied for automatically discerning between speech of a special population with language impairment - people with dementia - and healthy controls. The more successful of these methods employ a paired language model based ap- proach where the diagnosis is based on perplexities of two models — one trained on speech samples of people with dementia and the other on healthy control samples. This work applies that approach to another special population with language impairment — people with post stroke aphasia — and asks whether (1) this approach still works — as measured by improvement over a baseline classifier that simply returns the majority class from training data as the output, and (2) if input from a single testing dimen- sion — language production — is enough to improve significantly; since a clinical diagnosis is based on assessment along three dimensions - production, comprehension and repetition. Next, this work probes these language models to find out what is driving the difference in perplexity — the metric that underlies the classification decision. A word level analysis of difference in surprisals between the language models revealed that (1) for Broca’s aphasia, the models were most sensitive to closed class lexical elements, (2) for Wernicke’s aphasia, the models appeared to be sensitive to the main elements expected from a typical healthy response (e.g. the word ‘bread’ in a task that asks ‘how to make a PB&J Sandwich?’), (3) for Anomic Aphasia, the models were found to be sensitive to filled pauses (‘um’ and ‘uh’) taken during the discourse and (4) for Conduction aphasia, the models were sensitive to phonemic paraphasias and main elements of a response.

Details

PDF

Statistics

from
to
Export
Download Full History