Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Each year, publicly incorporated companies are required to file a Form 10-K with the United States Securities and Exchange Commission. These documents contain an enormous amount of natural language data and may offer insight into financial performance prediction. This thesis attempts to analyze two dimensions of language held within this data: sentiment and linguistic hedging. An experiment was conducted with 325 human annotators to manually score a subset of the sentiment words contained in a corpus of 106 10-K filings, and an inference engine identified instances of hedges having governance over these words in a dependency tree. Finally, this work proposes an algorithm for the automatic classification of sentences in the financial domain as speculative or non-speculative using the previously defined hedge cues.

Details

PDF

Statistics

from
to
Export
Download Full History