Files
Abstract
Most approaches to statistical stylometry have concentrated on lexical features, such as relative word frequencies or type-token ratios. Syntactic features have been largely ignored. This work attempts to fill that void by introducing a technique for authorship attribution based on dependency grammar. Syntactic features are extracted from texts using a common dependency parser, and those features are used to train a classifier to identify texts by author. While the method described does not outperform existing methods on most tasks, it does demonstrate that purely syntactic features carry information which could be useful for stylometric analysis.