Files
Abstract
The collaborative model of Wikipedia is simple and open. This nature of Wikipedia challenges its trustworthiness, leading to vandalism. There are several current vandalism detection techniques but none of them focus on detecting elusive vandalism. This type do not contain normal characteristics of vandalism and hence difficult to detect. We have proposed multicontext aware detection techniques for determining whether an elusive edit is vandalized or not. The main idea of these techniques is to check whether an edit lies within the context of other words within a particular Wikipedia article. For the experimental purposes, we make use of a PAN corpus, which is a large collection of Wikipedia edits. Then we perform a feature extraction followed by a data trained classification using WEKA. Accuracy of our methods is calculated using f1-measure. Results show that the context aware techniques are efficient since they result in highly less number of false positives and negatives.