The Illiterate Editor: Metadata-driven Revert Detection in Wikipedia

This presentation is part of the WikiSym + OpenSym 2013 program.

Jeffrey Segall, Rachel Greenstadt

As the community depends more heavily on Wikipedia as a source of reliable information, the ability to quickly detect and remove detrimental information becomes increasingly important. The longer incorrect or malicious information lingers in a source perceived as reputable, the more likely that information will be accepted as correct and the greater the loss to source reputation. We present The Illiterate Editor (IllEdit), a content-agnostic, metadata-driven classication approach to Wikipedia revert detection. Our primary contribution is in building a metadata-based feature set for detecting edit quality, which is then fed into a Support Vector Machine for edit classication. By analyzing edit histories, the IllEdit system builds a prole of user behavior, estimates expertise and spheres of knowledge, and determines whether or not a given edit is likely to be eventually reverted. The success of the system in revert detection (0.844 F-measure) as well as its disjoint feature set as compared to existing, content-analyzing vandalism detection systems, shows promise in the synergistic usage of IllEdit for increasing the reliability of community information.

A PDF file will be made available on August 5, 2013, through the WikiSym + OpenSym 2013 conference proceedings.

