Category Archives: Wikipedia Track

Measuring the Quality of Edits to Wikipedia

Title: Measuring the Quality of Edits to Wikipedia

Authors: Susan Biancani

Abstract: Wikipedia is unique among reference works both in its scale and in the openness of its editing interface. The question of how it can achieve and maintain high-quality encyclopedic articles is an area of active research. In order to address this question, researchers need to build consensus around a sensible metric to assess the quality of contributions to articles. This measure must not only reflect an intuitive concept of “quality,” but must also be scalable and run efficiently. Building on prior work in this area, this paper uses human raters through Amazon Mechanical Turk to validate an efficient, automated quality metric.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Reliability of User-Generated Data: the Case of Biographical Data in Wikipedia

Title: Reliability of User-Generated Data: the Case of Biographical Data in Wikipedia

Authors: Robert Viseur

Abstract: Wikipedia is a collaborative multilingual encyclopedia launched in 2001. We already conducted a first research on the extraction of biographical data about personalities from Belgium in order to build a large database with biographical data. However, the question of the reliability of the data arises. In particular, in the case of Wikipedia, the data are generated by users and could be subject to errors. In consequence, we wanted to answer to the following question: are the data introduced in Wikipedia articles reliable? Our research is organized in three sections. The first section provides a brief state of the art about the reliability of the user-generated data. A second section presents the methodology of our research. A third section will present the results. The error rates that were measured for the birthdate is low (0.75%), although it is higher than the 0.21% score that we observed for the baseline (reference sources). In a fourth section, the results are discussed.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata

Title: Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata

Authors: Thomas Steiner

Abstract: Wikipedia is a global crowdsourced encyclopedia that at time of writing is available in 287 languages. Wikidata is a likewise global crowdsourced knowledge base that provides shared facts to be used by Wikipedias. In the context of this research, we have developed an application and an underlying Application Programming Interface (API) capable of monitoring realtime edit activity of all language versions of Wikipedia and Wikidata. This application allows us to easily analyze edits in order to answer questions such as “Bots vs. Wikipedians, who edits more?”, “Which is the most anonymously edited Wikipedia?”, or “Who are the bots and what do they edit?”. To the best of our knowledge, this is the first time such an analysis was done for Wikidata and for really all Wikipedias—large and small. According to our results, all Wikipedias and Wikidata together are edited by about 50% bots and by about 23% anonymous users. Wikidata alone accounts for about 48% of the totally observed edits. If we do not consider Wikidata, i.e., if we only look at all Wikipedias, about 15% of all edits are made by bots and 26% of all edits are made by anonymous users. Overall, we found a stabilizing number of 274 active bots during our observation period. Our application is available publicly online at the URL http://wikipedia-edits.herokuapp.com/, its code has been open-sourced under the Apache 2.0 license.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Information Evolution in Wikipedia

Title: Information Evolution in Wikipedia

Authors: Ujwal Gadiraju, Mihai Georgescu, Marco Fisichella, Andrea Ceroni, Kaweh Djafari Naini

Abstract: The Web of data is constantly evolving based on the dynamics of its content. Current Web search engine technologies consider static collections and do not factor in explicitly or implicitly available temporal information, that can be leveraged to gain insights into the dynamics of the data. In this paper, we hypothesize that by employing the temporal aspect as the primary means for capturing the evolution of entities, it is possible to provide entity-based accessibility to Web archives. We empirically show that the edit activity on Wikipedia can be exploited to provide evidence of the evolution of Wikipedia pages over time, both in terms of their content and in terms of their temporally defined relationships, classified in literature as events. Finally, we present results from our extensive analysis of a dataset consisting of 31, 998 Wikipedia pages describing politicians, and observations from in-depth case studies. Our findings reflect the usefulness of leveraging temporal information in order to study the evolution of entities and breed promising grounds for further research.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Rhizome and Wikipedia: A Humanities Based Approach Towards a Structural Explanation of the Namespace

Title: Rhizome and Wikipedia: A Humanities Based Approach Towards a Structural Explanation of the Namespace

Authors: Stephan Ligl

Abstract: In this paper, I describe the similarities between the rhizome according to Deleuze and Guattari with their six principles and the wikipedia’s main namespace on the one hand and the principles of a botanical rhizome and wikipedia’s main namespace on the other hand and try to compare them.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.