Category Archives: OpenSym 2014

Measuring the Quality of Edits to Wikipedia

Title: Measuring the Quality of Edits to Wikipedia

Authors: Susan Biancani

Abstract: Wikipedia is unique among reference works both in its scale and in the openness of its editing interface. The question of how it can achieve and maintain high-quality encyclopedic articles is an area of active research. In order to address this question, researchers need to build consensus around a sensible metric to assess the quality of contributions to articles. This measure must not only reflect an intuitive concept of “quality,” but must also be scalable and run efficiently. Building on prior work in this area, this paper uses human raters through Amazon Mechanical Turk to validate an efficient, automated quality metric.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Older Adults and Free/Open Source Software: A Diary Study of First-Time Contributors

Title: Older Adults and Free/Open Source Software: A Diary Study of First-Time Contributors

Authors: Jennifer Davidson (Oregon State University), Umme Ayda Mannan (Oregon State University), Rithika Naik (Oregon State University), Ishneet Dua (Oregon State University), Carlos Jensen (Oregon State University)

Abstract: The global population is aging rapidly, and older adults are becoming increasingly technically savvy. This paper explores ways to engage these individuals to contribute to free/open source software (FOSS) projects. We conducted a pilot diary study to explore motivations, barriers, and the contribution processes of first-time contributors in a real time, qualitative manner. In addition, we measured their self-efficacy before and after their participation. We found that what drove participants were intrinsic motivations, altruism, and internal values, which differed from previous work with older adults and with the general FOSS population. We also found that self-efficacy did not change significantly, even when participants encountered significant barriers or setbacks. The top 3 barriers were lack of communication, installation issues, and documentation issues. We found that asking for and receiving help, and avoiding difficult development environments were more likely to lead to success. To verify these results, we encourage a future large-scale diary study that involves multiple demographics. Given our pilot study, we recommend that future outreach efforts involving older adults focus on how to effectively communicate and build community amongst older contributors.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Reliability of User-Generated Data: the Case of Biographical Data in Wikipedia

Title: Reliability of User-Generated Data: the Case of Biographical Data in Wikipedia

Authors: Robert Viseur

Abstract: Wikipedia is a collaborative multilingual encyclopedia launched in 2001. We already conducted a first research on the extraction of biographical data about personalities from Belgium in order to build a large database with biographical data. However, the question of the reliability of the data arises. In particular, in the case of Wikipedia, the data are generated by users and could be subject to errors. In consequence, we wanted to answer to the following question: are the data introduced in Wikipedia articles reliable? Our research is organized in three sections. The first section provides a brief state of the art about the reliability of the user-generated data. A second section presents the methodology of our research. A third section will present the results. The error rates that were measured for the birthdate is low (0.75%), although it is higher than the 0.21% score that we observed for the baseline (reference sources). In a fourth section, the results are discussed.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata

Title: Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata

Authors: Thomas Steiner

Abstract: Wikipedia is a global crowdsourced encyclopedia that at time of writing is available in 287 languages. Wikidata is a likewise global crowdsourced knowledge base that provides shared facts to be used by Wikipedias. In the context of this research, we have developed an application and an underlying Application Programming Interface (API) capable of monitoring realtime edit activity of all language versions of Wikipedia and Wikidata. This application allows us to easily analyze edits in order to answer questions such as “Bots vs. Wikipedians, who edits more?”, “Which is the most anonymously edited Wikipedia?”, or “Who are the bots and what do they edit?”. To the best of our knowledge, this is the first time such an analysis was done for Wikidata and for really all Wikipedias—large and small. According to our results, all Wikipedias and Wikidata together are edited by about 50% bots and by about 23% anonymous users. Wikidata alone accounts for about 48% of the totally observed edits. If we do not consider Wikidata, i.e., if we only look at all Wikipedias, about 15% of all edits are made by bots and 26% of all edits are made by anonymous users. Overall, we found a stabilizing number of 274 active bots during our observation period. Our application is available publicly online at the URL http://wikipedia-edits.herokuapp.com/, its code has been open-sourced under the Apache 2.0 license.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Wikidata: How We Brought Structured Data to Wikipedia

OpenSym 2014 is proud to announce one of the conference’s invited talks!

Title: Wikidata: How We Brought Structured Data to Wikipedia

Speaker: Daniel Kinzler and Lydia Pintscher of Wikimedia e.V.

Abstract: Over the last two years we have been developing Wikidata and build up a community around it. Wikidata is Wikimedia’s central repository for structured data. This is the place where data, like the number of inhabitants of a country, is stored and made accessible to humans and computers alike. The data is used across all 287 language editions of Wikipedia and its sister projects as well as in projects outside of Wikimedia. In this talk we will take a look at how we developed Wikidata, what great tools are being built on top of it and what is in store for the future.

Biographies: Daniel Kinzler is the lead developer of the Wikidata project at Wikimedia Germany. He has been active on Wikipedia since 2004 and contributed to MediaWiki since 2005. He has a diploma in Informatics with a thesis about data extraction from Wikipedia.

Lydia Pintscher studied computer science at the Karlsruhe Institute of Technology and is the product manager of Wikidata at Wikimedia Germany. She has been with the Wikidata project since its beginning and is passionate about all things Free Culture. In her other life she is a board member of KDE e.V. and editor of Open Advice.

Information Evolution in Wikipedia

Title: Information Evolution in Wikipedia

Authors: Ujwal Gadiraju, Mihai Georgescu, Marco Fisichella, Andrea Ceroni, Kaweh Djafari Naini

Abstract: The Web of data is constantly evolving based on the dynamics of its content. Current Web search engine technologies consider static collections and do not factor in explicitly or implicitly available temporal information, that can be leveraged to gain insights into the dynamics of the data. In this paper, we hypothesize that by employing the temporal aspect as the primary means for capturing the evolution of entities, it is possible to provide entity-based accessibility to Web archives. We empirically show that the edit activity on Wikipedia can be exploited to provide evidence of the evolution of Wikipedia pages over time, both in terms of their content and in terms of their temporally defined relationships, classified in literature as events. Finally, we present results from our extensive analysis of a dataset consisting of 31, 998 Wikipedia pages describing politicians, and observations from in-depth case studies. Our findings reflect the usefulness of leveraging temporal information in order to study the evolution of entities and breed promising grounds for further research.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

On Influences Between Software Standards and Their Implementations in Open Source Projects: Experiences from RDFa and Its Implementation in Drupal

Title: On Influences Between Software Standards and Their Implementations in Open Source Projects: Experiences from RDFa and Its Implementation in Drupal

Authors: Björn Lundell (University of Skövde), Jonas Gamalielsson (University of Skövde), Alexander Grahn (University of Skövde), Jonas Feist (RedBridge AB), Tomas Gustavsson (PrimeKey Solutions AB), Henrik Strindberg (Findwise AB)

Abstract: It is widely acknowledged that standards implemented in open source software can reduce the risk for lock-in, improve interoperability, and promote competition on the market. However, there is limited knowledge concerning the relationship between standards and their implementations in open source software. This paper reports from an investigation of influences between software standards and open source software implementations of software standards. The study focuses on the RDFa standard and its implementation in the Drupal project. Specifically, issues in the W3C issue trackers for RDFa and the Drupal issue tracker for RDFa have been analysed. Findings show that there is clear evidence of reciprocal action between RDFa and its implementation in Drupal. The study contributes novel insights concerning effective processes for development and long-term maintenance of software standards and their implementations in open source projects.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Rhizome and Wikipedia: A Humanities Based Approach Towards a Structural Explanation of the Namespace

Title: Rhizome and Wikipedia: A Humanities Based Approach Towards a Structural Explanation of the Namespace

Authors: Stephan Ligl

Abstract: In this paper, I describe the similarities between the rhizome according to Deleuze and Guattari with their six principles and the wikipedia’s main namespace on the one hand and the principles of a botanical rhizome and wikipedia’s main namespace on the other hand and try to compare them.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

Not Only for Ideation, But Also for Signaling: Incorporating User-Profile-Webpages into Virtual Ideas Communities

Title: Not Only for Ideation, But Also for Signaling: Incorporating User-Profile-Webpages into Virtual Ideas Communities

Authors: Ulrich Bretschneider, Philipp Ebel, Shkodran Zogaj, Jan Marco Leimeister

Abstract: This research-in-progress-paper describes the case of SAPiens, which is a Virtual Ideas Community (VIC). Typically, SAPiens – and VICs in general – focuses solely on supporting the ideation interactions among members. There is evidence from a survey that SAPiens members are also interested in actively signaling competences, experiences and skills to third parties. However, SAPiens does not offer IT functionalities that would allow for such a signaling. Against this backdrop, we propose to enrich SAPiens through User Profile Webpages allowing SAPiens members to construct a public profile within the community and thereby to signal individual capabilities, skills and experiences. The aim of this action design research is to design such an IT artifact by building on the signaling theory. After this initial design, our research constitutes a circular process of constant refinement as well as piloting and evaluation of the IT artifact in the real world setting of the SAPiens VIC.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.

From Mashup Applications to Open Data Ecosystem

Title: From Mashup Applications to Open Data Ecosystem

Authors: Timo Aaltonen (Tampere University of Technology), Tommi Mikkonen (Tampere University of Technology), Heikki Peltola (Tampere University of Technology), Arto Salminen (Tampere University of Technology)

Abstract: Web-based software is available all over the world instantly after the online release. Applications can be used and updated without need to install anything, with natural support for collaboration, which allows users to interact and share the same applications over the Web. In addition, numerous web services allowing users to upload, download, store and modify private and public resources have emerged. However, as the amount of web services and devices used to consume as well as generate data has exploded, it is difficult to access and manage relevant data. In this paper, we start from the principles of mashups, reflect their use to the concepts of software ecosystems, and finally extend the discussion to open data generated by users themselves. As a technical contribution, we also introduce our proof-of-concept implementation of a mashup system built on wellness data, and discuss the main lessons we have learned in the process.

This contribution to OpenSym 2014 will be made available as part of the OpenSym 2014 proceedings on or after August 27, 2014.