History: XML dumps and research needs for WMF projects

Source of version: 3 (current)

*Session time: Thursday, July 8, 2010, 9 AM
*Facilitator: Ariel Glenn
*Participants: Kevin Crowston, Victor Grishchenko, Daniel Kinzlerm Roan Kattouw, Andreea Gorbatai

Discussion topics:
*Proposals for new information in the XML dumps, for various ways to segment the dumps into smaller chunks or produce samples
*Types of usage statistics people want to see produced, navigation path statistics
*Proposal to collect and provide search terms from Lucene and Google searches, track search successes and failures
*Shared researcher collaboration and computing platform for sharing dumps, samples, tools, research results and for providing disk space and computing power

There is a wiki page at http://www.mediawiki.org/wiki/Research_Data_Proposals which contains the list of proposals; please add items there.