As part of the 2011 Summer of Research, the Wikimedia Foundation's Community Department has announced an experiment to investigate potential improvements to the new editors' experience of their first contact with patrollers, using the Huggle anti-vandalism tool. The Summer of Research is a three-month intensive project to study aspects of participation in Wikipedia that may have a significant effect on editor retention. It brings together a group of researchers, mostly PhD candidates, who have experience in both computer science and the social sciences, to give us a more well-rounded understanding of participation in the projects. (See also earlier Signpost coverage: "Wikimedia Summer of Research: Three topics covered so far", "WMF Community Department announces 'Summer of Research' participants")
The Signpost interviewed researchers R Stuart Geiger (who uses the Staeiou account for non-research editing), Aaron Halfaker, and Wikimedia Foundation Fellow Steven Walling to find out more. Steven has been a volunteer editor on the English Wikipedia since 2006, and before taking up the Foundation Fellowship was a professional writer and blogger, mostly for technology publications and companies. Stuart has been a Wikipedia editor since late 2004, and has been studying the project as an academic since his undergraduate senior thesis in 2006. Since then, he's been gathering from a number of fields the conceptual, theoretical, and methodological tools necessary to study something as complex as Wikipedia. "At present, I'm a doctoral student from the School of Information at the University of California, Berkeley, and I have a keen interest in both the digital humanities and social statistics movements." Aaron is a computer science graduate student from the University of Minnesota. He's been an editor since 2008 and has published academic research on Wikipedia since WikiSym 2009. He specializes in statistical data mining and designs user-scripts for Wikipedia to understand/improve editor interactions.
How did the Summer of Research project come about, and what questions will it investigate? According to Steven, the experiment aims to test "warning templates that are explicitly more personalized and set out to teach new editors more directly, rather than simply pointing them to policy and asking them not to do something". Steven says he personally got involved because, as a Fellow at the Foundation, research has been part of his job. "I currently share the responsibility for leading the project team with Diederik van Liere and Maryana Pinchuk. Diederik has experience with the technical side of this project, Maryana is a qualitative researcher with an academic background, and I lend community experience to round out the leadership team. We built an enormous, multi-part question list publicly on Meta. But it turns out that was just a beginning guide. We've been structuring the summer as a series of weekly sprints, and to get a feel for the research topics that have been and are currently being explored, I'd check out the public list on our Meta page. Because the team has a wide variety of skills, we've looked at many different aspects of Wikipedia as a community so far."
Aaron said they decided to experiment with Huggle's standardised warning system because the project goal is to understand the decline in new editors, so it seemed logical to focus on new editors' experience in the community. "Team-member Dr. Melanie Kill suspected that welcome messages might have an effect on how new editors perceive the community. So because Hugglers send out the most messages to new editors, we wanted to see if we could improve conversion (from damage) and other retention rates by just changing the wording of the message."
We wondered how the Foundation's sometimes lofty strategic goals, like "Support the recruitment and acculturation of newer contributors", are translated into practical initiatives such as this. Steven points to the Board resolution on Openness and the Foundation's Annual Plan for 2011–12. "Recruiting and retaining editors for Wikipedia is now one of our top priorities, and Zack Exley, our Chief Community Officer, designed the summer to really dig deeper into the exact areas of English Wikipedia and other projects that have the largest effect on new editors, and whether those editors stick around. The Editor trends study gave us a high-level understanding of the trends in participation, but it didn't tell us with certainty what internal community factors most have an impact. We need to have more data we're confident in if we're going to make good decisions, thus the Huggle experiment, which is clarifying that automated editing tools have a huge impact on new editors. The project was in the sweet spot of being able to gather a statistically significant sample quickly and with minimal impact on the normal functioning of the community."
Stuart's background seems ideally matched to an experiment that seeks to understand social phenomena using technical methodologies. "I'm an adherent of the sociotechnical systems approach, which thinks in terms of how social and technical phenomena are inherently intertwined, especially when we study processes in communities as technologically mediated as Wikipedia. Our motto, 'the free encyclopedia that anyone can edit' speaks to this principle that the Wikipedia community can't be fully understood without taking into account the code on which it runs – and vice versa. Huggle is a great example of this: scripts, tools, and bots like Huggle, Twinkle, and User:ClueBot have become the predominant way in which new users are introduced into Wikipedia. In fact, here's a statistic that is hot off the research press: almost 75% of newbies have their first talk page message sent to them from one of those semi- or full-automated software systems."
How were the parameters of the experiment decided on – for example, the number of warnings delivered, the proportion of changed warnings? Aaron says they settled on three variables for testing in the experiment: personalized, teaching-oriented, and image. "Dr. Kill, a professor of rhetoric, produced personalized and teaching-oriented versions of the default warning template for Huggle; Stuart and Aaron then expanded these templates with image/no-image versions and prepared a random template generator. Our requirement for the number of experimental welcomes/warnings is based on a bit of statistical algebra that allows us predict how many observations we'll need to find statistically significant differences between the variables."
The Huggle experiment is not the first to investigate the interactions of patrollers and new page creators. In the 2009 community-led Newbie treatment at Criteria for speedy deletion experiment (Signpost coverage), experienced editors (one Signpost interviewer included) posed as inexperienced article creators to look into how new contributors are treated in the patrolling process. The experiment attracted significant controversy due to ethical concerns surrounding the lack of informed consent of the participants. Steven says that before the experiment they posted a public notice at the Village Pump. "We also spoke directly with the main Huggle developers over email, IRC, and on-wiki (Addshore, Gurch and other volunteer developers deserve a lot of credit here; we couldn't have done this without their help and consent beforehand). I should probably point out that we felt pretty confident about this experiment because Stuart is a prolific Huggler himself. Even if we had no volunteer editing experienced as a team, I think the key difference between this and the treatment experiment you referred to is that we've been transparent about our actions before going forward with it."
Aaron points out that Huggle users come across hundreds of potential editors every day, and a surprising proportion of these editors are testing whether they can, in fact, edit Wikipedia by damaging an article. "We suspect that the reaction these potential editors receive affects whether they'll register an account and try contributing productively. We hypothesized that the tone of the welcome/warning message could be an important factor in this decision. We have Hugglers testing a few variations of the 1st-level warning message to find out if we're right."
So can we expect more experiments like this in the future? Steven says he could probably do an entire Signpost report just on this topic. "But let me give it a quick shot by saying that the recently released Annual Plan (see link above) will give you a very good idea of what direction we're focusing on, as well as activity on mediawiki.org, the tech portion of blog.wikimedia.org, and the impending software deployments page. We try to make sure to push a message locally here when new experiments with features or anything else is happening, but those three places are where to look if you're interested in these topics in the future."
Discuss this story
"Fewer than half of the newbies investigated received a response from a real person during their first 30 days". I think we really dropped the ball here. Interaction is a major way to recruit newbies and hopefully turn them into "regulars". OhanaUnitedTalk page 05:18, 2 August 2011 (UTC)[reply]
Perhaps two critical concerns will govern the efficiency with which the problem can be addressed: (i) how long into a newbie's edit-history the patterns become clear, and (ii) the extent to which they can be identified by a bot (including whether a bot could do the initial "easy" filtering and pass a minority on to human eyes for higher-level sorting to identify the promising newbie-pluses for human interaction – a three-tiered filtering, as it were). Of particular interest might be the grey area of newbies – not those who will clearly stay and those who clearly won't (or who we clearly do or don't want to stay), but those where final stage, human interaction, has a reasonable likelihood of making the difference, of bringing them over the line. Finding the best bot/human mechanism for rationing the supply of "newbie mentors" to this prioritised editorial demographic, IMO, is the challenge. After that, a future project could work on developing guidelines for the best ways in which to interact with newbie-pluses. Tony (talk) 02:41, 3 August 2011 (UTC)[reply]