Checking that everything's all right: This week, the Signpost decided to have a look around with WikiProject Check Wikipedia a maintenance project not concerned so much with articles' content, but in all the tiny errors that are to be found scattered within them. Their front page gives a list of things they mainly focus on ...
This week, the Signpost decided to have a look around with WikiProject Check Wikipedia, a maintenance project not concerned so much with articles' content, but in all the tiny errors that are to be found scattered within them. Their front page gives a list of things they mainly focus on:
“
Checkwiki helps clean up syntax and other errors in the source code of the Wikipedia by finding problems such as:
Eliminate errors in wiki syntax, such as a missing close tags or brackets;
Check for accessibility issues, such as small print or heading problems;
Correct, delete or move code that does not follow conventions, such as the position of references with respect to punctuation;
It's one of those essential but hidden projects home to a number of WP:WikiGnomes, who quietly come around with their tools and fix those small mistakes. It might not get much attention paid to it, but this week, we thought we'd feature them here so you can take a look at their work, and who knows, even give them a hand. So, we spoke to Bgwhite and Magioladitis to get their inside perspective on WikiProject Check Wikipedia.
What is Check Wikipedia?
Bgwhite: Check Wikipedia finds common syntactical errors. Errors range from missing closing tags, missing/extra brackets, accessibility issues and MOS issues. Errors are found by scanning the twice monthly dumps of 47 different Wikipedias and the monthly dump of English Wikipedia. Daily scans are also performed on recently changed articles for five different Wikipedias, include English Wikipedia. A whitelist is used so false positives do not show up as errors. Each different Wikipedia has their own configuration file. With the configuration file, errors can be turned off or on, errors can be assigned different priority values and each Wikipedia can have their own whitelist.
There is a web site that shows the errors and where in the article the error is found. One can manually edit the article or use the autoFormatter script to fix the error. One can also load up the errors inside the tools AWB and WPCleaner.
Tell us about the history of WikiProject Check Wikipedia and how it came to be.
Bgwhite: Stefan Kühn and a few others started the project on dewiki around September 2008. It was inspired by WikiProject Wiki Syntax, which stopped in 2006. The project was expanded to other Wikipedias and then in 2010, it received its last major update. Until 2013, the project stagnated and infrequent checks against dumps were being made. I had started running the programs on my home computer in 2012 because of Check Wikipedia being infrequently run on the dump files and I also started making fixes. Check Wikipedia was ported over to WMF Labs in late summer of 2013. Since then, several detected errors have been removed, new detected errors added, whitelist function added and a ton of fixes or tweaks done.
How much of your time that you spend on Wikipedia is used "checking" Wikipedia for errors? Is it time-consuming?
Bgwhite: I usually spend 3-4 hours a day fixing new errors found that day. Any "free" time is spent cleaning out a backlog in an error or doing some coding.
Magioladitis: I spend about the same amount of time. And there is extra time of course for fixing bugs on the automated tools, the scripts, etc. If we distribute the work better in the future and more editors involve I hope I'll need to spend less time on fixing and more time on improving the automated tool.
What motivated you to join WikiProject Check Wikipedia? Do you find yourself to be a bit of a "grammar nazi" or perfectionist, or maybe you feel it is the best way you can contribute if writing content isn't your forte?
Bgwhite: My background is in math and computer science. Anyone who has ever read a talk message of mine knows that grammar and spelling are something I rarely use. I started out writing articles. I shifted to editing new biography articles, mostly correcting mistakes. Slowly over time I went from correcting new articles to correcting mistakes via Checkwiki.
Magioladitis: I started my wikilife by fixing red links in 2005-2006 as a member of Wikipedia:WikiProject Red Link Recovery. Then I shifted into fixing Wikipedia:Disambiguation pages with links and later by adding {{blp}} banners in talk pages of biography articles. I am one of the developers of WP:AutoWikiBrowser which fixes dozens of common syntax errors. When I discovered WikiProject Check Wikipedia I realised it was the perfect ground for me for testing AWB's code and improving it. I am not a native English speaker and sometimes I do not feel comfortable writing large portions of text. I enjoy small edits and I certainly like to imagine Wikipedia as a huge encyclopaedia in which all articles follow a certain style. This is how printed books are and this is how I look Wikipedia to be.
Is most of your scrutinizing of Wikipedia done manually or with a semi-automated script, tool or bot? Do you find these to be accurate and useful or sometimes making mistakes?
Bgwhite: It is done by manual editing and by bot. During the past year, the AWB and WPCleaner developers have worked with us to find and fix more errors. Both these tools can correct many errors in bot mode. TMg has been updating his wonderful script, autoFormatter. Both AWB and WPCleaner are integrated with Checkwiki. WPCleaner and Checkwiki both use the same configuration files. These three tools are an integral part of Checkwiki. A listing is available that shows each error and what each tool can fix automatically or find.
What is the most common problem you have to fix?
Bgwhite: There is really no common problem. What I find interesting is what makes up some of the errors. For example, Error #8 is headlines that don't end with an equal sign. 80% of these errors are a result of vandalism. Error #58 is headlines that are all CAPS. 80% of these errors are found on India related articles. Upward of 25% of the broken bracket errors are related to vandalism.
What is your inter-WikiProject relationship with the other Active Wiki Fixup Projects, and do the projects work collaboratively to fix a common problem? How many of your participants are also members of another cleanup WikiProject?
How can the average Wikipedian come to help you out; what is a good way for them to start assisting?
Magioladitis: We certainly need more Wikipedians to get involved independently if they use automated tools or not. We provide daily lists of pages with errors and everyone is invited to contribute. Pages with ISBN errors have a huge backlog and can't be done via automated tools. Moreover, we certainly prefer manual editing than bot editing since many errors involve checking the edit history. We need editors to have a closer look at pages and check whether the error reported is the result of vandalism or not.
Is there anything else you would like to add to the interview?
Magioladitis: Checking Wikipedia for syntax fixes is a great task for many editors. It involves reading many different articles, counter-vandalism (counter-vandals are the superheroes of Wikipedia), improving rendered time of pages and much more. It's the perfect task for WikiGnomes. It can be beneficial for Visual Editor and automated tools. It's a nice way to get more involved in Wikipedia. Try it!
Next week, this report will chat to the editors of Scotland just before their vote for or against political independence. In the meanwhile, check out past reports in the archive.
Discuss this story