The Swedish Wikipedia's prolific Lsjbot, which has created a significant proportion of the site's 1.7 million articles and has nearly single-handedly pushed it to being the fourth-largest Wikipedia, was covered in the Wall Street Journal this week.
In its front page article, the US newspaper reported that the bot has created 2.7 million articles, which is apparently a reference to the Waray-Waray and Cebuano Wikipedias (where Lsjbot is also active), and that "on a good day", it creates 10,000 articles.
The Wall Street Journal's article comes as the Cebuano Wikipedia is now the twelfth Wikipedia to cross the million article mark, almost entirely from the boost of these formulaic articles. Of these, over 40% (Swedish, Waray-Waray, Cebuano, Vietnamese, and Dutch) have received significant help from automated article creation scripts. The highest depth of these five is Vietnamese, with 18; Swedish follows with 11, and the others are all under ten. By comparison, the German Wikipedia has a depth of 90.
The process of bot-created articles has proved controversial among Wikimedians; by way of commenting, German Wikipedian Achim Raschka pointed the Signpost to an entry Denis Diderot wrote for the Encyclopédie, titled "Aguaxima". Diderot lamented that all they knew about the Aguaxima was that it was a plant in Brazil, yet he still had to describe it: "If all the same I mention this plant here, along with several others that are described just as poorly, then it is out of consideration for certain readers who prefer to find nothing in a dictionary article or even to find something stupid than to find no article at all."
Disagreement with these edits even led to a proposal last year that would have banned the overuse of bot-created articles on Wikimedia projects.
Still, they are not the first Wikipedias to utilize bots to augment human article creators: in 2007, Volapük and Lombard were expanded by over 100,000 bot articles each; Tagalog saw a similar rise. Lombard editors later placed a moratorium on new automated articles and deleted most of them; the Lombard Wikipedia currently has around 31,000 articles. Volapük is hovering around 120,000, and the Tagalog Wikipedia has close to 63,000.
Waray-Waray, Cebuano, and Tagalog are three of the largest languages of the Philippines. Volapük is a 19th-century constructed language from Germany, and Lombard is a Romance language from northern Italy. Vietnamese is primarily limited to Vietnam, while Dutch is spoken in the Netherlands, Belgium, and Suriname.