Community-elected Wikimedia board member Kat Walsh is a copyright lawyer and free-culture advocate. This month she joined the San Francisco–based non-profit Creative Commons (CC) as an attorney. CC is devoted to expanding the range of creative works available for others to build on legally and to share, and has released several copyright-licenses free of charge to the public.
Creative Commons (CC) is currently working on version 4.0 of its suite of copyright licenses, which include the CC-BY-SA and CC-BY licenses used by the Wikimedia projects. Wikimedia adopted BY-SA-3.0 in 2009, and we hope that the 4.0 version will be superior for all license users, including Wikimedia. But to meet its goals, CC needs your input into the revision process.
Background and goals of the 4.0 process
The CC wiki lists five ambitious goals for the revision:
Internationalization, by further adapting the core suite of international licenses to operate globally, ensuring they are robust, enforceable and easily adopted worldwide;
Interoperability, by maximizing interoperability between CC licenses and other licenses to reduce friction within the commons, promote standards, and stem license proliferation;
Longevity, by anticipating new and changing adoption opportunities and legal challenges, which will allow the new suite of licenses to endure for the foreseeable future;
Focus on data, public-sector information (PSI), science, and education, by identifying and addressing impediments to the adoption of CC by governments and other institutions in these and other critical arenas; and
Support for existing adoption models and frameworks, by accommodating the needs of our existing community of adopters leveraging pre-4.0 licenses, including governments and other important constituencies.
Wikipedia was launched in January 2001, almost two years before CC published its first licenses. All Wikipedias were initially licensed under the GFDL, a Free Software Foundation (FSF) license intended for software documentation; the main advantage was its "copyleft" terms, which allow any user to reuse and remix GFDL works as long as the result is shared under the same license.
But before Wikipedia, GFDL had not been widely used for cultural works outside the realm of free software, and some of its requirements weren't well-suited for the uses people were making of freely licensed content. Other licenses existed, but were incompatible with the GFDL and with each other.
Meanwhile, CC quickly rose to prominence, gaining wide adoption among communities of creators, including other wiki projects such as Wikitravel and WikiEducator. Many Wikipedia users were already choosing to dual-license their contributions under both GFDL and one or more of the CC licenses (Wikinews was already using the non-copyleft CC-BY license). Wikimedia worked with CC and the FSF to bring the two licenses into closer harmony, ultimately leading to the release of GFDL version 1.3, which allowed collaborative works licensed under it to be relicensed under CC-BY-SA. Wikimedia held a successful community referendum on adopting 1.3, and began dual licensing with the CC-BY-SA-3.0 in June 2009.
CC published the 3.0 license suite in early 2007. Over the past five years, those licenses have been widely used for works that are free to share without all of the restrictions of standard copyright. They've been adopted by cultural institutions, national and local governments, media-hosting websites, educational projects, and popular artists. Wikimedia is one of the largest and most prominent users, with a community whose goals to make available the free and open sharing of knowledge are closely aligned with those of CC, so the needs of the Wikimedia communities are an important consideration for CC.
In the past several years, use by the Wikimedia communities and others has revealed opportunities for improvement. For example, the specific requirements for attribution have proved difficult to follow, even for the most diligent, good-faith reusers. Many users have been concerned that the licenses don't adequately address database rights, moral rights, and copyright-like rights, to ensure they create the right expectations for both licensors and reusers. And while CC licenses have been officially "ported" to many jurisdictions to make them more closely aligned with local laws, the international (formerly "unported") license is in wide use globally; to make it as good a legal tool as possible for a worldwide community of users, it needs revision to better address the legal requirements of all national jurisdictions.
All of this is done keeping in mind the need to be responsible stewards of the license, and that the new version needs to continue to uphold the expectations of those using them to extend the commons. CC has been actively consulting with organizations such as Wikimedia, the Free Software Foundation, and the Open Knowledge Foundation to ensure that changes to the licenses don't inadvertently harm the freedoms those licenses are intended to help in the first place.
CC general counsel Diane Peters explained the goals in more detail in her blog post following last year's CC Global Summit.
Your help is needed
To achieve these ends, the CC community is currently discussing several open questions on its mailing lists (community and licenses) and wiki. Many members of the Wikimedia communities have already contributed to those discussions, including individual volunteers and Wikimedians who are part of CC's international affiliate teams. The first public draft is now open for comment and discussion. Throughout the drafting process, CC will make more focused calls for input, asking specific questions. (The most recent call was five open questions on attribution here.)
Wikimedia has already been involved in the drafting process. I attended the CC Global Summit last September on behalf of Wikimedia and began talking to the CC legal team about the variety of issues Wikimedia faces with licenses. Wikimedia's Legal and Community Advocacy team (especially legal counsel Michelle Paulson) has been giving input on the process since the announcement in September.
But for the licenses to be suitable for diverse uses, it takes more than just a few heads coming together. Copyright mavens outside the US are especially needed to look at jurisdiction-specific issues to ensure the licenses are valid worldwide.
Many of the open questions depend on knowledge of a wide range of community practices. Do you work with print reusers, GLAMs or other national institutions, or mirrors and forks of Wikimedia content? Do you handle photo submission requests, or use freely-licensed photos in MediaWiki skins? Every volunteer has a particular area of expertise that is difficult for others to know about without your help. Where do you see the greatest opportunities for improvement in the licenses, to best encourage sharing and reuse?
Even if you're not a licensing expert, you can help by sharing the calls for comment with parts of the community who would be interested and haven't seen it yet, and by translating the calls for information and posting them on your language's community forums.
4.0 process timeline
According to the draft timeline, the second draft will be published next month, with another comment period before the third draft in September; by that stage, the process should be nearly complete. Final comments will be taken after the third draft, and if all goes as scheduled, the final draft of the licenses will be published sometime around December 2012. (The earlier that proposed changes are discussed, the more likely it is that they can be addressed and potentially included!).
After the final revision is published, Wikimedia will begin a process of deciding whether to adopt the later version of the CC-BY-SA license as the primary license for its projects. With board, staff, and community input from the earliest stages, we hope this will be a smooth process, and that potential problems will be raised and discussed well before the final draft is published.
New connections
By taking a legal counsel job with CC, joining its small legal team, I'm thrilled to have the chance to work on these issues full-time. The most frequent question put to me about the job is "will you have to leave Wikimedia?" I'm happy to say that the answer is no. Instead, I'm looking forward to using my knowledge of Wikimedia and its legal and strategic challenges to help CC achieve its goals of creating infrastructure for sharing knowledge and culture.
One challenge I'll have is being clear who I'm speaking for when talking about licensing. (Here, I have my Wikimedia hat on!) I'll also recuse myself from board decisions involving CC and CC licensing. But in practical terms, I'm hoping to face very few actual conflicts: one of the most rewarding things about being part of Wikimedia is that I think that Wikimedia's goals really do serve the public interest, and I think the same of CC. This licensing process is intended to be the last revision for a long while; what is at stake is powerful long-term effects on the ability to share and reuse material in the commons all over the world.
The Commons Picture of Year Committee has just announced the winner of The Sixth Annual Wikimedia Commons POTY Contest: Lake Bondhus Norway 2862, shot by German Wikipedian Heinrich Pniok using a Canon EOS 5D Mark II with 24 mm focal length, then digitally retouched. Known as User:Alchemist-hp on WMF projects, Heinrich is a familiar participant at the featured-picture processes on Commons and the English and German Wikipedias, and has gifted to us an array of fine pictures of the chemical elements, inorganic compounds, minerals, insects and animals, and plants, landscapes, and places.
Heinrich told the Signpost he made the picture from three single images with different exposures to produce a more realistic dynamic. "The eyes can see better than the best camera," he says, "but not if I can use good software to achieve a similar dynamic view. I tested a lot of different software to be able to produce pictures like Lake Bondhus. Photomatix Pro is my favoured tool for making HDR/tone mapping, or put simply, images in which you blend different exposures." Ironically, Heinrich's capturing of how the unusual scene appeared to his eyes – by the use of varying focus throughout the image and by digital retouching – led to a few opposes among many positive reviewers at the Commons featured-picture nomination page. Reviewer George Chernilevsky commented that the effect is "mystical", to which Heinrich replied that the place itself was mystical (not just his image of it).
Heinrich told us that on 23 July last year he and his wife went on "a two and a half hour walking tour along a road about 50 km southeast of Bergen, Norway's second-largest city. We had mixed weather that day, both sunny and rainy. When we arrived at this place we were very happy and surprised to find such a beautiful scene: dreamlike and mystical, with a fantastic light." (Zoomable Google Map.)
With 143 votes, Lake Bondhus was the stand-out over editors' second and third choices, with 118 and 57 votes respectively. Why was it so popular? One Commons editor gave this explanation: "Take a look at the composition: the glacier angle reversing into the angle of the boat; the seemingly random scattering of the rocks in the water, counterposed with the rocky foreground; the rather elegant line of posts; the binary reflection of clouds, rocks, and mountains, and the variety of textures. The most striking aspect is the serenity of the boat and the water versus the ragged clouds that seem to impinge on the scene." Lake Bondhus is a featured picture on the German Wikipedia, and appeared on the main page of Commons on 15 May.
The people's second choice was a self-portrait by NASA flight engineer Tracy Caldwell Dyson in the Cupola module of the International Space Station during Expedition 24, taken 11 September 2010 using a NIKON D2X with 16 mm focal length, from a distance of just over a metre. The image won high praise from reviewers at the English Wikipedia's featured-picture nomination page, despite a few queries about EV (encyclopedic value). The photographer has completed three spacewalks, is a private pilot and a former track-and-field athlete, and surprisingly, is lead vocalist for the all-astronaut band Max Q.
The third choice was an image by francophone Belgian Wikimedian Luc Viatour, whose photography comprises a stunning variety of subjects, from the astronomical to landscapes, wildlife, and buildings – amply demonstrated in Luc's gallery of his Commons featured pictures. Cueva de los Verdes (Spanish for the Verdes' cave, named after the former owners, the Verdes family) is a lava tube and tourist attraction in the Canary Islands, off the west coast of Africa. The cave was created around 3,000 years ago by lava flows from the nearby volcano Monte Corona, flowing across the Malpaís de la Corona toward the sea. When the lava drained away, the solidified upper part remained to form the roof of the caves, which extend for 7.5 kilometres. In earlier centuries, islanders hid in this cave to protect themselves from pirates and slave raiders. Luc's images have been finalists in the competition for five years in a row. He told the Signpost he took the photo during his vacation on the island in 2011. "I used a Nikon D3s, 14–24 mm 2,8, tripod. The water you see was fresh, and with the artificial lighting gave a beautiful reflection of the cave ceiling."
Wikimania 2012
Wikimania, the annual international Wikimedia community conference, will be held in Washington DC on 12–14 July. This will be the first time since Boston in 2006 that the conference has been held in the US.
The pre-program will start with Wikimania Takes Manhattan, 6–9 July, and will come to a local peak in the next wave of New York's Wiknic event, the Wiki World's Fair on 7 July on Governors Island in New York Harbor.
On 10–11 July, MediaWiki hackers, Toolserver users, gadget developers and others will meet for the annual Wikimania Hackathon and will revisit issues they looked at during their Berlin meeting earlier this month. Alongside the technology event, the Ada Initiative will host a camp to promote women’s participation in open technology, and the Wikimedia chapters will meet to finally work out the basics of their new umbrella organization, the Wikimedia Chapters Association. On the eve of the main event, Google will host a reception of its own.
During the four-day conference the schedule will cover a wide range of issues, sorted in thematic categories; these will include chapters, education, GLAM, and technology and infrastructure. In addition to the main schedule, the National Archives and other local institutions will offer tours, and Wikimedians will meet with library representatives to work on collaborative outreach projects (Wiki loves libraries).
On 15 July, an unconference will take place and the WMF US education program working group will look at how to reform collaborative projects with US universities. Online registration is open until 23:59 EDT, 4 July; on-site registration will be available.
Ten proposals have recently been published to overhaul the RfA process, most of them focusing on procedural rather than technical remedies. Under consideration are expert committees, empowered to select administrators in place of the current polling method, and remodelling the RfA process by adding additional stages or dividing the process into two stages.
On 24 June, Jc37 proposed a technical solution by the creation of a new user group. He pointed out that this new set of user rights to promote content-related admin activities would reduce backlogs in areas such as AfD and Cfd. Tools related to the management of user behaviour, like blocking and protecting, would not be part of this package.
Excising these tools, so the argument goes, could turn down the volume in RfAs so that candidates are assessed on the merits of their "understanding of how to determine consensus in discussions, various content-related policies and guidelines, and also on the trust requisite with only the particular tools they would be receiving". The proposal prompted wide-ranging discussions, and commands some support.
Users interested in contributing to the ongoing debates are listed here.
Brief notes
Offline outreach in French-speaking Africa: Wikimedia France has announced a new project, called Afripedia, to push up offline outreach in the French language in Africa. The chapter will cooperate with the Agence universitaire de la Francophonie, Kiwix, and the Institut français to implement offline versions of Wikipedia at 20 points in 15 West African countries by the northern autumn 2012.
Wiktionary app for Android: The WMF has published a mobile app on for Wiktionary on Android. As the foundation reports in its blog, Wiktionary is available in more than 150 language versions.
Article feedback tool: The fifth version of the article feedback tool will be expanded from 0.6% to 10.0% of articles by 3 July. Central notice information for the release will soon be available.
Social media outreach in India: In the WMF's blog, the foundation's Indian pilot program has reported on its work in Facebook and other social media to reach out to new editors for the Indian language projects.
Wales slams content industry's role in copyright-related extradition
Jimmy Wales has called on the United Kingdom's Home Secretary, Theresa May, to stop the extradition of Richard O'Dwyer to the United States for his alleged breach of American copyright law.
O'Dwyer is being charged by the American federal government with criminal copyright infringement related to his former websites TVShack.net and TVShack.cc. The prosecutors allege that he was "involved in the illegal distribution of copyrighted movies and television programs over the Internet". As O'Dwyer resides in the United Kingdom, the United States' Justice Department asked for his extradition in May 2011 under the UK's Extradition Act 2003. The case resides in murky legal ground, however; O'Dwyer's defense team argues that American laws should not apply to a website hosted in the UK. They also argued that his TVShack websites "simply provided a link" to the content, rather than actually hosting and curating the offending material—essentially, they believe that the site functioned as an online service provider as envisioned under the American 1998 Digital Millennium Copyright Act.
Describing O'Dwyer as a "clean-cut, geeky kid" and "precisely the kind of person one can imagine launching the next big thing on the internet", Wales sees O'Dwyer's fight against extradition as another battle between the large television/film industry (Wales' "content industry") and the wider public. Previous battles included the popular movement against two proposed American laws, the PROTECT IP Act and the Stop Online Piracy Act, also known as PIPA and SOPA, respectively. Actions taken to protest the bills included the blackout of several major websites, including Wikipedia, on 18 January 2012 (see previous Signpost coverage: 16 January, 23 January). Wales called O'Dwyer the "human face" of this war, and warned that "if he's extradited and convicted, he will bear the human cost." (more information in the Guardian; Wales' change.org petition)
Is Wikipedia politically biased?
On 18 June, the Washington Postreported on a study by Northwestern University's Shane Greenstein and the University of Southern California's Feng Zhu, "Collective Intelligence and Neutral Point of View: The Case of Wikipedia", which examined the viability of Linus' Law ("Given enough eyeballs, all bugs are shallow") through the case study of Wikipedia articles on American federal politics. They chose the topic because it would be an area "where Linus' Law would face challenges due to the presence of controversial topics and lack of verified and/or lack of objective information."
The Post claimed that the results showed that "only a handful of [Wikipedia articles] were politically neutral," though the study was positive in their belief that "Wikipedia's entries lack much slant and contain less bias than observed earlier." The pair came to this conclusion by analyzing a decade's worth of Wikipedia articles on American politics. It noted that while a large number of users sought to remove bias from the articles, most articles receive little attention from most users and, more often than not, they retain their political bias, which will often be that of the original contributor. (See also the review of an earlier version of the paper in the Signpost's "Recent research" section: "Given enough eyeballs, do articles become neutral?") Whatever the reason, if these accusations are true then Wikipedia is breaking its own commitment to a neutral point-of-view.
The pair used a technical index to determine the political slant of articles which measure how often one thousand phrases were used. These were taken from all of the remarks made by both Democrats and Republicans, the two main American political parties, in 2005. Essentially, the index uses the logic that an article written from a Democrat's point of view will include phrases like 'civil rights' and 'trade deficit' more often, as opposed to an article with a Republican bias, which would have 'economic growth' and 'illegal immigration'. However, the Post notes that "the vocabulary of partisans has doubtlessly shifted somewhat since 2005."
It is not just recently that accusations have been made of Wikipedia being politically biased. Early versions of Wikipedia were seen as very liberal, while in 2006, the American PBS (Public Broadcasting Service) ran an article stating that, according to conservative blogger Robert Cox of the National Debate, Wikipedia had 'a liberal bias in many hot-button topic entries'. Jimmy Wales replied that this was thanks to Wikipedia's global community, and this tendency was natural when the "international community of English speakers is slightly more liberal than the U.S. population." When asked if he felt this affected the site's goal, he said that "the idea that neutrality can only be achieved if we have some exact demographic matchup to United States of America is preposterous" and that Wikipedia should have a view that would be interpreted as neutral worldwide, not just in the US. (see previous Signpost coverage; more information from PBS)
It should be noted that many of these posts originate from American sources regarding articles on American politics—yet the US political system is much more conservative than that of other English-speaking countries. For example, the national health service supported by all major parties in countries such as the UK and Canada has faced vociferous opposition in the US. Therefore, what may seem neutral in some countries could seem left-wing in the US.
In brief
Anita Sarkeesian's Wikipedia biography vandalized: Several news sites reported on the repeated vandalism of Anita Sarkeesian's Wikipedia article after she launched a Kickstarter campaign to raise money for producing videos analyzing how women are portrayed in video games. The vandalism was part of a broader campaign to attack Sarkeesian due to her criticism of the video game industry. (Wired; Jezebel op-ed)
Edit war patterns, deleters vs. the 1%, never used cleanup tags, authorship inequality, higher quality from central users, and mapping the wikimediasphere
"Dynamics of Conflicts in Wikipedia"[1] develops an interesting "measure of controversiality", something that might be of interest to editors at large if it were a more widely popularized and dynamically updated statistic. The paper analyzes patterns of edit warring over Wikipedia articles. The authors conclude that edit warriors are usually willing to reach consensus, and that the rare cases of never-ending warring are those that continually attract new editors who have not yet joined the consensus.
The authors' decision to exclude from the study articles with under 100 edits because they are "evidently conflict-free" is questionable. Articles with fewer than 100 edits have been subject to clear, if not overly long, edit warring. A recent example is Concerns and controversies related to UEFA Euro 2012. It is also unfortunate that "memory effects" – a term mentioned only in the abstract and lead, and which the authors suggest is significant in understanding the conflict dynamic – is not explained in the article. The term "memory", by itself, appears four times in the body, but is not operationalized anywhere.
In a recent blog post by Wibidata, an analytics startup based in San Francisco, the authors set out to shed light on the often-quoted claim that most of Wikipedia was written by a small number of editors, noting other editorial patterns along the way.[2] Using the entire revision history of English Wikipedia (they wanted to show that their platform can scale), the authors looked at the distribution of edits across editor cohorts, grouped by number of total edits. They found that from a pure count perspective, the most active 1% of editors had contributed over 50% of the total edits. (see original plot here)
In response to the suggestion that the strongly skewed distribution of edits might just be due to a core set of editors who primarily make only minor formatting modifications, they looked at the net number of characters contributed by each editor. Grouping editors by total number of edits as before, they showed an even more strongly skewed distribution, with the top 1% contributing well over 100% of the total number characters on Wikipedia (i.e. an amount of text that is larger than the current Wikipedia) and the bottom 95% of editors deleting more on average than they contributed (original plot). Next, the authors separated logged in users from non-logged in "users" (identified only by IP addresses) and recomputed the distribution of net character contributions. By edit-count cohort, logged-in users tended to contribute significantly more than their anonymous counterparts, and non-logged-in users tended to delete significantly more (original plot).
In summary, low-activity and new editors, along with anonymous users, tend to delete more than they contribute; this reinforces the notion that Wikipedia is largely the product of a small number of core editors.
Evaluating and predicting interlingual links in Wikipedia
Published in proceedings of *SEM, a computational semantics conference, researchers from the University of North Texas and Ohio University looked into the nature of interlingual links on Wikipedia, both reviewing the quality of existing links and exploring possibilities for automatic link discovery.[3] The researchers took the directed graph of interlingual links on Wikipedia and used the lens of set-theoretic operations to structure an evaluation of existing links, to build a system for automatic link creation. For example, they suggest that the properties of symmetry and transitivity should hold for the relation of interlingual linking. This means that if there is an interlingual link from language A to B, there should also be a link from B to A, and if there is a link from language A to B, and language B to C, then there should be a link from language A to C. (This assumption is routinely made by the many existing Interwiki bots.) They further refine the notion of transitivity, by grouping article pairs by the number of transitive 'hops' required to connect a candidate article pair.
Their methodology revolves around the creation of a sizeable annotated gold data set. Using these labels, they first evaluated the quality of existing links, finding between one half and one third to fail their criteria for legitimate translations. They then evaluated the quality of various implied links. For example, reverse links where they do not already exist satisfy their criteria for faithful translation only 68% of the time.
The gold data set was used to train a boosted decision-tree classifier for selecting good candidate pairs of articles. They used various network topology features to encode the information in interlingual links for a given topic and found that they can significantly beat the baseline, which uses only the presence of direct links (73.97% compared with 69.35% accuracy).
"Wikipedia Academy" preview
Various conference papers and posters from the upcoming "Wikipedia Academy" (hosted by the German Wikimedia chapter from June 29 to July 1 in Berlin) are already available online. A brief overview of those which are presenting new research about Wikipedia:
{{Citation needed}} more effective than {{unreferenced}}: "On the Evolution of Quality Flaws and the Effectiveness of Cleanup Tags in the English Wikipedia"[4] shows "that inline tags are more effective than tag boxes" in tagging article flaws so that they get remedied. The researchers also "reveal five cleanup tags that have not been used at all, and 15 cleanup tags that have been used less than once per year", recommending their deletion, and "ten cleanup tags that have been used, but the tagged flaws have never been fixed." Similar to a paper reviewed in the April issue of this report (One in four of articles tagged as flawed, most often for verifiability issues"), they find that "the majority (71.62%) of the tagged articles have been tagged with a flaw that belongs to the flaw type Verifiability".
A paper titled "The Power of Wikipedia: Legitimacy and Territorial Control"[5], is "based on the experience of the projects WikiAfrica (2006-2012) and Share Your Knowledge (2011-2012)", and looks at various aspects of Wikipedia, Wikimedia chapters and the Foundation through the lens of "anthropological, african and post-colonial studies."
"Individual and Cultural Memories on Wikipedia and Wikia, Comparative Analysis"[6] looks at the coverage of the late British DJ John Peel on Wikipedia and Wikia, respectively, as well as the Wikipedia article about the 1980s.
An "Extended Abstract", "Latent Barriers in Wiki-based Collaborative Writing"[7] compares the collaborative process "25 special-purpose wikis" (most of them hosted by Wikia) with that of the German Wikipedia. One observation of the work in progress is a "strong divide between extracts of Wikipedia (even if being reduced to single articles and their one-link neighborhoods) on the one hand and special purpose wikis on the other."
Two Brazilian authors will examine "the climate change controversy through 15 articles of Portuguese Wikipedia".[8] The paper contains various quantitative results about the edit history of these articles, some of them unsurprising ("A very strong positive correlation (0.994) was found between the number of edits and the number of editors of an article"). Using the framework of actor–network theory, the authors conclude that "the collaborative encyclopedia is enrolled as an ally for the mainstream science and becomes one of its spokespersons."
Historical infobox data: An article by four authors from Google Switzerland and the Spanish National University of Distance Education (UNED) observes[9] that "much research has been devoted to automatically building lexical resources, taxonomies, parallel corpora and structured knowledge from [Wikipedia]", often using the structured data present in infoboxes (which they say are present in "roughly half" of English Wikipedia articles). However, this research has so far used only snapshots representing the state of articles at a particular point in time, whereas their project embarked to extract "a wealth of historical information about the last decade ... encoded in its revision history." The resulting 5.5GB dataset, called "Wikipedia Historical Attributes Data (WHAD)", will be made freely available for download.
Better authorship detection, and measuring inequality: Two researchers from the University of Karlsruhe will present an algorithm[10] to detect which user wrote which part of a Wikipedia article. Similar to a new revert-detection algorithm presented in a recent paper co-authored by one of the present authors (see last month's issue: "New algorithm provides better revert detection"), one crucial part of the algorithm is to split the article's wikitext into paragraphs, analyzing them separately under the assumption "that most edits (if they are not vandalistic) change only a very minor part of an article’s content". Another part is calculating the cosine similarity of sentences that are not exactly identical. In the authors' own test, the new algorithm performed significantly better than the widely used WikiTrust/WikiPraise tool. Having determined the list of authors for an article revision and the size of each author's contribution, they then define a gini coefficient "as an inequality measure of authorship" (roughly, an article written by a single author will have coefficient 1, while one with equal contributions by a multitude of editors will have coefficient 0). They implement a tool called "WIKIGINI" to plot this coefficient over an article's history, and show a few examples to demonstrate that it "may help to spot crucial events in the past evolution of an article". The paper starts out from the assumption "that the concentration of words to just a few authors can be an indicator for a lack of quality and/or neutrality in an article", but it does not (yet) contain a systematic attempt to correlate the gini coefficient and existing measures of article quality.
Troll research compared: A paper by a German Wikipedian titled "Here be Trolls: Motives, mechanisms and mythology of othering in the German Wikipedia community"[11] examines four academic texts about online trolls (only one of them in the context of Wikipedia), which "were compared regarding their scope, their theoretical approach, their methods and their findings concerning trolls and trolling.
Posters
"Self-organization and emergence in peer production: editing 'Biographies of living persons' in Portuguese Wikipedia[12]
"Biographical articles on Serbian Wikipedia and application of the extraction information on them"[13]
"Wikipedia article namespace – user interface now and a rhizomatic alternative"[14]
"Extensive Survey to Readers and Writers of Catalan Wikipedia: Use, Promotion, Perception and Motivation"[15]
Researcher Felipe Ortega blogged[16] about a new parser for Wikipedia dumps, to be integrated into "WikiDAT (Wikipedia Data Analysis Toolkit) ... a new integrated framework to facilitate the analysis of Wikipedia data using Python, MySQL and R. Following the pragmatic paradigm 'avoid reinventing the wheel', WikiDAT integrates some of the most efficient approaches for Wikipedia data analysis found in libre software code up to now", which will be featured in a workshop at the conference.
Special issue of "Digithum" on Wikipedia research
The open-access journal "Digithum" (subtitled "The Humanities in the Digital Era") has published a special issue containing five papers about Wikipedia from various disciplines, with a multilingual emphasis (including research about non-English Wikipedias, and Catalan and Spanish versions of the papers alongside the English versions):
Are articles about companies too negative?: A paper titled "Wikipedia’s Role in Reputation Management: An Analysis of the Best and Worst Companies in the United States"[17] looked at the English Wikipedia articles about the ten companies with the best and worst reputations according to the "Harris Reputation Quotient", a 2010 online survey about "perceptions for 60 of the most visible companies in America". Those 20 articles were coded, sentence by sentence, as positive, negative or neutral, and according to other "reputation attributes". Among the findings was that "the companies with the worst reputations had more negative content; they had, in fact, almost double the amount of negative content, although only slightly less positive content. Both types of companies had more negative than positive content. This indicates that even if a company is considered to have a good reputation, it is still very vulnerable to having its dirty laundry aired on Wikipedia." Another observation was that "emotional appeal is an attribute where both types of companies lacked content. It was rare for companies to have content about trust or feeling good, which only existed for the best companies" (an interesting question may be whether this is related to Wikipedia guidelines such as WP:PEACOCK). The paper appears at a time where many PR industry professionals in the US and UK argue that Wikipedia should allow them more control over the articles about their clients, and ends by highlighting the "importance of public relations professionals monitoring and requesting updates to Wikipedia articles about their companies". This conclusion resembles that of another recent study by one of the authors (DiStaso), which likewise concerned company articles, implicating a somewhat controversial conclusion about their accuracy (see the April issue of this research report: Wikipedia in the eyes of PR professionals).
WordNets from Wikipedia': The second paper[18] describes "the state of the art in the use of Wikipedia for natural language processing tasks", including the researchers' own application of Wikipedia to build WordNet databases in Catalan and Spanish.
The Wikimedia movement as "wikimediasphere": The article "Panorama of the wikimediasphere"[19] gives an overview of the Wikimedia movement, proposing the term "wikimediasphere" to describe it, and explaining "the role of the communities of editors of each project and their autonomy with respect to each other and to the Wikimedia Foundation", which is seen as "the principal supplier of the technological infrastructure and also the principal instrument for obtaining economic and organisational resources". Its vision statement is presented as a summary of the aim that is "the ideological glue that binds all the players involved". The section about "the social and institutional dimension" of the sphere briefly covers the Foundation's governance and funding models, Wikimedia chapters and other recognized supporting organizations, and the various wikis and other online platforms that structure "the organisational activity": The Foundation wiki, Meta-wiki, Strategy wiki, Outreach wiki, the Wikimedia blog and the blogs of community members aggregated on Planet Wikimedia, mailing lists etc. Authored by a Wikimedian who is a member of both the Spanish chapter and the Catalan "Friends of Wikipedia" association, the paper is remarkably well-informed and up-to-date, e.g. incorporating the Board resolution on "Recognized Models of Affiliations" from the beginning of April, and various other recent events such as the English Wikipedia's SOPA/PIPA blackout. The abstract uses the term "WikiProjects" in a different sense from that common among English-speaking Wikimedians, possibly a translation error.
Truth and NPOV: The fourth article[20] by Nathanial Tkacz (one of the organizers of the "Critical Point of View"/CPOV initiative that organized three conferences about Wikipedia in 2010, see Signpost interview) sets out to "show that Wikipedia has in fact two distinct relations to truth: one which is well known and forms the basis of existing popular and scholarly commentaries, and another which refers to equally well-known aspects of Wikipedia, but has not been understood in terms of truth. I demonstrate Wikipedia's dual relation to truth through a close analysis of the Neutral Point of View core content policy (and one of the project's 'Five Pillars')."
Wiki Loves Monuments: A paper titled "Wiki Loves Monuments 2011: the experience in Spain and reflections regarding the diffusion of cultural heritage",[21] written by five Spanish Wikimedians, gives a concise overview of the photo contest as it played out in Spain last year.
Briefly
Who was notable in London in the 1960s?: A Master's thesis in Computer Science[22] describes "A tool for extracting and indexing spatio-temporal information from biographical articles in Wikipedia". The tool, named "Kivrin" after a time-travelling character from a science fiction novel, is available online, and grew out of an earlier, simpler one that searches for articles about plants and animals living at a particular geographical place ("Flora & Fauna Finder"). The author remarks that "the data is skewed, like Wikipedia itself, towards the U.S. and Western Europe and relatively recent history". A search for the 1960s in London brings up several Beatles-related biographies near the top. While the tool does seem to cover languages other than English (e.g. text from the Hungarian entry on Gottlob Frege appears in the search results for Jena, the German town), searches for Hungarian or other non-English place names (e.g. Moszkva and Москва, the Hungarian and Russian names of Moscow) yielded no results. Disambiguation is attempted by way of geocodes but far from robust - the search results for Halle, Saxony-Anhalt actually contain multiple entries referring to Halle, North Rhine-Westphalia.
How did people in Europe feel in the 1940s?: As described in a post in the New York Times' "Bits" blog[23] Kalev Leetaru from the University of Illinois conducted a sentiment analysis of statements on Wikipedia connected to a particular space and time, and made the result into a video: "The Sentiment of the World Throughout History Through Wikipedia"
"Central" users produce higher quality: A preprint by two Dublin-based researchers attempts "Assessing the Quality of Wikipedia Pages Using Edit Longevity and Contributor Centrality".[25] The former uses the assumption that contributions which survive many subsequent edits tend to have a higher quality, and "measures the quality of an article by aggregating the edit longevity of all its author contributions". The second approach considers either the coauthorship network (the bipartite graph of users and the articles they have edited, used in many recent papers to grasp Wikipedia's collaboration processes) or the user talk page (UTP) network, where two Wikipedians are connected if one has edited the other's talk page. It is assumed that a user's "centrality" in one of these networks is a measure for the "contributor authoritativeness". These quality measures are then evaluated on 9290 history-related Wikipedia articles against the manual quality rating from WikiProject History. "The results suggest that it is useful to take into account the contributor authoritativeness (i.e., the centrality metrics of the contributors in the Wikipedia networks) when assessing the information quality of Wikipedia content. The implication for this is that articles with significant contributions from authoritative contributors are likely to be of high quality, and that high-quality articles generally involve more communication and interaction between contributors."
Familiarity breeds trust: A bachelor thesis[26] at Twente University had 40 college students assess the trustworthiness of articles from the English Wikipedia, after a search for a piece of information in the article that was either present at the top or near the bottom of the article. The hypothesis that the longer search in the second case might affect the trustworthiness rating was rejected by the results, but it was found (consistent with other research) that "Trust was higher in articles with a familiar topic, rather than with unfamiliar topics".
^Morton-Owens, E. G. (2012). A tool for extracting and indexing spatio-temporal information from biographical articles in Wikipedia. New York University. PDF
^Qin, Xiangju; Cunningham, Pádraig (2012). "Assessing the Quality of Wikipedia Pages Using Edit Longevity and Contributor Centrality". arXiv:1206.2517 [cs.SI].
With the 2012 Tour de France beginning at the end of this week, interested editors should head over to WikiProject Cycling to find out how they can help
Submit your project's news and announcements for next week's WikiProject Report at the Signpost's WikiProject Desk.
This week, we spent some time with WikiProject Athletics which covers a variety of athletic competitions including running, jumping, and throwing. Started in May 2009, WikiProject Athletics is relatively young among the sport projects. It is home to 3 Featured Articles, 4 Featured Lists, and 18 Good Articles. The project maintains the Athletics Portal and various lists of articles that either do not exist or need considerable improvement. We interviewed Trackinfo and project founder Sillyfolkboy (SFB).
What motivated you to join WikiProject Athletics? Have you coached or competed in any athletic events?
Trackinfo: I've competed, officiated and administered for the sport for well over 40 years. The sport is a built in element to my life, I've obviously read a lot about the sport and have been a journalist covering the sport at the international level for decades. I started editing when I noticed facts that I knew (and thought should be common) were missing. I've gotten more involved as I have discovered the immense lack of understanding of our sport amongst even people who have positions of authority. Between vandals and deletionists, I feel like watching wikipedia is as much a defensive effort as it is adding accurate content that the world should have access to.
Sillyfolkboy (SFB): I initiated the Athletics WikiProject in May 2009 after developing an interest in the sport through the 2008 Beijing Olympics. I wanted to create a collaborative space for an underdeveloped topic area and was inspired by the equivalent French and German projects. My desires to both write about and participate in the sport grew in tandem – I have twice competed at the Great Manchester Run, albeit more for charity and leisure than a demonstration of athletic prowess!
Are some aspects of athletics better covered on Wikipedia than others? Are there any glaring holes in Wikipedia's coverage of athletics?
Trackinfo: I edit the aspects I know and understand. I am sometimes driven to research outside of my base, but its hard to do without background that I do not possess. Frankly it seems like there are only a handful of us who are making significant contributions, I see the same handles. We each have our obvious pools of knowledge, but outside of that core group, we have to depend on unregistered editors to fill in the gaps. Such contributions are inconsistent.
SFB: As Trackinfo rightly points out, the quality of the coverage of athletics varies greatly and is largely dependent upon whether a person has dedicated time to the topic. The article for the mile run, one of the sport's most definitive events, is only a year old. The more historical aspects of the sport are our biggest blind spot: great athletes of the 19th century such as Lon Myers have only recently been covered, while the stories of George Seward of Lewis "Deerfoot" Bennett are still untold. There are some things for the WikiProject to celebrate, however, such as the much improved coverage of specific competitions and statistical lists.
Most of the project's Featured and Good Articles are biographies of athletes. What are some challenges faced by editors trying to improve athletics-related topics to FA or GA status?
Trackinfo: I'm not motivated by trying to elevate any single article to a higher status. I don't even know how that is decided.
SFB: Some of my first athletics efforts were aimed towards taking articles like Usain Bolt to good article status. I soon came to realise that coverage of athletics on Wikipedia was sparse. I decided to focus my time on general article creation and expansion, rather than refining a smaller number of articles. My last Good Article came in late 2009, but since then I have created many hundreds of substantial articles on athletics and expanded countless more.
The peer-review mechanisms for distinguishing articles are useful, but inherently time-consuming. Other sports projects such as, WikiProject Football, can rely on a large base of editors to cover current events, allowing others time to writing extensively on specific topics. For our project, time is so limited that if one or two editors focused on one article, then some current events would simply not be covered. SFB
How difficult is it to obtain images for athletics articles? Are there any specific pictures that the project is searching for?
Trackinfo: The administrators of the commons have taken their role to protect against copyright violations (copyvio) completely overboard. I have submitted my own work and they deleted it because of the fear of copyvio. I have signed their paperwork multiple times--years of this. Finally the most recent contributions have not (yet) been challenged. For the wider ramifications of getting images of historical figures or just ones from other sources, the obstacles to putting up good images are almost insurmountable.
SFB: Images of athletes prior to 2000 are particularly rare. It is difficult to get high quality images even for major names like Michael Johnson and Jan Železný, and hard to get any at all for people like Bob Beamon. It is much easier to get images of modern marathon runners, given the rise of good quality personal cameras and the athletes' proximity to the public. The number of images of track and field athletes in recent years has been greatly boosted by Erik van Leeuwen, an athletics photographer who has donated many high quality images to Wikimedia Commons. I'm very grateful for what he does.
Does WikiProject Athletics collaborate with any other projects? Are there ways the various sports and games projects could aid and reinforce each other?
Trackinfo: Directly we interface to Project Olympics. That's obvious. When we were going through a wave of deletionism (the BLP unsourced attack), I learned some good sources for Olympic information, so that technique assisted in rescuing many Olympians in other sports as well. All sports collaborate on WP:NSPORTS which has served to define a commonality of notability across all sports. Participating in that guideline has let me to become familiar with the issues in other sports.
SFB:WikiProject Running is another sister project. It's difficult to say what the role of WikiProject Sports is at the moment because editors gravitate more towards the sport-specific ones.
I think English Wikipedia could benefit by better integrating the project and portal spaces. The French Wikipedia, for example, displays clear and prominent portal links on their Main page. Their Sports portal is ten times more popular than ours and provides intuitive links to the project and category spaces, as well as an all purpose "Café" for discussion on the subject (a scheme I have tried to mirror on the athletics project). English Wikipedia's sports and games portal is buried quite deeply, discouraging development and usage.
Which articles will be the most vital to visitors drawn to Wikipedia after watching media coverage of major athletic events like the upcoming European Athletics Championships or the Summer Olympic Games? What needs to be done to prepare these articles for the spotlight?
SFB: Visitors are most driven to the main competition page and the relevant athlete biographies. For the larger events, we have event-specific articles which readers use to view the results. For example, the article on the Men's 100 m final for the 2011 World Championships in Athletics received nearly 10,000 views on the day of the final. There are six or seven editors who regularly provide to-the-minute results for major competitions. Following this news-style service, these event sub-articles still receive dozens of hits a day years afterwards.
It is comparatively easy to prepare competition articles, laying out the medal tables beforehand, but building athlete biographies can be more challenging. At each major athletics event, 24 men's and 23 women's champions are crowned, and the intensity of the sport means there are many new faces each year. Sometimes I take a guess at who the top performers are going to be. Christian Taylor's biography was written less than a month before he became triple jump world champion.
What are the project's most pressing needs? How can a new member help today?
SFB: Updating the results of major competitions is a mammoth task in itself and help is always welcome in that respect. The articles on athletics events like the 10,000 metres leave much to be desired. Athlete biographies can be updated and expanded; Melaine Walker and Brittney Reese are among the best ever in their respective events, but their achievements are covered in little detail. World champions like Bekele Debele remain red-linked, while former Olympic champions can barely merit three sentences. Some international competitions, such as the European Cup in Athletics, have little historical coverage and there are so many missing champions that I've made a worklist. It's a topic area ripe for development.
Anything else you'd like to add?
SFB: Since 2009, the athletics project has presented some unusual challenges, particularly because of the American/British language barrier. Much time and discussion has led to a new article: Athletics (U.S.). Both that title and the project's main topic title, Athletics (sport), remain less than ideal but have allowed the space to develop articles on the distinct concepts.
On top of this, we also now have a separate article for just Track and field, which had to be created from scratch. Athletics and track and field had been dealt with in one article – Athletics (track and field). However, the relationship between the two is similar to that of gridiron football and American football in that the latter is more prominent and constitutes a major part of the former, but the former is a broader concept. That situation meant most athletics efforts were wound up in arguments over title names and the actual topic matter suffered greatly. Article quality has really improved in the three years since the titles have been stable. It is the English language which fails us – no other Wikipedia has encountered these issues because their words are unambiguous.
The largest Wikipedia copyright investigation brought the project to a standstill for months as one of the most prolific editors was involved. To save the content, countless hours were spent analysing articles for infringement. I checked at least 2000 entries; it may be higher but the sheer workload makes it hard to determine.
The international nature of the sport is another challenge as source material is often in the local language. Arabic and Japanese tend to be difficult. Also, many of our members are not native English speakers. It can lead to some odd usage of prepositions, but the enthusiasm is refreshing nevertheless!
Trackinfo: The mention of the copyvio story brings up some bad memories for me to comment on as well. I too spent a lot of time checking articles by this individual and I will grant that a lot of it was copied. But the vast majority of his work was statistical--reformatting of public results. I submitted that all of that work was perfectly valid--that we editors needed only to review the articles by this editor for descriptive prose that he also copied and was a clear copyvio. Instead of accepting my analysis and my efforts to check articles, my own credibility was impugned by just one editor who maintained a position of authority. Out of thousands of contributions I have made, that editor found two articles that "didn't sufficiently" rewrite the story from the original sources (of my outside research). All of my efforts in review of those article were reverted. This is one of a continuing series of situations that give evidence to the poor management structure of Wikipedia. It is like most small community organizations, where a small group of ideologues rush to take over total control to make sure their agenda can be carried forth. Here there is a small but powerful oligarchy of administrators who control this site. They can exert their control because there is a behind-the-scenes world of wikipedia that the majority of the population does not venture into and is ill equipped to fight a battle within. From their back rooms, they are the deciders of what happens here. As long as you play with them by their rules, they don't bother the masses very often, but if one of them gets a burr up their butt, they can close ranks. It only takes a handful, but they gang up and slaughter initiative very effectively. Their senseless abuses of power has bruised my nose more than once and I am certain has driven away the contributions of many other potential editors. Fearless Leader decides what you and the rest of the world are allowed to know via Wikipedia.
WikiProject Athletics is the first in a series of sport-related projects the WikiProject Report will be highlighting in the next two months to celebrate major sporting events and summer pastimes (winter for our friends in the south). Next week's project will show off its need for speed. In the meantime, tune up your engine in the archive.
Avery Brundage (nom) by Wehwalt. Brundage (1887–1975) was the fifth, and only American, president of the International Olympic Committee (IOC). As a youth he was a track star, but later became known as a sports administrator. During the 1936 Olympics in Berlin he controversially ensured the American team's participation, and 16 years later he became president of the IOC. As president he was a fervent supporter of amateurism in sports; after the massacre of Israeli athletes at the 1972 Games, Brundage stirred controversy by saying the Games would go on. He retired soon afterwards.
Horseshoe Curve (Pennsylvania) (nom) by Niagara. Horseshoe Curve is a 3,485-foot (1,062 m), triple-tracked railroad curve in Blair County, in the U.S. state of Pennsylvania. Built by the Pennsylvania Railroad in 1854 as an alternative to the Allegheny Portage Railroad, it continues to be an important part of the area's railroad infrastructure. It is also a tourist attraction and was designated a National Historic Landmark in 1966 and National Historic Civil Engineering Landmark in 2004.
Charles Scott (governor) (nom) by Acdixon. Scott (1739–1813) was an American soldier who later served as the fourth governor of the US state of Kentucky. He began his military career in 1755 as a scout, but soon became an officer. He participated in the American revolution, and was taken prisoner by the British. He continued to serve during the Indian wars until he was elected as governor of Kentucky in 1808. During his four-year term Scott, left on crutches after a slip, depended on his son-in-law Jesse Bledsoe. His administration dealt with growing tensions between the US and Britain.
New Forest pony (nom) by ThatPeskyCommoner. The New Forest pony is a breed of pony native to Britain that is valued for hardiness, strength, and surefootedness. The breed's wild ancestors date back more than 500,000 years; currently, only purebreds can be registered. Although in 1945 their numbers were under 600, in recent years thousands of semi-feral ponies have ranged the New Forest in Hampshire; they are gathered annually to be checked for health, wormed, and tail-marked. The ponies can be ridden by adults or children.
William S. Sadler (nom) by Mark Arsten, Mathew Townsend, and Livitup. Sadler (1875–1969) was an American psychiatrist. Initially a Seventh-day Adventist, Sadler left the denomination after it excommunicated his father-in-law in 1907. He began his conversations with "the sleeping man" in 1910, and ultimately decided that the man was telling the truth. The revelations from his conversations formed the basis of The Urantia Book, published in 1955. The teachings in the book are similar to those in Adventist literature.
"Squeeze" (The X-Files) (nom) by Grapple X. "Squeeze" was the first "monster-of-the-week" episode of the American TV series The X-Files. In it, FBI special agents Fox Mulder and Dana Scully investigate a series of ritualistic killings by somebody seemingly capable of squeezing his body through impossibly narrow gaps. Written by Glen Morgan and James Wong, the episode had production problems that required more than the usual post-production work. The episode has received generally positive reviews.
William Burges (nom) by KJP1. Burges (1827–1881) was an English architect and designer who sought in his work to escape from both nineteenth-century industrialisation and the Neoclassical architectural style, to re-establish the architectural and social values of a utopian medieval England. In his 18-year career Burges created numerous designs for architectural works, although many were never executed or later demolished. His most notable works are Cardiff Castle and Castell Coch. Burges also designed metalwork, sculpture, jewellery, furniture and stained glass.
William T. Anderson (nom) by Mark Arsten. Anderson (1839–1864) was one of the deadliest and most brutal pro-Confederate guerrilla leaders in the American Civil War. He began his career of violence in 1862 as a horse thief, before joining pro-Confederate rebels. After his sister was killed by Union forces, Anderson – already known for his brutality – began killing in a quest for revenge. In June 1864 a group under his control began fighting in Missouri, and in September they killed 24 Union soldiers in Centralia, and then more than 100 militiamen in an ambush. For this, Anderson was hunted and killed by the Union.
Reg Saunders (nom) by AustralianRupert and Ian Rose. Saunders (1920–1990) was the first Aboriginal Australian to be commissioned as an officer in the Australian Army. He enlisted in 1940 and fought in North Africa and Europe; after he was commissioned in November 1944, Saunders saw further action in New Guinea. He left the Army for five years, but reenlisted for the Korean War in 1950. He retired as a Captain in 1954 and went on to work in the logging and metal industries before joining the Office of Aboriginal Affairs as a liaison officer in 1969. In 1971, he was appointed a Member of the Order of the British Empire.
Typhoon Gay (1989) (nom) by Cyclonebiskit. Typhoon Gay caused more than 800 fatalities in and around the Gulf of Thailand in November 1989; it was the worst storm to hit the area in 35 years. It first struck Chumphon Province in Thailand with winds of 185 km/h (115 mph) before crossing the Bay of Bengal and striking Andhra Pradesh, India, with winds measuring 260 km/h (160 mph). The storm destroyed several towns in Thailand and caused 11 billion baht worth of damage; it damaged or destroyed about 20,000 homes in Andhra Pradesh, leaving 100,000 people homeless, but caused much lower financial losses.
Yogo sapphire (nom) by PumpkinSky and Montanabw. Yogo sapphires are only found in the Yogo Gulch in central Montana in the US, and are widely considered among the finest sapphires in the world. Mining for the typically cornflower blue stones has generally been unprofitable since it began in 1895. The stones were marketed as the only guaranteed "untreated" sapphire, exposing a practice of the time wherein 95 percent of all the world's sapphires were heat-treated to enhance their natural color.
Featured lists
Eight featured lists were promoted this week:
List of English Twenty20 cricket champions (nom) by Harrias. Since the establishment of the England and Wales Cricket Board Twenty20 competition for first-class cricket counties in 2003, Leicestershire has had the most successful team, with three championships, while Somerset has competed most. Six teams have won one championship each.
General Secretary of the Communist Party of Vietnam (nom) by TIAYN. Since the establishment of the Communist Party of Vietnam in 1930, the party has had twelve general secretaries. During French rule over the country many of these men were arrested, while in modern times the general secretaries have been among the most powerful persons in the country.
List of songs recorded by Chrisye (nom) by Crisco 1492. The Indonesian pop singer Chrisye recorded over 200 songs in his forty-year career. Although he was accused of plagiarism on two songs, four others were listed by Rolling Stone Indonesia as among the best Indonesian songs of all time.
List of municipalities in Rio Grande do Norte (nom) by Albacore. The Brazilian state of Rio Grande do Norte consists of 167 municipalities, grouped into four mesoregions and 23 microregions. The largest is Natal, which has 803,811 of the state's 3,168,133 inhabitants.
Malmö FF league record by opponent (nom) by Reckless182. The Swedish professional association football club Malmö Fotbollförening's main rivals are Helsingborgs IF, IFK Göteborg and, historically, IFK Malmö. It has played the most games against AIK.
List of chronometers on HMS Beagle (nom) by Spinningspark. The British Admiralty sailing ship HMS Beagle carried numerous chronometers on its three voyages during the early 19th century. On its second voyage it carried 22 of the instruments; of these, only two remain.
Marshal Foch Professor of French Literature (nom) by Bencherlite. There have been seven Marshal Foch Professors of French Literature at the University of Oxford in England since the post was first filled in 1920. The position was endowed by an arms trader and named after Supreme Commander of Allied Forces Ferdinand Foch.
Nebula Award for Best Novella (nom) by PresN. The Nebula Awards, described as the most important American awards in the genre, are given each year for the best science fiction or fantasy fiction published in the United States during the previous year. There have been 45 winners since the award was established in 1966.
Featured pictures
Six featured pictures were promoted this week:
L'Umbracle (nom, related article), created by Diliff and nominated by Tomer T. L'Umbracle is a landscaped walk that is part of the Ciutat de les Arts i les Ciències in Valencia, Spain. It has 99 arches standing 18 metres (59 ft) high.
Liocarcinus navigator (nom, related article), created by Lycaon and nominated by Tomer T. Liocarcinus navigator is a crab that averages 3.5 millimetres (0.14 in) wide. It is found in the northeastern Atlantic ocean.
Pittsburgh, Allegheny & Birmingham (nom, related article), created by Otto Krebs, restored by Adam Cuerden, and nominated by Howcheng. Another restoration by retired Wikimedian Adam Cuerden, this lithograph depicts the south side of Pittsburgh, United States; in the late 18th century it was already highly industrialised.
Walt Disney Concert Hall (nom, related article) by jjron. The Walt Disney Concert Hall in Los Angeles, United States, was opened in 2003 and houses numerous musical groups. Reviewers were surprised by a lack of traffic in front of the building.
The case concerns alleged misconduct by Fæ, brought by MBisanz. Proposed decisions are due tomorrow (Tuesday 26 June).
In response to a Workshop proposal calling for his desysopping, Fæ's administrator rights were removed at his request on 18 June; he has declared he will not pursue RfA until June 2013, and that should another user nominate him and he feels confident to run, he will launch a reconfirmation RfA rather than requesting the tools back without community process.
The case was referred to the committee by Timotheus Canens, after TheSoundAndTheFury filed a "voluminous AE request" concerning behavioural issues in relation to Ohconfucius, Colipon, and Shrigley. The accused deny his claims and decried TheSoundAndTheFury for his alleged "POV-pushing". According to TheSoundAndTheFury, the problem lies not with "these editors' points of view per se "; rather, it is "fundamentally about behaviour".
The case, filed by P.T. Aufrette, concerns the suitability of the new move review forum, after a contentious requested move discussion (initiated by the filer) was closed as successful by JHunterJ; the close was a matter of much contention, with allegations that the move was not supported by consensus. After a series of reverts by Deacon of Pndapetzim, Kwamikagami and Gnangarra, the partiality of JHunterJ's decision was discussed, as was the intensity of Deacon of Pndapetzim's academic interests in the topic.
Evidence submissions and proposed decisions are due 28 June and 12 July, respectively.
There is plenty of evidence that wiki-markup is a substantial barrier that prevents many people from contributing to Wikipedia and our other projects. Formal user tests, direct feedback from new editors, and anecdotal evidence collected over the past several years have made the need for a visual editor clear ... It’s the biggest and most important change to our user experience we’ve ever undertaken.
”
— The Visual Editor Team, Wikimedia Foundation, November 2011
A second prototype of the "Visual" (what you see is what you get) editor being developed by the Wikimedia Foundation went live to MediaWiki.org this week (Wikimedia blog), seven months after the first prototype (see previous Signpost coverage). The project is being assisted by developers for the wiki farm site Wikia, many of whose wikis use an existing, less powerful WYSIWYG editor at present.
Work on the editor had been delayed by a late decision to switch the "behind the scenes" framework used to power it; as such, despite the passage of time, developers aimed only for "feature parity" with last December's prototype, though the newer version does add the ability to save articles after editing, the potential for mobile editing, and integration with browser spell-check features. It is further hoped that the newer framework should allow for all remaining features – including tables, images and reference sections – to be rapidly integrated from now. Nevertheless, publication of details of the new live test version has already provoked a long string of bug reports. It seems likely that the deployment of the visual editor to its first live wiki will be pushed back further, possibly until the late northern autumn.
Just like the first prototype, the most significant limitation with this second demonstration version undoubtedly surrounds its inability to understand potentially difficult wikitext constructs (manual override mode has been limited to administrators during the testing period for precisely this reason). Indeed, it has been this concern over backwards compatibility that has long been seen as the major challenge for developers of WYSIWYG editors. The difference this time, developers say, is that the introduction of the radically improved new parser will make all the difference when it comes to the provision of a truly comprehensive editor. Even so, its deployment will almost certainly be accompanied by the "phasing out" of particularly complex wikitext structures.
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.
Volunteer-developed Wiktionary app launches: A new official Wikimedia app was launched this week on Google Play, the marketplace that serves as the central app repository for devices running Google Android (Wikimedia blog). The app, which joins its sister Wikipedia app, has since been downloaded over 500 times, though it still has some way to go before it matches the success of the other official Wikimedia app, a native Wikipedia adaptation, which has now been downloaded well over a million times since its own launch. Most notably, the Wiktionary app was developed by a body of volunteers, all members of the Canadian "Undergraduate Capstone Open Source Projects" (UCOSP) group.
Real-time collaborative editing attracts attention: Renewed attention was thrown onto the possibility of real-time collaborative editing, with a renewed Etherpad fork being shown off (wikitech-l mailing list). Few think that live collaborative editing – in short, the banishment of the edit conflict to the history books – will land any time soon; one current Google Summer of Code student, Ashish Dubey, is looking at integrating such features with the new Visual Editor, which may not be finished well into next year. Nevertheless, the idea has proved conceptually popular, especially given Wikipedia's high-profile role as a news provider. Wikimedians are already large consumers of Etherpads; the latest attempt to make them more wiki-like was released this week, having received WMF technical support.
Foundation to use contractors for Wikimania training: The Wikimedia Foundation will be employing technology non-profit OpenHatch to help deliver tutorials at the Wikimania Hackathon, due to be held immediately in Washington D.C. in the days immediately before Wikimedia itself begins there in a fortnight's time, it was revealed this week (wikitech-l mailing list). The move is the latest in a line of developments aimed at professionalising the way novice coders (and perhaps more importantly, experienced coders willing but unable to contribute to MediaWiki and related projects) are integrated into the wider development community. Newly promoted Engineering Community Manager Sumana Harihareswara, announcing the development, cited OpenHatch's experience in "teaching new contributors, and in building open source communities' capacity to nurture"; the aim, she wrote, was to "get scores of semitechnical Wikimedia editors over the barriers to technical contribution", although the difficulty of the challenge facing Harihareswara remains unclear.
Sshhhhhh, we're trying to talk here: There was a discussion this week on the wikitech-l mailing list about the possibility of silencing many of the bots (programs designed to automatically relay certain log messages) active in the #MediaWiki IRC channel. Critics argue that the bots – which relate (among other things) updated bug statuses, code review news and details performance issues – constituted background "noise" for most users, frequently "drowning out" or "diluting" the words of those asking for help in the channel, which serves as a focal point for MediaWiki developers of all types. Supporters rejected such a critique, arguing that the updates were "important" and in any case an important part of the development workflow by attracting developer comment on controversial topics. As discussion continues, it was noted that a recent software change, yet to be reviewed, allows code changes targeted at part of MediaWiki which have their own more specific IRC channel (such as #wikimedia-mobile) to be repeated only in that channel and not in the more general #MediaWiki.
ArticleFeedback version 5 goes large: Many English Wikipedians will get their first glimpse of the fifth version of the Article Feedback Tool over the coming fortnight, as it expands beyond its current 0.6% coverage, going live on some 10% of articles by 3 July. Version 5, which has been developed slowly and in response to extensive community contact, moves away from the numerical model adopted by version 4, replacing it with a system centred on short comments instead. Concerns include the possibility for libellous, offensive and/or spammy comments being displayed on the public feedback review pages.