Wikipedia:Wikipedia Signpost/Single/2014-10-15

The Signpost
Single-page Edition
WP:POST/1
15 October 2014

 

2014-10-15

Ships—sexist or sexy?

Does this look female to you? The Royal Navy's HMS Hood, 1924.

Sexism is a hot topic on Wikipedia at the moment. The Countering systemic bias WikiProject uses Tom Simonite's "The Decline of Wikipedia" to highlight "... the effect of systemic bias and policy creep on recent downward trends in the number of editors available to support Wikipedia's range and coverage of topics." It cites the New York Times to say that "Wikipedia has been criticized by some journalists and academics for lacking not only women contributors but also extensive and in-depth encyclopedic attention to many topics regarding gender."

A Wikimedia Foundation study found that fewer than 13% of contributors to Wikipedia are women. Former WMF Executive Director Sue Gardner said increasing diversity was about making the encyclopaedia "as good as it could be." Possible factors cited as discouraging women included the "obsessive fact-loving realm" and the necessity to be "open to very difficult, high-conflict people, even misogynists." In August 2014, Wikipedia co-founder Jimmy Wales announced in a BBC interview the Wikimedia Foundation's plans for "doubling down" on gender bias at Wikipedia.

Grammatical gender has not been a feature of English since the 12th century. The use of the feminine pronoun "she" to refer to countries survived in some writing until the early 20th century, but is almost unknown nowadays. Wikipedia, as a modern encyclopedia, follows this trend: we do not talk about France or the United States as "she", except occasionally in quotations.

In Wikipedia's articles, the use of "she" to describe naval ships is near-universal, despite a successful and ongoing effort to improve the quality of these articles by the Military History and Ship WikiProjects. The consensus is that the first major editor of an article gets to decide for all time whether an article uses "she" or "it". It's obvious from the preponderance of "she" in the articles that almost all of them have been written by those with a preference for "she", which under our current rules is fine. This leaves naval articles as the last bastion of grammatical gender on Wikipedia.

As a man with a fascination for machines, including war machines, I've always had a particular horror of men who describe their cars, motorbikes, or aeroplanes as "she". Without getting too psychoanalytical, this seems to be evidence of ingrained and systematic sexism. The AP style guide and the Lloyd's Register discourage "she" for ships, and the Chicago Manual of Style has stated since 2003: "When a pronoun is used to refer to a vessel, the neuter it or its (rather than she or her) is preferred". Some of my older naval books still use "she", but the modern academic standard in all serious works is to omit it as an archaic usage.

Seabees from Naval Mobile Construction Battalion 3 and other members of the US Navy women's dragon boat team cheer after winning first place over the US Army, Air Force and Marines in 2007.
The reasons some men give for hanging on to this terminology for ships are fascinating: "It takes a lot of work and tender loving care, as well as a lot of paint to make a ship look good" and "Some have a cute fantail, others are heavy in the stern, but all have double-bottoms which demand attention," are two of my favourites. Our Wikipedian usage still reflects the sentiment of "... it takes an experienced man to handle her correctly; and without a man at the helm, she is absolutely uncontrollable."

While these justifications are no doubt given tongue-in-cheek, in my value-system the casual sexism is obvious. Aesthetically this jars, and in terms of the embedded values of language, the use of a feminine pronoun to describe a killing machine crewed mainly by men jars too.

The place of women in Western society has undergone a huge change in the past 100 years. Women were allowed to vote in elections after much controversy in most countries after World War I, with Switzerland holding out until 1971. In the United States Navy, women have been recruited since 1917. In the 1940s, a special auxiliary service for women, WAVES, was set up. Women were expected to be non-combatants. By the 1970s, women were eligible for most surface combat roles and the first female naval aviators qualified. American submarines opened their hatches to women only in the last few years. In Britain, the Royal Navy first allowed women to go to sea in 1990 and it was 2014 before the first female submariners were admitted.

Perhaps as women penetrate this male preserve, this last remnant of grammatical gender could be allowed to wither from our project. Wikipedia generally has a proud tradition of being conservative in what we include in articles, but we claim to have a progressive attitude towards addressing systemic bias in how we write. Spinal Tap depicts a male rock star unable to understand criticism of the band's new album cover as being "sexist"; he asks "What's wrong with being sexy?" That was a 1984 satire on the problem of ingrained sexism; are male editors of ship-related articles in 2014 unconsciously perpetuating the same misogyny satirised in the film?

If Lila Tretikov and Jimmy Wales (not to mention the millions of volunteers who write our articles) are serious about helping us create a female-friendly editing environment, reforming the pronoun we use for naval ships might be an obvious place to start.

The views expressed in this op-ed are those of the author only; responses and critical commentary are invited in the comments section. The Signpost welcomes proposals for op-eds at our opinion desk or through email.

Reader comments

2014-10-15

College player falsely linked to sports scandal by Wikipedia; the Nobel Prizes

Wikipedia article falsely links player to college sports scandal for six years

Baldwin the Eagle, the mascot of Boston College

Ben Koo of the sports blog Awful Announcing investigated (October 9) how player Joe Streater's name became involved in recent years with a historic sports scandal, the 1978–79 Boston College basketball point shaving scandal. The scandal involved Boston College basketball players conspiring with mobsters, including Henry Hill, who was immortalized by the film Goodfellas, to deliberately reduce the point spread so they could profit through illegal sports gambling.

Streater was a basketball player at Boston College, a private university in Boston, Massachusetts. One former Boston College player recalled that "He had mad skills and smarts." However, he was not even on the team at the time of the scandal, having left the team and college the previous season after playing only eleven games, less than half of the scheduled games for the 1977-78 season. Why Streater left, what he did following his time at Boston College, or even whether or not he is still alive are all unknown, and Koo was unable to locate Streater.

Despite the frequency with which he is associated with the scandal, Streater is not mentioned in any of the important accounts of the incident, including the famous 1981 Sports Illustrated article describing Hill's first-person account, Associated Press reporter David Porter's 2002 book, Fixed: How Goodfellas Bought Boston College Basketball, or ESPN's 2014 documentary Playing for the Mob. Porter told Koo that he did not know of any involvement in the scandal by Streater or why his name has been repeatedly mentioned. He said "I have seen the name over the years and am mystified as well."

Koo found many mentions of Streater's name in connection with the scandal outside of these in-depth reports, including some from media outlets like the Associated Press, ESPN, and Sports Illustrated, which had reported on the scandal without mentioning Streater, most prominently a widely circulated 2012 Associated Press story. Koo could not find a story mentioning Streater in conjunction with the scandal dating before 2008. Koo concluded that the connection resulted from writers and journalists consulting Wikipedia or other sources which had repeated inaccurate information from Wikipedia.

Koo traced the addition of Streater's name to the Wikipedia article on the scandal to an August 12, 2008 edit by User:155.212.229.132, a Massachusetts-based IP address belonging to Goodwill Industries. The edit added Streater's name to the article five times and changed the amount of a payment from Hill from $500 to $2000. In December 2008, edits from the same IP address deleted a large amount material from the article on the scandal, including all of the references, as well as material from the article for NBA coach David Blatt, who Koo noted played against Streater when they were both high school basketball players in the Boston area. (The only other edits from the IP address were two November 2009 typographical corrections to the article Morgan Memorial Goodwill Industries, which is now a redirect to Goodwill Industries.)

The day before Koo's story was published, four of the mentions of Streater were removed from the Wikipedia article about the scandal by an IP address originating outside of Massachusetts. The remaining mention of his name was removed the next day by a different editor. Streater's name had been in the article for six years.

Wikipedia and the Nobel Prize

Nobel Peace Prize winner Malala Yousafzai

Each year, the week of announcements from the Swedish Academy regarding the new Nobel Prize laureates leaves many people, including professional journalists and commentators, scrambling to learn about winners who are often obscure outside their own fields, and Wikipedia is one of their first stops for information.

Slate reports (October 9) on a warning left for journalists in the article for the newest literature laureate, Patrick Modiano, by a Wikipedia editor adding a major update following the announcement. Lest a journalist who needed to make a quick blog post crib unverified details from the article, under the section heading "To The Reporter Now Copying from Wikipedia", the editor wrote "Be careful boy. Primary sources are still best for journos." The warning was removed from the article eleven minutes later.

Huffington Post UK complained (October 13) that the article for new economics laureate Jean Tirole contained little information about his work and was mostly a list of his lectures. It noted that an IP editor added the remark "YO, SOMEONE EDIT THIS STUFF IT LOOKS LIKE KRAP", though it was removed by another editor three minutes later.

IBN Live compares (October 13) Wikipedia traffic statistics for this year's two Nobel Peace Prize winners, Kailash Satyarthi and Malala Yousafzai. Pageviews for Satyarthi spiked on the day of the announcement, suggesting that readers wanted to learn more about the lesser known of the two, while pageviews for Yousafzai surpassed those for Satyarthi for the next two days.

In brief

2014-10-15

One case closed and two opened

Banning Policy was closed on 12 October. Arbcom affirmed that users have "considerable leeway" in terms of how their talk pages are managed. Users Tarc (talk · contribs), Smallbones (talk · contribs), and Hell in a Bucket (talk · contribs) were all warned to refrain from edit warring and making inflammatory comments. Tarc was also topic banned from editing any of the administrator's noticeboards or User talk:Jimbo Wales, and from reinstating any edits that were reverted because they were made by a banned user.

New cases

Two new cases have been opened since the last arbitration report. Gender Gap Task Force was opened on 2 October and is in its evidence phase until 17 October. Landmark Worldwide was opened on 16 October and is also currently in the evidence phase.

In brief

2014-10-15

Bells ring out at the Temple of the Dragon at Peace



This Signpost "Featured content" report covers material promoted from 5 October 2014 through 11 October 2014.

Nine featured articles were promoted this week.

The market in Keswick, Cumbria.
The video game Fez is the subject of a new featured article, and this video is a new featured picture. User Czar obtained the necessary permission from the game's rights owner. Click on it to play.

Twenty-six featured pictures were promoted this week.

The eyelash viper's name comes from the scales over its eyes, clearly visible in this image. I feel pretty... Oh so pretty...
Winslow Homer's The Fog Warning.
Les Invalides, Paris, France
Jean-François Millet's The Gleaners
File:Edward Hopper - Girl at a Sewing Machine (1921).jpg
Edward Hopper's Girl at Sewing Machine


Reader comments

2014-10-15

<big>Attempting<ref>{{citation needed</ref>}} to parse <code>wikitext</code></big>

This week we sat down with The Earwig to learn about his wikitext parser, mwparserfromhell.

What is mwparserfromhell, and how did it get its name?

mwparserfromhell (which I will abbreviate as mwpfh) is a Python parser for wikicode. In short, it allows bot developers (like those using pywikibot) to systematically analyze and manipulate wikitext, even in cases where it is complex or ambiguous.
For example, let's say we want to see if a page transcludes a particular template, check whether it has a particular parameter, and if not, add it. A classic application would be a bot that dates {{citation needed}} tags. This isn't as simple as it sounds! A naive solution might use regexes, but then we need to check whether the parameter exists between the template's opening and closing brackets, but not get confused if it's inside of a template contained within the template (for example, if you had {{citation needed|reason=This fact is important.{{citation needed|date=October 2014}}}}), whether the template is between <nowiki> tags, and so on...
mwparserfromhell makes this easy by creating a tree representation of the wikicode (loosely described as a parse tree) that can be converted back to wikicode after any modifications are made. It focuses on being as accurate as possible, both in terms of the tree representation being accurate, and the outputted wikicode being as similar to the original as possible.
Its name comes courtesy of Σ, reflecting the somewhat insane nature of the project, and as an excuse for its frightening codebase.

What led you to develop it in the first place?

I’ve been writing bots and tools/scripts for many years – situations like the one above come up a lot. Sure, ad hoc solutions using regexes work sometimes, but I wanted something that would work in more general cases. mwparserfromhell seemed like a project that would be useful to most bot developers, and of which there was no existing equivalent.

What were some of the challenges you faced or things that didn't go according to plan while developing the parser? How did you manage them?

Oh, boy. It turns out that wikicode is a horrible, horrible language, for people and computers alike. It lacks a clear definition of how certain edge cases should be handled, and since mwparserfromhell’s goal is to be accurate, a lot of time was spent just trying to figure out how MediaWiki works. Many language parsers are designed to give up once they see a syntax error, like a missing bracket somewhere, but MediaWiki considers all possible wikitext to be valid, so a lot of mwpfh’s code involves making sense of some very questionable things (like templates nested inside of HTML tag attributes nested inside of external links, or the difference between {{{{{foo}}bar}}} and {{{{{foo}}}bar}}) and handling them as closely as possible to the way MediaWiki does. Sometimes this is hard, but other times it is outright impossible and we have to make guesses. For example, if we imagine that the template {{close ref}} transcludes </ref> and the parser encounters the wikicode <ref>{{cite web|…}}{{close ref}}, it will appear as if the <ref> tag does not end, even though it does. This is a limitation inherent in the nature of parsing wikicode: we have no knowledge of the contents of the template, so we can't figure out every situation. mwparserfromhell compromises as best as it can, by treating the <ref> tag as ordinary text and fully parsing the two templates.

How does mwparserfromhell compare to other re-implementations of the MediaWiki parser, like Parsoid?

Most projects like Parsoid (or MediaWiki’s own PHP parser) are designed to convert wikicode to HTML so that it can be viewed or edited by users. mwparserfromhell converts wikicode into a tree structure for bots, and that structure must contain enough information (such as HTML comments, whitespace, and malformed syntax that other parsers would outright ignore or try to correct) for it to be manipulated and converted back to wikitext with no unintentional modifications. Furthermore, it has less awareness of context than other parsers: because it is designed to deal with wikicode on a fairly abstract level, it doesn't know the contents of a template and can't make any substitutions. As noted above, this causes problems sometimes, but it's necessary for the parser to be useful to bots that are manipulating the templates themselves.

What is the most significant challenge that mwparserfromhell currently faces, and why?

It’s a difficult, exhausting project that would ideally have multiple people working on it. Development has stalled recently as I've been busy with college, and additional eyes would be useful to point out potential issues or help out with open problems.

What's next for mwparserfromhell? Do you have any other cool projects you'd like to tell us about?

Some wikitext constructs (primarily tables, but also parser functions and #REDIRECTs) aren’t understood by mwparserfromhell, so I would like to implement those. There’s actually an open request to review some code for table support that I've been procrastinating on for a couple months now. Other than that, I have some plants to make it more efficient; mwpfh has some speed issues with ambiguous syntax on large pages.
My copyvio detection tool on Wikimedia Labs (which uses mwparserfromhell, by the way!) has seen a lot of improvements lately, including more accurate detection, more detailed search results, and a fresh new API. If you don't know about it or have only used it in the past, I invite you to give it a spin.

Reader comments

2014-10-15

Now introducing ... mobile data

As reported in the Signpost last month, mobile views have not been historically included in the raw page count data provided by the Wikimedia Foundation. That has caused stats.grok.se as well as the WP:5000 report on which this report and the WP:TOP25 are based to lack that data. And this has led to a significant under count in total page views, as mobile views now account for about 30% of Wikipedia traffic. However, we are pleased to report that the WP:5000 has now been updated to include mobile views, including a column reflecting the percentage of views coming from mobile devices. This week's report is the first using the additional data.

We've noticed two primary effects from the inclusion of mobile view data so far. First, and most obviously, view counts are up. This week's #1, Ebola virus disease, had almost 4.3 million views, the best showing of a #1 article by far since the incredible 9.1 million which Robin Williams received after his death in August. To simply make the Top 25 this week, it took 484,791 views -- a big jump from only 240,000 views last week.

Second, we can also see that the percentage of mobile views an article receives varies by the type of article it is, as well as the source of its popularity. This week's #3, Moose, became popular due to a Reddit thread but only had 26% mobile views. Perhaps that general percentage will prove to hold true over time for Reddit popularity -- #6 this week, Age disparity in sexual relationships, was also made popular by a Reddit thread and had 26.5% mobile views.

Meanwhile, this week's #1 (Ebola virus disease) had 54.4% mobile views and #2 Ebola virus hit 64%. Contrast those numbers to this week's #10, Thor Heyerdahl, made popular by a Google Doodle. Only 15.7% of those views were from mobile sources. And Deaths in 2014, an article which often makes the Top 10, was reduced to #23 this week with only 19.9% mobile views. One might suppose that the very lengthy list-like (and sobering) nature of that article may make it less popular to read on the go. We'll continue to review how the inclusion of mobile data affects trends in article popularity, feel free to add your hypotheses to the comments.

For the full top 25 list, see WP:TOP25. See this section for an explanation of any exclusions.

For the week of 5-11 October, 2014, the ten most popular articles on Wikipedia, as determined from the report of the 5,000 most viewed pages, were:

Rank Article Class Views Image Notes
1 Ebola virus disease B-class 4,298,499
The death of Thomas Eric Duncan on October 8, the first person to die in the United States from Ebola virus disease, has only continued to increase attention to this subject, which is #1 for the second week in a row.
2 Ebola virus Start-class 998,891
See #1.
3 Moose B-class 966,086
This week Reddit learned that "the Killer Whale is a natural predator of the Moose." The sentence which piqued their curiosity remarked that killer whales "are the moose's only known marine predator as they have been known to prey on them when swimming between islands out of North America's Northwest Coast."
4 American Horror Story: Freak Show Start-class 956,565
The fourth season of the American Horror Story series debuted on 8 October 2014. Series co-creator Ryan Murphy (pictured at left) directed the first episode of the season.
5 Gone Girl (film) C-class 953,715
This 2014 American mystery film starring Ben Affleck and Rosamund Pike (both pictured at left) was released in the United States on October 3.
6 Age disparity in sexual relationships Start-class 864,448
This article owes it popularity to a Reddit thread. That this subject might be a topic of interest is not surprising to anyone who peruses the incredibly long List of films featuring romances of significant age disparity listed in the "see also" section.
7 Annabelle (film) Start-class 853,990
This 2014 American horror film was released on October 3, and stars Annabelle Wallis, Ward Horton, and Alfre Woodard (pictured at left).
8 Ebola virus epidemic in West Africa C-class 826,103
See #1 and #2.
9 Facebook B-class 798,797
A perennially popular article.
10 Thor Heyerdahl C-class 779,723
A Google Doodle honored what would have been 100th birthday of this Norwegian adventurer known for his 1947 Kon-Tiki expedition.


Reader comments

2014-10-15

Signpost reaches the Midwest

Today, it's the turn of WikiProject Ohio to give us an interview probing deep into of how they manage to run a project covering one fiftieth of the United States, and the workings of how they manufacture their successes and other articles. They have gathered a staggering 66 pieces of Featured content, and 164 Good articles. 83 members might sound like a lot of Wikipedians to work on the topics of Ohio, but we selected just three to give us a flavor of what goes on behind their scenes. Our interviewees this week are Vjmlhds, Frank12 and Wikipelli.

Ohio centers of population

What motivated you to join WikiProject Ohio? Do you live or have you ever lived in the state?

  • Vjmlhds: As a life long Ohioan, I wanted to make sure that I did all I could to make Ohio related articles the best they could be.
  • Frank12: I've only lived in Ohio and I was willing to team up with other Ohioans and those with interest in Ohio-related articles to provide useful and accurate content.
  • Wikipelli: My family's history is rooted in Ohio. I have a great interest in the history of Ohio – specifically, the Columbiana County area.

Do you contribute to the projects of any other US states? How would you compare activity at WikiProject Ohio to activity at other state projects?

  • Vjmlhds: No, and I can't really answer the second part, since I'm more focused on this project.
  • Frank12: I have, many of them in other Midwestern states due to my fascination with the region. I can't really say either, but I get the impression that Ohio has a great deal of pride among its residents that are willing to showcase the great qualities of the state.
  • Wikipelli: I have also been active in Virginia history projects.

Have you contributed to any of the project's 39 featured articles, 17 featured lists, 2 A-class articles, and 164 good articles? Are you currently working on promoting an article to FA or GA status?

  • Vjmlhds: I've done my fair share of work on a couple of articles that meet those standards – The Miz (GA) and Cleveland (FA). And I've tried to get a few others up to that level, but to no avail...but I'm still working on it.
  • Frank12: Yes, but nothing of great addition. If anything I try to add interesting tidbits, but I'm sure I'll contribute a lot if it's of a strong personal interest to me, which I've done here and there with other articles.

In addition to cities, counties, and geographic features, what are some interesting articles covered by the project?

  • Vjmlhds: I like articles about the pro sports teams (especially the ones in Cleveland), as well as Ohio State Buckeyes football, plus I've done work on articles about pro wrestlers from the state (including The Miz as mentioned above).
  • Frank12: I find the geographical articles very interesting as well as sports and college/university articles.
  • Wikipelli: I like the articles relating to the NRHPs.

WikiProject Ohio has a large number of child projects, including ones for Cincinnati, Cleveland, Youngstown, Ohio State Highways, the Cincinnati Reds, Cleveland Browns, Columbus Blue Jackets, Cleveland Cavaliers, and Ohio Wesleyan University. Do you consider this a large project that needs to split off its major cities, or are those projects not doing so well as the statewide project?

  • Vjmlhds: I think things are going along OK as is.
  • Frank12: For now I think they can stay under the Ohio scope, but I wouldn't be surprised if one day the 3 Cs could go off on their own.

What are WikiProject Ohio's most urgent needs? How can a new member help today?

  • Vjmlhds: The biggest thing any current or future member can do is just keep a look-out to make sure things stay up to standard.
  • Frank12: Go after the articles that interest you most or you feel you can contribute to the best, and have fun with it!

Anything else you'd like to add?

  • Vjmlhds: O-H....I-O!
  • Frank12: I'll second that!

Next week, we'll take you on a trip to the orphanage. In the meanwhile, check out some other lost gems in the archive.

Reader comments

If articles have been updated, you may need to refresh the single-page edition.