Wikipedia talk:Wikipedia Signpost/2014-08-20/Op-ed

Discuss this story

Hi Denny, good article and interesting topic.

  • A small note: Maybe you should flip around File:A_new_metric_for_Wikimedia_-_3.png, because the next graph changes left-right direction. Maybe easier to understand.
  • As to your question, "But are these really the primary metrics we should keep an eye on? Is Editor engagement the ultimate goal? I agree that these are very interesting metrics. I am not sure if they answer the question whether we are achieving our mission. But how could an alternative look like?" I think the metrics we don't have and we should keep an eye on are:
    • (A) article quality metrics that scale (are the articles reliable, readable, sourced, grammatically correct, updated, vandalised etc.) We have pretty much nothing. and
    • (B) maintenance metrics: Which article types need more maintenance or less maintenance than others (Animal species? Biographies of Living People? Bio of dead people? Company articles? Sports Tournaments? Rivers/Mountains and other geographic features?) How can a stagnating editor base keep up a growing article mass? Which tools allow us to do more effective maintenance and push the boundaries?
I think these are far more important and pressing questions for us than "how many percent of the infinite 'all knowledge' do we cover?" (see User:Emijrp/All human knowledge). And, as you correctly pointed out: "The energy and interest of volunteers cannot be allocated arbitrarily." / "But the question that the movement as a whole faces is how to prioritize the allocation of resources that the movement can actually allocate?" --Atlasowa (talk) 11:42, 19 August 2014 (UTC)[reply]
Hi Atlasowa, thank you for the comments and your careful reading. I have changed the direction of the graph, that's a good catch. I agree that article quality and maintenance play an important (and more immediate role) than many other metrics we are currently gathering. I took the liberty to extend my article based on your comments, and also will add quality and maintenance to the Meta project page. I would say maintenance metrics are a derivative for quality, and quality plays an important role for availability of knowledge -- but I stay with my statement that the ultimate goal is accessibility of knowledge to every one. --denny vrandečić (talk) 15:37, 19 August 2014 (UTC)[reply]
Hi Denny, that was fast! WRT "I would say maintenance metrics are a derivative for quality", now if i look at your quality metrics at meta:Research:Measuring mission success i find this:
      • Quality
        • persistence (implied review)
          • content persistence
          • reverts
          • article survival
You're not measuring article quality at all! You're just implying that if the stuff sticks, it's good ^^ If sticking is quality and "maintenance metrics are a derivative for quality", then... you lost me... :-) Hm, I guess the botgenerated Cebuano and Waray-Waray WP have excellent quality according to these quality metrics? --Atlasowa (talk) 17:46, 19 August 2014 (UTC)[reply]
Good point and good examples! Note that the tree is from Aaron, whereas I am answering - there is the root of this inconsistency. And his tree was a first draft in order to have something out to start the conversation on wiki instead of emails - I really appreciate his fast work. But it doesn't mean we agree on every detail. I actually extended the points you referred to in order to reflect how they could start to measure quality - feel free to do so as well. --denny vrandečić (talk) 18:19, 19 August 2014 (UTC)[reply]
Hi Denny, OK, i'll try to write about quality metrics and maintenance at meta:Research talk:Measuring mission success. That you link for "collection of works on understandability" to de:Benutzer:Atlasowa/Verständlichkeit is very flattering, but... my notes and lists are an unreadable mess (ironic, i know) and mostly in german. How about a link to http://www.readabilityofwikipedia.com ? "Calculate readability for... Enter the title of any Wikipedia article here to test its readability using the Flesch Reading Ease Score." "Results for: Tokyo Stock Exchange@English Wikipedia The Flesch reading ease score is 55, this means that 64% of the articles on Wikipedia are harder to read than this one." (The outcome of this formula can be interpreted by the following table: 0 - 30 Very difficult; 30 - 50 Difficult; 50 - 60 Fairly difficult; 60 - 70 Standard; 70 - 80 Fairly Easy; 80 - 90 Easy; 90 - 100 Very Easy. When writing for a general public, one should aim for a score between 60 and 70. Academic publications mostly score below 30. Note that the formula only works for English texts. [1]) Best, --Atlasowa (talk) 13:18, 21 August 2014 (UTC)[reply]
I'll link to both, Atlasowa. I didn't mean to be flattering, but it was the most comprehensible list on the issue I could think of at that moment. :) --denny vrandečić (talk) 15:10, 21 August 2014 (UTC)[reply]
You both might be interested in Wikipedia:Short popular vital articles which was an attempt to quantify the most important articles which appeared (by their raw size) to be lacking in detail. Presumably the baseline there can be used to measure how fast the entire cohort is growing. Novis Ordo (talk) 02:08, 24 August 2014 (UTC)[reply]
That's an excellent list, Novis Ordo, thanks! --Atlasowa (talk) 20:34, 24 August 2014 (UTC)[reply]
  • Fascinating big-picture questions. At some point Wikipedia's scope comes into play in its ability to spread knowledge -- pressing for free mobile coverage in developing countries would help and is laudable, but so would teaching everyone English. The sharing of knowledge would be much more efficient if we all read English (bye-bye Alemanic asteriod project), but no one would argue that should be the goal of Wikipedia. How to focus resources is a very important question, even within areas we all agree is in our core scope. Paris Hilton has many dogs (not only one you ignoramuses!), and its not easy to get someone interested in that subject to expand Maryam Mirzakhani, people with different knowledge sets need to be encouraged to join the project as metrics of success are defined.--Milowenthasspoken 18:31, 22 August 2014 (UTC)[reply]
  • A bit of information that would guide my work as an editors is "which parts of our articles do people read most". Is it mostly the lead. Mostly certain sections? I already direct my work to English Wikipedia's most read medical articles. And would love to see this tool Wikipedia:WikiProject_Medicine/Popular_pages appear in other languages. Doc James (talk · contribs · email) (if I write on your page reply on mine) 23:53, 23 August 2014 (UTC)[reply]
    • I believe there is some research about which parts get read most, and to what extent, as someone mentioned some figures to me once. All the best: Rich Farmbrough02:35, 24 August 2014 (UTC).
Hi Rich! There is some analysis of session length at meta:Research:Mobile_sessions, and amongst the results "... we are able to identify a dataset-specific cutoff point to identify 'sessions' - 430 seconds. This provides a clean breakpoint, and is in line with existing research on session time as applied to Wikipedia." That is ~7 minutes for a Wikipedia article, and that is consistent with similar analysis for blog posts (see links at meta:Research talk:Mobile sessions). --Atlasowa (talk) 20:37, 24 August 2014 (UTC)[reply]
Thanks for that Atlasowa! All the best: Rich Farmbrough21:00, 24 August 2014 (UTC).
  • I enjoyed this rather eccentric approach, and agree with the main thrust, but a serious flaw is the idea of what is "done". Actually much of our "serious" content is very poor, and that it does not change is purely due to lack of editors. Too many Wikipedians just seem to look at the length of articles, and count the number of references. Actually reading them is too often a sobering experience, even without specialist knowledge, and worse if you have it. Much of our core content is unacceptably low in quality, and it is precisely this that tends to get left, because improving it essentially means starting from scratch. Articles that are only half-bad are much more attractive targets, and more likely to attract the small remaining band of editors ready to actually write text. Johnbod (talk) 16:19, 24 August 2014 (UTC)[reply]
If it is regarded eccentric to focus on our vision instead of other measures, then this points out to the fact that this article is rather timely :) --denny vrandečić (talk) 16:48, 24 August 2014 (UTC)[reply]
It is not the focus (which has long been my own) but the approach to measuring it that I thought eccentric. Johnbod (talk) 17:03, 24 August 2014 (UTC)[reply]
Thank's for the clarification. I would very much like to hear alternatives. This proposal is really not more than a first stab. --denny vrandečić (talk) 17:16, 24 August 2014 (UTC)[reply]
I would certainly change "done" to something else. If I were doing it I would be tempted to split the black area into two parts, one representing what we have some coverage of, and the other an essentially subjective guess at what is covered with some degree of quality. Unfortunately the article assessment system is of little use for this for the vast majority of articles below Good Article. Johnbod (talk) 14:23, 25 August 2014 (UTC)[reply]
Extend your sentence to "a person without internet access has access to none of the world's knowledge through the work of the Wikimedia movement". I did not mean to measure any possible access to knowledge, as I wouldn't know how to even start, but merely to focus on what the Wikimedia movement provides. This is a limitation of the approach, and it could (and maybe should) be extended in that sense, I do not know. If you have ideas on how to estimate access to knowledge outside Wikimedia, please add it to the Meta project page. Thanks for raising the point. --denny vrandečić (talk) 16:48, 24 August 2014 (UTC)[reply]
Fair enough, but even then it's not quite true. There are off-line readers that can be preloaded with wiki text. But they might not be common. Deltahedron (talk) 20:40, 24 August 2014 (UTC)[reply]
There are also a huge number of printed books that are made from WP content. I doubt that they penetrate the places where people have no Internet access though.
There are certainly a lot of schools in Kenya that have off-line copies of WP. All the best: Rich Farmbrough21:00, 24 August 2014 (UTC).

I quite agree! Also wanted to point out a new tool I've built, Quarry that lets people explore our databases for research purposes from a web friendly way and share the results. Might be useful for less technically minded Wikimedians who want more power than Wikimetrics :) YuviPanda (talk) 12:55, 25 August 2014 (UTC)[reply]

Sweet! Thanks, Yuvipanda! --denny vrandečić (talk) 16:01, 29 August 2014 (UTC)[reply]
  • One further point -- related to the one Johnbod made above -- is that longer does not always mean better. Almost always a short but well-written article can be preferable & more useful than a longer but diffuse article -- even if the longer version has more information. -- llywrch (talk) 17:52, 25 August 2014 (UTC)[reply]
Absolutely correct. --denny vrandečić (talk) 16:01, 29 August 2014 (UTC)[reply]
Quite devastating, I reckon. The file would probably not be accessible anymore. --denny vrandečić (talk) 16:01, 29 August 2014 (UTC)[reply]

Thank you for this, this is exactly the kind of discussion we should be having! Jan-Bart de Vreede 217.200.185.43 (talk) 09:05, 27 August 2014 (UTC)[reply]

Thanks! I hope you carry it to the relevant places, Jan-Bart! --denny vrandečić (talk) 16:01, 29 August 2014 (UTC)[reply]

Don't want to be ultra-pedantic, but you misspelled diphtheria... -- AnonMoos (talk) 04:39, 27 August 2014 (UTC)[reply]

Aww. Thanks. Too my defense, smarter people than me do it too. Fixed it. --denny vrandečić (talk) 16:01, 29 August 2014 (UTC)[reply]