Do you ever wonder where Wikipedia articles come from? With a world of knowledge to represent, it’s a big question. In my role at Wiki Education, I am especially concerned with Wikipedia being an equitable and representative resource. Whether it’s a museum of paintings, a library full of volumes of books, or an online encyclopedia, systematic bias is inherent in every collection. And Wikipedia is not immune to it. So when we think about where Wikipedia articles come from, another question we must answer is how do we ensure Wikipedia has articles to make it a more representative resource?
With support from the Nielsen Foundation's Data for Good grants program, I have been developing a Wikipedia resource that encourages editors to create articles to improve representation of diverse groups and topics on Wikipedia. We have been inspired by some of the amazing projects that are already working to address this issue on Wikipedia — Women in Red, Art + Feminism, and Black Lunch Table, to name a few. It's our hope that this tool can complement the work of these projects.
Specifically, Women in Red uses Wikidata, the linked data knowledge base that connects all Wikimedia projects, to generate lists of articles that could exist in English Wikipedia, but don't yet. Building on their efforts, we are creating a resource that allows community members to do the same thing, but with a broader scope of demographic variables. In addition to individuals who identify as women, we have constructed pages that list thousands of potential articles based around sexual orientation, nationality, disability status, and ethnicity.
These lists query the other language versions of Wikipedia and pull only the results that don’t have English language articles. From there, community members can select individuals and generate English language versions of the articles. Since these articles exist in other language versions of Wikipedia, the idea is they already pass notability and have references. The article writing process will still take time, but it saves some effort in not starting from scratch. Check out our the lists here.
I know what you’re thinking — can this get any cooler? And the answer is yes! Wiki Education has been developing and maintaining the Dashboard for the past few years. The Dashboard allows instructors and individuals to create courses that are scoped to a set of students, Wikipedians, edit-a-thon attendees, and so on — basically any set of individuals that want to participate in whatever the course is. Another feature is the ability to frame a course around a list of articles. Using the same query from our resource, anyone using a Dashboard can scope it to one of the lists we’ve developed. The idea here is to encourage Dashboard users to select articles about underrepresented groups or individuals, and write them for English Wikipedia. Follow this link for an example of an article-scoped Dashboard. Heads up — clicking the PSID list will take some time to load, because it is large.
And this, dear readers, is one place where Wikipedia articles come from.
The main idea is we’re building a tool that encourages community members to write articles to increase the visibility of diverse groups and topics on Wikipedia. We’re doing this using Wikidata, queries, a list tool called Listeria, articles scoping on the Dashboard, and the hard work of anyone taking a Dashboard course or attending an event that uses the Dashboard. Although systemic bias and underrepresentation will remain a significant problem on Wikipedia and beyond, we hope this tool can push new and old users alike to edit in a way that helps to improve representation on the platform. As the community and these tools mature, we also hope others can refine and adapt it to their specific needs. An amazing thing about pulling from Wikidata is that users can narrow and expand queries to generate new lists. These lists are configured to improve English Wikipedia, but in a snap they can point to other language versions.
We're still tinkering and ironing out the wrinkles, but we hope you can start taking advantage of it now! Get ready to make some edits.
This post was originally published on the Wiki Education blog on August 31st, 2023.
Discuss this story
Thanks, Will, for this interesting article on your efforts towards overcoming equity problems on Wikipedia and for mentioning Women in Red among the projects which have inspired your work. I was interested to see your lists on Ethnicity and Medical condition but for some reason those on "Sexual orientation", "Nationality" and "Gender" are empty. As for your dashboards, they may well be useful for monitoring progress in the WikiEdu environment but when they are used more widely, for example in connection with editathons, they are not always as accessible as traditional Wikipedia meetup pages. Perhaps somewhere (why not here on The Signpost?) you could offer more detailed explanations on the dashboards and how we can access them as normal Wikipedia users rather than as participants in an educational programme or at an editathon. Unlike Wikipedia pages, many seem to be difficult to access. WP:Dashboard does not appear to cover them.--Ipigott (talk) 10:48, 7 November 2023 (UTC)[reply]
"Since these articles exist in other language versions of Wikipedia, the idea is they already pass notability and have references." - I recognize this was originally a blog post meant to build excitement and not a dour "here are all the challenges", but... as someone who's definitely made several articles based on other language Wikipedia's entries, this is an... optimistic... statement that may be setting up some well-meaning volunteers for heartbreak. English Wikipedia, for all its faults, is much better curated than many other language wikis, and they have articles that would be deleted by NPP as overly promotional / sourced to just primary sources. Is there any way for editors to mark entries from this query as unlikely to be suitable for an English Wikipedia article? For example - and for whatever reason, a lot of the initial results from this query are porn stars, even if they thin out later on - Angie Savage has a bunch of articles on other Wikipedias. BUT they were almost surely translations of the English Wikipedia article, which was deleted at Wikipedia:Articles for deletion/Angie Savage. If there's any cleanup to be done, it's in deleting those other versions. One low-hanging fruit would be to exclude articles which once were on enwiki but have since been deleted, although I'm not sure if Wikidata tracks this. (@Will (Wiki Ed):). SnowFire (talk) 23:09, 10 November 2023 (UTC)[reply]
One thing I am interested in getting comments about is how best to make sure we are capturing variations in name when listing out potential article subjects. For instance, Sarah Robertson (painter) looks like a pretty decent article that has been in existence since 2014. However, Sarah Margaret Armour Robertson is still listed (as of this writing) as a redlink on both Wikipedia:WikiProject Women in Red/Art and also Wikipedia:WikiProject Women in Red/The World Contest/Missing articles/North America. Now, some of this is going to be inevitable, and obviously we can create a redirect, but what are ideas on how to minimize this going forward? List multiple variations of names on the Women in Red pages? Thanks to all for their contributions and comments. KConWiki (talk) 22:23, 16 November 2023 (UTC)[reply]