A couple weeks ago there was an article in Quillette about the English Wikipedia, in the context of our recent fork Justapedia. While there's plenty to discuss about the majority of the article — which can be read about in this month's In the media column — there is one particular rhetorical aside that caught my eye. It's brought up briefly, almost in passing, and the article moves on with its argument. But it's fairly interesting, and I think warrants some closer examination. Here is what it says:
An example of such a case, in which the problems with a Wikipedia article cannot be reduced to left vs. right, concerns the hacktivist “Cyber Anakin.” In 2022, Cyber Anakin launched an attack against Chinese computer networks that included government websites, satellite interfaces, and various industry- and infrastructure-related systems. In retaliation, most of the biographical details were removed from Cyber Anakin’s Wikipedia article, and the Taiwan News reports that this was likely done by employees or sympathizers of Xi Jinping's regime. These removals from the Wikipedia article provoked a further round of attacks against Chinese government websites by the hacker group known as Anonymous.
To an outside observer, removing information from someone’s biography in order to retaliate against a hacker looks like an abuse of Wikipedia. But from the standpoint of Wikipedia’s internal rules, the removals were done by means of (mostly) legitimate editing processes, with the perpetrators arguing extensively on the article’s talk page for why their removals were justified. Perhaps more importantly, the quantity of text posted to justify these removals was so immense that, in the words of one outside commenter, "most editors would walk away in an instant." For these reasons, the removals from Cyber Anakin's Wikipedia article have never been undone. But the removed material has been added back in his Justapedia article, and consequently, as of this writing, the Justapedia article about Cyber Anakin is around four times the Wikipedia article's length.
No.
There are a lot of claims made here, and a lot of facts to address. The accusation goes beyond "big if true" — it would be gargantuan if true — and, as such, it warrants being examined in full detail, rather than rejected out of hand. It should not be dismissed as pure fancy. After all, astroturfing is a real and ubiquitous phenomenon, and astroturfing on the English Wikipedia is attempted on a daily basis by any number of organizations and entities.
I think, however, that when the merits of this situation are examined in full detail, you will agree with me that the insinuation of the Chinese Communist Party being somehow involved with this Wikipedia article is bullshit.
To be more specific:
The article revision linked to with the text "Cyber Anakin's Wikipedia article" is from late November 2023. As best I can tell, the version they are actually referring to (the much-longer one, referred to by the Taiwan News article) is this, from September 2022. It is indeed true that the current revision is much shorter. It's also true that many articles on Wikipedia are made shorter (or longer) on a regular basis. But unlike most websites, the page history on Wikipedia articles is an open record for anyone to inspect. So we can inspect this and try to figure out the real story. First, though, allow me to take a minute to tell you a different story.
As a Wikipedia administrator with over a hundred thousand edits, and over a hundred article creations, I have gotten into all sorts of disagreements with people. But perhaps the greatest frustration in my editing career was this deletion discussion. In it, an extremely useful and detailed 122,174-character-long tabular list of technical specifications for Xilinx field-programmable gate arrays was unceremoniously redirected to a couple paragraphs in the larger article about Xilinx. Look how they massacred my boy. I didn't write the article, but I'd used it, and I gave it my all at the debate. It was me, a consummate Wikipedia nerd (and a handful of outraged hardware engineers) against an opposing contingent of equally consummate Wikipedia nerds. We lost. It made me angry, but the decision was compliant with policy, and part of working on a collaborative project is that sometimes stuff happens that makes you angry. Eventually you have to get over it, which I did.
Note that this single deletion discussion was about 3,200 words long, and it wasn't anywhere near the longest. I've written some software that keeps indices of the largest deletion debates of all time; to give you an idea of the Wikipedian capacity for argumentation, Wikipedia:Articles for deletion/List of bow tie wearers (4th nomination) is 22,271 words long. Talk:Cyber Anakin and its archive page (which contain the entirety of the talk page arguments mentioned in the Quillette piece) come out to about 17,708 words.
Sure, this is a lot. It's a whopping 0.79 bow-tie-wearers-fourth-nominations' worth.
Anyway, one of my adversaries during the Xilinx FPGA deletion debate was Drmies. He is, incidentally, one of the editors who removed content from the Cyber Anakin article in November 2022, and therefore, I suppose, one of the "employees or sympathizers of Xi Jinping's regime". Is this plausible?
Well, let's see: Drmies — who really is a Dr. — is an administrator, for which he had to submit to a grueling public seven-day job interview that was being voted on the entire time (and passed with 205 in favor, 2 opposed and 3 neutral). In the twelve years since then he's also been given the checkuser and oversight usergroups. Being an oversighter involves access to information so sensitive even administrators can't see it (e.g. they are the people who remove shock videos of executions and naked children), and for which you are required to sign a non-disclosure agreement with the Wikimedia Foundation. Since 2007 he's made 378,748 edits.
Just as a thought exercise, try to imagine you are a college professor, and you're approached by a foreign spymaster, who offers you a mission: to edit an encyclopedia in your free time, carrying out research, writing articles, fighting vandalism, making hundreds of thousands of individual edits, debating the finer points of policy, writing dozens of paragraphs arguing with engineers about field-programmable gate arrays, and signing legally binding documents to achieve a position of authority and prestige on said encyclopedia — to do all of this for sixteen years — as a ploy so that one day you can remove a couple paragraphs from an article about a hacktivist. How much money do you think you'd ask for? How many millions of dollars do you think Xi Jinping's budget is for each individual English Wikipedia article? My guess is not enough for this to be a viable strategy.
Anyway, you can look at the Xtools statistics for the article and see for yourself what the deal is on everyone else. Sideswipe9th, a heavy editor of the article, has often removed material from it, and has also made 9,996 edits over the last few years. They edit a lot of political stuff but they've never been blocked, something which seems fairly difficult to do if you're a saboteur. Jayen466, one of the editors who's argued for inclusion of material in the article, is not only a highly experienced user, but a former editor-in-chief and regular contributor to the Signpost; one would hope he'd be capable of noticing and saying something if he found himself surrounded by psyops agents.
Of course, it's impossible to know who people are in real life without instituting rather intrusive measures that destroy anonymity — something which would bode quite poorly for our editors and readers who live in, say, tin-pot dictatorships where all remotely political Internet activity is monitored and official arbiters of truth given central registries by which to control speech.
Ultimately, it's impossible to say for sure that none of the well-established editors arguing over that article were on the PLA payroll. Or that I'm not, for that matter. The same is true, of course, of the milkman, the firefighter, or the thinkpiece writer — can any empirical knowledge truly be known? — well, no. But some things are just not very likely to be the case, and it's just not very likely that thousands of volunteers who can't even agree on the notability of field-programmable gate array datasheets would be able to carry out a coordinated decades-long operation (after all, unmasking your interlocutor as an international psyops agent is a great way to win the argument).
You may wonder why I am getting so bent out of shape about the accusation of paid editing. Surely this stuff happens all the time. Yes and no — we're volunteers, and it takes a lot of time to track down people who are up to no good, and there sure are a lot of them. But we have a small army of volunteers who sniff out sockpuppet farms and astroturf operations. They are pretty good at it, and something like this would be a gigantic ordeal. The Signpost has reported on dozens of cases of influence operations on Wikipedia getting busted.
While we're on the topic of unmasking strange behavior, a couple other things may be worth mentioning about the history of the article. Primarily, one editor — Bugmenot123123123 — created both Cyber Anakin and the since-deleted page 2016 KM.RU and Nival Networks data breaches. Bugmenot123123123 doggedly (and unsuccessfully) defended both pages at their respective deletion nominations in late 2016 (here and here) — and was eventually blocked for disruptive editing. Like I said before, due to our principles of anonymity, it's difficult to know exactly who someone is in real life. The jury is still out on who this is: but they've been remarkably consistent in their agenda over the better part of a decade.
Throughout their tenure, they were persistent in advocating for Cyber Anakin to have an article, for the article to be retained, and for the article to be expanded. In fact, they were so dedicated to championing the cause of Cyber Anakin that, even after their block, they operated several sockpuppet accounts — including Mdikici4001, Mamasanju, and Wizzakk — all of whom were fixated on recreating the article. All of whom, I should note, were rather easily detected and their efforts stymied: it is pretty obvious when five brand-new accounts suddenly try to create articles about the same random hacktivist over and over. This is not the first time someone has tried to do this, and we're not idiots.
It is true that, despite all the sockpuppetry and abuse, Cyber Anakin has a page now. We're not idiots, but we're not Inspector Javert either, and we don't punish people simply because they have aggressive fanboys (or, for that matter, if they are the aggressive fanboys). The article was since put to a second deletion nomination last November, at which it was concluded that there were enough independent, third-party sources to be able to write a neutral, accurate article. I mean, who knows — someone could nominate it again and maybe it would get deleted. Maybe it should. Maybe it will. Or maybe not. Part of working on a collaborative project is that sometimes stuff happens, and then other stuff happens.
But back to Bugmenot123123123.
There are many sockfarms — and I mean hundreds of farms and thousands of socks — with investigation casepages. But there are comparatively fewer long-term abuse pages; these are a distinction reserved for people whose abuse of the project is so persistent and relentless that it's necessary to keep tabs on their modus operandi (like this guy, whose LTA page's "see also" section includes a link to the "California" section of our cyberstalking legislation article — draw your own conclusions from that).
Bugmenot123123123 has been such a giant pain in the ass, for so many years, and in so many diverse ways, that they have a long-term abuse page of their very own:
BMN123 canvasses extensively onwiki, offwiki, and crosswiki, changing proxies frequently and using external rather than normal links (to avoid backlinking) in an attempt to conceal its extent, often seeding in random editors among those targeted as likely supporters.
Edits to Cyber Anakin focus on maintaining the sockmaster's preferred narrative. They will edit-war (Special:Diff/1113301828, Special:Diff/1112087106) and leave long rambling talk page posts (Special:Diff/1113134519, Special:Diff/1112288532) in effort to restore their preferred version when it is disturbed, occasionally leaving warning templates on the talk pages of people who revert them (Special:Diff/1112090413).
Edits on other Anonymous area pages attempt to spam mentions of Cyber Anakin or incidents associated with Cyber Anakin everywhere (Special:Diff/1082332024, Special:Diff/757938463) subsequently edit-warring to retain them (Special:Diff/1117016182 Special:Diff/757940419). In an unblock request they outright stated their intent to spam Cyber Anakin across many pages (Special:Permalink/1117032979#Request_to_downgrade_block_to_partial_block_of_some_pages).
[...]
It is possible that more than one editor is responsible. They've directly claimed to have hired others to edit [1] and have posted on and off-wiki attempting to form a dirty tricks cabal (Special:Diff/1113171407). Regardless policy is clear thatwhen there is uncertainty whether a party is one user with sockpuppets or several users with similar editing habits they may be treated as one user with sockpuppetsso the detail is not particularly important.
They've also shown on ability to trick journalists from some marginally reliable sources to defame Wikipedia editors and subsequently post the ultimately self-sourced statements in mainspace as part of their harassment campaign.
Man, that sure would be embarrassing.
The Italian Court of Audit publicly opposed a recent decision by the Ministry of Culture, led by Gennaro Sangiuliano, to establish minimum fees for the production and publication of digital reproductions of cultural heritage, as recently reported by Wikimedia Italia, as well as several national media (in Italian; the latter two links are behind pay-wall).
As written by Italian lawyers Deborah De Angelis and Giuditta Giardini for Communia last July, in Italy the so-called Cultural Heritage and Landscape Code (CCHL) has been in force since 2004; basically, it was intended to "support the role of cultural heritage institutions in sustainable economic and social development", granting them, among other privileges, discretion to choose whether to make art works such as paintings, frescoes and statues available in the public domain, through the attribution of a Creative Commons licence or, at least, the digital reproduction of images.
However, in recent years some state-owned institutions have taken advantage of this interpretation of the CCHL to start lawsuits against commercial uses of works by Italian artists which, theoretically, should already be in the public domain – for example, Michelangelo’s David, Leonardo da Vinci’s Vitruvian Man and Sandro Botticelli’s The Birth of Venus. As explained by De Angelis and Giardini, these initiatives are likely in contrast with the Article 14 of the Directive on Copyright in the Digital Single Market, adopted by the European Union in 2019 and transposed into domestic law by Italy two years later.
In April of this year, the Italian Ministry of Culture caused even more headaches by publishing "guidelines" for the introduction of minimum fees for the commercial use of digital reproductions of state-owned cultural heritage, including works in the public domain (see previous coverage on Diff and the Signpost). The decree, which was harshly criticized by numerous experts and researchers, contradicts the principles expressed both in the CCHL itself – more specifically, the Articles 1 and 6 – and the Faro Convention (which Italy signed and ratified): they stress the importance of full freedom of access to and sharing of reproductions of cultural heritage in the public domain. If officially implemented, the measures included in the MoC’s decree might not only impoverish Wikimedia’s projects, but also damage activities of research and promotion of Italian culture.
Now, though, the Italian Court of Audit also expressed concern about the ministry's bill in a report named The results of monitoring activities done in the year 2022 and the consequential measures adopted by administrations. In their “Review of consequential measures adopted by administrations" – starting from page 157 of the report – the court give credit to the MoC’s offices for their “important effort in digitization”, as for the goals set by the Digital Library and the National Recovery and Resilience Plan , while noting how the introduction of the aforementioned minimum fees looks to be “against [this] trend”, especially in regards to the benefits of open access:
For some time now, Open Access has proven to be a powerful multiplier of wealth not only for the cultural institutions themselves [...], but also in terms of increasing the GDP, and is therefore considered a strategic asset for the social, cultural and economic development of the [European] Union’s member countries. [...] The introduction of such a "fee schedule" seems, moreover, to take into account neither the operative peculiarities of the web, nor the potential damage to the community, which should be measured in terms of [...] lost opportunities, as well; therefore, [the decree] also stands in obvious contrast to the clear indications coming from the National Digitization Plan (PND) of cultural heritage.
What’s more, Avvenire and Il Sole 24 Ore (see the links cited at the top of this story) reported that the Court of Audit had already endorsed the free circulation of digital reproductions of cultural heritage in public domain in an October 2022 document, which included the following quote:
The radical transformations digital [devices and services] have produced in our society encourage [...] the abandonment of traditional "proprietary" paradigms, in favor of a more democratic, inclusive and horizontal vision of cultural heritage. Forms of economic return based on the "sale" of the single image appear anachronistic and largely outdated since, moreover, they are patently uneconomic. There is evidence that, in some cases, the ratio of costs incurred in managing the collection service to the actual revenue generated produces a negative balance.
The question is: will the MoC get the memo this time around? – O
On December 19, 2023, Stas Kozlovsky, the Executive Director of Wikimedia Russia, posted a message on a community page in the Russian Wikipedia, saying that after almost 25 years of work as an associate professor at Moscow State University, he had recently been summoned by the vice-rector and told there was "reliable information" about the imminent intention by Russian authorities to declare him a "foreign agent". He said he was allowed to choose either being dismissed "for absenteeism" or resigning on his own, and eventually "chose the latter" option.
Kozlovsky proceeded to call an emergency meeting of Wikimedia Russia, where he shared this news, and a general decision was taken to close the organization; the liquidation process would take several months.
Stas had taken over as head of Wikimedia Russia earlier this year, after the previous director of the organization, Vladimir V. Medeyko, was indefinitely banned for establishing a government-approved fork of the Russian Wikipedia (see previous Signpost coverage). See the In focus column of this issue for more details on Wikimedia Russia's shut-down, as well as reports from The Moscow Times and Radio Free Europe. – AK, O
Following the end of the national contests in September and October of this year, the 2023 edition of Wiki Loves Monuments has come to its crucial final phase, and it’s now waiting for the international winners to be publicly announced. Historically, the annual photographic competition organized worldwide by the Wikipedia community has involved dozens of countries across the globe and gifted Wikimedia projects with hundreds of thousands of photos each year; 2023 has made no exception, as users from at least 46 different nations uploaded more than 217,000 images.[1] 2,343 photos became quality images, 46 were assessed as featured pictures, and two received the valued image treatment.
Five countries made their debut in this year’s competition: Egypt, Togo, Uzbekistan, Zambia and the Dutch special municipality of Sint Eustatius. On the other hand, four nations – Belgium, Georgia, Greece and the United States – came back to the party after more or less prolonged hiatus.
Taking a look at the statistics,[2] Italy recorded the highest number of uploaded images by a mile, with 52,004 contributions; to put it in context, that’s almost twice as much as second-placed Russia (28,761), and almost thrice as much as third-placed Ukraine (19,641), as well as a huge jump from the previous performances of the Bel Paese itself. Brazil was the highest-ranked, non-European country on the list, coming in fifth place with 13,202 contributions, right above the United Kingdom (12,851); elsewhere, India led the Asian continent from their 10th place (5,754 images), while Nigeria was the first of the African countries in 17th place (2,800), slightly outperforming the US (2,513).
Perhaps unsurprisingly, Italy also topped the chart for the total number of uploaders (946, 565 of whom signed up to Commons during the competition), with Russia (557, 406 of whom registered) and Iran (459, 391 of whom registered) following at moderate distance. It is surprising, however, to see Uganda boast the highest percentage of images that were used in the wikis after being uploaded (about 85%), a feat Egypt (61%) and Malta (44%) are not even close to, despite being on the podium.
Most of the countries involved in Wiki Loves Monuments 2023 have already elected their national winners and/or selected their ten best submissions for the international stage: you can see a comprehensive gallery here.[3] Just like last year, Italy's committee has once again stood out for their decision to “kill two birds with one stone”, by hosting a traditional contest alongside one that was centered around a specific category of monuments, which in this case turned out to be religious buildings; you can see the winners and finalists of the Italian contest in detail here or here. Now, all we have to do is wait for the announcement of the international winners: let’s see which pictures will make our jaws drop this year! – O
Forks are everywhere. If you've got a barn or a stable, there should be a fork inside it to clean out the muck. There are forks in the road, on the internet, on the chess board, on antelopes, in rivers, in beards and tongues, in cryptocurrencies, and almost everybody has forks in their drawers. Maybe we should use chopsticks instead. – S
The Ledger's headline (paywalled) gives the main news: the Florida newspaper is asking for funds from its readers to support Wikipedia. But the bad news is that The Ledger needs to charge its readers to pay its bills. Otherwise, their readers will get cut off by the paywall. The good news is that they will give you "unlimited digital access (costing) $1 for the first 6 months". Everybody, it seems, needs a little green to support their publishing. The better news is that Wikipedia is still free for all readers and has no plans to change that. This reporter has no objections to you donating $2.75 or $25.00 or whatever amount you would prefer. It is not that the Wikimedia Foundation needs your cash now to forestall closing down this website next week, next month, or even next year, but it is just good planning for a non-profit organization to build a solid base of small donors who can ensure that this site will be around for a long time to come. The best news is that The Signpost will always be free – just as we have for almost 19 years – so long as Wikipedia keeps publishing. And to return the plug, Signpost readers should feel free to consider paying a dollar for six months of The Ledger. – S
In his ever-informative column in Slate, Stephen Harrison explains in detail why editors from WikiProject Highways created a new website forking Wikipedia's road articles. (We note that The Signpost scooped him on this story.)
In his usual style, Harrison breaks the story into an intriguing introduction, and several tines accompanied by quotes from participants and analysis of Wikipedia's policies and guidelines. In this particular case, he grabs you in the intro with "Wikipedia, road infrastructure, and drama—one of these things doesn’t sound like the other" and a mention of a video that "spills the tea." He then focuses on an editor, identified only as Ben (or bmacs001), and the tines include the difference between editors who are roadgeeks and railfans, with a brief note on possible cultural differences between American and European railfans.
The Wiki-rules discussed include notability, reliable sources, pseudoscience, and no original research.
Of course, no newspaper story is ever perfect: Harrison might have emphasized the fact that the fork has enjoyed a fairly successful start, or that there are no rules against forking Wikipedia (as long as you give proper attribution). Or that there are no prohibitions on users editing both Wikipedia and the fork, and few on importing text from the fork into Wikipedia itself. And he certainly should have mentioned that the word "fork" is likely an inherently funny word. – S
In an article for Australian newspaper Quillette, Shuichi Tezuka raises some pointed objections to the way the Wikipedia community handles disputes over coverage of contentious material; for example, he expresses concern about "cognitive distortions" that are perpetuated "by reducing the population of people who raise [objections]... as these users have either quit Wikipedia or been permanently blocked from editing". Tezuka mention the famous "somewhat-viral tweet" of last October and related concerns about WMF spending (see previous Signpost coverage), and concludes that newly-formed fork Justapedia (which recently sparked a discussion on the administrators' noticeboard), is necessary to solve these problems, stating: "the need for such a competitor [to Wikipedia] is stronger now than it has been in past years, due to several recent controversies revolving around the manipulation and/or politicization of Wikipedia, along with a widespread perception that Wikipedia has not done enough to prevent this type of problem." The founder of Justapedia, user Atsme, wrote an op-ed expressing some of the same concerns for the Signpost back in 2020. – B
At the policy village pump: what's the deal with media companies uploading drawings and videos of highly recognizable characters under free licenses? Is it a proper release? Do they have the authority to make a proper release? Would it hold up in court? Who knows. One thing's for sure: in one week, the mouse will be freed from his prison. — J
On December 6, a discussion was opened at Commons' village pump regarding the proper tagging and use of pictures created using image models (e.g. Midjourney, DALL-E, Stable Diffusion and friends). Should they be permitted? Labeled? Forbidden? There's a litany of opinions. At the center of them is the newly-congealing Commons:AI-generated media. — J
Following a discussion on the Arabic Wikipedia ( archive alt), the wiki took several actions in response to the war in Gaza. The actions, as summarized by Arabic Wikipedia checkuser and administrator Dr-Taher, are as follows:
The discussion began when Mervat posted a message to her fellow Arabic Wikimedians, encouraging the wiki to put out a banner message in solidarity with Palestinian Arabs and calling for an end to the war. As conversation developed, محمد أحمد عبد الفتاح lamented the state of the war's coverage on the Arabic Wikipedia, stating that it relied too much upon English sources and that several key articles were too short, and arguing that some changes should be made to encourage the creation of content related to the Palestinian cause and to encourage advocacy on its behalf.
Following initial discussion, a consensus was obtained on the Wiki to issue a statement of solidarity. Dr-Taher made a formal proposal to take several actions in support of Palestinians, including a blackout of the Arabic Wikipedia's main page.
هذا المقترح أعلاه، وأرجو من الزملاء الخبراء في الصور اختيار صورة مناسبة مع إبراز العلم الفلسطيني. سيبدأ الإعلان بـ (يوم إغلاق -Blackout- لصفحة ويكيبيديا الرئيسة) فلا يظهر فيها إلا الإعلان فقط، وكذلك صفحات ويكيبيديا على وسائل التواصل. يلي ذلك عودة الصفحة لما كانت عليه مع بقاء الإعلان في أعلى الصفحة، وتغيير الشعار إلى لون العلم الفلسطيني لحين إشعار آخر. سيبدأ الإداريون ترتيب مسابقة لتطوير المحتوى المتعلق بالحرب وبالقضية الفلسطينية كاملة.
— Dr-Taher
Later on, حبيشان suggested that the logo of the Arabic Wikipedia be changed to bear the colors of the Palestinian flag. The user compared the proposal to the decision by community of the Georgian Wikipedia to change their logos to bear the colors of the Ukrainian flag following the Russia's 2022 invasion of Ukraine.
Throughout the discussion, there was near unanimous support for a statement of support for the Palestinian people, the creation of new content related to the conflict, the blacking out of the main page, and the changing of the logo. A broader ban on editing on 23 December was also implemented; an interface administrator installed a fork of Wikibreak enforcer into the site's Common.js file (diff, archive alt). It is not clear to The Signpost which interface administrator made the edit, as the username of the editor who performed the edit was subject to revision deletion.
Update: NickK filed a Steward request (archive) immediately on Meta after the Wikibreak enforcer came online: requesting for the enforcer to be removed on basis that unsuspecting logged-in editors who visit the Arabic Wikipedia would be logged out without adequate warnings. After a quick discussion with an unanimous view that the protest should not affect logged-in editors on other projects formed, the enforcer was removed.
On December 19, 2023, the director of Wikimedia Russia, Stanislav Kozlovsky, made several important statements about his forced resignation from his job at the Moscow State University and the dissolution of the WMF's Russian branch.
Kozlovsky had been working at the Moscow State University for 25 years, where he most recently served as a Candidate of Psychological Sciences and Associate Professor at the Faculty of Psychology. In December 2023, he was at the University's branch in Baku, Azerbaijan, giving lectures on psychophysiology to local students, when he was "unexpectedly" forced to interrupt the course, having been called to Moscow by order of the vice-rector of the MSU due to "operational necessity".
On December 18, 2023, during a meeting at the dean's office of the MSU's Faculty of Psychology, Kozlovsky was told that there existed "reliable information" about his inclusion in the list of suspected foreign agents by Russian authorities [since lists of foreign agents are updated on Fridays, his inclusion on the list was expected to be made public on December 22, 2023]. According to Kozlovsky himself, he was offered two options by the MSU's board: either be fired for "absenteeism", or to resign "at his own request"; having been denied further time to think about his future at the university, he ultimately "chose the latter [option]". Later on the same day, Kozlovsky removed information about his place of work from his Wikipedia user page.
However, TASS later reported about a press statement by the MSU's Faculty of Psychology itself regarding the events associated with Kozlovsky’s dismissal, in which the office denied his version of the facts, while also claiming that neither the university, nor the faculty knew anything "on the inclusion of S. A. Kozlovsky in any lists".
On the evening of December 18, Kozlovsky called an emergency meeting of Wikimedia RU's NP members, where he reported his forced resignation from his job at the MSU. As he later said, in an interview with RBK, the Wikimedia RU administrators agreed that "it [was] impossible to work in such conditions" and "decided to dissolve the organization", although the "closing formalities [would] take several months". After this was reported on the Russian Wikipedia's Village Pump, users expressed understanding and many found some words in support.
At 07:38, December 19, 2023 (UTC), Kozlovsky posted on the Russian Wikipedia news forum with a story about the events described. He also gave several interviews to Russian media, including TASS, RBC, RTVI, and Vzglyad. He said that Wikimedia RU has never been responsible for Wikipedia. Instead, it supported Wikiprojects in Russian (those, along with Wikipedia, include Wiktionary, Wikinews, Wikisource, and others). It prepared textbooks and offered online courses on editing these projects, organized conferences, seminars, and lectures, and worked with copyright holders to facilitate transfer of works to Wikipedia under free licenses. According to him, in recent years, "everyone has become afraid" of dealing with Wikimedia RU, although there has been no obvious pressure on the organization until now.
In an interview with another publication, the business newspaper Vzglyad, Stanislav said that he does not intend to leave Russia. "I do not have anywhere to go."
Kozlovsky was also not sure about what exactly was the reason for the possible inclusion of foreign agents on the list. Since he gave lectures in Baku, Kozlovsky joked: "Maybe they want to recognize me as a foreign agent of Azerbaijan? I don't know".
In addition, Kozlovsky recalled that there has been discussion about blocking Wikipedia for more than ten years, but it has never happened: "It [the blocking] could happen any day, but it doesn’t happen. I hope this never happens."
“ | When we’re at night, the white skin of Wikipedia dazzles us and it’s very uncomfortable. I suggest a night mode switch for users or at least a darker color. Also available for the mobile version. | ” |
— VictorPines, 2017 |
“ | Some kind of toggleable dark or night-mode. It would be most accessible as a feature for everyone and not just a new skin for logged-in users. | ” |
— Premeditated Chaos, 2018 |
“ | Colors of Wikimedia projects are white or near white, which on long view time causes damage to eyes, and consumes more energy on the laptop. | ” |
— David L, 2021 |
“ | Please add dark mode!! | ” |
— Crenshire, 2023 |
The Wikimedia Foundation has seen many requests like these. Dark mode is available in the Wikipedia mobile apps, but still not in the web browser. It’s been a common request from editors in the Community Wishlist Surveys and the rollouts of the Vector 2022 skin — hundreds of comments! We would like to thank for all these.
Some time ago, a few Foundation staff members, Volker, Alex, Carolyn, and MusikAnimal, built a dark mode script as an experiment. It has become a popular gadget across wikis. But until this year, making dark mode a regular part of the interface was not possible. Now, with help from communities, we are finally ready to work on this feature! Continue reading to learn about the benefits of dark mode, what made it possible, and how to get involved.
Dark mode improves accessibility. The primary benefit is that it reduces eye strain. When we’re in a long reading or editing session, particularly when it’s dark around us, the contrast between a bright screen and the surrounding darkness can cause discomfort. Dark mode mitigates this by giving us a darker background with light text, reducing glare and minimizing eye fatigue. This feature is especially helpful for night-time readers or readers who spend lots of time on their devices.
Many readers and editors favor dark mode. The softer, darker hues can be less harsh on the eyes and create a more relaxed reading environment, enhancing the reading experience.
In the past, it was not possible to change our web interface based on the preferences of logged-out users. These users couldn’t set a preferred page density, change the font size, or set a dark mode. Also, the MediaWiki skin and design architecture made it difficult to maintain two color schemes (light and dark). It was necessary to improve these three facets first.
With this system in place, we could begin planning the Accessibility for reading project. This is our response to users’ need to read the wikis comfortably and to adjust the settings. In the first step, logged-in and logged-out users will be able to select different font sizes and text density. Dark mode will be next.
Editors control content which includes templates: amboxes, infoboxes, navboxes, as well as bitmaps, timelines, tables, and more. Some of those, like weather and sports tables, use colors in a meaningful, or semantic, way. Simply inverting these colors would immediately lose their meaning. We need to find other options.
Whatever technical approach we choose, we will coordinate with editors. We may build different solutions for big and small communities. In the coming weeks, we will reach out with specific questions and ideas.
We would like to start gradually, with a limited number of communities and wikis. First, the dark mode would be a beta feature. As such, it would only be available for logged-in users who decide to enable it. Any logged-in user will have an opportunity to test alongside us as we build out the final version.
We will talk to interface admins, template and module maintainers, and editors interested in making the wikis easier to read for everyone. Together with them, we would like to work on recommendations for making pages more friendly to dark mode. We will also help them adjust the current code on the wikis. When enough pages become dark-mode friendly, we will roll dark mode out for logged-out users. (On a side note, we aren’t sure how many pages are enough. We will ask about that, too!)
How do you feel about all this? Write on our project talk page. Be sure to subscribe to the Web team newsletter to never miss an update from us. Thank you! —OV, SG, JR (WMF)
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
This paper[1] (by five graduate students at Stanford University's computer science department and Monica S. Lam as last author) sets out to show that
While large language models (LLMs) can answer many questions correctly, they can also hallucinate and give wrong answers. Wikidata, with its over 12 billion facts, can be used to ground LLMs to improve their factuality.
To do this, the paper "presents WikiSP, a few-shot sequence-to-sequence semantic parser for Wikidata that translates a user query, along with results from an entity linker, directly into SPARQL queries [to retrieve information from Wikidata]." It is obtained by fine-tuning one of Facebook/Meta LLaMA 1 large language models.
For example, the user question "What year did giants win the world series?" is supposed to be converted into the query SELECT DISTINCT ?x WHERE {?y wdt:sports_season_of_league_or_competition wd:Q265538; wdt:winner wd:Q308966; wdt:point_in_time ?x. }
. The paper uses a modified SPARQL syntax that replaces numerical property IDs (here, P3450) with their English-language label (here, "sports season of league or competition"). The authors motivate this choice by observing that "While zero-shot LLMs [e.g. ChatGPT] can generate SPARQL queries for the easiest and most common questions, they do not know all the PIDs and QIDs [property and item IDs in Wikidata], and nor is it possible to include them in a prompt."
To evaluate the performance of "WikiSP", and as a second contribution of the paper, the authors present
[...] WikiWebQuestions, a high-quality question answering benchmark for Wikidata. Ported over from WebQuestions for Freebase, it consists of real-world data with SPARQL annotation. [...]
Despite being the most popular large knowledge base for a long time, existing benchmarks on Wiki-data with labeled SPARQL queries are unfortunately either small or of low quality. On the other hand, benchmarks over the deprecated Freebase still dominate the KBQA research with better-quality data.
Using this new benchmark, "Our experimental results demonstrate the effectiveness of [WikiSP], establishing a strong baseline of 76% and 65% answer accuracy in the dev and test sets of WikiWeb- Questions, respectively." However, the paper's "Limitations" section hints that despite the impressive "12 billion facts" factoid that the paper opens with, Wikidata's coverage may be too limited to answer most user questions in a satisfying manner:
Even though knowledge bases are an important source of facts, a large portion of the knowledge available in digital form (e.g. Wikipedia, news articles, etc.), is not organized into knowledge bases. As such, the results of this paper can be considered complementary to the larger body of fact-checking research based on free text.
To address this weakness, the authors combine this Wikidata-based setup with a standard LLM that provides the answer if the Wikidata query fails to return a result. They state that
By pairing our semantic parser with GPT-3, we combine verifiable results with qualified GPT-3 guesses to provide useful answers to 96% of the questions in dev.
Data and evaluation code from the paper have been released in a GitHub repo, where the authors state that "We are now working on releasing fine-tuned models."
The paper's endeavour bears some similarity to a paper authored by a different team of Stanford graduate students with professor Lam that sought to use Wikipedia (rather than Wikidata) to reduce LLM hallucations, see the review in our July issue: "Wikipedia-based LLM chatbot 'outperforms all baselines' regarding factual accuracy".
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
From the abstract:[2]
"In this work, we explore the use of Large Language Models (LLMs) for knowledge engineering tasks in the context of the ISWC 2023 LM-KBC Challenge. For this task, given subject and relation pairs sourced from Wikidata, we utilize pre-trained LLMs to produce the relevant objects in string format and link them to their respective Wikidata QIDs. [...] The method achieved a macro-averaged F1-score of 0.701 across the properties, with the scores varying from 1.00 to 0.328. These results demonstrate that the knowledge of LLMs varies significantly depending on the domain and that further experimentation is required to determine the circumstances under which LLMs can be used for automatic Knowledge Base (e.g., Wikidata) completion and correction. The investigation of the results also suggests the promising contribution of LLMs in collaborative knowledge engineering. LLMKE won Track 2 of the challenge.
From the abstract:[3]
"Knowledge bases such as WikiData provide large-scale, high-quality representations of inferential semantics and world knowledge. We show that large language models learn to organize concepts in ways that are strikingly similar to how concepts are organized in such knowledge bases. Knowledge bases model collective, institutional knowledge, and large language models seem to induce such knowledge from raw text. We show that bigger and better models exhibit more human-like concept organization, across four families of language models and three knowledge graph embeddings."
From the abstract:[4]
[...] we explore methods to make better use of the multilingual annotation and language agnostic property of KG [ knowledge graph ] triples, and present novel knowledge based multilingual language models (KMLMs) trained directly on the knowledge triples. We first generate a large amount of multilingual synthetic sentences using the Wikidata KG triples. Then based on the intra- and inter-sentence structures of the generated data, we design pretraining tasks to enable the LMs to not only memorize the factual knowledge but also learn useful logical patterns. Our pretrained KMLMs demonstrate significant performance improvements on a wide range of knowledge-intensive cross-lingual tasks, including named entity recognition (NER), factual knowledge retrieval, relation classification, and a newly designed logical reasoning task.
From the abstract:[5]
"We present KGConv, a large, conversational corpus of 71k conversations where each question-answer pair is grounded in a Wikidata fact. Conversations contain on average 8.6 questions and for each Wikidata fact, we provide multiple variants (12 on average) of the corresponding question using templates, human annotations, hand-crafted rules and a question rewriting neural model. We provide baselines for the task of Knowledge-Based, Conversational Question Generation. [...]"
From the abstract[6] of a paper presented by a team of Google researchers at last year's ICML conference:
"[...] conversational question answering (ConvQA) systems have long been stymied by scarce training data that is expensive to collect. To address this problem, we propose a new technique for synthetically generating diverse and high-quality dialog data: dialog inpainting. Our approach takes the text of any document and transforms it into a two-person dialog between the writer and an imagined reader: we treat sentences from the article as utterances spoken by the writer, and then use a dialog inpainter to predict what the imagined reader asked or said in between each of the writer's utterances. By applying this approach to passages from Wikipedia and the web, we produce WikiDialog and WebDialog, two datasets totalling 19 million diverse information-seeking dialogs -- 1,000x larger than the largest existing ConvQA dataset. Furthermore, human raters judge the answer adequacy and conversationality of WikiDialog to be as good or better than existing manually-collected datasets."
As "a real example of a dialog inferred from a Wikipedia passage using dialog inpainting" the paper presents the following (abridged) exchange between an "imagined reader" of the Freshman 15 article and a Wikipedia "Writer" who (after the initial greeting) always answers with excerpts from the article, with all other sentences filled in by the inpainter:
From the abstract of a 2021 paper by a team from Facebook AI Research:[7]
"Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020). In this work we explore the use of neural-retrieval-in-the-loop architectures [retrieving articles from Wikipedia] [...] for knowledge-grounded dialogue [...] We demonstrate that our best models obtain state-of-the-art performance on two knowledge-grounded conversational tasks. The models exhibit open-domain conversational capabilities, generalize effectively to scenarios not within the training data, and, as verified by human evaluations, substantially reduce the well-known problem of knowledge hallucination in state-of-the-art chatbots."
From the abstract:[8]
"Pre-trained language models (LMs) have recently [as of 2021] gained attention for their potential as an alternative to (or proxy for) explicit knowledge bases (KBs). In this position paper, we examine this hypothesis, identify strengths and limitations of both LMs and KBs, and discuss the complementary nature of the two paradigms."
The authors acknowledge that "Starting from [a 2019 paper], many works have explored whether this LM-as-KB paradigm [i.e. the ability of LLMs to answer factual questions, by now familiar to users of ChatGPT] could provide an alternative to structured knowledge bases such as Wikidata. However, the paper concludes, as of 2021,
[...] that LMs cannot broadly replace KBs as explicit repositories of structured knowledge. While the probabilistic nature of LM-based predictions is suitable for task-specific end-to-end learning, the inherent uncertainty of outputs does not meet the quality standards of KBs. LMs cannot separate facts from correlations, and this entails major impediments for KB maintenance. We advocate, on the other hand, that LMs can be valuable assets for KB curation, by providing a “second opinion” on new fact candidates or, in the absence of corroborated evidence, signal that the candidate should be refuted.
I love Christmas carols, especially the old ones. Charles Dickens's story A Christmas Carol is not that old — first published in 1843 — but is written in the form of a "Christmas carol in prose", according to the title page. Its chapters are even called staves. In the first stave, a passing caroler sings a small snippet of an old carol to Scrooge. Do you know the Christmas carol sung in A Christmas Carol?
"God Rest Ye Merry, Gentlemen" goes back to the 1650s, but songs have been associated with mid-Winter holidays for over 2,000 years. For example, the Roman holiday Saturnalia was associated with song, as well as wine and political incorrectness — though it should not be confused with Bacchanalia. There's even a modern Saturnalia song, sung in Latin, titled "Io, Saturnalia" (In English: "Yo, Saturnalia") which might be better to skip.
Carols are not necessarily religious, but they are almost always happy music you can dance to. "O Tannenbaum" means "Oh, fir tree" in German but is usually translated into English as "Oh, Christmas Tree". Other than the word "Christmas", the song has little to do with religion. It just praises the fir tree's "faithfulness" — its ability to stay green all Winter. In German, in French, and in English.
My favorite religious carols include:
"Good King Wenceslas" — celebrates the day after Christmas, the Feast of Stephen, and emphasizes the importance of charity (and gift-giving in general).
"It Came Upon the Midnight Clear" — a song that has lyrics from a poem of the same name, and is a very intellectual expression of the author's personal interpretation of the meaning of Christmas. It may mark his joy at the announcement of peace ending the Mexican-American War.
"O Holy Night" — sends a similar message.
Ramsey Lewis gives a jazz version of "We Three Kings".
To fully appreciate "O come all ye faithful", you need to hear it in a large packed church with a powerful organ belting it out on Christmas Eve. The original Latin version, Adeste Fideles, can be even more powerful. Strangely, though I only know a few words of Latin, I always think of it as Venite Adoremus from the words in the chorus that translate to "Oh come let us adore (him)".
The explanation is the quirky, sprightly carol "The Snow Lay On the Ground", which also uses the words venite adoremus. The lyrics are attributed to a 19th-century Italian folk song, but three quarters of the time you just sing venite adoremus.
Another folk song, an African-American spiritual, "Go Tell It on the Mountain", is an expression of pure joy. It was first mentioned in 1901, and published in 1909.
Modern Christmas carols and songs express many of the same themes as the earlier carols, adapted to the current state of the world. But I'm not going to link to "All I Want for Christmas is You" — you know where to find it, and you know that you have heard it enough already this year.
There are also many people who live in different circumstances in other countries, who celebrate different Winter holidays, and worship in different faiths. Nobody should be left out at this time of year. We are sorry that there is not enough time to cover everybody's circumstances.
"Silver Bells" brings great memories of "Christmastime in the city". But I also have mixed feelings on its message. Is it meant to honor the Salvation Army? Or is it just an advertisement for the modern commercialized holiday that seems to start in October? Or maybe it is just a great song, in a bad movie, starring an even worse comedian?
There is no doubt that Elvis Presley's "Blue Christmas" is a great song. But sometimes I wonder if it has anything to do with Christmas.
José Feliciano's song [3] "Feliz Navidad" causes no such mixed feelings. A little bit of repetition never hurt a Christmas song.
Russia and Ukraine both have long traditions of celebrating Christmas and New Years'. And they share some of them.
В лесу родилась ёлочка ("In the woods is born a fir tree") is a Russian children's New Years' song. It mirrors Oh, Christmas Tree but includes a cute little bunny, an angry wolf, and most kiddy videos include Father Frost (a Slavic Santa Claus).
The music to this little Christmas dance was written by a gay Russian composer whose grandfather was born in Ukraine.
Do not be fooled by a bit of chaos at the start to this video of Ukrainian carolers.
These shared traditions only make the current war more tragic.
There are other tragedies happening right now that involve different religions that share, in part, a common heritage.
You might think it would be difficult finding a Jewish Christmas carol, but a song often called "The best selling Christmas song of all time" was written by Irving Berlin, a Jew.
Hanukkah songs include "The Dreidel Song" and "Hanukkah Rocks" [4] by The LeeVees (from NPR's Tiny Desk Concert).
You might think there are no Muslim Christmas songs, and you might be right. But Muslims are allowed to borrow the Christmas carols they like, just like anybody else. This is the view put forward in these two thought provoking videos.
We all share part of our common human heritage. We all share in our common human tragedy.
As part of my grueling mountain training regimen to become an elite webshit, I have learned enough CSS to make the Signpost crossword template usable. Essentially, it is a grotesque hack of the InputBox extension — full documentation can be found here — so there are some issues. Namely, if you press "enter" in any of the cells, it will take you to another page. I can't do anything about this. I can change where it goes (enabling the fun surprise you'll see if you do it) but I can't make it go away.
Anyway, here's the deal for this issue: everything is an abbreviation except 1-Across.
1
|
2
|
3
|
|||
4
|
|||||
5
|
|||||
6
|
|||||
7
|
8
|
9
|
|||
10
|
|||||
11
|
12
|
||||
13
|
|||||
Across | ||
1 | One in real life often causes one on its Wikipedia article | WAR |
4 | Where naughty Signpost articles are taken to meet their doom | MFD |
5 | Overwhelmed by Neelix | RFD |
6 | Policy requiring that sources can be checked | V |
7 | Attention, calling all cars, we have a BLP vio in progress on 5th street and Centerline... we have additionally reports of a "poopoo peepee" past uw-4im in the 1600 block of Stamson... exercise extreme caution, suspect may be armed with a proxy... | AIV |
10 | Department where employees love adding "mission statement" to the company's infobox | PR |
11 | A parenthetical note made in passing | BTW |
13 | Where you could take Senkaku Islands and tree shaping edit-warriors to be dealt with, until quite recently | AE |
Down | ||
1 | They roll the nickels | WMF |
2 | Where the work of 10-across departments tends to wind up | AFD |
3 | 5-across deals with, and 2-down can close as | RD |
5 | Challenge-pissing extravaganza for demonstrating a need for the tools | RFA |
8 | Permission granted to public wifi enjoyers, college editors, and others frequently hit by rangeblocks | IPBE |
9 | Celeb email receivers | VRT |
12 | Retired annelid arbitrator | WTT |
13 | Zoomer/moomer version of "hella"; alternately, camera setting for non-supervised crispness | AF |
Note: the chronologically previous crossword appeared in the 26 June 2022 issue, in the humour column.
Rank | Article | Class | Views | Image | Notes/about |
---|---|---|---|---|---|
1 | Animal (2023 film) | 5,560,820 | I'd rather be with an animal... Ranbir Kapoor stars in this Bollywood movie about a man's quest for revenge on the guys who tried to kill his father, that managed to become the highest-grossing in the world for its opening weekend (even entering the American top 10), overcoming mixed reviews on how the protagonist is an embodiment of toxic masculinity. | ||
2 | Ryan O'Neal | 1,313,128 | An American actor who died at the age of 82, whose best moments were in the 1970s with films like Love Story, What's Up, Doc?, Barry Lyndon, and Paper Moon, the last of which alongside daughter Tatum O'Neal (who won the Oscar at just 10 for her role). Regarding the rest of his career, let's just leave this. | ||
3 | Tripti Dimri | 1,101,282 | This Indian actress's small role (slightly larger than a cameo appearance) in #1 was apparently a "treat to look out for", leading to loads of Wikipedia views and Instagram followers. | ||
4 | Leave the World Behind (film) | 986,506 | Released in theaters in November and to streaming this week, this apocalyptic thriller, based on the novel of the same name, depicts what happens to humanity when technology fails us, or rather is controlled to do so. The film is produced by and stars Julia Roberts (pictured). | ||
5 | Deaths in 2023 | 980,794 | Last day of the rest of my life I wish I would've known 'Cause I didn't kiss my mama goodbye... | ||
6 | Shane MacGowan | 926,977 | People continued to mourn the frontman of The Pogues, a punk of many words and few teeth. His funeral was attended by the president of Ireland and celebrities such as Nick Cave, Johnny Depp, Bob Geldof, Aidan Gillen, and last but certainly not least former Sinn Féin leader Gerry Adams, and featured a performance of the holiday classic "Fairytale of New York". | ||
7 | Macaulay Culkin | 926,247 | Success might have waned for this actor as an adult, but Culkin is now eternized with a star at the Hollywood Walk of Fame. Among those in attendance were Catherine O'Hara, who played his mom in Culkin's best known role, Home Alone and its sequel. | ||
8 | Godzilla Minus One | 885,761 | The giant radioactive dinosaur that is one of Japan's cultural icons will have its 70th anniversary in 2024, so Toho celebrated one year earlier with a period piece where Godzilla emerges shortly after the end of World War II. Godzilla Minus One earned positive reviews and along with already having paid itself with the Japanese box office, is performing well in North America, with two straight weekends at #3, behind The Hunger Games: The Ballad of Songbirds & Snakes as runner-up to Renaissance: A Film by Beyoncé and The Boy and the Heron. (In the meantime, the American Godzilla of the MonsterVerse got a trailer for his return in Godzilla x Kong: The New Empire.) | ||
9 | Norman Lear | 876,342 | The television king, who wrote, created, or developed over 100 shows in the 1970s and 1980s, died at age 101 on December 5. His sitcoms cleverly broached political and social themes of the time. | ||
10 | Premier League | 822,401 | The highest-level English football system made headlines over a stalemate to help fund other struggling systems. |
Rank | Article | Class | Views | Image | Notes/about |
---|---|---|---|---|---|
1 | Animal (2023 film) | 3,332,995 | It's animal, livin' in the human zoo... | ||
2 | Leave the World Behind (film) | 2,807,615 | The film (co-star Mahershala Ali pictured) with a simple message, and a lackluster ending, was the top film on Netflix with 41.7 million views. | ||
3 | Andre Braugher | 2,557,735 | This double-Emmy Award-winning actor of television, film, and stage died at age 61 from lung cancer on December 11. | ||
4 | Deaths in 2023 | 1,001,973 | If I wane, this could die If I wait, this could die... | ||
5 | Tommy DeVito (American football) | 820,601 | This "zero-to-hero" quarterback is keeping the New York Giants in NFL playoff contention. | ||
6 | UEFA Champions League | 718,583 | The group phase of Europe's top club tournament ended. Most of the qualified teams aren't surprising (including last year's finalists Manchester City and Inter Milan and perennial favorites Real Madrid, Bayern Munich and FC Barcelona), but there was still room for F.C. Copenhagen over the once mighty Manchester United. | ||
7 | Shohei Ohtani | 744,102 | "Shotime", now a Los Angeles Dodgers player, just signed the largest contract in professional sports history: ten years, US$700 million. | ||
8 | Wonka (film) | 701,240 | Charlie and the Chocolate Factory already inspired two hit movies, so now there's an attempt at a prequel telling Willy Wonka's beginnings as a chocolatier, starring Timothée Chalamet. Praised by reviewers as great family picture with impressive production values and another catchy soundtrack, Wonka arrives in North America one week after being released in 37 countries, and is expected to debut atop the box office. | ||
9 | Premier League | 643,562 | The latest season of English football keeps on rolling. Arsenal F.C. are currently leading, and hoping they won't choke in the final rounds like last season, specially to give the resonance of another title in the 20th anniversary of their unbeaten championship. | ||
10 | List of highest-grossing Indian films | 626,796 | Ten of the 50 highest-grossing Indian films were released in 2023, with #1 quickly cracking the Top 10 upon release last week. |
There are some unique challenges in working with a content management system consisting of nineteen years of HTML, CSS, MediaWiki markup, Lua, two separate JavaScript user scripts, Python, and the specific template ecosystem of the English Wikipedia running a backend and a frontend on top of another backend and frontend. It is less of a "tech stack" and more of a "tech pile". Nevertheless, some progress has been made on the suite (SPS.js, SignpostTagger, Module:Signpost, Wegweiser, and the various internal Signpost templates). Most recently, two new features have been integrated into the pile: subheadings and images in the database.
The subheadings were particularly difficult. The primary issue is that they weren't saved anywhere: they were used in templates on the main page, at Wikipedia:Wikipedia Signpost, and then erased when the next issue was published. Which means that, well, they were saved somewhere. But the only way to get them out was to go through the entire history of the main Signpost page, manually identify each revision that was associated with a specific issue, and then manually copy the text of the subheading from each item into the templates on that issue's archive page. Well, not entirely manually: I wrote a script to do the last part. Then I put together a second script to extract those subheadings out of the issue pages and add them to the RSS description templates in the actual article pages.
But then, once they were in the RSS description templates, it was simple to add some code to the existing metadata fetcher script to incorporate the subheadings into the data it passed to the Lua serializer − and similarly simple to incorporate subhead parsing/formatting into the publishing script, SignpostTagger, the module's own output, and the snippet display templates.
Well, for arbitrary definitions of "simple". The long and short of it is that I was able to recover all of the previously-lost subheadings for every Signpost article going back to July 2012 when subheadings were first used. It's also now possible to use the module to make dynamic article lists, now that the database contains the full set of information associated with an article, rather than the previous system of hardcoding everything into individual issue pages (yeesh!).
Some more work will be necessary to fully modernize various things — archive pages, for example, still use the weird redundant {{Signpost/item}} instead of {{Signpost/snippet}}, the module doesn't yet have the ability to fetch images, and CSS cropping for cover items has some bizarre mobile bugs that need to be worked out. By the time the next issue is out, I expect to have these resolved, as well as some miscellaneous other things.
There was also some pretty interesting stuff squirreled away in those old revisions, which I will go into further depth on in this issue's Apocrypha − for now, suffice it to say that, in the language of the old country, "we nao haz t3h piccys ^____^" [sic].
Yeah, you got clickbaited. Anyway, here's the deal:
Recently, I wrote and deployed an argosy of scripts (covered in more detail here) to extract 1,380 lost subheadings from the revision history of the Signpost's main page. These are now in their respective articles' header templates (and from there, in the module indices that serve as an article database). While this allows for much broader flexibility in our display methods, that isn't very exciting (or at least not until these display methods are actually put into practice). What is exciting — or at least mildly amusing — is a whirlwind tour of the never-before-seen Signpost greatest-hits compilation.
Basically, the subheadings were introduced in July 2012, as part of the perpetual effort to keep the Signpost modern and bumpin' — they started out as simple excerpts from articles that were shown on the main page. In 2017, they started being incorporated by default into the RSS-description templates — these are invisible, they don't display anywhere on the article page, but they provide metadata in the HTML — and began to assume their current form (brief, couple-sentence-long hooks). Well, I went through and put all of them into RSS-description templates, so now there's 2,464 articles with machine-readable subheadings, out of 5,462 Signpost articles in total — i.e. there are precisely 1,998 articles from before July 2012 that just never had headlines in the first place. Well, whatevs.
Some were missing, some were messed-up, some were typos — honorable mentions to the 2018-02-20 humor column ("headline?
") and the 2017-12-18 blog feature (".
"). Among the rest, a few extremes stood out, which I had nothing better to do than put in tables for my own amusement — and maybe, dear reader, yours as well.
The longest among these is 206 characters long. I wonder if that fits into the modern display template? Oh, if only a highly stable genius had made it easy to retrieve and format Signpost article metadata... if only you could type something short and memorable like {{Signpost/snippet/autofill|article|2018-10-28|Humour}}
and have it automatically render the full snippet template... but alas: there's no sufficiently handsome and wise programmer among us, capable of such heroic deeds.
Haha sike.[1]
The older subheadings tended to be longer (although I trimmed some of the most egregious ones when parsing them in). That last one is a whopping 1,070 characters. Let's see that monster in a snippet:
Boy oh boy!!!!!!! By the way, while we're on the subject, anyone wanna help fix all of this crap?
Team Karl Thruster Drag Racing Enterprise was a short-lived American Top Fuel drag racing team that participated in the 2023 racing season. The team was notable for its brief existence, spanning approximately four minutes.
Team Karl Thruster Drag Racing Enterprise was established just before the start of the 2023 Top Fuel drag racing season. The team, owned and operated by Karl Joles, fielded a fleet comprising two funny cars and two dragsters, each adorned with vibrant liveries and the team's iconic logo.
The team's debut was highly anticipated in the drag racing community, with fans and media outlets eager to see the new entrant's performance. However, just four minutes after its official launch, the team announced its dissolution due to financial constraints.
Although its existence was fleeting, Team Karl Thruster Drag Racing Enterprise left a lasting impression on the drag racing world. Its quick rise and even quicker fall became a topic of amusement and a cautionary tale about the financial demands of the sport. The team is fondly remembered for its ambitious start and its candid approach to the realities of racing economics.
Team Karl Thruster's story quickly became a viral sensation, with memes and jokes circulating on social media. It is often referenced in discussions about the financial challenges in motorsports and is remembered as a humorous footnote in the history of drag racing.