A politician in Baltimore, commenting on how long it will take to rebuild the Francis Scott Key Bridge, said large infrastructure projects usually take no less than 10 years, and most of that time is not the construction but the politics of who pays for it. -- GreenC22:59, 29 March 2024 (UTC)[reply]
The WMF is swimming in money. As of 6/30/2023, net assets were more than $250 million, with property and equipment making up only $14 million. (source) And that doesn't include the $100+ million in the WMF endowment fund. If it would cost (say) $3 million to fix this problem, that really shouldn't be an issue.
But it doesn't look like there is any obvious technical solution - at least, apparently no one has offered any that are agreeable to everyone, at which point someone could actually add the associated price tag.
More importantly, isn't the real question whether Wikipedia's volunteer base is interested and able to use more complex solutions **to any large extent**? If not, then - at least for the next couple of years, or at least until a good technical solution presents itself, a better approach might well be an automated pass at existing graphs, to convert them to images that don't carry security risks. (Store the snipets of code and data so that they are can be used in the future, if the relevant one ever arrives.) -- John Broughton(♫♫)23:08, 29 March 2024 (UTC)[reply]
I would have a lot more sympathy to the view of "this is hard and needs great investment" (paraphrased) if we were talking about a new feature. But this is an old feature and from the user POV, it just stopped working. Recognizing this feature isn't used in the vast majority of Wikipedia, it nevertheless provided considerable value and it is missed. Why can't an alternative, secure module be found for the common use cases, and editors given migration instructions to make that work? Stefen Towers among the rest!Gab • Gruntwerk23:27, 29 March 2024 (UTC)[reply]
I agree. This is not rocket science, and the WMF is indeed swimming in money relative to any normal standard. This is a hard problem, but the world is full of people sufficiently talented to solve hard problems for money, so this is a management politics problem, not a technical one. I agree with StefenTower above, I'd start by solving a very few limited use cases to get at least some partial support in the meantime, and only then think about how to implement something better. — The Anome (talk) 00:41, 30 March 2024 (UTC)[reply]
Indeed, while reviewing quite a few discussions for writing this article, I didn't see a good explanation why, after this edit restriction idea (option 1) had been abandoned in favor of option 2 (iframes, as a more thorough solution enabling more editors), and then option 2 was itself abandoned many months later, folks apparently didn't consider to fall back to option 1 again, as a much-better-than nothing solution. Regards, HaeB (talk) 03:00, 30 March 2024 (UTC)[reply]
Actually, it looks like a workaround (for interactive content in general) akin to option 1 is happening right now, with volunteer developer User:Sophivorus recently announcing that
it is now possible to load a specific gadget when a specific template is used in a page. This opens the door (or perhaps a window?) to interactive content using JavaScript. See for example this article in the Spanish Wikipedia <https://es.wikipedia.orgview_html.php?sq=Qlik&lang=&q=Juego_de_la_vida> for an interactive instance of Conway's Game of Life, and scroll down for more instances!"
I can imagine that this has various downsides compared to a fully planned out and better resourced WMF effort (and presumably nobody is eager to reimplement the broken graphs infrastructure using this), but can't help being reminded of the term Shadow IT.
(My apologies for missing this when writing this Signpost article, much of it was drafted before this announcement.)
Shadow IT is an apt term, as the downsides (and upsides) are kind of similar. The major worry is lack of isolation. User javascript has full access to anything the user can do, which is a bit of a security nightmare (albeit not a new one). If every interactive thing is custom JS, the amount of custom js multiplies. It just takes one mistake, and then everyone viewing the page has their account taken over. And that is just accidental mistakes. If the number of people writing these things multiply, eventually you run into malicious people (seems particularly apt to think about right now given what happened with liblzma yesterday). That said, i do generally like the approach of giving the community low level tools, and letting them go wild. I think trying to anticipate needs is hard and its better to let the community create what it needs. Like how lua was more succesful than just adding more and more special syntax to wikitext. (i've written about that previously). Krinkle of course has some fairly compelling counter-arguments. Bawolff (talk) 15:21, 30 March 2024 (UTC)[reply]
Well yes, this security angle was already discussed at length in the article. Do you agree with Ori that this is also a resourcing issue on WMF's side? Regards, HaeB (talk) 21:02, 30 March 2024 (UTC)[reply]
I partially agree with him. I agree that all these problems have solutions and that more resources could make some particular solution happen (nothing we do here is on the cutting edge of computer science. Other people have had every problem we have).
However, i don't think that is the problem we have here. The real problem is a product management & communication problem. Nobody agrees on what we want. Nobody is telling the community (explicitly) what is happening and what has been decided. Everyone [on the wikimedia-l thread] is talking past each other. Our issue is not solving problems. Our issue is deciding what problems to solve and owning that decision.
The "temporary" graph error message is a perfect demonstration of the problem. It would be totally reasonable for WMF to fix graphs. It would be totally reasonable for WMF to decide its not worth it to fix graphs given their lack of use, and just tell everyone - sorry its not going to happen. At least people could get closure on the issue and move things to alternative solutions. What is unreasonable is just letting the status quo sit there in limbo for 2 years. No amount of extra resources can fix the problem where nobody is willing to make a decision and own it publicly. Bawolff (talk) 23:00, 30 March 2024 (UTC)[reply]
It should perhaps be emphasized that option 1 would need to be very, very restrictive. There are currently a grand total of nine interface administrators for the entire English Wikipedia, and one of them is User:MusikBot II. Why is it so restricted? Because an "evil" interface administrator would have de factosuperuser privileges for the wiki. They can do at least all of the following:
Cause any logged-in user to perform any action their privileges allow, without that user being aware of the fact, the next time that user loads a page. For example, exercise CheckUser or Oversight privileges when someone with CheckUser or Oversight loads any page on the wiki.
Hide any log entry from everybody, by removing it with JavaScript. Can combine with suppression (via the previous bullet) to make it much harder for anyone to figure out what happened.
Hide or disguise the contents of any page on the wiki, as well as its history. Again, this combines well with suppression.
The only way around this problem would be for someone to manually read the source of the page with "view source" and spot the rogue JavaScript code (which can be obfuscated or minified as desired). Now, let me ask you a question: When you edit or even just read Wikipedia while logged in, do you check the source for rogue JavaScript? I know I don't. --NYKevin20:16, 11 April 2024 (UTC)[reply]
The continued failure to re-introduce graphs is the latest, most visible instance of Wikimedia's complete inability to do software projects. There are plenty of talented developers who work there, but for some reason the foundation can't project manage their way out of a paper bag. "We plan to, in a couple months, release the date for when we'll have a roadmap for doing the development" is a deranged statement for any software product feature. They could have implemented non-interactive graphs in a month with a single developer. Heck, it's been so long now they could have implemented interactive graphs from scratch. I get that this isn't a priority for them, but I can't help but notice that pretty much every single feature they work on has absurdly long timelines to produce anything. --PresN02:23, 30 March 2024 (UTC)[reply]
WMF already has significantly more than one FTE working on maintenance (however you define that). I'm not going to claim WMF is perfect in how it prioritizes things, but they have many people working on maintenance type things. Nobody really notices because maintenance is thankless work. Bawolff (talk) 05:10, 30 March 2024 (UTC)[reply]
People have no clue how much work and people are required to keep the thing running (mostly) as is. However that is still way too few people to make things evolve quick enough, let alone deal with major problems like these or dare I say the 2030 ambitions.
I agree. Recall the period 2005-2008. Forget graphs, everything was down, frequently. And this argument they have pools of money. The US Federal Government has the biggest budget of any institution in the world, and as everyone knows, they are the best run institution in the world... the more money there is, the harder it gets to manage due to politics and special interests, and the more complaints there are. But I can't help returning to the core fact, WMF overall does a pretty good job when you consider what they facilitate. -- GreenC16:03, 30 March 2024 (UTC)[reply]
we have all the downsides of a company and all the downsides of a volunteer foss project at the same time - that's an interesting thought, do you want to elaborate on that? (I know you have been a longtime observer of such matters, so maybe you already wrote about this elsewhere - feel free post a link.) Regards, HaeB (talk) 21:08, 30 March 2024 (UTC)[reply]
The advantage of a company with money supporting a software project is that it has resources that it can bring to bear to resolve issues. For example, usually better user support is possible with more resources, as more testing can be performed in more environments. The downside is that with any sizeable project, it can't do everything at once, so it has to prioritize what to do. This means balancing the concerns of the various stakeholders, such as what benefits users (readers and editors in the case of Wikipedia) and the company's objectives. But this also means that companies are often less nimble. One reason is that they have to be able to maintain as much support for their user community as possible, whereas volunteer developers are free to pick slimmer segments of the community for their work. Another is prioritizing dollars spent means having to allocate effort towards making those decisions, and considering the effect of their decisions. (I think part of the problem with the graph situation is that no one on the WMF dev team wanted to be the one to tell the community that they just weren't going to support Vega anymore, knowing it would make people unhappy, and so they tried and continue to try to find a way forward while still using Vega. The result of that is unfortunately drawn-out uncertainty, making planning difficult.)
Volunteer projects are frequently hard up for skilled technical volunteers, and without defined commitments of effort, it's hard to set deadlines or make a roadmap for very far into the future. Thus short turn-around projects can get more easily done than longer-term ones. Putting out version 1 of a feature with limitations on users can fit. Moving that to full support for all users and managing long-term sustainability, such as resolving third-party component dependencies across the entire product, is a long-term project. If the initial volunteers lose interest, then it can be hard to find anyone to pick up the work.
In the context of Wikipedia: the WMF has resources, but it has to spend a lot of them on keeping a large project stable and working for a large community using a wide variety of platforms. By default it has to pick up support for code that was written from many developers no longer involved. All of this makes the speed of development slow. (In general, the industry tries to combat this through lots of automated testing and trying to keep the code in a ready-to-ship state as much as possible. It's a difficult problem for everyone, nonetheless.) This increases the time investment required for volunteers which discourages them. There are certainly dedicated volunteers who have spent a lot of time on Wikimedia software, and their contributions are highly valued. But the reality is that there are lot of choices for where people can spend their volunteer time, and absent targeted recruiting, it's hard to get the right helpers. isaacl (talk) 03:02, 31 March 2024 (UTC)[reply]
Thanks for sharing this perspective. This largely matches what I have observed about the dynamic between paid and volunteer contributions to open source. In addition to automated testing, clarity about API capabilities and versioning is another technique that I have seen successfully support diverse contributions and usage, while also enabling more predictability for maintenance needs over time. SDeckelmann-WMF (talk) 19:49, 1 April 2024 (UTC)[reply]
I find the comparison to WolframAlpha very silly. WP:NOT is among our largest and most referenced policies because, as it turns out, including too much information in a reference work makes it less useful. We have a lane and we should stick to it. Mach6104:47, 30 March 2024 (UTC)[reply]
I agree with Tisza and TheDJ; we should be focusing on getting the basic graph functionality running rather than trying to devise interactive solutions. While the fanciful visions of interactive content may sound all good and nice, fact is that we've got something broken, and the vast majority of use cases don't currently require interactivity. So we should expend the energy on fixing the broken thing and then the interactive stuff can be added later. And flashy graphs with complex interactive features are better served by other sites anyway; we shouldn't try to be everything to everyone. ― novov(tc)05:13, 30 March 2024 (UTC)[reply]
This problem with tables is only the most glaring example of software problems that don't get fixed. It takes a lot of resources, and I figure those resources ought to be beefed up. Amateurish enthusiasm works all right for editing articles, surprising as it may seem, but software coding seems to be a more delicate kind of clockwork. My own little problems are mostly with the interworkings of the WP App, Commons App, Wikidata, and their various mapping methods. They do not connect to each other neatly, and when teaching new editors we have to introduce them to various workarounds. I like to think there must be some other part of the budget that could be robbed to pay for better software, not just for tables but for mobile. The mobile site is good enough for readers, but it's poor for editors, and the internal mobile apps are what ought to work together to make life easier for mobile editors. Jim.henderson (talk) 15:33, 30 March 2024 (UTC)[reply]
First, HaeB, thanks for this write-up, I really appreciated getting up to speed without having to read all the mailing list discussions. In my view, this comes down to two lines:
the path forward will require a substantial investment – one that we have not yet started given the other priorities we’ve been working on
and
Still, on English Wikipedia, only 19,160 pages were affected.
The WMF isn't going to spend millions of dollars to fix something that's only a problem on 20k enwiki pages. Bottom line is bottom line, unfortunately: graphs aren't popular enough to care about, by and large. Personally, I think they're great and could be profitably used on millions of pages, but you can't argue with the fact that they just aren't very popular, not very widely used, and thus low-priority when compared with other features that are more widely used. Levivich (talk) 16:41, 30 March 2024 (UTC)[reply]
Meanwhile, the fact on how essential it is to the Arabic 'pedia is interesting, though it's debatable how much influence over the projects as a whole it has. Aaron Liu (talk) 16:55, 30 March 2024 (UTC)[reply]
Even that is significantly over-stating the impact. Most of those 20K pages are talk pages not content pages. Of the content pages, it was largely used for static graphs where an uploaded png/svg would serve the same purpose just as well. Much of the damage here is self-inflicted. Instead of replacing graphs with something that works when possible, we have replaced them with a giant error message, in the hopes that the ugliness of the error would blackmail WMF into re-enabling the feature, which clearly did not work. Bawolff (talk) 17:00, 30 March 2024 (UTC)[reply]
I think I would be willing and able to develop a Lua module that replaces the Graph extension for the most common use cases (bar charts, line graphs and maybe even pie charts). However, the Wikimedia Technology Fund has been marked as Permanently on hold with no explanation given (at least that I know of). Perhaps this year I'll find the time and energy to attempt it as a volunteer, but it would really help to get some compensation for it, since it will require a lot of time and effort that I could otherwise spend in remunerated work, or just reading a book. Sophivorus (talk) 23:01, 30 March 2024 (UTC)[reply]
Good point, but that's a bit "We have graphs at home" ;) This template existed long before the Graph extension, so it seems that the latter was meeting some needs that the former left had left unfulfilled. Regards, HaeB (talk) 06:43, 1 April 2024 (UTC)[reply]
We probably don't want any more modules that use HTML and CSS to create graphs. It is essentially one big hack. I would advise technical contributors who have the time to instead consider working on phab:T334372 or phab:T66460, which would enable creating SVGs via Lua. The benefit would be that the resulting graphs would be real images that can be indexed by search engines, copied, downloaded, etc. Today's lua-generated graphs are just elaborate HTML structures with which none of that is possible. – SD0001 (talk) 18:24, 31 March 2024 (UTC)[reply]
Hello everyone. I'm Marshall Miller; I'm a Senior Director of Product at WMF (I was quoted in the article above), and I work with teams that build features for the reading and editing experiences (e.g. Discussion Tools or Night Mode). Thank you all for thinking about and discussing the graphs issue. I know that it is frustrating that this remains unresolved -- it has been a deceptively complex problem, involving issues around security and scalability. I just wanted to post here saying that I am following this discussion, and that I have been working with colleagues to propose a path forward for graphs. We'll be resuming the discussion and updates on Meta, and I will also post a link here once there is a new update over on that wiki. -- MMiller (WMF) (talk) 05:06, 31 March 2024 (UTC)[reply]
Has WMF appointed a single-threaded owner for this? I wouldn't be surprised that no progress occurs if the person handling Graphs also happens to be working on Discussion Tools and Night Mode. They would just choose to spend their time on the less challenging assignments. – SD0001 (talk) 18:47, 31 March 2024 (UTC)[reply]
@SD0001 -- I'm sorry for the confusion; let me try to clarify. In my role, I am the manager of several product managers, each of whom is part of a product team. One of those several PMs works with the Editing team on Discussion Tools and another of those PMs works with the Web team on Night Mode. I just mentioned those two features to give examples of the sort of things my broader team is responsible for. I'm the point person for Graphs right now, as this is a complicated situation that spans the expertise of multiple teams. Does that all make sense? MMiller (WMF) (talk) 04:42, 1 April 2024 (UTC)[reply]
I agree with the frustration, but.... Those of you who consult resources hosted by the British Library are aware of the cyberattack that brought down their system in October. Some of the functionality is still not there and won't come back until the architecture is redesigned (see the report Learning Lessons From The Cyber-Attack). Security threats are real. I suspect most people would not want Wikimedia projects to be hacked and all off its content erased in such a way where it was not retrievable. Perhaps the best way forward is to ask WMF to devote more resources to security and replacing those parts of the infrastructure which are at risk. - kosboot (talk) 14:49, 1 April 2024 (UTC)[reply]
Just FYI the security risk graphs presented were more of either a mass account takeover, or to be used as a launching pad to attack users (e.g. to distribute malware). Data deletion/ransomware was probably not a threat from this particular issue. Bawolff (talk) 16:00, 1 April 2024 (UTC)[reply]
I find it hilarious that they had enough time, effort, and energy to debut the unwanted vector 2022 mandatory "upgrade", but can't (or won't) get this fixed. Incompetence be thy name, foundation. TomStar81 (Talk) 21:12, 1 April 2024 (UTC)[reply]
Vector 2022, which is just CSS and normal PHP and a smatter of JS, does seem a lot easier than graphs. Not to mention that they spent 4 years on it anyway. Aaron Liu (talk) 21:42, 1 April 2024 (UTC)[reply]
I, for one, did want the Vector redesign and welcome it much, and if it took more time and effort than it should have, I'm sure it's because of the unending complaints and the impossible attempt by the foundation to please everyone. I sometimes wish developers listened less to users and just do their job. — Preceding unsigned comment added by Sophivorus (talk • contribs) 18:42, 1 April 2024 (UTC)[reply]
In addition to updating CSS being much easier than patching the Vega security hole, V22 is on 100% of articles, graphs are on less than 0.3%. Apples and oranges. Levivich (talk) 23:46, 1 April 2024 (UTC)[reply]
Not to mention its a totally different skillset. Being a developer is not just one skill. The people who work on vector would not be the people who work on graphs and vice versa. Bawolff (talk) 16:49, 2 April 2024 (UTC)[reply]
I get that the security problem with graphs was a zero-day vulnerability, but the level of visibility to readers meant a zero-day fix was needed. Instead we have had nothing but delay and misleading claims from the WMF. This is an ongoing embarrassment that, it seems, could have been avoided by a temporary, low-quality solution (like replacing each graph usage with a PNG image and then disabling graphs). I'm sure there are valid questions to be asked about interactivity and desired functionality, but it seems to me that the WMF would have been better to say "you're on your own, kid" on day zero and en.wiki volunteer programmers and editors would have had a mostly working solution rolled out months ago. — Bilorv (talk) 23:39, 3 April 2024 (UTC)[reply]
But a zero-day fix was implemented: Graphs were disabled. That's all a security response requires, mitigation of the exposure.
Beyond that, should the response have focused on eliminating the interactive graphs entirely? Perhaps. Though I don't feel like we can blame the WMF for us not doing that.
For example, the article mentions Another Wikipedian reported that "In ruwiki, interactive Lua-based graphs are used in more than 26000 articles about settlements and administrative units through https://ru.wikipedia.orgview_html.php?sq=Qlik&lang=&q=Module:Statistical. The primary uses of ruwiki Module:Statistical are in templates like ruwiki Шаблон:Население ("Template:Population"), which for years has had the ability to format population data as either an interactive <graph>, or a pure-HTML-based bar chart. When Vega went offline, instead of displaying the "can't do that" message, they simply updated their template so that all requests for the graph version instead get the HTML bar chart. It's less satisfying (note: discussion in Russian) than the interactive graphs, but it conveys most of the same data.
Here on enWiki, OTOH, seems to me we've either wrung our hands about either the WMF not fixing graphs fast enough for our tastes, or complained about them not giving us permission to work around the absence of graphs. ...Do people feel that's an unfair characterization? ruWiki didn't need the WMF's permission, why do we? FeRDNYC (talk) 01:24, 4 April 2024 (UTC)[reply]
A quote from the top of the article from a WMF employee: "My hope is we can maybe restore some functionality in the next week or so". Given this it was reasonable to revert and delay attempts to replace graphs at the time. The point I'm making about the zero-day issue is not a security one but a reputation one. The zero-day fix needed for reputational purposes was restoring the information that was encoded into graphs (even in a lower-quality format). — Bilorv (talk) 15:41, 4 April 2024 (UTC)[reply]
It is worth noting, I think, that parts of the problem have been solved. For example, the much-discussed Template:OSM_Location_map, used by 5,600 pages on the English Wikipedia, is functioning again since 11 March 2024, with some newly added features. See the Return to service article. On the other hand, those 5,600 pages were not included in the count of 19,160 -- a count that underestimates the true impact of the problem, leaving out all the templates that are indirectly rendered inoperable. Renerpho (talk) 07:46, 5 April 2024 (UTC)[reply]
Well, the article does allow, (Those numbers likely already reflect the manual removal of broken graphs from many pages.) But it goes on to say:
In a more detailed 2020 analysis, volunteer developer User:Bawolff had found that "the graph extension is used on 26,238 pages [on English Wikipedia]. However, most of these are in non-content namespaces, from a template that generates a graph of page views for a specific page (w:Template:PageViews graph). There are 4,140 pages on en.wikipedia.org in the main namespace that use graphs."
That information appears roughly consistent with both the numbers discussed here (19,150 + 5600 = 26,750), and the current contents of Category:Pages with disabled graphs — which is actually down to 18,249 pages now, but is still overwhelmingly dominated by Talk-namespace pages. Since the analysis was done in 2020, when graphs were working, it wouldn't have been biased by any manual removals. We can assume that the numbers would've gone up a bit since, and the fact that Category:Articles containing OSM location maps alone contains 5319 members (and AFAICT is restricted to mainspace) bears that assumption out. So that 4,140 is obviously a bit out of date.
(Aside: Category:Pages with disabled graphs does still include Template:OSM Location map itself (despite it having been returned to service), because the category is manually added in the template's source for some reason. But it's done in the <noinclude> documentation section, so that it's only applied to the template itself, not its consumers.)
Beyond that, I don't see how the count underestimates the true impact of the problem, leaving out all the templates that are indirectly rendered inoperable. The tracking category is populated by MediaWiki itself, and any use of <graph> in a template, no matter how indirect, would cause every page that transcludes the template to be counted. It's only the pages where the transclusion was removed (as the article notes), or where the transcluded template was modified to eliminate its use of graphs (like {{OSM Location map}}), that wouldn't be counted. FeRDNYC (talk) 17:00, 6 April 2024 (UTC)[reply]
-- This subject demonstrates Wikimedians' propensity to blather on, building towering walls of text in place of taking action. The way forward seems blindingly obvious: implement a safe static graph facility while separating off the issue of whether and how to provide dynamic and/or interactive graphics safely. The most efficient way is likely to be to subset the Vega framework (or just syntax) to those constructs that can only produce safe, static code. The point is to actually act on that rather than talk about that. --R. S. Shaw (talk) 23:09, 8 April 2024 (UTC)[reply]
Your most efficient way has been discussed. I can't find the specific discussion right now, but the conclusion was that Vega's syntax was too complex to provide a viable subset or sth like that Aaron Liu (talk) 23:44, 8 April 2024 (UTC)[reply]
How many of the 4,000 articles that use broken graphs use only pie charts or bar charts? There are working templates for both of those,[1][2] so long as you don't miss the hover-over feature. And you shouldn't miss it. Pretty sure most users just want a simple, easy-to-read image in most cases - before they spend millions on making something, they should check if people would actually find it more useful. Wizmut (talk) 03:44, 10 April 2024 (UTC)[reply]
Hello everyone. I wanted to follow up here on my earlier comment above. I just posted an update to the Graph project page laying out a proposal in which WMF would build a new extension for making graphs. We've come to this proposal after talking to community members over the past weeks, analyzing data, and thinking through architecture with staff. This would be a substantial amount of work, and I hope that community members can weigh in on whether this seems like the right approach and to help us plan the project. Please join the discussion on the talk page! -- MMiller (WMF) (talk) 23:15, 10 April 2024 (UTC)[reply]
← Back to Technology report