Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One thing that I think this discussion is highlighting to me is that there's very little support in the web standard (as implemented by browsers) for surfacing resources to users that aren't displayable by the browser.

Consider, for example, RSS/Atom feeds. Certainly there are <link /> tags you can add, but since none of the major browsers do anything with those anymore, we're left dropping clickable links to the feeds where users can see them. If someone doesn't know about RSS/Atom, what's their reward for clicking on those links? A screenful of robot barf.

These resources in TFA are another example of that. The government or regulatory bodies in question want to provide structured data. They want people to be able to find the structured data. The only real way of doing that right now is a clickable link.

XSLT provides a stopgap solution, at least for XML-formatted data, because it allows you to provide that clickable, discoverable link, without risking dropping unsuspecting folks straight into the soup. In fact, it's even better than that, because the output of the XSLT can include an explainer that educates people on what they can do with the resource.

If browsers still respected the <link /> tag for RSS/Atom feeds, people probably wouldn't be pushing back on this as hard. But what's being overlooked in this conversation is that there is a real discoverability need here, and for a long time XSLT has been the best way to patch over it.



> One thing that I think this discussion is highlighting to me is that there's very little support in the web standard (as implemented by browsers) for surfacing resources to users that aren't displayable by the browser.

Really wish registerProtocolHandler were more popular. And I really wish registerContentType hadn't been dropped!

Web technology could be such a nexus of connectivity. We could have the web interacting with so much, offering tools for so much. Alas, support has largely gotten worse decade by decade. And few have taken up the chance.

Bluesky is largely using at:// urls. Eventually we probably could argue for support for our protocol. But web+at:// is permission less. Tools like https://pdsls.com can just become web based tools, with near no effort, if they want.


at:// urls were unfortunately mis-designed as explained in https://github.com/bluesky-social/atproto-website/issues/417


The suggestion to move the RSS/Atom feed links to a hidden link element is a horrible one for me and presumably others who want to copy that and paste it into their podcast applications. With that suggestion it adds another layer of indirection an application has to fetch and inspect.

Part of the reason HTML 5/LS was created was to preserve the behaviour of existing sites and malformed markup such as omitting html/head/body tags or closing tags. I bet some of those had the same usage as XSLT on the web.


You're right, it's not a great flow! And while many podcast/feed reader applications support pasting the URL of the page containing the <link /> element, that still leaves the problem of advertising that one can even do that, or that there's a feed available in the first place.


> The suggestion to move the RSS/Atom feed links to a hidden link element is a horrible one for me and presumably others who want to copy that and paste it into their podcast applications. With that suggestion it adds another layer of indirection an application has to fetch and inspect.

This has already been the norm for years. Feed auto discovery is decades old at this point.

> Part of the reason HTML 5/LS was created was to preserve the behaviour of existing sites and malformed markup such as omitting html/head/body tags or closing tags.

This is not true. It has always been valid to omit html/head/body tags and the closing tags of several elements. This is a valid HTML 4.01 Strict document, it follows all of the parsing rules of HTML 4.01 correctly, it is not malformed, it parses correctly in every browser, there is no ambiguity, there is no error handling taking place, and there is nothing incorrect at all about it:

    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
    <title>Valid</title>
    <p>This is a valid HTML 4.01 Strict document.


The intention behind this is dropping existing behaviour (XSLT support) and that will break at least some websites. I was saying that when they created HTML 5 they went to great lengths to preserve the HTML parsing behaviour, resulting in the complex parsing and tree construction rules. This is because their goal was to preserve compatibility with existing websites.

I could have chosen a better example like quirks mode. My point was that they could have simplified the parser logic and implementation buren by dropping that. But they didn't.

They even went to great lengths to parse existing XML content (SVG, MathML, and XLink) in HTML documents without supporting namespaces.

So dropping XSLT support here is purely because they don't want to put the effort in implementing their own XSLT engine (in a "safer" language if they need to), or to properly support the maintenance of the open source lbraries they rely on. And dropping XSLT goes against why HTML 5/LS was originally created, which was to support all existing HTML and web content no matter what they looked like.


> My point was that they could have simplified the parser logic and implementation buren by dropping that. But they didn't.

You’re right, and I think this is possibly the biggest mistake the HTML5 standards process ever made. They got that one badly wrong. Strict XML-based parsing was the clearly correct solution informed by many years of Postel’s Law constantly fucking the web up with incompatibilities and security vulnerabilities.

> So dropping XSLT support here is purely because they don't want to put the effort in implementing their own XSLT engine

That’s not the whole answer; they think the benefits don’t justify the effort, and I agree with that.

> And dropping XSLT goes against why HTML 5/LS was originally created, which was to support all existing HTML and web content no matter what they looked like.

That wasn’t why HTML5 was created. That makes no sense; you don’t need a new format to preserve the status quo.


> If someone doesn't know about RSS/Atom, what's their reward for clicking on those links? A screenful of robot barf.

The microformats folks have a standard to embed machine-readable feed data into HTML, which seems a lot more practical to me after seeing how browser just ignore RSS:

https://microformats.org/wiki/h-feed

(I haven't tried it but it seems fine)


Why is it better to drop users onto the RSS page and say copy the URL in the address bar, rather than just giving a copy button that puts it on the clipboard and say put this in your RSS app?


but why priviligize XML? there's WASM and sites can have any kinds of codecs they want.

unfortunately(?) modern browsers are turning more into these sandbox and state managers with a few lower-level rendering engines bolted together (JS driving the DOM, plus CSS and SVG and audio and video and font and various image codecs and ...)

and if people want to copy the raw data URL the site can show them, plus HTTP provides the Accept and other headers for content negotiation so in theory the URL can be the same.


> but why priviligize XML?

Because it was privileged at the start and now it’s been a working feature you could rely on for a generation.

> there's WASM and sites can have any kinds of codecs they want.

Can’t js your way into the browser opening an xml document and automatically applying the stylesheet as far as I know.


Features come and go. (Others mentioned Flash. Java applets also were pretty mainstream for a while. Cookie rules changed. There's now more security plumbing, eg. CORS.) From a security and maintenance aspect it makes sense to get rid of libxslt and libxml.

It's hard to say how much the cost will be on users. They will need to copy the URL and put it into some site that applies the XSLT. Likely there will be some page that does this from JS and people can simply make URLs like https://blabla.gov/apply-xslt?xml=....&style=... and put those on websites, right?

> Can’t js your way into the browser opening an xml document and automatically applying the stylesheet as far as I know.

Yes, that's definitely not ideal, but maybe the result of this deprecation will be that we get some kind of handler registration. (Though I saw the comments about similar failed initiatives.)


Java applets were terrible in all possible ways that something can be terrible and it took no less than Steve Jobs to kill off Flash.

This stuff was foundational to the modern web and it's clear the maintainers, who probably are not Steve Jobs, have no idea what will break as a result. If it's removed, it will just get added back in after the outrage


XSLT being foundation to the modern web? How?

Outrage? In this economy? Are we watching the same movie?

(Anyway, my prediction is that likely they'll go with the WASM polyfill if they remove it ... but likely some folks will fight for some budget to keep it around behind some deprecation warnings and sandboxed for a while anyway.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: