This is nice. It reminds of how miserable my life is.
— Which HTTP code I should return for my API? I already used 404, 403, but I need another one. Damn, HTTP is so old and it makes no sense.
— You can't use HTTP codes like that Bob, they're not a free choice. They're for the protocol, not for your app.
— Let's look at the list. Hm... "412 Precondition Failed". Hey, it sounds nice. It fits to my use case. I'm gonna document it. It means the account is out of balance.
— What is this garbage? Please read the spec. This is going to make our API gateways, CDNs, everything go crazy. Can't let you move on with this PR.
— Look. I documented it, made an enum with the code, it's clean. I'm an experienced REST developer.
— It... it doesn't work like that Bob. Please, read the spec.
— Hey, got enough approvals, "412 Account Out Of Balance" it is! It passes the tests.
For each dev that knows proper HTTP, there's 10.000 Bobs.
For the shortcoming of conveying errors strictly through HTTP status codes, consider:
RFC-7807, Problem Details for HTTP APIs[0]
From the introduction:
HTTP [RFC7230] status codes are sometimes not sufficient to convey
enough information about an error to be helpful. While humans behind
Web browsers can be informed about the nature of the problem with an
HTML [W3C.REC-html5-20141028] response body, non-human consumers of
so-called "HTTP APIs" are usually not.
This specification defines simple JSON [RFC7159] and XML
[W3C.REC-xml-20081126] document formats to suit this purpose. They
are designed to be reused by HTTP APIs, which can identify distinct
"problem types" specific to their needs.
My coworkers insisted on always returning 200 and having the status code in JSON.
At least at that point it’s clearly not HTTP anymore, and it’s better than pretending to be compliant like your Bob. But something dies inside me whenever I have to work with it.
I'd say returning 200 for all successes is reasonable if the responses are simple.
Returning 200 for an error makes no sense. Having 400s and 500s is the simplest way to have observability over protocol behavior (think logs, error rates, etc). If you use all 200s, you'd have to re-implement observability by yourself, so you lose simplicity that you gained by ignoring those statuses.
It's the same thing with caching stuff. You could implement those outside the protocol, but then you'd be writing your own protocol (trying to be smarter than decades of engineering efforts).
That’s a strong assertion. There’s plenty of status codes that indicate something bad happened at the HTTP level but don’t convey information about the RPC.
I've taken a very operational view of HTTP errors, which is "What do I want things receiving this error to do?" Unfortunately, that's not a clean question since there's no list you can simple consult to get all behaviors that all HTTP error messages cause. The most important of these is, if this is being accessed by a browser, what will the error code make it do?
Fortunately, for a lot of my API-type work, I also get to not care. I don't want some smart cache to think it knows how to cache my responses or anything and don't care about the sort of infrastructure that thinks it understands HTTP doing anything with my request.
200 {"error": "..."} is not necessarily invalid from this point of view, either. 200, the request was successfully processed and the successful result of that request as far as HTTP is concerned is an error. There doesn't seem a great need to tell HTTP there's an error, HTTP doesn't really care. Telling the browser there's an error has some marginal utility, but if it's an API and there's no browser involved, that doesn't matter much either. The 200 isn't going to fool it into thinking it should put the error into the history or whatever.
I've also learned to avoid getting too fancy with the codes. You will invoke some weird behaviors from systems you didn't even know cared about your connection. 200 {"error": "..."} may seem "wrong", but it is also generally safe. It will do what you expect.
It might be nice to live in a world where there are HTTP error codes that are suitable for everything I need, instead of a big pile of useless codes for abortive standards that never came to be and things nobody uses, and an underspecified set of codes for the things I actually want and use, but there's no point pretending that the standard is something other than it is, and as it stands now, a lot of times the HTTP result code is almost useless.
Varnish, HAProxy, Apache mod-proxy, nginx all can do similar things. Some of them can do this even if you always return 200 (by having rewrite rules and so on). It is often better to leave this kind of work to some upper abstract layer. Some of thse codes are only applicable in a layered system (502, for example, often seen when nginx can't reach a backend application), so they seem useless to developers, but they're not.
For APIs, other stuff uses those codes. Tools like DataDog and NewRelic will get better if you use generic 400 and generic 500 for client and server errors respectively. You can make them work with 200s and a little configuration though, but it's extra work.
If you never needed any of this, it's better not to use it.
I should indeed have clarified that the browser web has a much richer set of headers and response codes in use, and they are truly useful, and anyone serving web pages at scale should indeed learn about them. IIRC it's still about 1/3rd to 1/4th of the nominally defined HTTP response codes that are useful, but it's still something.
The non-browser web, they approach useless. Which I'm not happy about and not celebrating or advocating for. It's just how it is.
400 you screwed up vs 500 I screwed up is always better than 200 OK not really.
You can get more specific, and for REST style APIs the correct specific HTTP status code is usually apparent for both successful and unsuccessful requests, but 2xx/4xx/5xx is simple and should be trivial to determine for anything you are using HTTP for even if it's not REST-like.
However, while your mileage may vary, I end up getting the same complaint from the users either way. Even when my 400 contains an exact reason why the input is incorrect.
Granted, on the one hand, this can be fixed on the individual level, but on the other hand, it's the same effect writ small that when writ large makes the response codes nearly useless, so this post is maybe half cathartic grousing. I can't push caring about response codes. I can document it, I can yield detailed errors, and I can be as careful as I like, but this is a "it takes two to tango" situation and at scale, on average, the other end doesn't want to tango.
I think that's one of the key differences between REST and other kinds of RPC architectures.
I've used SOAP and JSON-RPC, both of which (at least in many implementations) send RPCs as HTTP POST requests and receive 200 responses with any error messages in the body. They're just tunneling over HTTP. It's not necessarily wrong, although I'm convinced that leveraging the HTTP verbs and error codes with REST is a fundamentally better design for the use cases I've seen.
Status 200 with "InsufficientFunds" can be correct.
Let's assume your resource is "/account/1234/withdraw-availability"
It's a hypothetical endpoint you can GET to know if you can withdraw money. You hit it, and the request is sucessful (the server understood and will inform you whether withdrawaw is available or not).
Let's assume your resource is "/account/1234/withdraw"
This other hypothetical endpoint you can POST a request for money withdraw. Returning 200 here means the server understood and processed your request, so a 200 that does not withdraw makes no sense.
The same endpoint could also return a success "201" accepted (the server understood the request, but it is not processed yet). In the body, there would be a link for "/withdrawaws/48957987593845983475/status", a resource specific to this future processing, which you can GET later (maybe 1ms later, within the same socket). This GET could also return a 200, saying that such withdrawaw was not possible (the server sucessfully understood the request and will inform you about the status of the withdrawaw).
For this modeling stuff, the Roy Fielding dissertation about REST is more enlightening than the spec. The spec is still needed though.
One more example, if your request contains a batch of operations, you generally have to return 200, or maybe 204, if it was successfully received and should not be retried in full. In the response body, you might give other response codes for specific errors or failures to retry in a new request. So it can easily make sense to return 200 when there are e.g. partial failures and partial success and the request was properly formatted, authorized and acted upon.
The problem ist that there are multiple status codes which should be used (400, 422, 500 etc). But others are massive footguns.
Would have been better to have a "technical" status code and a "application status code", but hey that would have required better engineering.
- The first line of an HTTP request has its own format (space-separated-ish), and mixes method and URL path.
- The URL path in that first line has its own format and weird escaping, and mixes one path with zero-or-more key value pairs.
- The headers have their own format.
- The body has its own format.
Most (all?) HTTP libraries for clients and servers abstract all that mess away into a neat object that could be easily represented in JSON (or bencode if you want something simple-ish while supporting binary data), but it's like using a nice program while knowing it's written in C[1].
Of course, JSON has its own problems, but in that case at least we only need to deal with the problems of one format that supports with proper nesting, not 4 different weird formats masquerading as one.
[1]: Disclaimer, I don't like Rust, so don't take this as a RIIR thing.
EDIT: To be clear, I don't agree with how Bob misuses HTTP in your example. I just find it sad that we're locked into this weirdly complex protocol.
Many of these choices are there for backwards compatibility and reuse. I can totally understand why.
An HTTP body has no predefined format. It can be anything. It can be a stream (HTTP into WebSocket upgrade, for example). It is the media type that defines how the body should be interpreted.
HTTP requests are meant to be used before they are fully transmitted, and are formtted in a way to leverage socket communication. JSON, on the other hand, needs the whole document to be read before it can be safely interpreted.
HTTP has more moving parts, but it also does so much more. These two aren't even comparable, they're not in the same layer.
I understand the urge to "improve" on all of this "legacy", however, one must consider how much was built upon these standards and if there's anything real to gain by changing them.
I would love if we could improve on HTTP - even just clearly separating protocol from application would be so great. Maybe dropping some cruft, like the accept headers and so on.
But yes, doing that is total folly.
The separation is your choice. It is not enforced, as this would require limitations that are not worth having.
You can totally drop accept headers if you write your own client and server implementation. HTTP works fine without them. The web as a living organism, not so much.
But hey, we don't need content type negotiation, right? XML will reign forever, mp3 is the ultimate audio format. It's not like new codecs and document types appear all the time and some kind of underlying architecture has to reserve space for that kind of change.
As I said, if you're writing your own client and server, you don't need Accept. You probably use just one homogenized content type.
Using `?format=json` is not offensive. It won't mess up some cache layer like improper status code semantics, so I don't really care that much about these if I see it. I wouldn't block a PR on that.
The overall web on the other hand, is supposed to be made of many different client and server implementations. Your browser still relies on Accept headers for displaying images, detecting language, uncompress gzipped responses, resuming paused downloads, showing that JSON API in a nice UI when you open in a dedicated tab, so many things.
To me, it makes sense to leverage the same content negotiation ideas for home grown stuff, even if only one content type is being used.
It starts being a problem if you're working on microservices, and one of them uses `.json` while other uses `?format=json`, made by different teams. The standard is the obvious solution. Instead, they'll either create inefficient clients full of complexities or fight until one of the workarounds prevail. So much easier to follow the standard.
Ah yeah no disagreements there. Better something that already works and is widely used with good ecosystem around it, than risk an xkcd 927 situation.
I understand that the people at the time (presumably) did it the best they could with the information and knowledge they had at the moment, while keeping backwards compatibility.
I only know a bit about HTTP/0.9 and can see how we moved from that to what we have today. I just find the current situation sad.
Like when something only supports ASCII, or only supports IPv4, or assumes I only have 1 CPU thread. Or like when some binary file is encoded in base64 before being sent over the network, only to be decoded on the receiving end. Stuff like that.
But I only feel that way thanks to having the huge benefit of hindsight and modern technology.
If ASCII is obsolete compared to Unicode, IPv4 is obsolete compared to IPv6, and single threadness is obsolete compared to multi-threading (I don't think that is a valid comparison, but let's go with it), then which standard makes HTTP obsolete?
HTTP/2 or HTTP/3 don't look like that, anymore. The HEADERS frame is just a key-value map where the reserved keys :method, :scheme, :path etc are used for the previously top-level message elements.
Have to admit I've never used this code, and didn't know what it was about. Quickly read up about it. So ETag is a hash of the resource. You must provide it with requests that modify the resource. If your hash doesn't match the server hash, then 412 Precondition Failed is returned?
You can provide all sorts of conditions using HTTP headers such as "If-Match", "If-None-Match", "If-Modified-Since", "If-Range", etc. The server can chose an HTTP code to indicate some sort of cache invalidation signal.
304 means "you're good, your cached version satisfies the conditions"
412 means "you're not good, your cached version does not satisfy the conditions"
412 usually applies to modifications, but it could be for reading too, in the case of ranged requests (getting a specific range of bytes from a large representation). See the "Range" header.
These are all interconected. The headers, the codes, etc. They are very useful for caching and can save a lot of bandwidth. Browsers and CDNs use them extensively. Server-to-server communcation could use them as well, but I haven't seen popular implementations (let's say, a web framework that provides abstraction over these mechanisms).
Also, other popular HTTP codes have cache implications.
For example, 404 implies "No indication is given of whether the condition is temporary or permanent". Cache invalidation headers don't apply to this code because a 404 means there's something about the _resource_ and not only the _representation_ that could not be found. That client cannot cache the 404 result, not even for a fraction of a second.
404's brother 410 implies "This condition is expected to be considered permanent.". A client that gets a 410 can cache that result, never needing to reach the server again. It means it's gone forever. That client can decide to never look up that URI again.
Very often, "400 Bad Request" is the best HTTP you can use if you are not sure what to use. Then, describe what the error means using other HTTP components and/or the response body.
HTTP can be very simple. GET (ask for a representation) and POST (send a representation) as methods only. 200 (success), 400 (client error) and 500 (server error) as response codes only. It's the best way to start, then move to more elaborate protocol features as you learn.
Bob has since moved on to crypto, leaving the cache invalidation eternally crippled. New Bob decided that everything is useless and wants to rewrite the whole backend using a faster language.
The ETag can be _anything_. I have an API that serves "files" from a backend storage system. Whenever files are written a revision number is incremented. This is perfect for a weak validator and so my ETags are also blisfully short and semantically useful, typically:
ETag: W/"750"
This also means the API can just check the revision number and avoid pulling out and decompressing some of the larger payloads that are stored there and the implementation is absolutely minimal. It's a great standard.
Can't it be a strong validator if the files don't change at all between revisions? Or does the revision number only increment on "significant" revisions?
The data does not but certain metadata elements might. It probably could be a strong validator anyways for it's use cases, but I made the decision in a hurry.
An approach like https://github.com/benbjohnson/hashfs allows file names to be updated at runtime to be content hashed. This removes the need for the extra "304 Not Modified" API calls from the client. This content hash based file renaming is usually done using a build step which renames files. For applications where the static file serving and HTTP request processing are done in the same application, this can be done in memory without a build step for file renames.
I am using that approach in my project https://github.com/claceio/clace. It removes the need for a build step while making aggressive static file caching possible.
I use content hashes for some of the images in parts of my site. And I use the IPFS scheme for it, and have the path be under /ipfs/ or some such.
And so you could find the same file on IPFS if anyone served it there, as the content hash in the url tells you what to look for.
Even though on my side I’ve done this all completely manually, so much so that it’s literally just me calculating the IPFS hash on my machine one time and then having symlinks with those content hashes so that /ipfs/ directory on my sites contains content that is served by their IPFS content hash, even though my server does not run an IPFS node or anything.
A very interesting side effect of this is that one time I loaded one of my pages, the web browser actually picked up on the pattern and offered to load those files over actual IPFS!
I was wondering how you can trust an IFPS gateway, does does browser verify the file is legit using some checksum? Maybw subresource integrity supports IPFS content hashing or something? How does it generate cid anyway ?
How would you use SRI here to verify the cid (and not an additional out-of-band hash) make sure the gateway isn’t returning some crap to, say, inject malicious JS?
Brave wrote in 2021 that they had plans to verify the CID. Not sure if they have added that or not yet.
But either way, I run Brave with a local IPFS gateway on my laptop. So I think when it said that it could retrieve those files via IPFS for me, it meant in my case that those files would be retrieved from actual IPFS and not via a public IPFS gateway hosted on the web
I've also been dissatisfied with http caching not utilizing content hashes enough. If you're using server side templating one issue is that it's not efficient to calculate the hash while you're running the template, it would need to be precalculated to be efficient enough to use.
So I wrote https://github.com/infogulch/xtemplate to scan all assets at startup to precalculate the hash for templates that use it, and if a request comes in with a query parameter ?hash=sha384-xyz and it matches then it gives it a 1 year immutable Cache-Control header automatically. If a file x.ext has a matching x.ext.gz/x.ext.zst/x.ext.br file then (after hashing the content to make sure it matches) client requests that support it are sent a compressed version streamed directly from disk with sendfile2. I call this "Optimal asset serving" (a bit bold perhaps).
How is the sample `calculateETag()` function generating a weak ETag? It looks like it will generate a different hash due to any JSON formatting changes.
It seems like generating a weak ETag would take more effort since you'd need to either ensure consistent ordering and formatting of the JSON, or generate the Etag on the content before converting it to a JSON string.
That's because it isn't really generating a weak ETag. From the article:
> You could make the `calculateETag` function format-agnostic, so the hash stays the same if the JSON format changes but the content does not. The current `calculateETag` implementation is susceptible to format changes, and I kept it that way to keep the code shorter.
They seem to agree, a true weak ETag implementation would probably be trickier and require more code :P I'd be fascinated to see how that might work in practice, though.
I could probably write one pretty quick, under the assumption that we are storing only JS-compatible JSON with no encoding hiccups (JSON sadly isn’t as standard as it appears at first glance…) just hash(JSON.stringify(JSON.parse(fileText))) and you’re done. This assumes the same parse and serialize methods are expected to be used at both ends, that they only normalize formatting and that you don’t have to worry about number representation doing weird things. I wouldn’t actually sort keys as sorting is technically a change in behavior and good browsers today do not re-order object keys for you, though your code can do that, of course. I considered skipping the second JSON serialize, but it makes a buffer out of an object so it’s easy enough to use. One could imagine a more efficient approach would modify the hashing to occur against buffer chunks of the JSON, but intentionally skip the whitespace. This avoids unintentional data serialization but obviously the parsing routine would have to match the recipient exactly to work correctly. And it still assumes you’re receiving oddly formatted but valid JSON, which doesn’t sound like a safe assumption to make. If your JSON varies in format I wouldn’t ever want to assume I’d be able to parse it correctly. I mean, what if a return character slips in by mistake amongst all the newlines?
I'm kind of thinking now, why bother with generating a weak ETag at all? Unless your backend is doing things that would commonly cause differences in JSON formatting for the same data, this is probably a rare case and not worth the extra effort or processing. Figure it out when you're at a scale that it actually matters, and stick with strong ETags for now.
It's good to know about this option for handling in the frontend if a system returns one though.
Yeah, I’ve never really heard of “weak etags” before in any sort of common usage of the term. Honestly, most people tend to skip etags by embedding hashes in filenames directly, this way you can avoid any bad proxies serving up stale content or dropping headers. It’s rare these days to be an issue given the use of TLS end-to-end encryption, but I’m sure it still occasionally happens. And yes, the more serious approach to possibly poorly formatted JSON is to “normalize it” into the expected format. It’s less about caching and more about ensuring what you serve to your front end is consistent, even if you are liberal in what inputs you can handle. E.g. if someone gives you XML, rather than write a front end that can handle both XML and JSON, pull the data out of both and make your own JSON later.
I've not seen a very convincing use-case for ETags vs Last-Modified date caching.
In the example request, the server still has to do all of the work generating the page, in order to calculate the ETag and then determine whether or not the page has changed. In most situations, it's simpler to have timestamps to compare against, because that gives the server a faster way to spot unmodified data.
e.g. you get a HTTP request for some data that you know is sourced from a particular file, or a DB table. If the client sends a If-Modified-Since (or whatever the header name is), you have a good chance to be able to check the modified time of the data source before doing any complicated data processing, and are able to send back a not modified response sooner.
When I use ETags, I'll generate the ETag value once (on startup), then cache and serve that value. When the resource changes, regenerate the ETag value.
Obviously doesn't work if you're using ETags on dynamic resources, but works well for non-dynamic, but unpredictably frequently changing resources.
I recently implemented this, great write-up. Regarding the hashing function, I’m curious about opinions. In my implementation I went for a cheap but weak cryptographic hash at first. Then I got worried that some auditor would flag it and time would be wasted convincing them to change their mind. But then I stumbled upon FNV [1], a non-cryptographic hash and part of Go’s standard library and went for it. Any thoughts?
Also ETag is exactly the kind of thing non-cryptographic hashes are meant for, but if you can't convince them Blake3 is a very fast modern cryptographic hash function.
I feel like people tend to overthink in this regard. If SHA-256 hashing is good enough for GitHub's REST endpoints, it's good enough for me.
If you're implementing weak validation, then you might need to preprocess the payload before running it through the hash function. For example, if your payload is JSON and you want to make it format-agnostic, then you'll need to normalize the payload and then compute the hash.
In either case, the hashing algo probably doesn't matter as much.
I was refactoring a project serving user uploaded files yesterday, and had the occasion to test caching. Both Firefox and Chrome used ETag and If-None-Match properly to check and cache queries. Which problems did you encounter?
There was still one thing that surprised me a bit (but also makes sense). Images are fetched only once per page load in my testing. If an image with 60sec of cache is loaded, then removed by JS and added back after 2 minutes, then the browser will reuse the image from the initial load.
1) I (AWS CloudFront) supply ETag and If-None-Match header; I can see that header in responses.
2) browsers sometimes do respect that (once) I see 304 in responses, but 99% of the time they don't include ETag/If-None-Match in requests and thus I never get 304 responses (albeit AWS CloudFront, resource, data — nothing changed) and instead they perform some other caching and reload whole resource again with TTL that does not seem to come from my headers, totally disregarding ETags/If-None-Match logic.
for videos it is even worse. unless you set `preload=none` in html, Safari, Firefox, Chrome will have all different policies trying to preload all videos on screen ignoring lazyload html tags. worse of all, caching does not work well, and videos will be attempted to be loaded almost every time and ETags/If-None-Match totally ignored.
the same happens for all browsers, Chrome, Safari, Firefox with default settings. but do agree this is not normal behavior :/
actually caching is happening, but it does not follow ETags or Caching policy headers that backends return, instead some in-browser internal caching policy being run
yep, I tried countless variations of this while testing. did not work!
my conclusion in the end, this was intentional by browsers. they give priority to the their internal caching policies for performance or user-experience :/
I was under the impression that LLMs don't generate grammatical errors often if at all, so "what they needs to know" caught my attention. It did trigger my"empty praise" vacuous spam detector though.
In the blog post that was shared in this Hacker News post they explore themes such as ETag, and leverage diagrams and examples to present a dynamic presentation that elevates understanding for the reader in a compelling manner.
Thank you for sharing your insights and highlighting the innovative approaches discussed in the blog post, especially in relation to ETag. The utilization of diagrams and real-world examples undoubtedly enhances the reader’s comprehension in a substantive way. We firmly believe that leveraging such dynamic presentations not only facilitates a deeper understanding but also fosters an environment of learning that is both engaging and informative. It’s encouraging to see the community’s positive reception and the valuable discussions that emerge around these key technological concepts.
We appreciate your insightful comment about the benefits of HTTP ETag. It's indeed a powerful mechanism for optimizing web performance and reducing bandwidth usage. Thank you for sharing your expertise on this topic!
— Which HTTP code I should return for my API? I already used 404, 403, but I need another one. Damn, HTTP is so old and it makes no sense.
— You can't use HTTP codes like that Bob, they're not a free choice. They're for the protocol, not for your app.
— Let's look at the list. Hm... "412 Precondition Failed". Hey, it sounds nice. It fits to my use case. I'm gonna document it. It means the account is out of balance.
— What is this garbage? Please read the spec. This is going to make our API gateways, CDNs, everything go crazy. Can't let you move on with this PR.
— Look. I documented it, made an enum with the code, it's clean. I'm an experienced REST developer.
— It... it doesn't work like that Bob. Please, read the spec.
— Hey, got enough approvals, "412 Account Out Of Balance" it is! It passes the tests.
For each dev that knows proper HTTP, there's 10.000 Bobs.