SRE agents are the worst agents. I totally get why business and management will demand them and love them. After all, they are the n+1 of customer support chat bot that you get frustrated talking to before you find the magic way to get to a person.
We have been using few different SRE agents and they all fucking suck. The way they are promoted and run always makes them eager to “please” by inventing processes, services, and work-arounds that don’t exist or make no sense. Giving examples will always sound pity or “dumb”. Every time I have to explain to management where SRE agent failed they just hand wave it and assume it’s a small problem. And the problem is, I totally get it. When the SRE agent says “DNS propagation issues are common. I recommend flushing dns cache or trying again later” or “The edge proxy held a bad cache entry. Cache will eventually get purged and the issue should be solved eventually” sounds so reasonable and “smart”. The issue was in DNS or in the proxy configuration. How smart was the SRE agent to get there? They think it’s phenomenal and it may be. But I know that the “DNS issue” isn’t gonna resolve itself because we have a bug in how we update DNS. I know the edge proxy cache issue is always gonna cause a particular use case to fail because the way cache invalidation is implemented has a bug. Everyone loves deflection (including me) and “self correcting” systems. But it just means that a certain class of bugs will forever be “fine” and maybe that’s fine. I don’t know anymore.
That’s my experience working with most SRE humans too. They’re more than happy to ignore the bug in DNS and build a cron job to flush the cache every day instead.
So in some sense the agent is doing a pretty good job…
I have no personal experience with the SRE agents, but I used Codex recently when trying to root cause an incident after we're put in a stop gap, and it did the last mile debugging of looking through the code for me once I had assembled a set of facts & log lines and accurately pointed me to some code I had ignored in my mental model because it was so trivial I didn't think it could be an issue.
That experience made me think we're getting close to SRE agents being a thing.
And as the LLM makers like to reiterate, the underlying models will get better.
Which is to say, I think everyone should have some humility here because how useful the systems end up being is very uncertain. This of course applies just as much to execs who are ingesting the AI hype too.
I guess that depends on how you use agents (SRE or in general). If you ask it a question (even implicitly) and blindly trust the answer, I agree. But if you have it help you find the needle in the haystack, and then verify that did indeed find the needle, suddenly it’s a powerful tool.
Have you used Amazon Q? It's actually pretty handy at investigating, diagnosing, and providing solutions for AWS issues. For some reason none of our teams use it, and waste their time googling or opening tickets for me to answer. I go to Q and ask it, it provides the answer, I send it back to the user. I don't think an "SRE Agent" will be useful because it's too generic, but "Agent customized to solve problems for one specific product/service/etc" can actually be very useful.
That said, I think you're right that you can't really replace an Operations staff, as there will always need to be a human making complex, multi-dimensional decisions around constantly changing scenarios, in order to keep a business operational.
I agree. I actually think CSS (and SQL or other “perfectly functional” interfaces) hold some kind of special power when it comes to AI.
I still feel that the main revolution of AI/LLMs will be in authoring text for such “perfectly functional”-text bases interfaces.
For example, building a “powerful and rich” query experience for any product I worked on was always an exercise in frustration. You know all the data is there, and you know SQL is infinitely capable. But you have to figure out the right UI and the right functions for that UI to call to run the right SQL query to get the right data back to the user.
Asking the user to write the SQL query is a non-starter. You either build some “UI” for it based on what you think is the main usecases, or go all in and invent a new “query language“ that you think (or hope) makes sense to your user. Now you can ask your user to blurb whatever they feel like, and hope your LLM can look at that and your db schema, and come up with the “right” SQL query for it.
Hey! Don't you dare to compare SQL and CSS. SQL is not a cobbled together mess of incremental updates with 5 imperfect ways of achieving common tasks that interact in weird ways. Writing everything in SQL-92 in 2026 is not gonna get you weird looks or lock you out of features relevant for end users. If writing SQL for your problem feels difficult it's a good sign you ought to look at alternatives (eg. use multiple statements instead). Writing the right CSS being difficult is normal.
> Don't you dare to compare SQL and CSS. SQL is not a cobbled together mess of incremental updates with 5 imperfect ways of achieving common tasks that interact in weird ways.
Reminds me a little bit of Sascha Baron Cohen's democracy speech [1] in The Dictator ;-)
Both SQL and CSS have evolved through different versions and vendor specific flavors, and have accumulated warts and different ways to do the same thing. Both feel like a superpower once you have mastered them, but painful to get anything done while learning due to the steep learning curve.
> Tailwind Labs relied on a weird monetization scheme. Revenue was proportional to the pain of using the framework.
Really? To me, Tailwind seemed like the pinnacle of how anyone here would say “open source software” should function. Provide a solid, truly open source, software and make money from consulting or helping others use it and selling custom built solutions around it. The main sin of Tailwind was assuming that type of business could scale to a “large business” structure as opposed to “a single dev”-type project. By a “single dev”-type I don’t mean literally one guy, but more a very lean and non-corporate or company-like structure.
Vercel (and redislabs, mongo, etc) are different because they are in the “we can run it for you” business. Which is another “open source” model I have dabbled in for a while in my career. Thinking that the honest and ethical position is to provide open source software, then offer to host it for people who don’t want to selfhost and charge for that.
From the developer perspective not much changes despite organization structure being completely different in this comparison (trillion dollar company vs 10 individual contributors).
Tailwind Labs revenue stream was tied to documentation visit, that was the funnel.
The author's argument was this revenue stream was destroyed by a slight quality of life improvement (having llms fill in css classes).
Tailwind Labs benefits from:
a) documentation visit
b) inability to implement desired layout using the framework (and CSS being unpleasant).
It seems there is a conflict of interest between the developer expecting the best possible experience and the main revenue stream. Given that a slight accidental improvement in quality of life and autonomy for users destroyed the initiative main revenue stream, it would be fair to say it doesn't just "seems like a conflict of interest".
Definitely disagree with it being the "pinnacle" of how open source should function but I also won't provide any examples because it is besides the point. I will point out that fsf is fine for many decades now, and a foundation with completely different structure like zig foundation seems to be ok with a somewhat proportional revenue (orders of magnitude less influence, adoption and users, maybe 10-20x less funding)
Fish is also not POSIX which has always been its, and my, issue. I use zsh+starship and my own very minimal init stuff+zsh autocomplete and syntaxhighlihg plugins. It’s not a perfect setup. I wish fish would “just work” but it doesn’t. Frequently I had to look up for “workarounds” for my setup. 25 years in, I think I got it and i just keep my zshrc and `machine-init.sh` on point for my-own-style-experience. I think a lot of that could be simplified with fish+starship, but it’s just not there.
> Fish is also not POSIX which has always been its, and my, issue
Could you give some examples of issues you encountered because of that? I've been using fish for about 8 years now I can't remember an instance where that was a problem in interactive use.
Same here. More than 5 years with fish and it’s been like 5 times when not-POSIX was an “issue”, which I’ve been solving by temporarily entering bash and rerunning the command there.
Issue is the cognitive overhead to know 2 distinct shell languages. One you use, and one (almost) everyone else uses. If later isn't of your concern and Fish is all you interact with then no issue whatsoever for interactive or/and scripting use.
Not to be funny, but is POSIX scripting even still relevant? It's well understood that they should only be used for quick and simple tasks, and anything more serious or demanding should be done using something like python instead. But these quick and dirty tasks are very easy for LLM coding agents to do in python. I used to have dozens of shell scripts, each no more than tens of lines long, in my ~/bin/, but I had an LLM rewrite all of them in python, adding proper argument handling, --help messages and error handling too in the process. I sincerely don't think I'll ever write another bash script again.
Do you know how many CI/CD pipelines run on shell scripts?
Another example is small utilities. I wrote one to login to MySQL DBs at work. We have to use Teleport, which I dislike, and it has MFA. So I made a small SQLite DB that has cluster names, endpoints, and secret ARNs (no actual secrets, only metadata), and then wrote a shell function that uses fzf to parse the SQLite DB contents, then ssh tunnels through an EC2 with Teleport to the selected MySQL DB, using expect with a call to 1Pass to answer the password request, and then picks the first available port >=6033 to connect the mysql client. It also tracks the MySQL DB : port assignments in the SQLite DB and waits for the client to exit so it can tear down the ssh tunnel. The only input I have to do beyond selecting the DB is answering a push notification on my phone for MFA.
> replacing 10-LOC shell scripts with Python
The startup time would drive me insane. I love Python, but it’s not great for stuff like that, IMO.
For me it’s always been an inability to “copy this command from stackoverflow” (or in the modern day, it’ll be copy this from ChatGPT) into your shell. Maybe it’s better now, but the last time I seriously gave fish a chance was 2014.
Also one of may main use case is documenting things other developers can do to make their life easier. There are handful of things where zsh behaves differently than bash. And while those handful of thins are not even a POSIX or shell things, they often come up.
The reality is, every day I’m fighting with “developers” who don’t know what the difference between AWS, Linux, and bash is. Throwing “fish” into the mix seems like I’m just being obtuse for no reason. I have sept hours trying to explain to some dumbass that git-bash on windows is not the same thing as Linux only for them to call me “oh he really cares about ‘bash’”-guy. While claiming they are “Linux developers” as they use macOS.
I confess I don't really get this. Fish and Bash are different languages in the same way that Ruby and Perl are different languages. And if I want to run a Perl script, I don't try to run it in a Ruby interpreter, and I don't get grumpy at Ruby for not being source-compatible with Perl.
Which is to say, if you need to run a Bash script, run it via `bash foo.sh` rather than `./foo.sh`. There's no need to limit yourself to a worse shell just because there exist some scripts out there written in that shell's language.
There's nothing even preventing the second form from working either. Just put the right shebang at the top of the script and it'll run through that interpreter. I've been on fish for a decade, but still write all my shell scripts in Bash. It's never been an issue.
It’s not a gimmick feature. It’s just the “user” is always, inherently, in “control” of the kernel itself when it comes to Linux. That’s not true with NT or Darwin. You (a 3rd party) can always verify NT or Darwin’s “integrity” by checking it’s cryptographically signed by Microsoft or Apple. Other than assuming that Valve (or Sony, Nintendo, Debian, SUSE, RedHat, etc) is the “trusted kernel” for your game, you can’t do that with Linux. And the moment you say “My application only runs on Kernels signed by {insert organization}, are you really “Linux”?
The reality is the overwhelming majority of desktop linux users are probably using a kernel shipped by their distro, be it Fedora, Debian, Ubuntu, Valve, whatever. Those kernels could be attested.
I agree with your sentiment though. It's a wild future we're considering, just so some people can play video games and complain less about supposed cheaters (or often, skill issues, but I digress).
Yeah, I agree. Majority of people running any OS are expecting a vendor that manages their OS. Even those running Arch are rarely patching things by hand and just following whatever is in the official repos or wiki.
However, I believe part of the huge positive sentiment about “Linux gaming” online is that, so far, it’s been truly “Linux gaming”. Once it becomes “Valve’s Gaming” it’s really no different than PS5 or Switch using Linux for its base OS but it’s really Sony or Nintendo’s device.
I don’t really understand what that means. Are you, or anyone, expecting a signed Linux kernel by some organization (say Valve or Debian or whatever) that will be the “Gaming Kernel”? If not, no Linux kernel feature is safe from 1 patch and a custom build.
Stock Linux kernel in Fedora, for example, is signed by MS, so SecureBoot allows to boot it without modification. Kernel booted by SecureBoot is locked down by default. To unlock it, you need to patch kernel source, rebuild it, sign it with your own key, and install this key via UEFI to boot it in SecureBoot mode. Your custom key will not pass remote attestation.
They are not signed by MS they are dual signed by a CA that MS runs as a service for UEFI secure boot as well as the distro’s CA.
If you were around in the late 2000s when UEFI SecureBoot was being proposed, you’d remember the massive hysteria about how “SecureBoot is a MS plot to block Linux install”. Even though the proposal was to just allow the UEFI to verify the sig of the binary it’ll boot, and to allow the user to provide the UEFI with the keys to trust, the massive fear was that MB manufacturers will just be too lazy (or be bought by MS) that they will only allow MS keys, or that the process to enlist a new key would be too difficult to sufficiently discourage people from installing Linux (because you know, I’m all for the freedom and fuck-Microsoft camp, until its expected that I verify a signature) so Microsoft offered a service for CA service, like https CAs, but for boot signing.
Assuming you’re a good Linux user, you can always just put your favorite distro signing key in your UEFI without accepting MS CA n there.
Well if you walk backwards 10 paces and look at the big picture here, what MS did enables anti-cheat attestation via TPM, and that in turn can act as a feature that structurally - via the market - reduces the appeal of Linux.
Signing your own custom-built kernel (if you need to adjust flags etc., like I do) won't result in a certification chain that will pass the kind of attestation being sketched out by the OP article here.
Yes because you’re trying to communicate that trust to other players of the game you’re playing as opposed to yourself.
It’s why I hate the term “self-signed” vs “signed” when it comes to tls/https. I always try to explain to junior developers that there is no such a thing as “self-signed”. A “self-signed” certificate isn’t less secure than a “signed” certificate. You are always choosing who you want to trust when it comes to encryption. Out of convenience, you delegate that to the vendor of your OS or browser, but it’s always a choice. But in practice, it’s a very different equation.
I mean the approach the article is talking about. Creating a safe hypervisor and safe kernel that games can get an attestation to in order to trust that they are running on a secure platform.
Eh, and in 20 if SerpApi or whatever the fuck becomes the next google, they’ll have a blog post titled “Why we’re taking legal action against BlemFlamApi data collection”.
The biggest joke was all the “hackers” 25 years ago shouting “Don’t be evil like Oracle, Microsoft, Apple or Adobe and charge for your software, be good like Google and just put like a banner ad or something and give it away for free”
We need a legal precedent that enshrines adversarial interoperability as legal so that we can have a competitive market of BlemFlamApis with no risks of being sued.
Yeah, Seattle food is a lost cause. There is really no reliable-available-everyday type places here anymore. It's all becoming "super quirky and artisanal-we pretend to grow our own lettuce and provide great vibe" $25(+tip please to support our essential workers) BLT. Then, predictably, the place closes down once their lease is over. Rinse and repeat.
Even the old reliable mom and pop owned places (Tiryaki, Thai, Pho, Sichuanese) have all been hallowed out by what's been happening in the ID and have been transitioning to that "great vibe" model too as a last-ditch effort to survive. There was a bit of a resurgence of food trucks those last couple of years, but laws limit what can be made in a food truck.
I’m still frustrated by the criticism because I internalized it a couple of years ago and tried to move to age+minisig because those are the only 2 scenarios I personally care about. The overall experience was annoying given that the problems with pgp/gpg are esoteric and abstract that unless I’m personally are worried about a targeted attack against me, they are fine-ish.
If someone scotch tapes age+minisig and convince git/GitHub/gitlab/codeberge to support it, I’ll be so game it’ll hurt. My biggest usage of pgp is asking people doing bug reports to send me logs and giving them my pgp keys if they are worried and don’t want to publicly post their log file. 99.9% of people don’t care, but I understand the 0.1% who do. The other use is to sign my commits and to encrypt my backups.
Ps: the fact that this post is recommending Tarsnap and magicwormhole shows how badly it has aged in 6 years IMO.
Is this about commit signing? Git and all of the mentioned forges (by uploading the public key in the settings) support SSH keys for that afaik.
git configuration:
gpg.format = ssh
user.signingkey = /path/to/key.pub
If you need local verification of commit signatures you need gpg.ssh.allowedSignersFile too to list the known keys (including yours). ssh-add can remember credentials. Security keys are supported too.
Has Tarsnap become inadequate, security-wise? The service may be expensive for a standard backup. It had a serious bug in 2011, but hasn't it been adequate since then?
I don’t know anything that makes me think it’s inadequate per se, but it’s also been more than 10 years since I thought about it. Restic, gocryptfs, and/or age are far more flexible, generic and flat out better in managing encrypted files/backups depending on how you want to orchestrate it. Restic can do everything, gocryptfs+rclone can do more, etc.
It’s just not the same thing. There is significant overlap, but it’s not enough to be a reasonable suggestion. You can’t suggest a service as a replacement for a local offline tool. It’s like saying “Why do you need VLC when you can just run peertube?”. Also since then, age is the real replacement for pgp in terms of sending encrypted files. Wormhole is a different use case.
There are two parts of "sending encrypted files": the encryption and the sending. An offline tool (e.g. PGP or age) seems only necessary when you want to decouple the two. After all, you can't do the sending with an offline tool (except insofar as you can queue up a message while offline, such as with traditional mail clients).
The question thereby becomes "Why decouple the sending from encryption?"
As far as I can see, the main (only?) reason is if the communication channel used for sending doesn't align with your threat model. For instance, maybe there are multiple parties at the other end of the channel, but you only trust one of them. Then you'd need to do something like encrypt the message with that person's key.
But in the use-case you mentioned (not wanting to publicly post a log file), I don't see why that reason would hold; surely the people who would send you logs can trust trust Signal every bit as easily as PGP. Share your Signal username over your existing channel (the mailing list), thereby allowing these people to effectively "upgrade" their channel with you.
Sticking to the use case of serving that 0.1% of users, why can’t a service or other encrypted transport be a solution? Why doesn’t Signal fit the bill for instance?
We have been using few different SRE agents and they all fucking suck. The way they are promoted and run always makes them eager to “please” by inventing processes, services, and work-arounds that don’t exist or make no sense. Giving examples will always sound pity or “dumb”. Every time I have to explain to management where SRE agent failed they just hand wave it and assume it’s a small problem. And the problem is, I totally get it. When the SRE agent says “DNS propagation issues are common. I recommend flushing dns cache or trying again later” or “The edge proxy held a bad cache entry. Cache will eventually get purged and the issue should be solved eventually” sounds so reasonable and “smart”. The issue was in DNS or in the proxy configuration. How smart was the SRE agent to get there? They think it’s phenomenal and it may be. But I know that the “DNS issue” isn’t gonna resolve itself because we have a bug in how we update DNS. I know the edge proxy cache issue is always gonna cause a particular use case to fail because the way cache invalidation is implemented has a bug. Everyone loves deflection (including me) and “self correcting” systems. But it just means that a certain class of bugs will forever be “fine” and maybe that’s fine. I don’t know anymore.
reply