As a data point, I've been running stuff at Hetzner for 10 years now, in two dat...

jamesbelchamber · 2025-10-20T10:45:06 1760957106

> physical servers are way simpler than the extremely complex software and networking systems that AWS provides.

Or, rather, it's your fault when the complex software and networking systems you deployed on top of those physical servers go wrong (:

jwr · 2025-10-20T11:00:42 1760958042

Yes. Which is why I try to keep my software from being overly complex, for example by not succumbing to the Kubernetes craze.

margorczynski · 2025-10-20T11:36:18 1760960178

Well the complexity comes not from Kubernetes per se but that the problem it wants to solve (generalized solution for distributed computing) is very hard in itself.

alephnil · 2025-10-20T12:49:58 1760964598

Only if you actually has a system complex enough to require it. A lot of systems that use kubernetes are not complex enough to require it, but use it anyway. In that case kubernetes does indeed add unnecessary complexity.

otabdeveloper4 · 2025-10-20T12:53:11 1760964791

Except that k8s doesn't solve the problem of generalized distributed computing at all. (For that you need distributed fault-tolerant state handling which k8s doesn't do.)

K8s solves only one problem - the problem of organizational structure scaling. For example, when your Ops team and your Dev team have different product deadlines and different budgets. At this point you will need the insanity of k8s.

naikrovek · 2025-10-20T13:22:51 1760966571

I am so happy to read that someone views kubernetes the same way I do. for many years i have been surrounded by people who "kubernetes all the things" and that is absolute madness to me.

cutler · 2025-10-20T15:57:29 1760975849

Yes, I remember when Kubernetes hit the scene and it was only used by huge companies who needed to spin-up fleets of servers on demand. The idea of using it for small startup infra was absurd.

scaryclam · 2025-10-20T13:01:16 1760965276

As another data point, I run a k8s cluster on Hetzner (mainly for my own experience, as I'd rather learn on my pet projects vs production), and haven't had any Hetzner related issues with it.

So Hetzner is OK for the overly complex as well, if you wish to do so.

brianwawok · 2025-10-20T11:21:08 1760959268

I love my k8s. Spend 5 minutes per month over the past 8 years and get a very reliable infra

cj · 2025-10-20T11:53:43 1760961223

Do you work on k8s professionally outside of the project you’re talking about?

5 mins seems unrealistic unless you’re spending time somewhere else to keep up to speed with version releases, upgrades, etc.

dangus · 2025-10-20T12:06:29 1760961989

I think it sounds quite realistic especially if you’re using something like Talos Linux.

I’m not using k8s personally but the moment I moved from traditional infrastructure (chef server + VMs) to containers (Portainer) my level of effort went down by like 10x.

nullify88 · 2025-10-20T13:22:07 1760966527

I would say even if not using Talos, Argo CD or Flux CD together with Renovate really helps to simplify the reoccuring maintenence.

datadrivenangel · 2025-10-20T12:54:43 1760964883

You've spent less than 8 hours total on kubernetes?

gamaralf · 2025-10-20T13:23:00 1760966580

I agree. Even when Kubernetes is used in large environments, is it still cumbersome, verbose and overly complex.

tester54654 · 2025-10-20T13:31:34 1760967094

What are the alternatives?

tester54654 · 2025-10-20T13:30:51 1760967051

Right, who needs scalability? Each app should have a hard limit of users and just stop acceppting new users when limits are reached.

SJC_Hacker · 2025-10-20T15:52:57 1760975577

Yeah scalability is great! Let’s burn through thousands of dollars an hour and give all our money to Amazon/Google/Microsoft

When those pink slips come in, we’ll just go somewhere else and do the same thing!

sgarland · 2025-10-20T13:35:33 1760967333

You know that “scale” existed long before K8s - or even Borg - was a thing, right? I mean, how do you think Google ran before creating them?

tester54654 · 2025-10-20T15:43:25 1760975005

yes and mobile phones existed before smartphones, what's the point? So far in terms of scalability nothing beats k8s. And from OpenAI and Google we also see that it even works for high performance use case such as LLM trainings with huge amounts of nodes.

tete · 2025-10-20T13:33:46 1760967226

If the complex software you deployed and/or configured goes wrong on AWS it's also your fault.

saynay · 2025-10-20T13:34:41 1760967281

On the other hand, I had the misfortune of having a hardware failure on one of my Hetzner servers. They got a replacement harddrive in fairly quickly, but still complete data loss on that server, so I had to rebuild it from scratch.

This was extra painful, because I wasn't using one of the OS that is blessed by Hetzner, so it requires a remote install. Remote installs require a system that can run their Java web plugin, and that have a stable and fast enough connection to not time out. The only way I have reliably gotten them to work is by having an ancient Linux VM that was also running in Hetzner, and had the oldest Firefox version I could find that still supported Java in the browser.

My fault for trying to use what they provide in a way that is outside their intended use, and props to them for letting me do it anyway.

jwr · 2025-10-20T16:03:11 1760976191

That can happen with any server, physical or virtual, at any time, and one should be prepared for it.

I learned a long time ago that servers should be an output of your declarative server management configuration, not something that is the source of any configuration state. In other words, you should have a system where you can recreate all your servers at any time.

In your case, I would indeed consider starting with one of the OS base installs that they provide. Much as I dislike the Linux distribution I'm using now, it is quite popular, so I can treat it as a common denominator that my ansible can start from.

kro · 2025-10-24T12:58:10 1761310690

They allow netbooting to a recovery OS from which the disks can be provisioned via an ssh session too, for custom setups. Likely there are cases that require the remote "keyboard", but I wanted to mention that.

elktown · 2025-10-20T12:28:20 1760963300

Cloud marketing and career incentives seems to have instilled in the average dev that MTBF for hardware is in days rather than years.

cutler · 2025-10-20T16:00:38 1760976038

MTBF?

oceansweep · 2025-10-20T16:06:20 1760976380

Mean time between failures

Sanzig · 2025-10-20T16:06:16 1760976376

Mean Time Between Failures.

sapiogram · 2025-10-20T11:01:44 1760958104

Do you monitor your product closely enough to know that there weren't other brief outages? E.g. something on the scale of unscheduled server restarts, and minute-long network outages?

supriyo-biswas · 2025-10-20T11:16:52 1760959012

I personally do through status monitors at larger cloud providers at 30 sec resolutions, never noticed a downtime. They will sometimes drop ICMP though, even though the host is alive and kicking.

firesteelrain · 2025-10-20T11:38:15 1760960295

Surprised they allow ICMP at all

naikrovek · 2025-10-20T13:27:18 1760966838

why does this surprise you?

actually, why do people block ICMP? I remember in 1997-1998 there were some Cisco ICMP vulnerabilities and people started blocking ICMP then and mostly never stopped, and I never understood why. ICMP is so valuable for troubleshooting in certain situations.

scrps · 2025-10-20T16:21:24 1760977284

Security through obscurity mostly, I don't know who continues to push the advice to block ICMP without a valid technical reason since at best if you tilt your head and squint your eyes you could almost maybe see a (very new) script kiddie being defeated by it.

I've rarely actually seen that advice anywhere, more so 20 years ago than now but people are still clearly getting it from circles I don't run in.

firesteelrain · 2025-10-20T13:29:44 1760966984

I don’t disagree. I am used to highly regulated industries where ping is blocked across the WAN

lossolo · 2025-10-20T11:28:01 1760959681

I do. Routers, switches, and power redundancy are solved problems in datacenter hardware. Network outages rarely occur because of these systems, and if any component goes down, there's usually an automatic failover. The only thing you might notice is TCP connections resetting and reconnecting, which typically lasts just a few seconds.

jwr · 2025-10-20T11:17:12 1760959032

Of course. It's a production SaaS, after all. But I don't monitor with sub-minute resolution.

aflukasz · 2025-10-20T11:47:29 1760960849

I do for some time now, on the scale of around 20 hosts in their cloud offering. No restarts or network outages. I do see "migrations" from time to time (vm migrating to a different hardware, I presume), but without impact on metrics.

mattmanser · 2025-10-20T14:04:25 1760969065

Having run bare-metal servers for a client + plenty of VMs pre-cloud, you'd be surprised how bloody obvious that sort of thing is when it happens.

Also sorts of monitoring gets flipped.

And no, there generally aren't brief outages in normal servers unless you did it.

I did have someone accidentally shut down one of the servers once though.

runonmagicsmoke · 2025-10-20T11:55:48 1760961348

to stick to the above point, this wasn't a minute long outage. if you care about seconds/minutes long outages, you monitor. running on aws, hetzer, ovh, or a raspberry in a shoe box makes no difference

yread · 2025-10-20T11:41:46 1760960506

7 years, 20 servers, same here.