Oh, but it could be improved. Linux can cold boot under 300ms (easier if you control the BIOS and can tune it for speed, like coreboot can), faster if resuming from RAM. That should allow you to perform load-balancing while powering off the extra capacity (using wake-on-lan).
If load becomes too important for the SBC or close to capacity, wake the server, and perform a handover once it's up. You can either hold the packets and use the SBC as a proxy, or change your router's config to point to the newly awakened server (alternatively, just exchange IP or MAC addresses). With a bit of magic to avoid closing existing connections (I believe home ISP routers should keep NAT connections open if a port forward is changed), it would work. Obviously it's even easier with a proper load balancer.
edit: actually even a router might be able to handle low loads
So yeah, it's still wasted power and computation in my opinion. "Ready and waiting" should not take 100W per server, but be closer to 0.1W (WoL), or lower if managing state from a central node. I guess it's not worth optimizing for most people, and big cloud probably does something similar already.
In a way, it's a bit like big.LITTLE with additional latency: small, power-efficient vs big, fast, inefficient for small loads.
Modern CPUs go to lower power states super quickly and draw almost nothing. The thing is, if the server is running many VMs, there's no way it's going to a low power state, eveb if some are doing nothing (others will be). You also have 10 jet engines blowing air at the front, which probably is more than the CPU uses when both are idle.
Totally agreed that 100w is a waste at idle, but I don’t think that’s what the parent was talking about. My read was that the parent comment was responding to capacity planning tending towards reducing the number of servers (without building out low latency cold boot infra) in the name of cost/energy savings, and that resulting in higher latency request latency. Anecdotally this seems plausible, but I don’t have metrics to back it up.
If load becomes too important for the SBC or close to capacity, wake the server, and perform a handover once it's up. You can either hold the packets and use the SBC as a proxy, or change your router's config to point to the newly awakened server (alternatively, just exchange IP or MAC addresses). With a bit of magic to avoid closing existing connections (I believe home ISP routers should keep NAT connections open if a port forward is changed), it would work. Obviously it's even easier with a proper load balancer.
edit: actually even a router might be able to handle low loads
There seems to be surprisingly little interest in this (closest I found was https://github.com/kubernetes/kubernetes/issues/89271 ).
So yeah, it's still wasted power and computation in my opinion. "Ready and waiting" should not take 100W per server, but be closer to 0.1W (WoL), or lower if managing state from a central node. I guess it's not worth optimizing for most people, and big cloud probably does something similar already.
In a way, it's a bit like big.LITTLE with additional latency: small, power-efficient vs big, fast, inefficient for small loads.