Hacker Newsnew | past | comments | ask | show | jobs | submit | dosnem's commentslogin

Anyone understand how this could work? My mental model for llm is predictive text but here how can it understand cell A1 which has a string is the “header” for all values under it? How does it learn to understand table data like that?


LLMs already understand table data. "Predictive text" is somewhat true but so reductive that it leads to that kind of misconception.

HN is going to mangle this but here's a quick table:

| Type of Horse | Average Height | Typical Color | |----------------|----------------|-----------------| | Arabian | 15 hh | Bay, Gray | | Thoroughbred | 16 hh | Chestnut, Bay | | Clydesdale | 17.5 hh | Bay with White | | Shetland Pony | 10.5 hh | Black, Chestnut |

And after a prompt "pivot the table so rows are colors":

| Typical Color | Type of Horse | Average Height | |----------------|----------------------------------------|-----------------------| | Bay | Arabian, Thoroughbred, Clydesdale | 15 hh, 16 hh, 17.5 hh | | Gray | Arabian | 15 hh | | Chestnut | Thoroughbred, Shetland Pony | 16 hh, 10.5 hh | | Bay with White | Clydesdale | 17.5 hh | | Black | Shetland Pony | 10.5 hh |


> Anyone understand how this could work? My mental model for llm is predictive text but here how can it understand cell A1 which has a string is the “header” for all values under it? How does it learn to understand table data like that?

I imagine it uses the new Agent Skills features

https://www.anthropic.com/news/skills


This doesn’t really matter. This type of error gets the whole 5 why’s treatment and every why needs to get fixed. Both problems will certainly have an action item


It is not my claim that AWS is going to handle this badly, only that this thread is.


How does knowing this help you avoid these problems? It doesn’t seem to provide any guidance on what to do in the face of complex systems


He's literally writing about Three Mile Island. He doesn't have anything to tell you about what concurrency primitives to use for your distributed DNS management system.

But: given finite resources, should you respond to this incident by auditing your DNS management systems (or all your systems) for race conditions? Or should you instead figure out how to make the Droplet Manager survive (in some degraded state) a partition from DynamoDB without entering congestive collapse? Is the right response an identification of the "most faulty components" and a project plan to improve them? Or is it closing the human expertise/process gap that prevented them from throttling DWFM for 4.5 hours?

Cook isn't telling you how to solve problems; he's asking you to change how you think about problems, so you don't rathole in obvious local extrema instead of being guided by the bigger picture.


It's entirely unclear to me if a system the size and scope of AWS could be re-thought using these principles and successfully execute a complete restructuring of all their processes to reduce their failure rate a bit. It's a system that grew over time with many thousands of different developers, with a need to solve critical scaling issues that would have stopped the business in its tracks (far worse than this outage).


Another point is that DWFM is likely working in a privileged, isolated network because it needs access deep into the core control plane. After all, you don't want a rogue service to be able to add a malicious agent to a customer's VPC.

And since this network is privileged, observability tools, debugging support, and even maybe access to it are more complicated. Even just the set of engineers who have access is likely more limited, especially at 2AM.

Should AWS relax these controls to make recovery easier? But then it will also result in a less secure system. It's again a trade-off.


Both documents are, "ceremonies for engineering personalities."

Even you can't help it - "enumerating a list of questions" is a very engineering thing to do.

Normal people don't talk or think like that. The way Cook is asking us to "think about problems" is kind of the opposite of what good leadership looks like. Thinking about thinking about problems is like, 200% wrong. On the contrary, be way more emotional and way simpler.


I don’t really follow what you are suggesting. If the system is complex and constantly evolving as the article states, you aren’t going to be able to close any expertise process gap. Operating in a degraded state is probably already built in, this was just a state of degradation they were not prepared for. You can’t figure out all degraded states to operate in because by definition the system is complex


This seems so simple but I’m totally not understanding it..

If C = D^2, and you double compute, then 2C ==> 2D^2. How do you and the original author get 1.41D from 2D^2?


If C ~ D^2, then D ~ sqrt(C).

In other words, the required amount of data scales with the square root of the compute. The square root of 2 ~= 1.414. If you double the compute, you need roughly 1.414 times more data.


Thanks for clarification!


> there’s a process of research and planning and perusing in careful steps, and I set the agent up for success

Are there any good articles you can share or maybe your process? I’m really trying to get good at this but I don’t find myself great at using agents and I honestly don’t know where to start. I’ve tried the memory bank in cline, tried using more thinking directives, but I find I can’t get it to do complex things and it ends up being a time sink for me.



Providing context makes sense to me, but do you have any examples of providing context and then getting the AI to produce something complex? I am quite a proponent of AI but even I find myself failing to produce significant results on complex problems, even when I have clone + memory bank, etc. it ends up being a time sink of trying to get the ai to do something only to have me eventually take over and do it myself.


Quite a few times, I've been able to give it enough context to write me an entire working piece of software in a single shot. I use that for plugins pretty often, eg this:

  llm -m openai/o3 \
    -f https://raw.githubusercontent.com/simonw/llm-hacker-news/refs/heads/main/llm_hacker_news.py \
    -f https://raw.githubusercontent.com/simonw/tools/refs/heads/main/github-issue-to-markdown.html \
    -s 'Write a new fragments plugin in Python that registers issue:org/repo/123 which fetches that issue
      number from the specified github repo and uses the same markdown logic as the HTML page to turn that into a fragment'
Which produced this: https://gist.github.com/simonw/249e16edffe6350f7265012bee9e3...


I had a series of “Using Manim create an animation for formula X rearranging into formula Y with a graph of values of the function”

Beautiful one shot results and i now have really nice animations of some complex maths to help others understand. (I’ll put it up on youtube soon).

I don't know the manim library at all so saved me about a week of work learning and implementing


No way that makes sense. Email is for external conversations. Meetings are hour long.


Sometimes you do it because if you just ask the question you get ignored but if you say hello and get a response ppl are less likely to ignore the second question. Thats the bigger reason than any rudeness reason i would think.


The opposite is true as well. If someone pings me hello, I might not answer immediately because I don't know if what follows will be a big or small thing, but after having replied I feel obligated to continue answering. So my decision is to not answer until I'm less busy.

If they however ask a question no hello style, I can quickly gauge if I can answer it immediately, or I should wait until a better time for me.

So the no hello might get an immediate response, the hello will wait until I can handle whatever.


And if they ask the question and I determine I don't have the bandwidth to switch or immediately answer I can kindly reply such information. They're more likely to get a reply faster by skipping the hanging and dangling hello. "If it's important, they'll leave a message."


This. The selfish point (there are other points too) of "hi" is to confirm you have their attention and to remove plausible deniability of "oops I missed your message."


Weird subthread.

> The selfish point (there are other points too) of "hi" is to confirm you have their attention

No one is unsure of the selfish/self-serving motivation behind the lone "hello". The singleminded self-centeredness at the expense of others is the _entire_ basis of the criticism.

This response is like encountering in a thread about lunch theft in the workplace, "Some people take food that isn't theirs because they didn't bring anything for lunch, and they see food that someone else brought sitting there in the fridge." The power of this response to be able to explain something not already understood is nil—and so is its exculpatory power.

> to remove plausible deniability of "oops I missed your message."

I'll dispute this. The overwhelming purpose is so the sender can confirm they have the receiver's attention so the sender knows whether to bother themselves with typing out the rest of their inquiry. They're happy to trade the negative consequences on others for a minor convenience to themselves.


This is such a ridiculously cynical interpretation. I'm sure there at least a few people out there who behave as you describe but that is not normal. Greeting people before launching into a topic is a social norm. Even if you make a reasonable case that it is outdated in the context of instant messaging that doesn't change the reality of it.

Someone doing something that you consider outdated or inefficient does not imply that he is malicious.


> This is such a ridiculously cynical interpretation.

no u

> Someone doing something that you consider outdated or inefficient does not imply that he is malicious.

The absence of malice does not erase the harmful effects.


You specifically attributed malice and I'm responding to that.

As to these supposed harmful effects. If you find the most basic of social pleasantries to be such an unmanageable burden then I'm likely better off not associating with you. Do you get angry at people who greet you as you walk by on the street? Navigating that interaction similarly demands some small part of your attention after all, however brief it might be.


> You specifically attributed malice

O RLY? Wanna give that one another shot, scooter?


It's the confirmation of attention (the response to "hello") that removes the deniability of "I missed your message." In case that wasn't clear.


> The power of this response to be able to explain something not already understood is nil


I agree with yuy. If it's not important enough to write out their inquiry, is it even necessary to inquire?


I would be okay with this if the conversation actually demanded a realtime response. But I can't know that until I see the actual first message, and they usually don't.


I’ve always envisioned tla and other formal methods as specific to distributed systems and never needed to understand it. How is it used for a snake game? Also how is the TLA+ spec determined from the code? Won’t it implicitly model incorrect bugs as correct behaviour since it’s an existing state in the system? Also when using TLA from the start, can it be applied to implementations? Or is it only for catching bugs during design? Therefore I’m assuming implementations still need to match the design exactly or else you would still get subtle bugs? Sorry for all the questions I’ve never actually learned formal methods but have always been interested.


Here's how it caught my Snake bug: My snake representation is a vector of key points (head, turns, tail). A snake in a straight line, of length 3, facing right can look like this: [(0,0), (2,0)]. When a Snake moves (a single function called "step_forward"), the Snake representation is compressed by my code: If the last 2 points are the same, remove the last one. So if this snake changes direction to "left", then the new snake representation would be [(1, 1), (1, 1)] and compressed to [(1, 1)] before existing out of step_forward.

Here's how the bug was caught: It should be impossible for the Snake representation to be < 2 points. So I told Opus to model the behavior of my snake, and also to write a TLA+ invariant that the snake length should never be under 2. TLA+ then basically simulates it and finds the exact sequence of steps "turns" that cause that invariant to not hold. In this case it was quite trivial, I never thought to prevent a Snake from making turns that are not 90 degrees.


It's targeted at distributed systems, but it can be used to model any system over time. I've used it for distributed systems, but also for embedded systems with a peculiar piece of hardware that (seemed, and we found was) to be misbehaving. I modeled the hardware and its spec in TLA+, then made changes to the behavior description to see if it broke any expected invariants (it did, in precisely the way we saw with the real hardware). The TLA+ model also helped me develop better reproducible test cases for that hardware compared to what we were doing before.


I'm not an expert but my current understanding is that code execution is always a state transition to the next state. So what you do is fully specify each state and the relation between them. How the transition actually happens is your code and it's not that important. What's important is that the relations does not conflict to each other. It's a supercharged type system.

> Also how is TLA+ spec determined from the code?

You start from the initial state, which is always known (or is fixed). Then you model the invariants for each lines.

> Won’t it implicitly model incorrect bugs as correct behaviour since it’s an existing state in the system

Invariants will conflicts with each other in this case.

> Also when using TLA from the start, can it be applied to implementations?

Yes, by fully following the specs and handling possible incorrect states that may happens in practice. If your initial state in the TLA+ specs says that it only includes natural numbers between 1 and 5, you add assertions in your implementation (or throw exceptions) that check that as the Int type in many type systems is not a full guarantee for that constraint. Even more work when using a dynamic language.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: