Current frontier agents can one shot solve all 2024 AoC puzzles, just by pasting...

Current frontier agents can one shot solve all 2024 AoC puzzles, just by pasting in the puzzle description and the input data.

From watching them work, they read the spec, write the code, run it on the examples, refine the code until it passes, and so on.

But we can’t tell whether the puzzle solutions are in the training data.

I’m looking forward to seeing how well current agents perform on 2025’s puzzles.