I'm not super familiar with ARM / ARM64 assembly and was confused as to how x0 was incremented.
Was going to ask here, but decided to not be lazy and just look it up.
const float f = *data++;
ldr s1, [x0], #4
Turns out this instruction loads and increments x0 by 4 at the same time.
It looks like you can use negative values too, so could iterate over something in reverse.
Kind of cool, I don't think x86_64 has a single instruction that can load and increment in one go.
Kind of cool, I don't think x86_64 has a single instruction that can load and increment in one go.