Cache-friendly reads, user-hostile writes.
The destination is fast code. This blog is about the road there.
This blog isn’t really about fast code, it’s about how we got there.
SWAR tricks, bit-twiddling, branch-killing, cache-friendly layouts — the interesting part is rarely the final benchmark or the code itself. It’s the wrong turns, dead ends, small intuitions, and the “wait, why does THIS make it faster” moments along the way.