Banner

Obliterating the Player Loop’s CPU Footprint

Day 6: Obliterating the Player Loop’s CPU Footprint

Today we went full-on code assassin on our player loop—because in this grimdark grinder, every millisecond shaved is another soul crushed under your iron heel. After profiling, we discovered our poor loop was choking on a hefty 9 ms per frame—an eternity in code. So we tore into it with ruthless efficiency and scoured every branch, cache miss, and allocation.

  • Branchless brutality: We refactored conditionals into lookup tables and bitwise ops, turning unpredictable branches into predictable math.
  • Data locality war: Arrays got re-ordered, structs packed tight, and we swapped out random memory hops for contiguous buffers that the CPU actually loves.
  • Loop unrolling & inlining: We manually unrolled hot loops and inlined tiny functions, erasing call overhead like a pitiless editor.
  • Job system tease: By offloading some per-player prep into lightweight jobs (sans heavy sync), we let the main thread breathe easier than a victory smoke.

Result? That beastly 9 ms loop now purrs at 1.5 ms, giving us enough headroom for fancy particles, extra AIs, or literally anything else without tanking frame rates.

Stay savage, keep slaying those bottlenecks, and remember—performance is the new black.

– Guts Glory Games