I've also (recently) had poor performance on MSVC when "outside the loop" meant "a function argument", for a type that was essentially a struct wrapping a single primitive. It ended up being ABI weirdness that was confusing MSVC (and forceinline ended up fixing it).