The problem is probably the address space that movs use, instead of specialized registers with optimized pipelining. But internally, many instructions might actually come down to conditional moves. I guess that's either after the microcode is decoded, or if I guessed wrong about that, then Register Transfer Logik still pretty much sounds like it was based on, well, transfers.