If the test failure is detected (so, you get super lucky) you should immediately reject the code, including the tests failing before some other fixups that didn't effect that test... Oftentimes it will take a long time to surface these, but I'm of the opinion that a broken build is show stopping until it's resolved - that doesn't mean 5-alarm all devs rush to the scene, but it does mean a free person picking it up as their next task or, barring that, bumping someone off of feature or other work to address the issue.
It may take a while... but while that flakey test exists in your codebase it will leverage a constant cost on all of your developers.
Which code? When you randomly run into the flakey test, in most cases it's not coming from the change which was just tested. You'd reject some random, unrelated PR
It may take a while... but while that flakey test exists in your codebase it will leverage a constant cost on all of your developers.