Okay, this is a serious question. For me, not an official RH position. In my time in HPC, nodes were baked with a specific image and then that basically never ever got updates. As I came to that as a sysadmin from other areas, I found that somewhat horrifying, but it seemed pretty universal. Have things changed such that applying patches regularly (like, more often than once a month or so except in emergencies) is a thing?
Not much, but in our setup the image is not something which can evolve or change over time. This practice has some very practical reasons though.
Scientific applications can be very picky about the libraries they use or need, down to minor version since the results they produce are very, very precise. Even if not very accurate, you need to know the inaccuracy. An optimization in a math library can change this and, it's not something we want. Also program verification and certification generally includes versions of the libraries used.
Piecewise upgrades are a no go too. Your cluster generally can't work well in heterogeneous configurations (due to library mismatches) and draining a node is not a straightforward task (due to length of the jobs). If your cluster has a steady stream of incoming jobs, reducing resources also means queue bloat and recovering it is not easy sometimes. If you want to drain the whole cluster, it takes almost 2-3 weeks so, you lose ~1 month of productivity. When you start an empty cluster to churn its queues, its saturation takes time so, it doesn't go to 11 directly.
Also, worker nodes are highly isolated from the user's point of view. No users can log-in, only known people submit jobs, etc. Unless there's a rogue academic trying to do nefarious things, the place is pretty safe and worry-free. In past 15 years, we got two rootkit infections due to a server which can be world-accessible by design. Other than that, nothing ever got infected.
At the end of the day, this approach has some valid reasons to be alive. It's not that we're a bunch of lazy academics who refrain from applying good system administration practices. :D
Addendum: The images generally get updated when new hardware is added, since new processors tend to work better with newer kernels. Also sometimes we bit the bullet and update all the cluster at once. XCAT helps a lot in this space. If your image is sane, you can install batches of 150+ servers in 15 minutes while sipping your coffee.
We will certainly try. Need to mirror a repo, freeze it and update our installation infra so it looks to the local repo rather than the national mirror.
All repo settings will look to local repo so we'd have no dependency problem or version creep if we need to install an additional package.
Didn't completely think how to handle the occasional emergency update though.
Also, we need to compile in some packages. Hope they won't break. High performance stuff needs optimized/customized compilations.
I just want to add: Hope that the packages in CentOS stream won't end up too cutting edge for the scientific software community. These communities move slow due to stability requirements. We'll certainly see but it might be another potential problem.
I can totally reassure you on your last concern: everything that goes into Stream is approved for a minor release in RHEL. That's not changing at all. Cutting edge is still Fedora's turf. :)
To be clear, I'm RHEL and CentOS _adjacent_, rather than actively _in_ them. But I think (rough launch and more than a few communication issues) aside this is generally gonna be positive.
I think that's because HPC users are largely non-technical developers. We changed a DHCP schema at one point and had a bunch of angry academics in the IT office because their Matlab scripts were broken. Many of them had been hard coding IP addresses into the code itself.