The lesson here seems to not depend on tools written in languages that have complex, obscure build systems and no one is either able or interested to read. Using tools rewritten in Rust, Go or any other languege which resolves dependencies within project seems the only way to do hardening here.
I agree there's safer languages than C, but nobody reads the 50,000 lines changed when you update the vendoring in a random golang project. It would be easy to introduce something there that nobody notices too.
It is generally harder to introduce vulnerabilities in readable language even more when it is memory safe. Sure life is not perfect and bad actors would have found a ways to inject vulnerabilities also in Rust, Go codebase. The benefit of modern languages is that there is one way to build things and the source code is the only thing that needs to be auditted.
You don't need a complex obscure build system for most C code. There's a lot of historical baggage here, but many projects (including xz, I suspect) can get away with a fairly straight-forward Makefile. Double so when using some GNU make extensions.
Thanks for that post, I wish people stopped pushing ever so more complicated build systems, opaque, non-backward compatible between their own versions when a 2 pages Makefile would work just fine, and still work in 20 years time.
Rust is the worst in terms of build system transparency. Ever heard of build.rs? You can hide backdoors in any crate, or in any crate's build.rs, or the same recursively.
Most build systems are turing-complete. Rust, at least, drastically reduces the need for custom build scripts (most of my projects have empty build.rs files or lack one entirely), and build.rs being in the same language as the rest of the codebase aids transparency immensely.
That doesn't make build.rs any less of a juicy target for a supply chain attack.
Arbitrary code downloaded from the internet and run at build time? That's a nightmare scenario for auditing, much worse than anything Autotools or CMake can offer.
You're not wrong about arbitrary code execution. It's just that your statement applies to most of the packages on any linux distribution, Autotools and Cmake included, regardless of language. Many moreso than Rust due to the aforementioned features of Cargo and build.rs not requiring me to be an expert in a second language just to audit it.
Packages in a Linux distro are not built on my machine, they are built by the distro in a sandbox. Every time I type "cargo build" I am potentially running arbitrary code downloaded from the internet. Every time I type "make" in an Autotools program only my code runs.
> not requiring me to be an expert in another language just to audit it.
Do you do that every time your Cargo.lock changes?
> Every time I type "make" in an Autotools program only my code runs.
Says who? Make is just as good at calling arbitrary code as Cargo. Including code that reaches out over the network. Have you audited every single makefile to ensure that isn't the case?
So... you're complaining about what could happen in a Rust build if you include a library without examining that library first? How do you think that is different from doing the same in any other language?
The difference is that in another language the build step is delegated to someone else who has packaged the code, and every version has presumably gone through some kind of audit. With Rust I have no idea what new transitive dependencies could be included any time I update one of my dependencies, and what code could be triggered just by building my program without even running it.
Again, we're not talking about the dependencies that I choose, but the whole transitive closure of dependencies, including the most low-level. Did you examine serde the first time you used a dependency that used it? serde did have in the past a slightly sketchy case of using a pre-built binary. Or the whole dependency tree of Bevy?
I mean, Rust has many advantages but the cargo supply chain story is an absolute disaster---not that it's alone, pypi or nodejs or Ruby gems are the same.
> Nothing prevents you from using the packaged libraries if you prefer them
Nothing except, in no particular order: 1) only having one version of crates 2) mismatched features 3) new transitive dependencies that can be introduced at any time without any warning 4) only supporting one version of rust 5) packages being noarch and basically glorified distro-wide vendoring—so their build.rs code is still run on your machine at cargo build time
Same as any other library provided by the distribution in any other language.
> 2) mismatched features
Same as any other library provided by the distribution in any other language.
> 3) new transitive dependencies that can be introduced at any time without any warning
Not in packaged Rust libraries in Fedora, at least. Please read the aforementioned link.
> 4) only supporting one version of rust
Same as any other library provided by the distribution in any other language.
> 5) packages being noarch and basically glorified distro-wide vendoring
Packages containing only source is a consequence of the Rust ABI still stabilizing, see: https://github.com/rust-lang/rust/pull/105586 After ABI stabilization, Rust libraries will be first class like any other language.
Exactly. And at least Cargo will refuse to download a crate which has been yanked. So any crate which has been discovered to be compromised can be yanked, preventing further damage even when someone has already downloaded something which depends on it.
Building packages with up-to-date dependencies is also vastly preferable to building against ancient copies of libraries vendored into a codebase at some point in the past, a situation I see far too often in C/C++ codebases.
Wouldn't a supply chain attack like this be much worse with Rust and Cargo because of the fact it's not just a single dynamic library that needs to be reinstalled system-wise, but, instead, every binary would require a new release?
It would mean rebuilding more packages. I don't think that's meaningfully "much worse", package mangers are perfectly capable of rebuilding the world and the end-user fix is the same "pacman -Syu"/"apt-get update && apt-get upgrade"/...
On the flip side the elegant/readable build system means that the place this exploit was hidden wouldn't exist. Though I wouldn't confidently say that 'no hiding places exist' (especially with the parts of the ecosystem that wrap dependencies in other languages).
It's much worse because it requires repackaging every affected system package instead of a single library. Knowing which packages are affected is difficult because that information isn't exposed to the larger system package manager. After all, it's all managed by the build system.
Those CI and build infrastructures rely on the Debian and RedHat being able to build system packages.
How would an automated CI or build infrastructure stop this attack? It was stopped because the competent package maintainer noticed a performance regression.
In this case, this imagined build system would have to track every rust library used in every package to know which packages to perform an emergency release for.
I... don't see your point. Tracking the dependencies a static binary is built with is already a feature for build systems, just maybe not the ones Debian and RH are using now, but I imagine they would if they were shipping static binaries.
Rust isn't really the point here, it's the age old static vs dynamic linking argument. Rust (or rather, Cargo) already tracks which version of a dependency a library depends on (or a pattern to resolve one), but it's besides the point.
Rust is the issue here because it doesn't give you much of an option. And that option is the wrong one if you need to do an emergency upgrade of a particular library system-wide.
It's really not, it's not hard to do a reverse search of [broken lib] <= depends on <= [rust application] and then rebuild everything that matches. You might have to rebuild more, but that's not really hard with modern build infrastructure.
Not to mention if you have a Rust application that depends on C libraries, it already dynamically links on most platforms. You only need to rebuild if a Rust crate needs to be updated.
So, I know that librustxz has been compromised. I'm Debian. I must dive into each rust binary I distribute as part of my system and inspect their Cargo.toml files. Then what? Do I fork each one, bump the version, hope it doesn't break everything, and then push an emergency release!??!
> I must dive into each rust binary I distribute as part of my system and inspect their Cargo.toml
A few things:
1. It'd be Cargo.lock
2. Debian, in particular, processes Cargo's output here and makes individual debs. So they've taken advantage of this to already know via their regular package manager tooling.
3. You wouldn't dive into and look through these by hand, you'd have it as a first-class concept. "Which packages use this package" should be table stakes for a package manager.
> Then what? Do I fork each one, bump the version, hope it doesn't break everything, and then push an emergency release!??!
The exact same thing you do in this current situation? It depends on what the issue is. Cargo isn't magic.
The point is just that "which libraries does the binary depend on" isn't a problem with actual tooling.
People already run tools like cargo-vet in CI to catch versions of packages that may have issues they care about.
Except it is. The system package maintainers release a new build of the package in question and then you install it. There's not really anything else to do here. There's nothing special about Rust in this context, it would be exactly the same scenario on, for example, Musl libc based distros with any C application.
Fundamentally there is no difference. In practice Rust makes things a lot worse. It encourages the use of dependencies from random (i.e. published with cargo) sources without much quality control. It is really a supply chain disaster to happen. A problem like this would propagate much faster. Here the threat actor had to work hard to get his library updated in distributions and at each step there was a chance that this is detected. Now think about a Rust package automatically pulling in transitively 100s of crates. Sure, a distribution can later figure out what was affected and push upgrades to all the packages. But fundamentally, we should minimize dependencies and we should have quality control at each level (and ideally we should not run code at build time). Cargo goes into the full opposite direction. Rust got this wrong.
Whether a hypothetical alternate world in which Rust didn't have a package manager or didn't make sharing code easy would be better or worse than the world we live in isn't an interesting question, because in that world nobody would use Rust to begin with. Developers have expected to be able to share code with package managers ever since Perl 5 and CPAN took off. Like it or not, supply chain attacks are things we have to confront and take steps to solve. Telling developers to avoid dependencies just isn't realistic.
> It encourages the use of dependencies from random (i.e. published with cargo) sources without much quality control. It is really a supply chain disaster to happen.
Oh I 100% agree with this, but that's not what was being talked about. That being said, I don't think the distribution model is perfect either: it just has a different set of tradeoffs. Not all software has the same risk profile, not all software is a security boundary between a system and the internet. I 100% agree that the sheer number of crates that the average Rust program pulls in is... not good, but it's also not the only language/platform that does this (npm, pypi, pick-your-favorite-text-editor, etc.), so soloing out Rust in that context doesn't make sense either, it only makes sense when comparing it to the C/C++ "ecosystem".
I'm also somewhat surprised that the conclusion people come to here is that dynamic linking is a solution to the problem at hand or even a strong source of mitigation: it's really, really not. The ability to, at almost any time, swap out what version of a dependency something is running is what allowed this exploit to happen in the first place. The fact that there was dynamic linking at all dramatically increased the blast radius of what was effected by this, not decreased it. It only provides a benefit once discovered, and that benefit is mostly in terms of less packages need to be rebuilt and updated by distro maintainers and users. Ultimately, supply-chain security is an incredibly tough problem that is far more nuanced than valueless "dynamic linking is better than static linking" statements can even come close to communicating.
> A problem like this would propagate much faster. Here the threat actor had to work hard to get his library updated in distributions and at each step there was a chance that this is detected.
It wouldn't though, because programs would have had to have been rebuilt with the backdoored versions. The book keeping would be harder, but the blast radius would have probably been smaller with static linking except in the case where the package is meticulously maintained by someone who bumps their dependencies constantly or if the exploit goes unnoticed for a long period of time. That's trouble no matter what.
> Now think about a Rust package automatically pulling in transitively 100s of crates.
Yup, but it only happens at build time. The blast radius has different time-domain properties than with shared libraries. See above. 100s of crates is ridiculous, and IMO the community could (and should) do a lot more to establish which crates are maintained appropriately and are actually being monitored.
> Sure, a distribution can later figure out what was affected and push upgrades to all the packages.
This is trivial to do with build system automation and a small modicum of effort. It's also what already happens, no?
> But fundamentally, we should minimize dependencies and we should have quality control at each level
Agreed, the Rust ecosystem has it's own tooling for quality control. Just because it's not maintained by the distro maintainers doesn't mean it's not there. There is a lot of room for improvement though.
> (and ideally we should not run code at build time). Cargo goes into the full opposite direction. Rust got this wrong.
Hard, hard, hard disagree. Nearly every language requires executing arbitrary code at compile time, yes, even a good chunk of C/C++. A strong and consistent build system is a positive in this regard: it would be much harder to obfuscate an attack like this in a Rust build.rs because there's not multiple stages of abstraction with an arbitrary number of ways to do it. As it stands, part of the reason the xz exploit was even possible was because of the disaster that is autotools. I would argue the Rust build story is significantly better than the average C/C++ build story. Look at all the comments here describing the "autotools gunk" that is used to obfuscate what is actually going on. Sure, you could do something similar for Rust, but it would look weird, not "huh, I don't understand this, but that's autotools for ya, eh?"
To be clear, I agree with you that the state of Rust and it's packaging is not ideal, but I don't think it necessarily made wrong decisions, it's just immature as a platform, which is something that can and will be addressed.
I am not completely sure about this exploit, but seems like a binary needed to be modified for the exploit to work[1] which was later picked up by build system.
This seems to be an orthogonal issue. Rust could build the same dynamic library with cargo which could then be distributed. The diference is that there would be a single way to build things.
Most Rust libraries are not dynamically linked; instead, versions are pinned and included statically during the build process. This is touted as a feature.
Only a few projects are built as system-wide libraries that expose a C-compatible abi interface; rsvg comes to mind.
Once somebody actually does this people are gonna complain the same as always: "The sole purpose of your project is to rewrite perfectly fine stuff in Rust for the sake of it" or something along these lines.
Is this really the lesson here? We are talking about a maintainer here, who had access to signing keys and a full access to the repository. Deb packages which were distributed are also different than the source code. Do you honestly believe that the (arguably awful) autotools syntax is the single root cause of this mess, Rust will save us from everything, and this is what we should take away from this situation?
The fundamental problem here was a violation of chain of trust.
Open source is only about the source being open. But if users are just downloading blobs with prebuilt binaries or even _pre-generated scripts_ that aren't in the original source, there is nothing a less-obscure build system will save you from as you are putting your entire security on the chain of trust being maintained.