Minimal Versions

// #Rust #Devprod

A library’s dependency constraints are part of their interface with consumers. If a consumer wants to depend on the code of the library, they have to supply the dependencies within the given constraints. To make life easy on the consumers, the library maintainer should choose as broad of constraints as possible (low minimum, high maximum) which still capture any requirements.

Cargo Policy

Rust packages are published with semantic versioning, where each version has a MAJOR.MINOR.PATCH format.

  • MAJOR // Maintainers increment for incompatible API changes.
  • MINOR // Maintainers increment for backward-compatible functionality.
  • PATCH // Maintainers increment for backward-compatible bug fixes.

The catch here is that maintainers are human and make mistakes. It is very possible that a patch or minor bump contains an incompatible change by accident. Gotta keep that in mind when things get complex.

When a maintainer declares a dependency for their library in the Cargo.toml manifest, they generally choose a version like some_crate = "1.2.3". This is equivalent to writing ^1.2.3 (that little carrot) which means “use any version greater than 1.2.3, but less than 2.0.0”. There is a little catch here, if the major version 0 then only patches are upgraded. Kinda goofy, but the idea being that a 0 version is unstable…but what isn’t?

Anyways, a maintainer could choose to be more restrictive with an exact version =1.2.3, or only update patches ~1.2.3, or declare a range >=1.2.3, <1.5.0. This generally makes life easier for the maintainer, but the burden ends up on the consumer to deal with the narrower constraints.

That default behavior, of nothing being the ^, is a peak at the general cargo policy for updating dependencies. It tries to balance automatically getting bug fixes and improvements while avoiding breaking changes. “Hey maintainer, tell me the bare minimum you require for functionality, but I’ll roll in some fixes for you.” Cargo treats major version changes as two separate dependencies which can coexist. This makes life easier in the short term, but does expose users to runtime issues if data is passed between those crates.

A big downside of this default, auto-update policy is that over time a library might end up accidentally depend on a new version of a dependency. Kinda following the “any public part of an API will be depended upon” principle (TIL this is called Hyrum’s Law). This means that their dependency constraints are no longer valid. A consumer could resolve dependencies to something in the given window and then run into a build or runtime error.

So in order to expose a high quality, easy to use interface for consumers, a library maintainer needs to fight against the current of cargo and test that their minimum dependencies are still valid.

Cargo-minimal.toml

rust-bitcoin has an interesting approach to this which spark most of my thoughts here. They don’t check in a Cargo.lock and instead check in two separate Cargo-minimal.lock and Cargo-recent.lock files. The CI workflow runs tests twice, once where it copies Cargo-minimal.lock to Cargo.lock and once with Cargo-recent.lock. The minimal is supposed to be the “floor” dependencies and recent is the v2 maximum resolution. That strategy makes sense to me, the part I struggled with was how exactly are the lockfiles updated.

The CONTRIBUTING.md asks contributor to modify the manifest Cargo.toml, run the simple script below, and commit the updates. At first glance, I didn’t think this would change either lockfile.

for file in Cargo-minimal.lock Cargo-recent.lock; do
    cp -f "$file" Cargo.lock
    cargo check
    cp -f Cargo.lock "$file"
done

Lockfile update script.

Turns out, cargo check does update a lockfile if necessary! But, not as aggressive as a straight up cargo update command. It is a conservative, focused update, only changing the parts of a lockfile which it has to in order to fit the constraints. This includes any transitive updates. Importantly, it only updates to the required version, not the highest available like the usual cargo update policy. So theoretically, if you initialize a Cargo-minimal.lock with cargo +nightly build -- -Z minimal-versions, you can then naturally inch it forward as constraints change. I’ll get to that magic cargo command which finds the minimal dependency set of a manifest in a second. But one thought I have is why not just re-calculate the minimum set for every hash?

I think there are two viable options which strike a good balance.

  1. The current approach, check in a file which only needs to be touched on changes.
  2. Don’t check in a file at all, calculate a minimal set in the CI workflow.

A third option could be to check in a file which needs to be updated on each hash, but that is too much overhead.

Checking in a file leaves a nice paper trail. But I am curious if there are tricky scenarios when downgrading or removing a dependency. cargo check follows these rules.

  1. It adds new dependencies to the lockfile if they’re required by the manifest but not in the lockfile
  2. It removes dependencies from the lockfile if they’re no longer referenced in the manifest
  3. It updates dependencies to new versions if required by manifest changes
  4. It preserves existing dependency versions when possible (if they still satisfy the requirements)

So, #4, that doesn’t sound like what we want for this scenario. If we upgrade a dependency, and then roll it back for some reason, the version “floor” could be left in the raised state. We would no longer be testing the actual minimums.

minimal-versions vs. direct-minimal-versions

Both the minimal-versions and direct-minimal-versions unstable flags are sitting on the nightly channel (hidden by the -Z). minimal-versions was added early 2018 and direct-minimal-versions was added around November 2022. Since these are on nightly, they have never been stabilized into a rust version. Minimal version finding has proved to be useful, but might not be as easy as it sounds given the docs for direct-minimal-versions say not to use minimal-versions. So how are they different?

name = "qux"

[dependencies]
foo = "1.0.0"
bar = "1.0.0"

Theoretical Cargo.toml manifest.

foo depends on baz = 1.0.0 while bar depends on baz = 1.2.0. With normal (let’s say v2 resolver) resolution, the latest version of baz with major version 1, 1.5.0 is used.

If the minimal-versions flag is used, then foo and bar are both pegged to 1.0.0. The shared transitive dependency, baz is pegged to the lowest version which still satisfies the constraints. So in this case, 1.2.0 due to bar’s constraint.

If direct-minimal-versions is used instead, the same applies to foo and bar, they are at 1.0.0. But now baz is back to the general policy and gets its fresh 1.5.0 version.

Why introduce this less strict policy? Well, lots of libraries don’t test their minimal versions and have fallen for the easy trap of actually depending on high versions. In the simple example above, maybe bar actually needs baz = 1.3.0. So a maintainer would have to go and fix up some of their dependencies before they could effectively test their minimal versions. direct-minimal-versions is the “hey, I hope bar gets their shit together, but I don’t have time for that right now” option.

The tradeoff is that less scenarios are tested. But since rust doesn’t allow for a library to implicitly depend on things from its transitive dependencies, I think most of these scenarios are the responsibility of the dependencies. So minimal-versions can validate your libraries whole dependency tree, but some failures might be upstream (different crates). direct-minimal-versions covers less, but any break would be in the maintainers domain to fix.

So it seems like it would be best to use minimal-versions if you can and fallback to direct-minimal-versions if you must.

workspaces

If developing a group of libraries, it might make sense to put them in a workspace. A workspace however, shares a lockfile. This can be a helpful feature to ensure all apps and libraries in a workspace are using the same version of a dependency. This helps avoid any runtime issues with data being passed around. But it is a little restrictive if a workspace is made up of purely libraries. If one of the libraries has more restrictive constraints on a shared transitive dependency, what does that mean for maintainers and consumers?

Ideally, each crate would have the broadest constraints possible. But if the libraries are closely coupled and designed to be used together, this may end up just implicitly being the highest version of a dependency. There is some version inflation. Funny enough, this is were minimal-versions is less strict than direct-minimal-versions, it will happily create a lockfile and just put the highest version of the workspace for a dependency. But this means that the explicit minimal version is not being tested. direct-minimal-versions on the other hand fails fast, which is painful for maintaining, but perhaps a good thing to point out that your library constraints are probably lying?

The serde dependency in the main rust-bitcoin workspace is a good example.

Crate Serde Version Features Optional
bitcoin 1.0.103 derive, alloc Yes
fuzz 1.0.103 derive No
hashes 1.0 - Yes
internals 1.0.103 - Yes
primitives 1.0.103 derive, alloc Yes
units 1.0.103 derive Yes

And here is the state of the lockfiles. There is no direct-minimal-versions entry since the above difference in direct dependencies panics that command.

Lockfile Serde Version
minimal-versions 1.0.103
Cargo-minimal.lock 1.0.156
Cargo-recent.lock 1.0.210

A real life example of version drift! Something must have bumped serde in the past and then been removed or downgraded.

I am not sure the best way to fix this. direct-minimal-versions forces you to be explicit about your minimal version, which does sound good on paper, but in this case hashes doesn’t have an internal dependency that would push up its serde version. It is just being in the same workspace which is forcing it. These libraries are supposed to be tightly coupled, so maybe not the worst.