Files
cargoxx/docs/version-resolution.md

12 KiB
Raw Permalink Blame History

Version-resolution algorithm

Status: in progress (Phases 12 of 6 done). This doc fixes the contract for (package, version) → nixpkgs commit_hash discovery and the flake-codegen pipeline that consumes it. It overrides SPEC.md §10's single-shared-rev model with a per-dep-rev model (user-directed; SPEC amendment is Phase 6).

Overview

                          cargoxx add <pkg>@<ver>
                                  │
                                  ▼
                       ┌──────────────────────┐
                       │ resolve_version(name,│
                       │    version)          │
                       └──────────────────────┘
                              │      │
                  primary HTTP │      │ offline fallback
                              ▼      ▼
            ┌──────────────────┐  ┌──────────────────────┐
            │ devbox_resolve   │  │ nixpkgs_git_resolve  │
            │ search.devbox.sh │  │ ~/.cache/cargoxx/    │
            │ /v1/resolve      │  │   nixpkgs/ (lazy)    │
            └──────────────────┘  └──────────────────────┘
                              │      │
                              └──┬───┘
                                  ▼
                  Result<std::string /*commit_hash*/>
                                  │
                                  ▼
              cmd_add writes nixpkgs_rev into Cargoxx.lock
                                  │
                                  ▼     (later)
                          cargoxx build
                                  │
                                  ▼
                  codegen::flake_nix reads lockfile
                  emits per-pinned-dep nixpkgs input

When does resolution run?

Trigger What gets resolved
cargoxx add <pkg>@<ver> (pkg, ver) is resolved exactly once. The resulting commit is written to Cargoxx.lock next to the dep entry.
cargoxx add <pkg> (no @<ver>) Not resolved. Lockfile entry's nixpkgs_rev stays nullopt. The generated flake.nix uses only the shared nixpkgs.url = github:NixOS/nixpkgs/nixos-unstable.
cargoxx build (lockfile already has rev) Not re-resolved. cargoxx build reads existing lockfile entries and preserves nixpkgs_rev. Re-resolution would require an explicit cargoxx update (deferred to v0.3).
cargoxx build (lockfile missing the rev for a dep) Synthesized as null — same as the wildcard path. (Future: also call resolve_version here when manifest spec is concrete.)

cargoxx build is idempotent with respect to the lockfile — running it twice produces byte-identical flake.nix + Cargoxx.lock provided the manifest hasn't changed. This is the property the "lockfile merge" change in Phase 4 enforces.

resolve_version

auto resolve_version(name: string, version: string) -> Result<string /*sha40*/>:
    if r := devbox_resolve(name, version); r.has_value():
        return r->commit_hash
    if r := nixpkgs_git_resolve(name, version); r.has_value():
        return *r
    return std::unexpected(ResolutionVersionNotFound)

Implementation point: this orchestrator lives in src/resolver/resolver.cppm (declaration) + src/resolver/version_resolve.cpp (definition). Both probes are already implemented — Phase 3 just wires them into the orchestrator and into cmd_add.

Probe A — devbox_resolve (primary, HTTP)

File: src/resolver/search_devbox.cpp (committed df2c25b)

URL pattern:

GET https://search.devbox.sh/v1/resolve?name=<urlencoded-name>&version=<urlencoded-version>

This is the same endpoint devbox itself uses (devbox/internal/searcher/client.go Resolve()). Behind the URL is the same Jetify backend that powers nixhub.io.

Response shape (real, abbreviated for fmt 10.2.1):

{
  "commit_hash": "f4b140d5b253f5e2a1ff4e5506edbf8267724bde",
  "version": "10.2.1",
  "name": "fmt",
  "attr_paths": ["fmt"],
  "systems": {
    "x86_64-linux": {
      "commit_hash": "f4b140d5b253f5e2a1ff4e5506edbf8267724bde",
      "attr_paths": ["fmt"], ...
    }, ...
  }
}

Parser contract (parse_devbox_resolve):

  • commit_hash is mandatory. If the top-level field is missing, fall back to the first non-empty systems.<plat>.commit_hash.
  • name, version, attr_paths are best-effort; absence leaves them blank.
  • 404 / curl exit 22 → ResolutionUnknownPackage.
  • Empty commit_hash after fallback → ResolutionVersionNotFound.
  • Other curl exits, JSON parse errors → ResolutionNetworkError.

Timeout: 10 s on --max-time, 15 s wrapping ExecOptions.timeout.

Probe B — nixpkgs_git_resolve (offline fallback)

File: src/resolver/nixpkgs_git.cpp (committed in Phase 2 series)

Setup: lazy clone of https://github.com/NixOS/nixpkgs.git into $XDG_CACHE_HOME/cargoxx/nixpkgs/ (or $HOME/.cache/...) on first use. ~9 GB and slow (515 min); subsequent calls are fast and offline.

Search:

git -C <repo> log --all                   \
    -S 'version = "<urlencoded-version>"' \
    --pretty='%H %ct'                     \
    -- pkgs/

-S '<term>' returns commits that introduced or removed the literal string. --pretty='%H %ct' emits <sha40> <committer-time> per line. We restrict to pkgs/ to keep noise down (out-of-tree match sites in lib/, nixos/, etc. don't matter).

Pick: youngest committer-time (%ct highest) wins. The pure helper pick_youngest_commit(text) does this; it tolerates malformed lines (skips them).

Errors:

  • pick_youngest_commit returns nulloptResolutionVersionNotFound.
  • Clone failure → ResolutionNetworkError.
  • Subsequent git log failure → ResolutionNetworkError.

Test fixture trick: instead of cloning real nixpkgs in tests, the unit test builds a tiny throwaway repo with pkgs/development/libraries/<pkg>/default.nix files at two versions and asserts introducing-commit detection works.

Heuristic limits

-S 'version = "<v>"' is fuzzy — it matches any file in pkgs/ that has that literal. Two real-world failure modes:

  1. Unrelated package match. version = "1.0.0" appears in many nix derivations. The youngest-commit tiebreaker biases toward "the most recent thing that touched this string", which usually is the package's bump commit, but not guaranteed.
  2. Non-string-formed versions. Some derivations build the version via lib.removeSuffix, interpolation, or an inherited pname/finalAttrs.version. -S won't see those. For those packages, only the devbox HTTP path can answer.

Both are accepted as known limits — the HTTP path is primary and fast when reachable; the git fallback exists only for offline determinism.

Lockfile interaction

Cargoxx.lock already carries LockfilePackage.nixpkgs_rev (std::optional<std::string>). No schema change.

Add path

cmd_add fmt@10.2.1:

  1. existing manifest validation, duplicate check, linkdb resolve / discover (separate auto-resolution feature, already shipped).
  2. NEW: call resolve_version("fmt", "10.2.1"). On success, capture commit_hash.
  3. existing manifest write of [dependencies] fmt = "10.2.1".
  4. NEW: load lockfile (or initialize empty), find/insert the LockfilePackage{ name="fmt", version="10.2.1" } entry, set nixpkgs_rev = "<commit_hash>", write lockfile back.

cmd_add fmt (wildcard) skips step 2 and step 4's nixpkgs_rev assignment.

Build path (Phase 4 fix)

Today, synthesize_lockfile overwrites the lockfile every time. With per-dep revs in scope this would erase pinned revs on every build.

The fix:

build_lockfile(manifest, recipes):
    let prior = parse(project_root / "Cargoxx.lock") or empty
    for each dep in manifest.dependencies:
        let prior_entry = prior.find(dep.name, dep.version_spec)
        new_entry = LockfilePackage{ name, version=dep.version_spec, ... }
        if prior_entry: new_entry.nixpkgs_rev = prior_entry.nixpkgs_rev
        emit new_entry

The lookup key is (name, version). If the user changes the version, the prior rev is dropped (correct — the rev was for the old version). If the user neither edited nor cargoxx updated, the rev survives.

Update path (deferred to v0.3)

cargoxx update <pkg> would call resolve_version again with the existing manifest version_spec, possibly upgrading the rev to a newer one, even when the user-visible version string is unchanged. Out of scope for this milestone.

Flake codegen — per-dep inputs

Phase 5. Today's flake.nix template has a single @@NIXPKGS_REV@@ placeholder. The new template emits:

Inputs block

inputs = {
  nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
  # one line per dep with non-null nixpkgs_rev:
  nixpkgs-fmt-10_2_1.url = "github:NixOS/nixpkgs/f4b140d5b...";
  nixpkgs-spdlog-1_13_0.url = "github:NixOS/nixpkgs/abcdef0...";
  flake-utils.url = "github:numtide/flake-utils";
};

Outputs lambda

outputs = { self, nixpkgs, nixpkgs-fmt-10_2_1, nixpkgs-spdlog-1_13_0,
            flake-utils }: ...

Let bindings

let
  pkgs = import nixpkgs { inherit system; };
  pkgs_fmt_10_2_1 = import nixpkgs-fmt-10_2_1 { inherit system; };
  pkgs_spdlog_1_13_0 = import nixpkgs-spdlog-1_13_0 { inherit system; };
  llvmPkgs = pkgs.llvmPackages;
in {...}

buildInputs

buildInputs = [
  pkgs_fmt_10_2_1.fmt        # pinned dep
  pkgs_spdlog_1_13_0.spdlog  # pinned dep
  pkgs.zlib                  # unpinned: uses default nixpkgs
];

Unpinned deps (where nixpkgs_rev is null) reference the shared pkgs set as today.

Sanitization

Helper in src/codegen/flake.cpp:

auto sanitize_input_attr(std::string_view name, std::string_view version)
    -> std::string;

Steps:

  1. Concatenate nixpkgs-<name>-<version>.
  2. Replace every char outside [a-zA-Z0-9_-] with _. Mostly converts dots in versions: 10.2.110_2_1.
  3. Use the sanitized form in all three places: inputs.<attr>, the outputs = { …, <attr>, … } parameter list, and the pkgs_<attr-with-dashes-as-underscores> let binding.

Examples:

  • fmt + 10.2.1 → input attr nixpkgs-fmt-10_2_1, let binding pkgs_fmt_10_2_1
  • range-v3 + 0.12.0nixpkgs-range-v3-0_12_0, pkgs_range_v3_0_12_0
  • boost_filesystem + 1.84.0nixpkgs-boost_filesystem-1_84_0

The let-binding name needs all non-alpha-num replaced with _ (hyphens included) because nix variable names disallow hyphens. The input attr keeps hyphens (allowed in input names). Two derived forms.

Collision detection

Two pinned deps with the same (sanitized_name, sanitized_version) collide. With the version stored fully (e.g. 10.2.1, never the manifest spec 10.2) and dep names being unique within a manifest, collisions are pathologically rare. If a real one is ever reported, mitigation is to append -<short-sha> to the input attr.

Phase status

Phase Status Commit
1. devbox_resolve + parser df2c25b
2. nixpkgs_git_resolve fallback cb82e91
3. resolve_version + cmd_add wire-up 6f8e9c4
4. cmd_build lockfile merge c4b2a1b
5. flake codegen for per-dep inputs (this commit)
6. SPEC §7/§10 amendment + smoke (this commit)

End-to-end verification (Phase 6)

cd /tmp && rm -rf demo && mkdir demo && cd demo
cargoxx new app && cd app
cargoxx add fmt@10.2.1
grep "nixpkgs-fmt-10_2_1" flake.nix      # input present
grep "f4b140d5" flake.nix                # commit_hash substituted
cargoxx build && ./build/debug/app       # binary builds + runs
cargoxx build                            # second run is no-op
diff <prev-flake.nix> flake.nix          # byte-identical

A second cargoxx build regenerates byte-identical Cargoxx.lock + flake.nix — proves the merge path preserves the rev, not re-resolves it.

ABI note

Mixing nixpkgs revisions across pinned deps trades the single-rev ABI guarantee (SPEC §10) for flexibility. Two pinned deps may have been compiled against different glibc / libc++ majors and fail to link cleanly. v0.2 silently accepts the risk; surfacing a compatibility warning is a future polish item.