Files
cargoxx/docs/version-resolution.md

340 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Version-resolution algorithm
Status: in progress (Phases 12 of 6 done). This doc fixes the contract
for **`(package, version) → nixpkgs commit_hash`** discovery and the
flake-codegen pipeline that consumes it. It overrides `SPEC.md` §10's
single-shared-rev model with a per-dep-rev model (user-directed; SPEC
amendment is Phase 6).
## Overview
```
cargoxx add <pkg>@<ver>
┌──────────────────────┐
│ resolve_version(name,│
│ version) │
└──────────────────────┘
│ │
primary HTTP │ │ offline fallback
▼ ▼
┌──────────────────┐ ┌──────────────────────┐
│ devbox_resolve │ │ nixpkgs_git_resolve │
│ search.devbox.sh │ │ ~/.cache/cargoxx/ │
│ /v1/resolve │ │ nixpkgs/ (lazy) │
└──────────────────┘ └──────────────────────┘
│ │
└──┬───┘
Result<std::string /*commit_hash*/>
cmd_add writes nixpkgs_rev into Cargoxx.lock
▼ (later)
cargoxx build
codegen::flake_nix reads lockfile
emits per-pinned-dep nixpkgs input
```
## When does resolution run?
| Trigger | What gets resolved |
| --- | --- |
| `cargoxx add <pkg>@<ver>` | `(pkg, ver)` is resolved exactly once. The resulting commit is written to `Cargoxx.lock` next to the dep entry. |
| `cargoxx add <pkg>` (no `@<ver>`) | **Not** resolved. Lockfile entry's `nixpkgs_rev` stays `nullopt`. The generated flake.nix uses only the shared `nixpkgs.url = github:NixOS/nixpkgs/nixos-unstable`. |
| `cargoxx build` (lockfile already has rev) | **Not re-resolved.** `cargoxx build` reads existing lockfile entries and preserves `nixpkgs_rev`. Re-resolution would require an explicit `cargoxx update` (deferred to v0.3). |
| `cargoxx build` (lockfile missing the rev for a dep) | Synthesized as null — same as the wildcard path. (Future: also call `resolve_version` here when manifest spec is concrete.) |
`cargoxx build` is **idempotent with respect to the lockfile**
running it twice produces byte-identical `flake.nix` + `Cargoxx.lock`
provided the manifest hasn't changed. This is the property the
"lockfile merge" change in Phase 4 enforces.
## resolve_version
```
auto resolve_version(name: string, version: string) -> Result<string /*sha40*/>:
if r := devbox_resolve(name, version); r.has_value():
return r->commit_hash
if r := nixpkgs_git_resolve(name, version); r.has_value():
return *r
return std::unexpected(ResolutionVersionNotFound)
```
Implementation point: this orchestrator lives in
`src/resolver/resolver.cppm` (declaration) +
`src/resolver/version_resolve.cpp` (definition). Both probes are
already implemented — Phase 3 just wires them into the orchestrator
and into `cmd_add`.
### Probe A — devbox_resolve (primary, HTTP)
**File:** `src/resolver/search_devbox.cpp` (committed `df2c25b`)
**URL pattern:**
```
GET https://search.devbox.sh/v1/resolve?name=<urlencoded-name>&version=<urlencoded-version>
```
This is the same endpoint devbox itself uses
(`devbox/internal/searcher/client.go` `Resolve()`). Behind the URL is
the same Jetify backend that powers nixhub.io.
**Response shape (real, abbreviated for `fmt 10.2.1`):**
```json
{
"commit_hash": "f4b140d5b253f5e2a1ff4e5506edbf8267724bde",
"version": "10.2.1",
"name": "fmt",
"attr_paths": ["fmt"],
"systems": {
"x86_64-linux": {
"commit_hash": "f4b140d5b253f5e2a1ff4e5506edbf8267724bde",
"attr_paths": ["fmt"], ...
}, ...
}
}
```
**Parser contract** (`parse_devbox_resolve`):
- `commit_hash` is mandatory. If the top-level field is missing, fall
back to the first non-empty `systems.<plat>.commit_hash`.
- `name`, `version`, `attr_paths` are best-effort; absence leaves them
blank.
- 404 / curl exit 22 → `ResolutionUnknownPackage`.
- Empty `commit_hash` after fallback → `ResolutionVersionNotFound`.
- Other curl exits, JSON parse errors → `ResolutionNetworkError`.
**Timeout:** 10 s on `--max-time`, 15 s wrapping `ExecOptions.timeout`.
### Probe B — nixpkgs_git_resolve (offline fallback)
**File:** `src/resolver/nixpkgs_git.cpp` (committed in Phase 2 series)
**Setup:** lazy clone of
`https://github.com/NixOS/nixpkgs.git` into
`$XDG_CACHE_HOME/cargoxx/nixpkgs/` (or `$HOME/.cache/...`) on first
use. ~9 GB and slow (515 min); subsequent calls are fast and offline.
**Search:**
```
git -C <repo> log --all \
-S 'version = "<urlencoded-version>"' \
--pretty='%H %ct' \
-- pkgs/
```
`-S '<term>'` returns commits that *introduced or removed* the literal
string. `--pretty='%H %ct'` emits `<sha40> <committer-time>` per
line. We restrict to `pkgs/` to keep noise down (out-of-tree match
sites in `lib/`, `nixos/`, etc. don't matter).
**Pick:** youngest committer-time (`%ct` highest) wins. The pure
helper `pick_youngest_commit(text)` does this; it tolerates malformed
lines (skips them).
**Errors:**
- `pick_youngest_commit` returns `nullopt``ResolutionVersionNotFound`.
- Clone failure → `ResolutionNetworkError`.
- Subsequent `git log` failure → `ResolutionNetworkError`.
**Test fixture trick:** instead of cloning real nixpkgs in tests, the
unit test builds a tiny throwaway repo with
`pkgs/development/libraries/<pkg>/default.nix` files at two versions
and asserts introducing-commit detection works.
### Heuristic limits
`-S 'version = "<v>"'` is fuzzy — it matches **any** file in `pkgs/`
that has that literal. Two real-world failure modes:
1. **Unrelated package match.** `version = "1.0.0"` appears in many
nix derivations. The youngest-commit tiebreaker biases toward
"the most recent thing that touched this string", which usually
*is* the package's bump commit, but not guaranteed.
2. **Non-string-formed versions.** Some derivations build the version
via `lib.removeSuffix`, interpolation, or an inherited
`pname`/`finalAttrs.version`. `-S` won't see those. For those
packages, only the devbox HTTP path can answer.
Both are accepted as known limits — the HTTP path is primary and fast
when reachable; the git fallback exists only for offline determinism.
## Lockfile interaction
`Cargoxx.lock` already carries `LockfilePackage.nixpkgs_rev`
(`std::optional<std::string>`). No schema change.
### Add path
`cmd_add fmt@10.2.1`:
1. existing manifest validation, duplicate check, linkdb resolve /
discover (separate auto-resolution feature, already shipped).
2. **NEW:** call `resolve_version("fmt", "10.2.1")`. On success,
capture `commit_hash`.
3. existing manifest write of `[dependencies] fmt = "10.2.1"`.
4. **NEW:** load lockfile (or initialize empty), find/insert the
`LockfilePackage{ name="fmt", version="10.2.1" }` entry, set
`nixpkgs_rev = "<commit_hash>"`, write lockfile back.
`cmd_add fmt` (wildcard) skips step 2 and step 4's `nixpkgs_rev`
assignment.
### Build path (Phase 4 fix)
Today, `synthesize_lockfile` overwrites the lockfile every time. With
per-dep revs in scope this would erase pinned revs on every build.
The fix:
```
build_lockfile(manifest, recipes):
let prior = parse(project_root / "Cargoxx.lock") or empty
for each dep in manifest.dependencies:
let prior_entry = prior.find(dep.name, dep.version_spec)
new_entry = LockfilePackage{ name, version=dep.version_spec, ... }
if prior_entry: new_entry.nixpkgs_rev = prior_entry.nixpkgs_rev
emit new_entry
```
The lookup key is `(name, version)`. If the user changes the version,
the prior rev is dropped (correct — the rev was for the old version).
If the user neither edited nor `cargoxx update`d, the rev survives.
### Update path (deferred to v0.3)
`cargoxx update <pkg>` would call `resolve_version` again with the
existing manifest version_spec, possibly upgrading the rev to a
newer one, even when the user-visible version string is unchanged.
Out of scope for this milestone.
## Flake codegen — per-dep inputs
**Phase 5.** Today's `flake.nix` template has a single
`@@NIXPKGS_REV@@` placeholder. The new template emits:
### Inputs block
```nix
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
# one line per dep with non-null nixpkgs_rev:
nixpkgs-fmt-10_2_1.url = "github:NixOS/nixpkgs/f4b140d5b...";
nixpkgs-spdlog-1_13_0.url = "github:NixOS/nixpkgs/abcdef0...";
flake-utils.url = "github:numtide/flake-utils";
};
```
### Outputs lambda
```nix
outputs = { self, nixpkgs, nixpkgs-fmt-10_2_1, nixpkgs-spdlog-1_13_0,
flake-utils }: ...
```
### Let bindings
```nix
let
pkgs = import nixpkgs { inherit system; };
pkgs_fmt_10_2_1 = import nixpkgs-fmt-10_2_1 { inherit system; };
pkgs_spdlog_1_13_0 = import nixpkgs-spdlog-1_13_0 { inherit system; };
llvmPkgs = pkgs.llvmPackages;
in {...}
```
### buildInputs
```nix
buildInputs = [
pkgs_fmt_10_2_1.fmt # pinned dep
pkgs_spdlog_1_13_0.spdlog # pinned dep
pkgs.zlib # unpinned: uses default nixpkgs
];
```
Unpinned deps (where `nixpkgs_rev` is null) reference the shared
`pkgs` set as today.
### Sanitization
Helper in `src/codegen/flake.cpp`:
```cpp
auto sanitize_input_attr(std::string_view name, std::string_view version)
-> std::string;
```
Steps:
1. Concatenate `nixpkgs-<name>-<version>`.
2. Replace every char outside `[a-zA-Z0-9_-]` with `_`. Mostly
converts dots in versions: `10.2.1``10_2_1`.
3. Use the sanitized form in **all three** places: `inputs.<attr>`,
the `outputs = { …, <attr>, … }` parameter list, and the
`pkgs_<attr-with-dashes-as-underscores>` `let` binding.
Examples:
- `fmt` + `10.2.1` → input attr `nixpkgs-fmt-10_2_1`,
`let` binding `pkgs_fmt_10_2_1`
- `range-v3` + `0.12.0``nixpkgs-range-v3-0_12_0`,
`pkgs_range_v3_0_12_0`
- `boost_filesystem` + `1.84.0``nixpkgs-boost_filesystem-1_84_0`
The `let`-binding name needs **all** non-alpha-num replaced with `_`
(hyphens included) because nix variable names disallow hyphens. The
**input** attr keeps hyphens (allowed in input names). Two derived
forms.
### Collision detection
Two pinned deps with the same `(sanitized_name, sanitized_version)`
collide. With the version stored fully (e.g. `10.2.1`, never the
manifest spec `10.2`) and dep names being unique within a manifest,
collisions are pathologically rare. If a real one is ever reported,
mitigation is to append `-<short-sha>` to the input attr.
## Phase status
| Phase | Status | Commit |
| --- | --- | --- |
| 1. devbox_resolve + parser | ✅ | `df2c25b` |
| 2. nixpkgs_git_resolve fallback | ✅ | `cb82e91` |
| 3. resolve_version + cmd_add wire-up | ✅ | `6f8e9c4` |
| 4. cmd_build lockfile merge | ✅ | `c4b2a1b` |
| 5. flake codegen for per-dep inputs | ✅ | (this commit) |
| 6. SPEC §7/§10 amendment + smoke | ✅ | (this commit) |
## End-to-end verification (Phase 6)
```sh
cd /tmp && rm -rf demo && mkdir demo && cd demo
cargoxx new app && cd app
cargoxx add fmt@10.2.1
grep "nixpkgs-fmt-10_2_1" flake.nix # input present
grep "f4b140d5" flake.nix # commit_hash substituted
cargoxx build && ./build/debug/app # binary builds + runs
cargoxx build # second run is no-op
diff <prev-flake.nix> flake.nix # byte-identical
```
A second `cargoxx build` regenerates byte-identical
`Cargoxx.lock` + `flake.nix` — proves the merge path preserves the
rev, not re-resolves it.
## ABI note
Mixing nixpkgs revisions across pinned deps trades the single-rev
ABI guarantee (SPEC §10) for flexibility. Two pinned deps may have
been compiled against different glibc / libc++ majors and fail to
link cleanly. v0.2 silently accepts the risk; surfacing a
compatibility warning is a future polish item.