274 lines
13 KiB
Markdown
274 lines
13 KiB
Markdown
# Auto-resolution for non-curated packages
|
||
|
||
Status: in progress. Tracks the implementation of `cargoxx add <pkg>` for
|
||
packages that are not in `data/linkdb.json`. See `SPEC.md` §9 step 4–6 for
|
||
the contract this implements.
|
||
|
||
## Goal
|
||
|
||
Today `cargoxx add` only succeeds for the 25 packages baked into
|
||
`data/linkdb.json`. This work extends `cargoxx add <pkg>` to fall through
|
||
to the user's local machine and, on success, persist the discovered
|
||
recipe to the SQLite overlay so subsequent runs are instant.
|
||
|
||
The user-stated steps:
|
||
|
||
1. confirm the package exists in `nixpkgs` (`nixos-unstable`),
|
||
2. discover its CMake `find_package` / target rules via Conan, then vcpkg,
|
||
then by scanning `lib/cmake/**/*Config.cmake` under the package's nix
|
||
store path,
|
||
3. verify the candidate by building an empty program that links the dep,
|
||
4. record the version (already in hand from step 1's `nix eval`),
|
||
5. write the recipe to the overlay so it sticks.
|
||
|
||
## Design decisions
|
||
|
||
| Decision | Choice | Why |
|
||
| --- | --- | --- |
|
||
| Verify depth | full `cargoxx build` of a tmp project | catches link / ABI errors that configure-only would miss (e.g. abseil-cpp's libstdc++ vs libc++ mismatch already exposed by `verify-curated-db.sh`) |
|
||
| Probe order | Conan → vcpkg → nix-cmake-scan; first that *passes verification* wins; failed candidates fall through | maximizes hit rate without polluting overlay |
|
||
| Discovery side-effects | `Database::resolve()` stays pure (overlay+curated only); a separate `Database::discover()` does network + verify + persist | preserves the existing test surface; `cmd_add` orchestrates the chain |
|
||
| Failure caching | populate `resolution_failures` (already in schema) when *all* probes fail; subsequent retries within 24 h short-circuit | prevents repeated minute-long retries |
|
||
| Verification result handling | scaffold tmp project, write provisional overlay row with `verified_at = 0`, build; on success rewrite `verified_at = now`; on failure delete the row | overlay only ever holds verified recipes |
|
||
|
||
## Resolution chain
|
||
|
||
```
|
||
db.resolve(name, version, components)
|
||
├─ overlay rows (existing)
|
||
├─ curated JSON (existing)
|
||
└─ on LinkdbUnknownPackage → cmd_add calls db.discover(name, project_root)
|
||
├─ nixpkgs probe: nix eval nixpkgs#<name> for { version, path }
|
||
│ fail → resolution_failures, return error
|
||
├─ Conan probe: GET conan-center-index/recipes/<name>/all/conanfile.py
|
||
│ regex out cmake_target_name + cmake_file_name
|
||
├─ vcpkg probe: GET microsoft/vcpkg/ports/<name>/usage
|
||
│ parse the literal CMake snippet
|
||
├─ nix-cmake-scan: walk <path>/lib/cmake/**/*Config.cmake
|
||
│ regex add_library(<name> ... IMPORTED) for targets
|
||
│ derive find_package name from the *Config.cmake filename stem
|
||
│
|
||
├─ for each candidate (in order above):
|
||
│ verify_link(candidate, name, version, components, overlay_path)
|
||
│ — scaffold tmp project (cmd_new),
|
||
│ — provisional overlay row pointing at the candidate,
|
||
│ — write empty src/main.cpp,
|
||
│ — call cmd_build(no_build = false) to run nix develop -c
|
||
│ cmake configure + build,
|
||
│ — succeeds → rewrite overlay row with verified_at = now;
|
||
│ return Recipe to caller
|
||
│ — fails → delete provisional row, try next probe
|
||
│
|
||
└─ all candidates failed → record to resolution_failures;
|
||
return ResolutionUnsatisfiable
|
||
```
|
||
|
||
## File layout
|
||
|
||
```
|
||
src/resolver/
|
||
├── resolver.cppm # public API surface for all resolver helpers
|
||
├── nixpkgs_probe.cpp # ✅ Phase 1 (committed: 1c7ff39)
|
||
├── nix_cmake_scan.cpp # Phase 2
|
||
├── conan_probe.cpp # Phase 3
|
||
├── vcpkg_probe.cpp # Phase 4
|
||
└── verify_link.cpp # Phase 5
|
||
```
|
||
|
||
`Database::discover` and the `cmd_add` wire-up land in Phase 6 by editing
|
||
`src/linkdb/curated.cpp`, `src/linkdb/overlay.cpp`, and
|
||
`src/cli/cmd_add.cpp`.
|
||
|
||
The deferred files in `TECH_SPEC.md` §1 (`nixhub.cpp`, `lazamar.cpp`,
|
||
`nixpkgs_git.cpp`) belong to a separate feature — the *version* resolver
|
||
that picks a concrete version from a range. Out of scope here.
|
||
|
||
## Critical files (re-)used
|
||
|
||
| File | Why |
|
||
| --- | --- |
|
||
| `src/linkdb/linkdb.cppm` | extend with `Database::discover()` declaration |
|
||
| `src/linkdb/curated.cpp:158` | `Database::resolve` already does overlay → curated; discovery is *not* folded in here, kept side-effect free |
|
||
| `src/linkdb/overlay.cpp` | split `overlay_insert_manual` → `overlay_insert_recipe(row, source)` so non-`manual` sources are persistable; add `overlay_delete_recipe`; add `overlay_record_failure` for `resolution_failures` |
|
||
| `src/cli/cmd_add.cpp:48` | after `db->resolve(...)` returns `LinkdbUnknownPackage`, call `db->discover(name, project_root)` and use the returned recipe |
|
||
| `src/exec/exec.cppm`, `src/exec/subprocess.cpp` | reuse `exec::run` for `nix eval` and `curl` — no new tooling, just new call sites |
|
||
| `src/util/util.cppm` | reuse `ResolutionUnknownPackage` (E40), `ResolutionNetworkError` (E41), `ResolutionUnsatisfiable` (E42); no new error codes |
|
||
| `src/cli/cmd_build.cpp` | called by `verify_link.cpp`; takes `overlay_path` and `project_root`; no signature change needed |
|
||
| `scripts/verify-curated-db.sh` | conceptual template for the `verify_link` flow — same pattern as that script, in code form |
|
||
|
||
## Probe specs
|
||
|
||
### A. nixpkgs_probe (✅ done — Phase 1, 1c7ff39)
|
||
|
||
```
|
||
nix eval nixpkgs#<pkg> --json --apply 'p: { version = p.version or ""; path = p.outPath; }'
|
||
```
|
||
|
||
- `--extra-experimental-features 'nix-command flakes'` baked into the call
|
||
so it works without user-side `nix.conf` flags.
|
||
- 60 s `ExecOptions.timeout`.
|
||
- Failure modes: missing attribute (`stderr` has `does not provide attribute`)
|
||
→ `ResolutionUnknownPackage`; otherwise `ResolutionNetworkError`.
|
||
- Returned: `NixpkgsInfo { attr, version, out_path }`.
|
||
- Field name **must** be `path`, not `outPath`. nix's `--json` mode coerces
|
||
any attrset containing `outPath` to a bare-string derivation reference,
|
||
which would lose the `version` field.
|
||
|
||
### B. nix_cmake_scan (Phase 2, next)
|
||
|
||
- Walk `<out_path>/lib/cmake/` recursively.
|
||
- For each `<X>Config.cmake` or `<X>-config.cmake`:
|
||
- `find_package` name = stem `<X>`.
|
||
- Read file. Regex
|
||
`add_library\(([^ ]+)\s+(STATIC|SHARED|INTERFACE|UNKNOWN)\s+IMPORTED\)`
|
||
to extract IMPORTED targets.
|
||
- Also pick up `add_library(<alias> ALIAS <real>)` so the canonical
|
||
`<alias>::<sub>` form gets detected.
|
||
- Pick best candidate:
|
||
1. case-insensitive equality between stem and `package_name`,
|
||
2. prefix match,
|
||
3. first config with non-empty target list.
|
||
- Returns `NixCmakeCandidate { find_package, targets, config_file }` or
|
||
`ResolutionUnknownPackage`.
|
||
|
||
### C. Conan probe (Phase 3)
|
||
|
||
- Text-only — never executes Python. SPEC §14 mandates this.
|
||
- `curl -fsSL https://raw.githubusercontent.com/conan-io/conan-center-index/master/recipes/<pkg>/all/conanfile.py`.
|
||
- Regex `cmake_target_name\s*=\s*['"]([^'"]+)['"]` and same for
|
||
`cmake_file_name`. Handle both `cpp_info.set_property("cmake_target_name", ...)`
|
||
and the legacy `self.cpp_info.names["cmake"] = "..."` forms.
|
||
- Pure parser exposed as `parse_conanfile(text)`; the network adapter
|
||
wraps `curl` via `exec::run`.
|
||
- 404 → `ResolutionUnknownPackage`; transport errors → `ResolutionNetworkError`.
|
||
|
||
### D. vcpkg probe (Phase 4)
|
||
|
||
- `curl -fsSL https://raw.githubusercontent.com/microsoft/vcpkg/master/ports/<pkg>/usage`.
|
||
- The file is plain CMake. Extract first `find_package(<name> ...)` line and
|
||
any `target_link_libraries(... <pkg>::...)` lines.
|
||
- Pure parser exposed as `parse_vcpkg_usage(text)`.
|
||
|
||
### E. verify_link (Phase 5)
|
||
|
||
```cpp
|
||
auto verify_link(const Recipe& candidate,
|
||
const std::string& name,
|
||
const std::string& version_spec,
|
||
const std::vector<std::string>& components,
|
||
const std::filesystem::path& cargoxx_overlay_path)
|
||
-> util::Result<void>;
|
||
```
|
||
|
||
- Create `<tmp>/cargoxx-verify-<name>` (mktemp).
|
||
- `cmd_new(name, /*lib_only=*/false, tmp_parent)`.
|
||
- Insert `candidate` into `cargoxx_overlay_path` with the right `source`
|
||
and `verified_at = 0` (provisional).
|
||
- Mutate the scaffolded manifest to declare `name` with `version_spec`
|
||
and `components`.
|
||
- Overwrite `src/main.cpp` with `int main() {}` — empty body. The point
|
||
is to exercise find_package + target_link_libraries + linker, *not* to
|
||
call any specific API (which would require per-package knowledge).
|
||
- Call `cmd_build(tmp_proj, no_build=false, release=false,
|
||
target=nullopt, overlay_path=cargoxx_overlay_path)`.
|
||
- On success: rewrite the overlay row with `verified_at = now()`,
|
||
return `{}`.
|
||
- On failure: delete the provisional row, return the build error.
|
||
- Always: `std::filesystem::remove_all(tmp_dir)` (RAII helper).
|
||
|
||
## Persistence semantics
|
||
|
||
| Probe path | `source` column | `verified_at` | TTL (existing `overlay_is_fresh`) |
|
||
| --- | --- | --- | --- |
|
||
| Conan probe verified | `conan` | now | 30 days |
|
||
| vcpkg probe verified | `vcpkg` | now | 30 days |
|
||
| nix-cmake-scan verified | `nix-probe` | now | 30 days |
|
||
| Manual via `linkdb add` | `manual` | now | never expires |
|
||
|
||
`resolution_failures` populated only when **all** probes fail. Subsequent
|
||
`cargoxx add` calls within 24 h skip probing and return the cached error.
|
||
|
||
## Phasing (one commit per phase)
|
||
|
||
| Phase | Status | Commit |
|
||
| --- | --- | --- |
|
||
| 1. nixpkgs_probe + JSON parser | ✅ | `1c7ff39` |
|
||
| 2. nix_cmake_scan | ✅ | `e63ac69` |
|
||
| 3. conan_probe + parse_conanfile | ✅ | `e5c173b` |
|
||
| 4. vcpkg_probe + parse_vcpkg_usage | ✅ | `941d5b3` |
|
||
| 5. verify_link (tmp project + cmd_build) | ✅ | `816ec99` |
|
||
| 6. resolver::discover + cmd_add wire-up | ✅ | (this commit) |
|
||
|
||
## Testing strategy
|
||
|
||
| Test | Mechanism |
|
||
| --- | --- |
|
||
| `parse_nix_eval_json(text)` | ✅ Catch2 unit (`tests/nixpkgs_probe_parse.cpp`) |
|
||
| `nixpkgs_probe(name)` | ✅ network-gated (`tests/nixpkgs_probe_live.cpp`); requires `CARGOXX_NETWORK_TESTS=1` |
|
||
| `scan_imported_targets(text)` | Catch2 unit |
|
||
| `nix_cmake_scan(tmp)` | Catch2 unit using a fixture tree |
|
||
| `parse_conanfile(text)` | Catch2 unit; embedded conanfile.py snippets covering both old and new forms |
|
||
| `parse_vcpkg_usage(text)` | Catch2 unit |
|
||
| `conan_probe(name)` | network-gated; against `fmt` |
|
||
| `vcpkg_probe(name)` | network-gated; against `fmt` |
|
||
| `verify_link` end-to-end | network-gated; uses `simdjson` (small, present in nixpkgs, not in our curated DB) |
|
||
| `cmd_add` end-to-end on uncurated package | network-gated; full flow on `simdjson` |
|
||
|
||
Failure-mode coverage:
|
||
- Conan/vcpkg 404 → `ResolutionUnknownPackage`
|
||
- `nix eval` errors → `ResolutionUnknownPackage`
|
||
- All probes return candidates that fail to verify-link → record failure,
|
||
return `ResolutionUnsatisfiable`
|
||
- `resolution_failures` cache hit → returns the recorded error without
|
||
re-probing
|
||
|
||
## Definition of done
|
||
|
||
After Phase 6:
|
||
|
||
```sh
|
||
nix develop -c cmake --build build && \
|
||
ctest --test-dir build --output-on-failure # all unit tests green
|
||
CARGOXX_NETWORK_TESTS=1 nix develop -c ctest --test-dir build # live tests too
|
||
```
|
||
|
||
Manual smoke (matches the user's request 1–5):
|
||
|
||
```sh
|
||
cd /tmp && rm -rf simd-smoke && mkdir simd-smoke && cd simd-smoke
|
||
~/cargoxx/build/cargoxx new app && cd app
|
||
~/cargoxx/build/cargoxx add simdjson # not in curated; triggers discover
|
||
# Expected output:
|
||
# probing nixpkgs#simdjson ... ok (3.x.y)
|
||
# probing conan-center-index ... ok (cmake_target_name = simdjson::simdjson)
|
||
# verifying ... ok
|
||
# Added simdjson 3.x.y (linkdb: conan)
|
||
~/cargoxx/build/cargoxx build # ordinary build path now
|
||
# picks up the freshly cached
|
||
# overlay row
|
||
```
|
||
|
||
A second `cargoxx add simdjson` in another fresh project hits the overlay
|
||
directly and returns instantly — proves persistence step (5).
|
||
|
||
## Risks / known limits
|
||
|
||
- **Network**: Conan + vcpkg probes need outbound HTTPS. The
|
||
network-gated test layer covers this; the unit tests on pure parsers
|
||
don't need network.
|
||
- **Conan recipe shape variation**: ~10 % of recipes use Python
|
||
conditionals to set `cmake_target_name` per option — text parsing
|
||
will miss these. Falls through to vcpkg / nix-scan, which is the
|
||
point of the chain.
|
||
- **nix-cmake-scan heuristics**: packages without standard
|
||
`lib/cmake/<X>/<X>Config.cmake` layout won't be picked up. Acceptable
|
||
for v0.2; the manual escape hatch (`cargoxx linkdb add`) covers
|
||
edge cases.
|
||
- **Overlay growth**: long-tail packages will accumulate in the user's
|
||
overlay sqlite. No cleanup in v0.2 — not a concern at human-scale
|
||
package counts.
|
||
- **Verify-link slowness**: full `cargoxx build` per candidate. First
|
||
probe usually wins, so it's typically one build. Worst case: three
|
||
builds (Conan fail, vcpkg fail, nix-scan ok). Document as expected
|
||
behavior in the CLI output (`verifying...` progress message).
|