# Auto-resolution for non-curated packages Status: in progress. Tracks the implementation of `cargoxx add ` for packages that are not in `data/linkdb.json`. See `SPEC.md` §9 step 4–6 for the contract this implements. ## Goal Today `cargoxx add` only succeeds for the 25 packages baked into `data/linkdb.json`. This work extends `cargoxx add ` to fall through to the user's local machine and, on success, persist the discovered recipe to the SQLite overlay so subsequent runs are instant. The user-stated steps: 1. confirm the package exists in `nixpkgs` (`nixos-unstable`), 2. discover its CMake `find_package` / target rules via Conan, then vcpkg, then by scanning `lib/cmake/**/*Config.cmake` under the package's nix store path, 3. verify the candidate by building an empty program that links the dep, 4. record the version (already in hand from step 1's `nix eval`), 5. write the recipe to the overlay so it sticks. ## Design decisions | Decision | Choice | Why | | --- | --- | --- | | Verify depth | full `cargoxx build` of a tmp project | catches link / ABI errors that configure-only would miss (e.g. abseil-cpp's libstdc++ vs libc++ mismatch already exposed by `verify-curated-db.sh`) | | Probe order | Conan → vcpkg → nix-cmake-scan; first that *passes verification* wins; failed candidates fall through | maximizes hit rate without polluting overlay | | Discovery side-effects | `Database::resolve()` stays pure (overlay+curated only); a separate `Database::discover()` does network + verify + persist | preserves the existing test surface; `cmd_add` orchestrates the chain | | Failure caching | populate `resolution_failures` (already in schema) when *all* probes fail; subsequent retries within 24 h short-circuit | prevents repeated minute-long retries | | Verification result handling | scaffold tmp project, write provisional overlay row with `verified_at = 0`, build; on success rewrite `verified_at = now`; on failure delete the row | overlay only ever holds verified recipes | ## Resolution chain ``` db.resolve(name, version, components) ├─ overlay rows (existing) ├─ curated JSON (existing) └─ on LinkdbUnknownPackage → cmd_add calls db.discover(name, project_root) ├─ nixpkgs probe: nix eval nixpkgs# for { version, path } │ fail → resolution_failures, return error ├─ Conan probe: GET conan-center-index/recipes//all/conanfile.py │ regex out cmake_target_name + cmake_file_name ├─ vcpkg probe: GET microsoft/vcpkg/ports//usage │ parse the literal CMake snippet ├─ nix-cmake-scan: walk /lib/cmake/**/*Config.cmake │ regex add_library( ... IMPORTED) for targets │ derive find_package name from the *Config.cmake filename stem │ ├─ for each candidate (in order above): │ verify_link(candidate, name, version, components, overlay_path) │ — scaffold tmp project (cmd_new), │ — provisional overlay row pointing at the candidate, │ — write empty src/main.cpp, │ — call cmd_build(no_build = false) to run nix develop -c │ cmake configure + build, │ — succeeds → rewrite overlay row with verified_at = now; │ return Recipe to caller │ — fails → delete provisional row, try next probe │ └─ all candidates failed → record to resolution_failures; return ResolutionUnsatisfiable ``` ## File layout ``` src/resolver/ ├── resolver.cppm # public API surface for all resolver helpers ├── nixpkgs_probe.cpp # ✅ Phase 1 (committed: 1c7ff39) ├── nix_cmake_scan.cpp # Phase 2 ├── conan_probe.cpp # Phase 3 ├── vcpkg_probe.cpp # Phase 4 └── verify_link.cpp # Phase 5 ``` `Database::discover` and the `cmd_add` wire-up land in Phase 6 by editing `src/linkdb/curated.cpp`, `src/linkdb/overlay.cpp`, and `src/cli/cmd_add.cpp`. The deferred files in `TECH_SPEC.md` §1 (`nixhub.cpp`, `lazamar.cpp`, `nixpkgs_git.cpp`) belong to a separate feature — the *version* resolver that picks a concrete version from a range. Out of scope here. ## Critical files (re-)used | File | Why | | --- | --- | | `src/linkdb/linkdb.cppm` | extend with `Database::discover()` declaration | | `src/linkdb/curated.cpp:158` | `Database::resolve` already does overlay → curated; discovery is *not* folded in here, kept side-effect free | | `src/linkdb/overlay.cpp` | split `overlay_insert_manual` → `overlay_insert_recipe(row, source)` so non-`manual` sources are persistable; add `overlay_delete_recipe`; add `overlay_record_failure` for `resolution_failures` | | `src/cli/cmd_add.cpp:48` | after `db->resolve(...)` returns `LinkdbUnknownPackage`, call `db->discover(name, project_root)` and use the returned recipe | | `src/exec/exec.cppm`, `src/exec/subprocess.cpp` | reuse `exec::run` for `nix eval` and `curl` — no new tooling, just new call sites | | `src/util/util.cppm` | reuse `ResolutionUnknownPackage` (E40), `ResolutionNetworkError` (E41), `ResolutionUnsatisfiable` (E42); no new error codes | | `src/cli/cmd_build.cpp` | called by `verify_link.cpp`; takes `overlay_path` and `project_root`; no signature change needed | | `scripts/verify-curated-db.sh` | conceptual template for the `verify_link` flow — same pattern as that script, in code form | ## Probe specs ### A. nixpkgs_probe (✅ done — Phase 1, 1c7ff39) ``` nix eval nixpkgs# --json --apply 'p: { version = p.version or ""; path = p.outPath; }' ``` - `--extra-experimental-features 'nix-command flakes'` baked into the call so it works without user-side `nix.conf` flags. - 60 s `ExecOptions.timeout`. - Failure modes: missing attribute (`stderr` has `does not provide attribute`) → `ResolutionUnknownPackage`; otherwise `ResolutionNetworkError`. - Returned: `NixpkgsInfo { attr, version, out_path }`. - Field name **must** be `path`, not `outPath`. nix's `--json` mode coerces any attrset containing `outPath` to a bare-string derivation reference, which would lose the `version` field. ### B. nix_cmake_scan (Phase 2, next) - Walk `/lib/cmake/` recursively. - For each `Config.cmake` or `-config.cmake`: - `find_package` name = stem ``. - Read file. Regex `add_library\(([^ ]+)\s+(STATIC|SHARED|INTERFACE|UNKNOWN)\s+IMPORTED\)` to extract IMPORTED targets. - Also pick up `add_library( ALIAS )` so the canonical `::` form gets detected. - Pick best candidate: 1. case-insensitive equality between stem and `package_name`, 2. prefix match, 3. first config with non-empty target list. - Returns `NixCmakeCandidate { find_package, targets, config_file }` or `ResolutionUnknownPackage`. ### C. Conan probe (Phase 3) - Text-only — never executes Python. SPEC §14 mandates this. - `curl -fsSL https://raw.githubusercontent.com/conan-io/conan-center-index/master/recipes//all/conanfile.py`. - Regex `cmake_target_name\s*=\s*['"]([^'"]+)['"]` and same for `cmake_file_name`. Handle both `cpp_info.set_property("cmake_target_name", ...)` and the legacy `self.cpp_info.names["cmake"] = "..."` forms. - Pure parser exposed as `parse_conanfile(text)`; the network adapter wraps `curl` via `exec::run`. - 404 → `ResolutionUnknownPackage`; transport errors → `ResolutionNetworkError`. ### D. vcpkg probe (Phase 4) - `curl -fsSL https://raw.githubusercontent.com/microsoft/vcpkg/master/ports//usage`. - The file is plain CMake. Extract first `find_package( ...)` line and any `target_link_libraries(... ::...)` lines. - Pure parser exposed as `parse_vcpkg_usage(text)`. ### E. verify_link (Phase 5) ```cpp auto verify_link(const Recipe& candidate, const std::string& name, const std::string& version_spec, const std::vector& components, const std::filesystem::path& cargoxx_overlay_path) -> util::Result; ``` - Create `/cargoxx-verify-` (mktemp). - `cmd_new(name, /*lib_only=*/false, tmp_parent)`. - Insert `candidate` into `cargoxx_overlay_path` with the right `source` and `verified_at = 0` (provisional). - Mutate the scaffolded manifest to declare `name` with `version_spec` and `components`. - Overwrite `src/main.cpp` with `int main() {}` — empty body. The point is to exercise find_package + target_link_libraries + linker, *not* to call any specific API (which would require per-package knowledge). - Call `cmd_build(tmp_proj, no_build=false, release=false, target=nullopt, overlay_path=cargoxx_overlay_path)`. - On success: rewrite the overlay row with `verified_at = now()`, return `{}`. - On failure: delete the provisional row, return the build error. - Always: `std::filesystem::remove_all(tmp_dir)` (RAII helper). ## Persistence semantics | Probe path | `source` column | `verified_at` | TTL (existing `overlay_is_fresh`) | | --- | --- | --- | --- | | Conan probe verified | `conan` | now | 30 days | | vcpkg probe verified | `vcpkg` | now | 30 days | | nix-cmake-scan verified | `nix-probe` | now | 30 days | | Manual via `linkdb add` | `manual` | now | never expires | `resolution_failures` populated only when **all** probes fail. Subsequent `cargoxx add` calls within 24 h skip probing and return the cached error. ## Phasing (one commit per phase) | Phase | Status | Commit | | --- | --- | --- | | 1. nixpkgs_probe + JSON parser | ✅ | `1c7ff39` | | 2. nix_cmake_scan | ✅ | `e63ac69` | | 3. conan_probe + parse_conanfile | ✅ | `e5c173b` | | 4. vcpkg_probe + parse_vcpkg_usage | ✅ | `941d5b3` | | 5. verify_link (tmp project + cmd_build) | ✅ | (this commit) | | 6. Database::discover + cmd_add wire-up + failure caching | pending | — | ## Testing strategy | Test | Mechanism | | --- | --- | | `parse_nix_eval_json(text)` | ✅ Catch2 unit (`tests/nixpkgs_probe_parse.cpp`) | | `nixpkgs_probe(name)` | ✅ network-gated (`tests/nixpkgs_probe_live.cpp`); requires `CARGOXX_NETWORK_TESTS=1` | | `scan_imported_targets(text)` | Catch2 unit | | `nix_cmake_scan(tmp)` | Catch2 unit using a fixture tree | | `parse_conanfile(text)` | Catch2 unit; embedded conanfile.py snippets covering both old and new forms | | `parse_vcpkg_usage(text)` | Catch2 unit | | `conan_probe(name)` | network-gated; against `fmt` | | `vcpkg_probe(name)` | network-gated; against `fmt` | | `verify_link` end-to-end | network-gated; uses `simdjson` (small, present in nixpkgs, not in our curated DB) | | `cmd_add` end-to-end on uncurated package | network-gated; full flow on `simdjson` | Failure-mode coverage: - Conan/vcpkg 404 → `ResolutionUnknownPackage` - `nix eval` errors → `ResolutionUnknownPackage` - All probes return candidates that fail to verify-link → record failure, return `ResolutionUnsatisfiable` - `resolution_failures` cache hit → returns the recorded error without re-probing ## Definition of done After Phase 6: ```sh nix develop -c cmake --build build && \ ctest --test-dir build --output-on-failure # all unit tests green CARGOXX_NETWORK_TESTS=1 nix develop -c ctest --test-dir build # live tests too ``` Manual smoke (matches the user's request 1–5): ```sh cd /tmp && rm -rf simd-smoke && mkdir simd-smoke && cd simd-smoke ~/cargoxx/build/cargoxx new app && cd app ~/cargoxx/build/cargoxx add simdjson # not in curated; triggers discover # Expected output: # probing nixpkgs#simdjson ... ok (3.x.y) # probing conan-center-index ... ok (cmake_target_name = simdjson::simdjson) # verifying ... ok # Added simdjson 3.x.y (linkdb: conan) ~/cargoxx/build/cargoxx build # ordinary build path now # picks up the freshly cached # overlay row ``` A second `cargoxx add simdjson` in another fresh project hits the overlay directly and returns instantly — proves persistence step (5). ## Risks / known limits - **Network**: Conan + vcpkg probes need outbound HTTPS. The network-gated test layer covers this; the unit tests on pure parsers don't need network. - **Conan recipe shape variation**: ~10 % of recipes use Python conditionals to set `cmake_target_name` per option — text parsing will miss these. Falls through to vcpkg / nix-scan, which is the point of the chain. - **nix-cmake-scan heuristics**: packages without standard `lib/cmake//Config.cmake` layout won't be picked up. Acceptable for v0.2; the manual escape hatch (`cargoxx linkdb add`) covers edge cases. - **Overlay growth**: long-tail packages will accumulate in the user's overlay sqlite. No cleanup in v0.2 — not a concern at human-scale package counts. - **Verify-link slowness**: full `cargoxx build` per candidate. First probe usually wins, so it's typically one build. Worst case: three builds (Conan fail, vcpkg fail, nix-scan ok). Document as expected behavior in the CLI output (`verifying...` progress message).