[M5+] add resolver::nix_cmake_scan
This commit is contained in:
273
docs/auto-resolution.md
Normal file
273
docs/auto-resolution.md
Normal file
@@ -0,0 +1,273 @@
|
||||
# Auto-resolution for non-curated packages
|
||||
|
||||
Status: in progress. Tracks the implementation of `cargoxx add <pkg>` for
|
||||
packages that are not in `data/linkdb.json`. See `SPEC.md` §9 step 4–6 for
|
||||
the contract this implements.
|
||||
|
||||
## Goal
|
||||
|
||||
Today `cargoxx add` only succeeds for the 25 packages baked into
|
||||
`data/linkdb.json`. This work extends `cargoxx add <pkg>` to fall through
|
||||
to the user's local machine and, on success, persist the discovered
|
||||
recipe to the SQLite overlay so subsequent runs are instant.
|
||||
|
||||
The user-stated steps:
|
||||
|
||||
1. confirm the package exists in `nixpkgs` (`nixos-unstable`),
|
||||
2. discover its CMake `find_package` / target rules via Conan, then vcpkg,
|
||||
then by scanning `lib/cmake/**/*Config.cmake` under the package's nix
|
||||
store path,
|
||||
3. verify the candidate by building an empty program that links the dep,
|
||||
4. record the version (already in hand from step 1's `nix eval`),
|
||||
5. write the recipe to the overlay so it sticks.
|
||||
|
||||
## Design decisions
|
||||
|
||||
| Decision | Choice | Why |
|
||||
| --- | --- | --- |
|
||||
| Verify depth | full `cargoxx build` of a tmp project | catches link / ABI errors that configure-only would miss (e.g. abseil-cpp's libstdc++ vs libc++ mismatch already exposed by `verify-curated-db.sh`) |
|
||||
| Probe order | Conan → vcpkg → nix-cmake-scan; first that *passes verification* wins; failed candidates fall through | maximizes hit rate without polluting overlay |
|
||||
| Discovery side-effects | `Database::resolve()` stays pure (overlay+curated only); a separate `Database::discover()` does network + verify + persist | preserves the existing test surface; `cmd_add` orchestrates the chain |
|
||||
| Failure caching | populate `resolution_failures` (already in schema) when *all* probes fail; subsequent retries within 24 h short-circuit | prevents repeated minute-long retries |
|
||||
| Verification result handling | scaffold tmp project, write provisional overlay row with `verified_at = 0`, build; on success rewrite `verified_at = now`; on failure delete the row | overlay only ever holds verified recipes |
|
||||
|
||||
## Resolution chain
|
||||
|
||||
```
|
||||
db.resolve(name, version, components)
|
||||
├─ overlay rows (existing)
|
||||
├─ curated JSON (existing)
|
||||
└─ on LinkdbUnknownPackage → cmd_add calls db.discover(name, project_root)
|
||||
├─ nixpkgs probe: nix eval nixpkgs#<name> for { version, path }
|
||||
│ fail → resolution_failures, return error
|
||||
├─ Conan probe: GET conan-center-index/recipes/<name>/all/conanfile.py
|
||||
│ regex out cmake_target_name + cmake_file_name
|
||||
├─ vcpkg probe: GET microsoft/vcpkg/ports/<name>/usage
|
||||
│ parse the literal CMake snippet
|
||||
├─ nix-cmake-scan: walk <path>/lib/cmake/**/*Config.cmake
|
||||
│ regex add_library(<name> ... IMPORTED) for targets
|
||||
│ derive find_package name from the *Config.cmake filename stem
|
||||
│
|
||||
├─ for each candidate (in order above):
|
||||
│ verify_link(candidate, name, version, components, overlay_path)
|
||||
│ — scaffold tmp project (cmd_new),
|
||||
│ — provisional overlay row pointing at the candidate,
|
||||
│ — write empty src/main.cpp,
|
||||
│ — call cmd_build(no_build = false) to run nix develop -c
|
||||
│ cmake configure + build,
|
||||
│ — succeeds → rewrite overlay row with verified_at = now;
|
||||
│ return Recipe to caller
|
||||
│ — fails → delete provisional row, try next probe
|
||||
│
|
||||
└─ all candidates failed → record to resolution_failures;
|
||||
return ResolutionUnsatisfiable
|
||||
```
|
||||
|
||||
## File layout
|
||||
|
||||
```
|
||||
src/resolver/
|
||||
├── resolver.cppm # public API surface for all resolver helpers
|
||||
├── nixpkgs_probe.cpp # ✅ Phase 1 (committed: 1c7ff39)
|
||||
├── nix_cmake_scan.cpp # Phase 2
|
||||
├── conan_probe.cpp # Phase 3
|
||||
├── vcpkg_probe.cpp # Phase 4
|
||||
└── verify_link.cpp # Phase 5
|
||||
```
|
||||
|
||||
`Database::discover` and the `cmd_add` wire-up land in Phase 6 by editing
|
||||
`src/linkdb/curated.cpp`, `src/linkdb/overlay.cpp`, and
|
||||
`src/cli/cmd_add.cpp`.
|
||||
|
||||
The deferred files in `TECH_SPEC.md` §1 (`nixhub.cpp`, `lazamar.cpp`,
|
||||
`nixpkgs_git.cpp`) belong to a separate feature — the *version* resolver
|
||||
that picks a concrete version from a range. Out of scope here.
|
||||
|
||||
## Critical files (re-)used
|
||||
|
||||
| File | Why |
|
||||
| --- | --- |
|
||||
| `src/linkdb/linkdb.cppm` | extend with `Database::discover()` declaration |
|
||||
| `src/linkdb/curated.cpp:158` | `Database::resolve` already does overlay → curated; discovery is *not* folded in here, kept side-effect free |
|
||||
| `src/linkdb/overlay.cpp` | split `overlay_insert_manual` → `overlay_insert_recipe(row, source)` so non-`manual` sources are persistable; add `overlay_delete_recipe`; add `overlay_record_failure` for `resolution_failures` |
|
||||
| `src/cli/cmd_add.cpp:48` | after `db->resolve(...)` returns `LinkdbUnknownPackage`, call `db->discover(name, project_root)` and use the returned recipe |
|
||||
| `src/exec/exec.cppm`, `src/exec/subprocess.cpp` | reuse `exec::run` for `nix eval` and `curl` — no new tooling, just new call sites |
|
||||
| `src/util/util.cppm` | reuse `ResolutionUnknownPackage` (E40), `ResolutionNetworkError` (E41), `ResolutionUnsatisfiable` (E42); no new error codes |
|
||||
| `src/cli/cmd_build.cpp` | called by `verify_link.cpp`; takes `overlay_path` and `project_root`; no signature change needed |
|
||||
| `scripts/verify-curated-db.sh` | conceptual template for the `verify_link` flow — same pattern as that script, in code form |
|
||||
|
||||
## Probe specs
|
||||
|
||||
### A. nixpkgs_probe (✅ done — Phase 1, 1c7ff39)
|
||||
|
||||
```
|
||||
nix eval nixpkgs#<pkg> --json --apply 'p: { version = p.version or ""; path = p.outPath; }'
|
||||
```
|
||||
|
||||
- `--extra-experimental-features 'nix-command flakes'` baked into the call
|
||||
so it works without user-side `nix.conf` flags.
|
||||
- 60 s `ExecOptions.timeout`.
|
||||
- Failure modes: missing attribute (`stderr` has `does not provide attribute`)
|
||||
→ `ResolutionUnknownPackage`; otherwise `ResolutionNetworkError`.
|
||||
- Returned: `NixpkgsInfo { attr, version, out_path }`.
|
||||
- Field name **must** be `path`, not `outPath`. nix's `--json` mode coerces
|
||||
any attrset containing `outPath` to a bare-string derivation reference,
|
||||
which would lose the `version` field.
|
||||
|
||||
### B. nix_cmake_scan (Phase 2, next)
|
||||
|
||||
- Walk `<out_path>/lib/cmake/` recursively.
|
||||
- For each `<X>Config.cmake` or `<X>-config.cmake`:
|
||||
- `find_package` name = stem `<X>`.
|
||||
- Read file. Regex
|
||||
`add_library\(([^ ]+)\s+(STATIC|SHARED|INTERFACE|UNKNOWN)\s+IMPORTED\)`
|
||||
to extract IMPORTED targets.
|
||||
- Also pick up `add_library(<alias> ALIAS <real>)` so the canonical
|
||||
`<alias>::<sub>` form gets detected.
|
||||
- Pick best candidate:
|
||||
1. case-insensitive equality between stem and `package_name`,
|
||||
2. prefix match,
|
||||
3. first config with non-empty target list.
|
||||
- Returns `NixCmakeCandidate { find_package, targets, config_file }` or
|
||||
`ResolutionUnknownPackage`.
|
||||
|
||||
### C. Conan probe (Phase 3)
|
||||
|
||||
- Text-only — never executes Python. SPEC §14 mandates this.
|
||||
- `curl -fsSL https://raw.githubusercontent.com/conan-io/conan-center-index/master/recipes/<pkg>/all/conanfile.py`.
|
||||
- Regex `cmake_target_name\s*=\s*['"]([^'"]+)['"]` and same for
|
||||
`cmake_file_name`. Handle both `cpp_info.set_property("cmake_target_name", ...)`
|
||||
and the legacy `self.cpp_info.names["cmake"] = "..."` forms.
|
||||
- Pure parser exposed as `parse_conanfile(text)`; the network adapter
|
||||
wraps `curl` via `exec::run`.
|
||||
- 404 → `ResolutionUnknownPackage`; transport errors → `ResolutionNetworkError`.
|
||||
|
||||
### D. vcpkg probe (Phase 4)
|
||||
|
||||
- `curl -fsSL https://raw.githubusercontent.com/microsoft/vcpkg/master/ports/<pkg>/usage`.
|
||||
- The file is plain CMake. Extract first `find_package(<name> ...)` line and
|
||||
any `target_link_libraries(... <pkg>::...)` lines.
|
||||
- Pure parser exposed as `parse_vcpkg_usage(text)`.
|
||||
|
||||
### E. verify_link (Phase 5)
|
||||
|
||||
```cpp
|
||||
auto verify_link(const Recipe& candidate,
|
||||
const std::string& name,
|
||||
const std::string& version_spec,
|
||||
const std::vector<std::string>& components,
|
||||
const std::filesystem::path& cargoxx_overlay_path)
|
||||
-> util::Result<void>;
|
||||
```
|
||||
|
||||
- Create `<tmp>/cargoxx-verify-<name>` (mktemp).
|
||||
- `cmd_new(name, /*lib_only=*/false, tmp_parent)`.
|
||||
- Insert `candidate` into `cargoxx_overlay_path` with the right `source`
|
||||
and `verified_at = 0` (provisional).
|
||||
- Mutate the scaffolded manifest to declare `name` with `version_spec`
|
||||
and `components`.
|
||||
- Overwrite `src/main.cpp` with `int main() {}` — empty body. The point
|
||||
is to exercise find_package + target_link_libraries + linker, *not* to
|
||||
call any specific API (which would require per-package knowledge).
|
||||
- Call `cmd_build(tmp_proj, no_build=false, release=false,
|
||||
target=nullopt, overlay_path=cargoxx_overlay_path)`.
|
||||
- On success: rewrite the overlay row with `verified_at = now()`,
|
||||
return `{}`.
|
||||
- On failure: delete the provisional row, return the build error.
|
||||
- Always: `std::filesystem::remove_all(tmp_dir)` (RAII helper).
|
||||
|
||||
## Persistence semantics
|
||||
|
||||
| Probe path | `source` column | `verified_at` | TTL (existing `overlay_is_fresh`) |
|
||||
| --- | --- | --- | --- |
|
||||
| Conan probe verified | `conan` | now | 30 days |
|
||||
| vcpkg probe verified | `vcpkg` | now | 30 days |
|
||||
| nix-cmake-scan verified | `nix-probe` | now | 30 days |
|
||||
| Manual via `linkdb add` | `manual` | now | never expires |
|
||||
|
||||
`resolution_failures` populated only when **all** probes fail. Subsequent
|
||||
`cargoxx add` calls within 24 h skip probing and return the cached error.
|
||||
|
||||
## Phasing (one commit per phase)
|
||||
|
||||
| Phase | Status | Commit |
|
||||
| --- | --- | --- |
|
||||
| 1. nixpkgs_probe + JSON parser | ✅ | `1c7ff39` |
|
||||
| 2. nix_cmake_scan | pending | — |
|
||||
| 3. conan_probe + parse_conanfile | pending | — |
|
||||
| 4. vcpkg_probe + parse_vcpkg_usage | pending | — |
|
||||
| 5. verify_link (tmp project + cmd_build) | pending | — |
|
||||
| 6. Database::discover + cmd_add wire-up + failure caching | pending | — |
|
||||
|
||||
## Testing strategy
|
||||
|
||||
| Test | Mechanism |
|
||||
| --- | --- |
|
||||
| `parse_nix_eval_json(text)` | ✅ Catch2 unit (`tests/nixpkgs_probe_parse.cpp`) |
|
||||
| `nixpkgs_probe(name)` | ✅ network-gated (`tests/nixpkgs_probe_live.cpp`); requires `CARGOXX_NETWORK_TESTS=1` |
|
||||
| `scan_imported_targets(text)` | Catch2 unit |
|
||||
| `nix_cmake_scan(tmp)` | Catch2 unit using a fixture tree |
|
||||
| `parse_conanfile(text)` | Catch2 unit; embedded conanfile.py snippets covering both old and new forms |
|
||||
| `parse_vcpkg_usage(text)` | Catch2 unit |
|
||||
| `conan_probe(name)` | network-gated; against `fmt` |
|
||||
| `vcpkg_probe(name)` | network-gated; against `fmt` |
|
||||
| `verify_link` end-to-end | network-gated; uses `simdjson` (small, present in nixpkgs, not in our curated DB) |
|
||||
| `cmd_add` end-to-end on uncurated package | network-gated; full flow on `simdjson` |
|
||||
|
||||
Failure-mode coverage:
|
||||
- Conan/vcpkg 404 → `ResolutionUnknownPackage`
|
||||
- `nix eval` errors → `ResolutionUnknownPackage`
|
||||
- All probes return candidates that fail to verify-link → record failure,
|
||||
return `ResolutionUnsatisfiable`
|
||||
- `resolution_failures` cache hit → returns the recorded error without
|
||||
re-probing
|
||||
|
||||
## Definition of done
|
||||
|
||||
After Phase 6:
|
||||
|
||||
```sh
|
||||
nix develop -c cmake --build build && \
|
||||
ctest --test-dir build --output-on-failure # all unit tests green
|
||||
CARGOXX_NETWORK_TESTS=1 nix develop -c ctest --test-dir build # live tests too
|
||||
```
|
||||
|
||||
Manual smoke (matches the user's request 1–5):
|
||||
|
||||
```sh
|
||||
cd /tmp && rm -rf simd-smoke && mkdir simd-smoke && cd simd-smoke
|
||||
~/cargoxx/build/cargoxx new app && cd app
|
||||
~/cargoxx/build/cargoxx add simdjson # not in curated; triggers discover
|
||||
# Expected output:
|
||||
# probing nixpkgs#simdjson ... ok (3.x.y)
|
||||
# probing conan-center-index ... ok (cmake_target_name = simdjson::simdjson)
|
||||
# verifying ... ok
|
||||
# Added simdjson 3.x.y (linkdb: conan)
|
||||
~/cargoxx/build/cargoxx build # ordinary build path now
|
||||
# picks up the freshly cached
|
||||
# overlay row
|
||||
```
|
||||
|
||||
A second `cargoxx add simdjson` in another fresh project hits the overlay
|
||||
directly and returns instantly — proves persistence step (5).
|
||||
|
||||
## Risks / known limits
|
||||
|
||||
- **Network**: Conan + vcpkg probes need outbound HTTPS. The
|
||||
network-gated test layer covers this; the unit tests on pure parsers
|
||||
don't need network.
|
||||
- **Conan recipe shape variation**: ~10 % of recipes use Python
|
||||
conditionals to set `cmake_target_name` per option — text parsing
|
||||
will miss these. Falls through to vcpkg / nix-scan, which is the
|
||||
point of the chain.
|
||||
- **nix-cmake-scan heuristics**: packages without standard
|
||||
`lib/cmake/<X>/<X>Config.cmake` layout won't be picked up. Acceptable
|
||||
for v0.2; the manual escape hatch (`cargoxx linkdb add`) covers
|
||||
edge cases.
|
||||
- **Overlay growth**: long-tail packages will accumulate in the user's
|
||||
overlay sqlite. No cleanup in v0.2 — not a concern at human-scale
|
||||
package counts.
|
||||
- **Verify-link slowness**: full `cargoxx build` per candidate. First
|
||||
probe usually wins, so it's typically one build. Worst case: three
|
||||
builds (Conan fail, vcpkg fail, nix-scan ok). Document as expected
|
||||
behavior in the CLI output (`verifying...` progress message).
|
||||
Reference in New Issue
Block a user