Files
cargoxx/docs/auto-resolution.md

13 KiB
Raw Blame History

Auto-resolution for non-curated packages

Status: in progress. Tracks the implementation of cargoxx add <pkg> for packages that are not in data/linkdb.json. See SPEC.md §9 step 46 for the contract this implements.

Goal

Today cargoxx add only succeeds for the 25 packages baked into data/linkdb.json. This work extends cargoxx add <pkg> to fall through to the user's local machine and, on success, persist the discovered recipe to the SQLite overlay so subsequent runs are instant.

The user-stated steps:

  1. confirm the package exists in nixpkgs (nixos-unstable),
  2. discover its CMake find_package / target rules via Conan, then vcpkg, then by scanning lib/cmake/**/*Config.cmake under the package's nix store path,
  3. verify the candidate by building an empty program that links the dep,
  4. record the version (already in hand from step 1's nix eval),
  5. write the recipe to the overlay so it sticks.

Design decisions

Decision Choice Why
Verify depth full cargoxx build of a tmp project catches link / ABI errors that configure-only would miss (e.g. abseil-cpp's libstdc++ vs libc++ mismatch already exposed by verify-curated-db.sh)
Probe order Conan → vcpkg → nix-cmake-scan; first that passes verification wins; failed candidates fall through maximizes hit rate without polluting overlay
Discovery side-effects Database::resolve() stays pure (overlay+curated only); a separate Database::discover() does network + verify + persist preserves the existing test surface; cmd_add orchestrates the chain
Failure caching populate resolution_failures (already in schema) when all probes fail; subsequent retries within 24 h short-circuit prevents repeated minute-long retries
Verification result handling scaffold tmp project, write provisional overlay row with verified_at = 0, build; on success rewrite verified_at = now; on failure delete the row overlay only ever holds verified recipes

Resolution chain

db.resolve(name, version, components)
   ├─ overlay rows (existing)
   ├─ curated JSON   (existing)
   └─ on LinkdbUnknownPackage → cmd_add calls db.discover(name, project_root)
       ├─ nixpkgs probe: nix eval nixpkgs#<name> for { version, path }
       │     fail → resolution_failures, return error
       ├─ Conan probe: GET conan-center-index/recipes/<name>/all/conanfile.py
       │     regex out cmake_target_name + cmake_file_name
       ├─ vcpkg probe: GET microsoft/vcpkg/ports/<name>/usage
       │     parse the literal CMake snippet
       ├─ nix-cmake-scan: walk <path>/lib/cmake/**/*Config.cmake
       │     regex add_library(<name> ... IMPORTED) for targets
       │     derive find_package name from the *Config.cmake filename stem
       │
       ├─ for each candidate (in order above):
       │     verify_link(candidate, name, version, components, overlay_path)
       │       — scaffold tmp project (cmd_new),
       │       — provisional overlay row pointing at the candidate,
       │       — write empty src/main.cpp,
       │       — call cmd_build(no_build = false) to run nix develop -c
       │         cmake configure + build,
       │       — succeeds → rewrite overlay row with verified_at = now;
       │                    return Recipe to caller
       │       — fails  → delete provisional row, try next probe
       │
       └─ all candidates failed → record to resolution_failures;
          return ResolutionUnsatisfiable

File layout

src/resolver/
├── resolver.cppm        # public API surface for all resolver helpers
├── nixpkgs_probe.cpp    # ✅ Phase 1 (committed: 1c7ff39)
├── nix_cmake_scan.cpp   # Phase 2
├── conan_probe.cpp      # Phase 3
├── vcpkg_probe.cpp      # Phase 4
└── verify_link.cpp      # Phase 5

Database::discover and the cmd_add wire-up land in Phase 6 by editing src/linkdb/curated.cpp, src/linkdb/overlay.cpp, and src/cli/cmd_add.cpp.

The deferred files in TECH_SPEC.md §1 (nixhub.cpp, lazamar.cpp, nixpkgs_git.cpp) belong to a separate feature — the version resolver that picks a concrete version from a range. Out of scope here.

Critical files (re-)used

File Why
src/linkdb/linkdb.cppm extend with Database::discover() declaration
src/linkdb/curated.cpp:158 Database::resolve already does overlay → curated; discovery is not folded in here, kept side-effect free
src/linkdb/overlay.cpp split overlay_insert_manualoverlay_insert_recipe(row, source) so non-manual sources are persistable; add overlay_delete_recipe; add overlay_record_failure for resolution_failures
src/cli/cmd_add.cpp:48 after db->resolve(...) returns LinkdbUnknownPackage, call db->discover(name, project_root) and use the returned recipe
src/exec/exec.cppm, src/exec/subprocess.cpp reuse exec::run for nix eval and curl — no new tooling, just new call sites
src/util/util.cppm reuse ResolutionUnknownPackage (E40), ResolutionNetworkError (E41), ResolutionUnsatisfiable (E42); no new error codes
src/cli/cmd_build.cpp called by verify_link.cpp; takes overlay_path and project_root; no signature change needed
scripts/verify-curated-db.sh conceptual template for the verify_link flow — same pattern as that script, in code form

Probe specs

A. nixpkgs_probe ( done — Phase 1, 1c7ff39)

nix eval nixpkgs#<pkg> --json --apply 'p: { version = p.version or ""; path = p.outPath; }'
  • --extra-experimental-features 'nix-command flakes' baked into the call so it works without user-side nix.conf flags.
  • 60 s ExecOptions.timeout.
  • Failure modes: missing attribute (stderr has does not provide attribute) → ResolutionUnknownPackage; otherwise ResolutionNetworkError.
  • Returned: NixpkgsInfo { attr, version, out_path }.
  • Field name must be path, not outPath. nix's --json mode coerces any attrset containing outPath to a bare-string derivation reference, which would lose the version field.

B. nix_cmake_scan (Phase 2, next)

  • Walk <out_path>/lib/cmake/ recursively.
  • For each <X>Config.cmake or <X>-config.cmake:
    • find_package name = stem <X>.
    • Read file. Regex add_library\(([^ ]+)\s+(STATIC|SHARED|INTERFACE|UNKNOWN)\s+IMPORTED\) to extract IMPORTED targets.
    • Also pick up add_library(<alias> ALIAS <real>) so the canonical <alias>::<sub> form gets detected.
  • Pick best candidate:
    1. case-insensitive equality between stem and package_name,
    2. prefix match,
    3. first config with non-empty target list.
  • Returns NixCmakeCandidate { find_package, targets, config_file } or ResolutionUnknownPackage.

C. Conan probe (Phase 3)

  • Text-only — never executes Python. SPEC §14 mandates this.
  • curl -fsSL https://raw.githubusercontent.com/conan-io/conan-center-index/master/recipes/<pkg>/all/conanfile.py.
  • Regex cmake_target_name\s*=\s*['"]([^'"]+)['"] and same for cmake_file_name. Handle both cpp_info.set_property("cmake_target_name", ...) and the legacy self.cpp_info.names["cmake"] = "..." forms.
  • Pure parser exposed as parse_conanfile(text); the network adapter wraps curl via exec::run.
  • 404 → ResolutionUnknownPackage; transport errors → ResolutionNetworkError.

D. vcpkg probe (Phase 4)

  • curl -fsSL https://raw.githubusercontent.com/microsoft/vcpkg/master/ports/<pkg>/usage.
  • The file is plain CMake. Extract first find_package(<name> ...) line and any target_link_libraries(... <pkg>::...) lines.
  • Pure parser exposed as parse_vcpkg_usage(text).
auto verify_link(const Recipe& candidate,
                 const std::string& name,
                 const std::string& version_spec,
                 const std::vector<std::string>& components,
                 const std::filesystem::path& cargoxx_overlay_path)
    -> util::Result<void>;
  • Create <tmp>/cargoxx-verify-<name> (mktemp).
  • cmd_new(name, /*lib_only=*/false, tmp_parent).
  • Insert candidate into cargoxx_overlay_path with the right source and verified_at = 0 (provisional).
  • Mutate the scaffolded manifest to declare name with version_spec and components.
  • Overwrite src/main.cpp with int main() {} — empty body. The point is to exercise find_package + target_link_libraries + linker, not to call any specific API (which would require per-package knowledge).
  • Call cmd_build(tmp_proj, no_build=false, release=false, target=nullopt, overlay_path=cargoxx_overlay_path).
  • On success: rewrite the overlay row with verified_at = now(), return {}.
  • On failure: delete the provisional row, return the build error.
  • Always: std::filesystem::remove_all(tmp_dir) (RAII helper).

Persistence semantics

Probe path source column verified_at TTL (existing overlay_is_fresh)
Conan probe verified conan now 30 days
vcpkg probe verified vcpkg now 30 days
nix-cmake-scan verified nix-probe now 30 days
Manual via linkdb add manual now never expires

resolution_failures populated only when all probes fail. Subsequent cargoxx add calls within 24 h skip probing and return the cached error.

Phasing (one commit per phase)

Phase Status Commit
1. nixpkgs_probe + JSON parser 1c7ff39
2. nix_cmake_scan e63ac69
3. conan_probe + parse_conanfile e5c173b
4. vcpkg_probe + parse_vcpkg_usage 941d5b3
5. verify_link (tmp project + cmd_build) (this commit)
6. Database::discover + cmd_add wire-up + failure caching pending

Testing strategy

Test Mechanism
parse_nix_eval_json(text) Catch2 unit (tests/nixpkgs_probe_parse.cpp)
nixpkgs_probe(name) network-gated (tests/nixpkgs_probe_live.cpp); requires CARGOXX_NETWORK_TESTS=1
scan_imported_targets(text) Catch2 unit
nix_cmake_scan(tmp) Catch2 unit using a fixture tree
parse_conanfile(text) Catch2 unit; embedded conanfile.py snippets covering both old and new forms
parse_vcpkg_usage(text) Catch2 unit
conan_probe(name) network-gated; against fmt
vcpkg_probe(name) network-gated; against fmt
verify_link end-to-end network-gated; uses simdjson (small, present in nixpkgs, not in our curated DB)
cmd_add end-to-end on uncurated package network-gated; full flow on simdjson

Failure-mode coverage:

  • Conan/vcpkg 404 → ResolutionUnknownPackage
  • nix eval errors → ResolutionUnknownPackage
  • All probes return candidates that fail to verify-link → record failure, return ResolutionUnsatisfiable
  • resolution_failures cache hit → returns the recorded error without re-probing

Definition of done

After Phase 6:

nix develop -c cmake --build build && \
  ctest --test-dir build --output-on-failure                   # all unit tests green
CARGOXX_NETWORK_TESTS=1 nix develop -c ctest --test-dir build  # live tests too

Manual smoke (matches the user's request 15):

cd /tmp && rm -rf simd-smoke && mkdir simd-smoke && cd simd-smoke
~/cargoxx/build/cargoxx new app && cd app
~/cargoxx/build/cargoxx add simdjson         # not in curated; triggers discover
# Expected output:
#   probing nixpkgs#simdjson ... ok (3.x.y)
#   probing conan-center-index ... ok (cmake_target_name = simdjson::simdjson)
#   verifying ... ok
#   Added simdjson 3.x.y (linkdb: conan)
~/cargoxx/build/cargoxx build                # ordinary build path now
                                             # picks up the freshly cached
                                             # overlay row

A second cargoxx add simdjson in another fresh project hits the overlay directly and returns instantly — proves persistence step (5).

Risks / known limits

  • Network: Conan + vcpkg probes need outbound HTTPS. The network-gated test layer covers this; the unit tests on pure parsers don't need network.
  • Conan recipe shape variation: ~10 % of recipes use Python conditionals to set cmake_target_name per option — text parsing will miss these. Falls through to vcpkg / nix-scan, which is the point of the chain.
  • nix-cmake-scan heuristics: packages without standard lib/cmake/<X>/<X>Config.cmake layout won't be picked up. Acceptable for v0.2; the manual escape hatch (cargoxx linkdb add) covers edge cases.
  • Overlay growth: long-tail packages will accumulate in the user's overlay sqlite. No cleanup in v0.2 — not a concern at human-scale package counts.
  • Verify-link slowness: full cargoxx build per candidate. First probe usually wins, so it's typically one build. Worst case: three builds (Conan fail, vcpkg fail, nix-scan ok). Document as expected behavior in the CLI output (verifying... progress message).