13 KiB
Auto-resolution for non-curated packages
Status: in progress. Tracks the implementation of cargoxx add <pkg> for
packages that are not in data/linkdb.json. See SPEC.md §9 step 4–6 for
the contract this implements.
Goal
Today cargoxx add only succeeds for the 25 packages baked into
data/linkdb.json. This work extends cargoxx add <pkg> to fall through
to the user's local machine and, on success, persist the discovered
recipe to the SQLite overlay so subsequent runs are instant.
The user-stated steps:
- confirm the package exists in
nixpkgs(nixos-unstable), - discover its CMake
find_package/ target rules via Conan, then vcpkg, then by scanninglib/cmake/**/*Config.cmakeunder the package's nix store path, - verify the candidate by building an empty program that links the dep,
- record the version (already in hand from step 1's
nix eval), - write the recipe to the overlay so it sticks.
Design decisions
| Decision | Choice | Why |
|---|---|---|
| Verify depth | full cargoxx build of a tmp project |
catches link / ABI errors that configure-only would miss (e.g. abseil-cpp's libstdc++ vs libc++ mismatch already exposed by verify-curated-db.sh) |
| Probe order | Conan → vcpkg → nix-cmake-scan; first that passes verification wins; failed candidates fall through | maximizes hit rate without polluting overlay |
| Discovery side-effects | Database::resolve() stays pure (overlay+curated only); a separate Database::discover() does network + verify + persist |
preserves the existing test surface; cmd_add orchestrates the chain |
| Failure caching | populate resolution_failures (already in schema) when all probes fail; subsequent retries within 24 h short-circuit |
prevents repeated minute-long retries |
| Verification result handling | scaffold tmp project, write provisional overlay row with verified_at = 0, build; on success rewrite verified_at = now; on failure delete the row |
overlay only ever holds verified recipes |
Resolution chain
db.resolve(name, version, components)
├─ overlay rows (existing)
├─ curated JSON (existing)
└─ on LinkdbUnknownPackage → cmd_add calls db.discover(name, project_root)
├─ nixpkgs probe: nix eval nixpkgs#<name> for { version, path }
│ fail → resolution_failures, return error
├─ Conan probe: GET conan-center-index/recipes/<name>/all/conanfile.py
│ regex out cmake_target_name + cmake_file_name
├─ vcpkg probe: GET microsoft/vcpkg/ports/<name>/usage
│ parse the literal CMake snippet
├─ nix-cmake-scan: walk <path>/lib/cmake/**/*Config.cmake
│ regex add_library(<name> ... IMPORTED) for targets
│ derive find_package name from the *Config.cmake filename stem
│
├─ for each candidate (in order above):
│ verify_link(candidate, name, version, components, overlay_path)
│ — scaffold tmp project (cmd_new),
│ — provisional overlay row pointing at the candidate,
│ — write empty src/main.cpp,
│ — call cmd_build(no_build = false) to run nix develop -c
│ cmake configure + build,
│ — succeeds → rewrite overlay row with verified_at = now;
│ return Recipe to caller
│ — fails → delete provisional row, try next probe
│
└─ all candidates failed → record to resolution_failures;
return ResolutionUnsatisfiable
File layout
src/resolver/
├── resolver.cppm # public API surface for all resolver helpers
├── nixpkgs_probe.cpp # ✅ Phase 1 (committed: 1c7ff39)
├── nix_cmake_scan.cpp # Phase 2
├── conan_probe.cpp # Phase 3
├── vcpkg_probe.cpp # Phase 4
└── verify_link.cpp # Phase 5
Database::discover and the cmd_add wire-up land in Phase 6 by editing
src/linkdb/curated.cpp, src/linkdb/overlay.cpp, and
src/cli/cmd_add.cpp.
The deferred files in TECH_SPEC.md §1 (nixhub.cpp, lazamar.cpp,
nixpkgs_git.cpp) belong to a separate feature — the version resolver
that picks a concrete version from a range. Out of scope here.
Critical files (re-)used
| File | Why |
|---|---|
src/linkdb/linkdb.cppm |
extend with Database::discover() declaration |
src/linkdb/curated.cpp:158 |
Database::resolve already does overlay → curated; discovery is not folded in here, kept side-effect free |
src/linkdb/overlay.cpp |
split overlay_insert_manual → overlay_insert_recipe(row, source) so non-manual sources are persistable; add overlay_delete_recipe; add overlay_record_failure for resolution_failures |
src/cli/cmd_add.cpp:48 |
after db->resolve(...) returns LinkdbUnknownPackage, call db->discover(name, project_root) and use the returned recipe |
src/exec/exec.cppm, src/exec/subprocess.cpp |
reuse exec::run for nix eval and curl — no new tooling, just new call sites |
src/util/util.cppm |
reuse ResolutionUnknownPackage (E40), ResolutionNetworkError (E41), ResolutionUnsatisfiable (E42); no new error codes |
src/cli/cmd_build.cpp |
called by verify_link.cpp; takes overlay_path and project_root; no signature change needed |
scripts/verify-curated-db.sh |
conceptual template for the verify_link flow — same pattern as that script, in code form |
Probe specs
A. nixpkgs_probe (✅ done — Phase 1, 1c7ff39)
nix eval nixpkgs#<pkg> --json --apply 'p: { version = p.version or ""; path = p.outPath; }'
--extra-experimental-features 'nix-command flakes'baked into the call so it works without user-sidenix.confflags.- 60 s
ExecOptions.timeout. - Failure modes: missing attribute (
stderrhasdoes not provide attribute) →ResolutionUnknownPackage; otherwiseResolutionNetworkError. - Returned:
NixpkgsInfo { attr, version, out_path }. - Field name must be
path, notoutPath. nix's--jsonmode coerces any attrset containingoutPathto a bare-string derivation reference, which would lose theversionfield.
B. nix_cmake_scan (Phase 2, next)
- Walk
<out_path>/lib/cmake/recursively. - For each
<X>Config.cmakeor<X>-config.cmake:find_packagename = stem<X>.- Read file. Regex
add_library\(([^ ]+)\s+(STATIC|SHARED|INTERFACE|UNKNOWN)\s+IMPORTED\)to extract IMPORTED targets. - Also pick up
add_library(<alias> ALIAS <real>)so the canonical<alias>::<sub>form gets detected.
- Pick best candidate:
- case-insensitive equality between stem and
package_name, - prefix match,
- first config with non-empty target list.
- case-insensitive equality between stem and
- Returns
NixCmakeCandidate { find_package, targets, config_file }orResolutionUnknownPackage.
C. Conan probe (Phase 3)
- Text-only — never executes Python. SPEC §14 mandates this.
curl -fsSL https://raw.githubusercontent.com/conan-io/conan-center-index/master/recipes/<pkg>/all/conanfile.py.- Regex
cmake_target_name\s*=\s*['"]([^'"]+)['"]and same forcmake_file_name. Handle bothcpp_info.set_property("cmake_target_name", ...)and the legacyself.cpp_info.names["cmake"] = "..."forms. - Pure parser exposed as
parse_conanfile(text); the network adapter wrapscurlviaexec::run. - 404 →
ResolutionUnknownPackage; transport errors →ResolutionNetworkError.
D. vcpkg probe (Phase 4)
curl -fsSL https://raw.githubusercontent.com/microsoft/vcpkg/master/ports/<pkg>/usage.- The file is plain CMake. Extract first
find_package(<name> ...)line and anytarget_link_libraries(... <pkg>::...)lines. - Pure parser exposed as
parse_vcpkg_usage(text).
E. verify_link (Phase 5)
auto verify_link(const Recipe& candidate,
const std::string& name,
const std::string& version_spec,
const std::vector<std::string>& components,
const std::filesystem::path& cargoxx_overlay_path)
-> util::Result<void>;
- Create
<tmp>/cargoxx-verify-<name>(mktemp). cmd_new(name, /*lib_only=*/false, tmp_parent).- Insert
candidateintocargoxx_overlay_pathwith the rightsourceandverified_at = 0(provisional). - Mutate the scaffolded manifest to declare
namewithversion_specandcomponents. - Overwrite
src/main.cppwithint main() {}— empty body. The point is to exercise find_package + target_link_libraries + linker, not to call any specific API (which would require per-package knowledge). - Call
cmd_build(tmp_proj, no_build=false, release=false, target=nullopt, overlay_path=cargoxx_overlay_path). - On success: rewrite the overlay row with
verified_at = now(), return{}. - On failure: delete the provisional row, return the build error.
- Always:
std::filesystem::remove_all(tmp_dir)(RAII helper).
Persistence semantics
| Probe path | source column |
verified_at |
TTL (existing overlay_is_fresh) |
|---|---|---|---|
| Conan probe verified | conan |
now | 30 days |
| vcpkg probe verified | vcpkg |
now | 30 days |
| nix-cmake-scan verified | nix-probe |
now | 30 days |
Manual via linkdb add |
manual |
now | never expires |
resolution_failures populated only when all probes fail. Subsequent
cargoxx add calls within 24 h skip probing and return the cached error.
Phasing (one commit per phase)
| Phase | Status | Commit |
|---|---|---|
| 1. nixpkgs_probe + JSON parser | ✅ | 1c7ff39 |
| 2. nix_cmake_scan | ✅ | e63ac69 |
| 3. conan_probe + parse_conanfile | ✅ | (this commit) |
| 4. vcpkg_probe + parse_vcpkg_usage | pending | — |
| 5. verify_link (tmp project + cmd_build) | pending | — |
| 6. Database::discover + cmd_add wire-up + failure caching | pending | — |
Testing strategy
| Test | Mechanism |
|---|---|
parse_nix_eval_json(text) |
✅ Catch2 unit (tests/nixpkgs_probe_parse.cpp) |
nixpkgs_probe(name) |
✅ network-gated (tests/nixpkgs_probe_live.cpp); requires CARGOXX_NETWORK_TESTS=1 |
scan_imported_targets(text) |
Catch2 unit |
nix_cmake_scan(tmp) |
Catch2 unit using a fixture tree |
parse_conanfile(text) |
Catch2 unit; embedded conanfile.py snippets covering both old and new forms |
parse_vcpkg_usage(text) |
Catch2 unit |
conan_probe(name) |
network-gated; against fmt |
vcpkg_probe(name) |
network-gated; against fmt |
verify_link end-to-end |
network-gated; uses simdjson (small, present in nixpkgs, not in our curated DB) |
cmd_add end-to-end on uncurated package |
network-gated; full flow on simdjson |
Failure-mode coverage:
- Conan/vcpkg 404 →
ResolutionUnknownPackage nix evalerrors →ResolutionUnknownPackage- All probes return candidates that fail to verify-link → record failure,
return
ResolutionUnsatisfiable resolution_failurescache hit → returns the recorded error without re-probing
Definition of done
After Phase 6:
nix develop -c cmake --build build && \
ctest --test-dir build --output-on-failure # all unit tests green
CARGOXX_NETWORK_TESTS=1 nix develop -c ctest --test-dir build # live tests too
Manual smoke (matches the user's request 1–5):
cd /tmp && rm -rf simd-smoke && mkdir simd-smoke && cd simd-smoke
~/cargoxx/build/cargoxx new app && cd app
~/cargoxx/build/cargoxx add simdjson # not in curated; triggers discover
# Expected output:
# probing nixpkgs#simdjson ... ok (3.x.y)
# probing conan-center-index ... ok (cmake_target_name = simdjson::simdjson)
# verifying ... ok
# Added simdjson 3.x.y (linkdb: conan)
~/cargoxx/build/cargoxx build # ordinary build path now
# picks up the freshly cached
# overlay row
A second cargoxx add simdjson in another fresh project hits the overlay
directly and returns instantly — proves persistence step (5).
Risks / known limits
- Network: Conan + vcpkg probes need outbound HTTPS. The network-gated test layer covers this; the unit tests on pure parsers don't need network.
- Conan recipe shape variation: ~10 % of recipes use Python
conditionals to set
cmake_target_nameper option — text parsing will miss these. Falls through to vcpkg / nix-scan, which is the point of the chain. - nix-cmake-scan heuristics: packages without standard
lib/cmake/<X>/<X>Config.cmakelayout won't be picked up. Acceptable for v0.2; the manual escape hatch (cargoxx linkdb add) covers edge cases. - Overlay growth: long-tail packages will accumulate in the user's overlay sqlite. No cleanup in v0.2 — not a concern at human-scale package counts.
- Verify-link slowness: full
cargoxx buildper candidate. First probe usually wins, so it's typically one build. Worst case: three builds (Conan fail, vcpkg fail, nix-scan ok). Document as expected behavior in the CLI output (verifying...progress message).