# Use-case registry — the machine-checkable definition of abicheck's
# application/library ABI-API change use cases.
#
# This is the SOURCE OF TRUTH behind the human-readable scorecard in
# docs/development/usecase-coverage-evaluation.md. Each entry is validated by
# tests/test_usecase_registry.py, which enforces that:
#   - every entry has a known `axis` and `status`;
#   - `complete` / `partial` / `modeled` entries cite evidence whose paths
#     actually exist in the repo (so coverage claims cannot silently rot);
#   - `partial` / `modeled` / `planned` entries carry a `gap` id, `next_steps`,
#     and a `plan:` pointing at a real plan file under docs/development/plans/;
#   - `by_design_excluded` entries explain themselves in `note`.
#
# To extend coverage: add or update an entry here, point `evidence` at the
# real test/module/example that backs it, and the test keeps everyone honest.
#
# status legend:
#   complete            fully supported and validated
#   partial             works, but with documented caveats/gaps
#   modeled             code/parsers exist but NOT validated end-to-end in CI
#   planned             not implemented; has a tracked plan (gap + next_steps)
#   by_design_excluded  deliberate non-goal (see docs/development/goals.md)
#
# axis legend: change_class | archetype | platform | workflow | reporting | toolchain

schema_version: 1

use_cases:
  # ── change_class ──────────────────────────────────────────────────────────
  - id: UC-CHANGE-taxonomy
    axis: change_class
    name: ABI/API change taxonomy (254 ChangeKinds, 5-tier policy)
    status: complete
    evidence:
      modules: [abicheck/change_registry.py, abicheck/checker_policy.py]
      tests: [tests/test_changekind_completeness.py]
      examples: [examples/ground_truth.json]

  - id: UC-CHANGE-semver-recommendation
    axis: change_class
    name: Release recommendation (semver bump + SONAME action)
    status: complete
    evidence:
      modules: [abicheck/semver.py]
      tests: [tests/test_semver_recommendation.py]

  - id: UC-CHANGE-inline-ns-version
    axis: change_class
    name: Inline-namespace version-stamp normalization (ICU-style)
    status: partial
    gap: G15
    plan: docs/development/plans/g15-inline-namespace-version.md
    evidence:
      modules: [abicheck/versioned_symbol_scheme.py, abicheck/post_processing.py]
      tests: [tests/test_versioned_symbol_scheme.py]
    next_steps: >
      DONE (detector half, field eval P08): the versioned-symbol-scheme recogniser
      emits one advisory `versioned_symbol_scheme_detected` (RISK) when a strong
      majority of removed symbols reappear as added symbols differing only by a
      numeric version token (ICU `u_*_NN`). Additive — it explains the churn and
      never downgrades the artifact-proven removals (authority rule).
      STILL PLANNED: (1) normalize symbol keys *through* the detected token and
      re-diff so the surface collapses to the real delta (ICU 73→74: BREAKING/6288
      → +34/-0), behind an opt-in suppression preset; (2) cross-check the token
      against the SONAME and still surface the soname bump as the relink signal;
      (3) extend tokens to libstdc++ versioned namespaces / Abseil `lts_<date>`.

  # ── archetype ─────────────────────────────────────────────────────────────
  - id: UC-ARCH-c-library
    axis: archetype
    name: Pure-C shared library (extern "C", SONAME-versioned)
    status: complete
    evidence:
      examples: [examples/ground_truth.json]
    note: 35 pure-C example pairs (examples/case*/v1.c).

  - id: UC-ARCH-cpp-library
    axis: archetype
    name: C++ library (templates, vtables, inline namespaces)
    status: complete
    evidence:
      examples: [examples/ground_truth.json]
    note: 52 C++ example pairs (examples/case*/v1.cpp).

  - id: UC-ARCH-plugin
    axis: archetype
    name: Plugin host↔plugin load contract (dlopen)
    status: complete
    note: >
      G5 closed: first-class host-contract check
      (appcompat.check_plugin_host_contract) answers "does plugin v2 still
      satisfy host H's required entrypoints?" — the plugin-load mirror of
      appcompat, reusing its consumer-scoping. Exposed via the `plugin-check`
      CLI (binary or JSON-snapshot inputs, --require / --host-contract manifest)
      and the Python API, with the plugin_abi policy as the default. Validated
      end-to-end through snapshot-driven scenario tests and CLI tests; the
      host-safe-vs-host-breaking distinction (a library-wide BREAKING drop the
      host never resolves stays COMPATIBLE for the host) is asserted. A compiled
      host/plugin binary demo in examples/ remains optional (needs the
      integration toolchain) and is not required for the capability.
    evidence:
      modules: [abicheck/appcompat.py, abicheck/cli_plugin.py]
      tests: [tests/test_workflow_scenarios.py, tests/test_cli_plugin.py]
      docs: [docs/user-guide/plugin-systems.md]

  - id: UC-ARCH-header-only
    axis: archetype
    name: Header-only / inline-only libraries
    status: planned
    gap: G4
    plan: docs/development/plans/g4-header-ast-extractor.md
    evidence:
      modules: [abicheck/dumper_castxml.py]
    next_steps: >
      Add a libclang-based header-AST extractor alongside castxml to unblock
      concept tightening, hidden friends, and user-ctor mangled names
      (dormant fixtures: cases 78/105/106/111).

  - id: UC-ARCH-kernel-btf
    axis: archetype
    name: Kernel / eBPF modules (BTF/CTF)
    status: complete
    note: >
      G6 (kernel half) closed. The "module vs vmlinux BTF" workflow runs
      end-to-end through compare: real BTF bytes → parse_btf_from_bytes →
      layout detectors → BREAKING, with a CTF mirror scenario
      (parse_ctf_from_bytes). A committed BTF-blob example
      (examples/case121_kernel_btf_struct_field_added) carries v1.btf/v2.btf and
      a ground_truth.json entry; resolve_input now ingests bare BTF/CTF blobs by
      magic so `abicheck compare a.btf b.btf` works toolchain-free. A real
      `.BTF`-section fixture is validated via `gcc -gbtf`
      (tests/test_btf_integration.py, integration-marked). Full kernel-module
      __ksymtab namespace analysis beyond BTF type layout stays out of scope.
    evidence:
      modules: [abicheck/btf_metadata.py, abicheck/ctf_metadata.py, abicheck/service.py]
      tests: [tests/test_btf_metadata.py, tests/test_workflow_kernel_accel.py, tests/test_btf_integration.py]
      examples: [examples/case121_kernel_btf_struct_field_added]
      docs: [docs/user-guide/kernel-btf.md]

  - id: UC-ARCH-sycl
    axis: archetype
    name: SYCL / heterogeneous accelerator stacks (PI/UR)
    status: complete
    note: >
      G6 (accelerator half) closed. SYCL plugin-interface detection is driven
      through the standard compare + report path at the workflow level for BOTH
      interface generations: a dropped PI entrypoint (libpi_*.so) and a dropped
      UR adapter entrypoint (libur_adapter_*.so) each → BREAKING and reach the
      JSON/Markdown reports (tests/test_workflow_kernel_accel.py). CUDA device
      code (.cubin/PTX) stays deferred by design.
    evidence:
      modules: [abicheck/sycl_metadata.py, abicheck/diff_sycl.py]
      tests: [tests/test_diff_sycl.py, tests/test_workflow_kernel_accel.py]
      examples: [examples/case82_sycl_overload_set_removed]
      docs: [docs/user-guide/kernel-btf.md]

  - id: UC-ARCH-static-lib
    axis: archetype
    name: Static libraries (.a / .lib)
    status: by_design_excluded
    note: >
      G8 decision (option A): static/import library archives are a non-goal.
      abicheck compares single linkable images (shared libraries + objects); an
      `ar` archive (.a / .lib, magic `!<arch>\n`) has no runtime ABI surface
      (no SONAME, no dynamic symbol table, no symbol versioning). The CLI now
      detects archives and fails with actionable guidance (extract members or
      compare the shared library) instead of a misleading "unknown format"
      error. See docs/development/goals.md (Non-goals),
      docs/concepts/limitations.md, and the plan
      docs/development/plans/g8-static-libraries.md.
    evidence:
      modules: [abicheck/binary_utils.py, abicheck/service.py]
      tests: [tests/test_compare_input_modes.py, tests/test_service_unit.py]
      docs: [docs/concepts/limitations.md]

  - id: UC-ARCH-ffi-consumers
    axis: archetype
    name: FFI consumers in other languages (Rust/Go/Python)
    status: by_design_excluded
    note: >
      The C ABI such consumers bind to is covered; first-class support for
      non-C/C++ languages is a stated non-goal (docs/development/goals.md).

  # ── platform ──────────────────────────────────────────────────────────────
  - id: UC-PLAT-linux-elf
    axis: platform
    name: Linux ELF (the CI-validated baseline)
    status: complete
    evidence:
      modules: [abicheck/elf_metadata.py]
      examples: [examples/ground_truth.json]

  - id: UC-PLAT-windows-pe
    axis: platform
    name: Windows PE/COFF + PDB (MSVC / MinGW)
    status: complete
    note: >
      G1 (Windows half) closed. The `compare` workflow is validated end-to-end
      on native PE binaries in CI: the `cross-platform-e2e` lane builds DLLs
      with MinGW gcc and drives `abicheck compare` directly (binary↔binary),
      asserting BREAKING on a removed export and COMPATIBLE on identical builds
      (tests/test_cross_platform_integration.py). The MSVC+PDB lane
      (windows-msvc) now asserts both struct-growth (PDB layout) and
      exported-function removal (PE export table) verdicts. PDB struct-layout
      extraction depth on some MSVC versions remains best-effort (tests skip
      rather than fail when a layout can't be parsed), and the example-catalog
      platform tags stay a deliberate subset of Linux (see
      tests/test_platform_coverage_honesty.py).
    evidence:
      modules: [abicheck/pe_metadata.py, abicheck/pdb_metadata.py]
      tests: [tests/test_pe_metadata_unit.py, tests/test_pdb_metadata.py, tests/test_msvc_pdb_e2e.py, tests/test_cross_platform_integration.py]
      docs: [docs/reference/platforms.md]

  - id: UC-PLAT-macos-macho
    axis: platform
    name: macOS Mach-O (x86-64 / ARM64)
    status: complete
    note: >
      G1 (macOS half) closed. The `cross-platform-e2e` CI lane (macos-latest)
      builds .dylib files with Apple clang and drives `abicheck compare`
      directly on native Mach-O binaries, asserting BREAKING on a removed
      export and COMPATIBLE on identical builds
      (tests/test_cross_platform_integration.py). AArch64 AAPCS64 by-value
      aggregate passing (HFA/HVA, the 16-byte small-struct → indirect boundary)
      is modeled by macho_metadata.classify_aapcs64_aggregate and unit-tested in
      tests/test_macos_arm64_abi.py; the calling-convention divergence from SysV
      x86-64 is documented in docs/reference/platforms.md. Example-catalog
      platform tags remain a deliberate subset of Linux
      (tests/test_platform_coverage_honesty.py).
    evidence:
      modules: [abicheck/macho_metadata.py]
      tests: [tests/test_macho_metadata_unit.py, tests/test_macos_arm64_abi.py, tests/test_cross_platform_integration.py]
      docs: [docs/reference/platforms.md]

  - id: UC-PLAT-arch-guard
    axis: platform
    name: Cross-architecture comparison guardrail (ELF e_machine)
    status: planned
    gap: G13
    plan: docs/development/plans/g13-arch-mismatch-guard.md
    next_steps: >
      Capture ELF e_machine / EI_CLASS / endianness into the snapshot (PE and
      Mach-O already carry a machine field) and treat a machine mismatch in
      compare/compare-release as a hard ARCHITECTURE_MISMATCH guard instead of a
      false-green COMPATIBLE_WITH_RISK verdict. Confirmed by empirical scanning:
      x86-64 vs aarch64 of the same version currently reports 100%
      binary-compatible.

  # ── workflow ──────────────────────────────────────────────────────────────
  - id: UC-WF-compare
    axis: workflow
    name: Pairwise compare (CI PR gate)
    status: complete
    evidence:
      tests: [tests/test_reporter.py]
      examples: [examples/ground_truth.json]

  - id: UC-WF-appcompat
    axis: workflow
    name: Application compatibility (consumer-scoped)
    status: complete
    evidence:
      modules: [abicheck/appcompat.py]
      tests: [tests/test_appcompat.py, tests/test_workflow_scenarios.py]

  - id: UC-WF-baseline
    axis: workflow
    name: Baseline pinning / registry
    status: complete
    evidence:
      modules: [abicheck/baseline.py]
      tests: [tests/test_baseline.py]

  - id: UC-WF-debian-symbols
    axis: workflow
    name: Debian symbols file generate/validate/diff
    status: complete
    evidence:
      modules: [abicheck/debian_symbols.py]
      tests: [tests/test_debian_symbols.py]

  - id: UC-WF-abicc-compat
    axis: workflow
    name: ABICC drop-in replacement
    status: complete
    evidence:
      modules: [abicheck/compat/cli.py]
      tests: [tests/test_abicc_parity.py]

  - id: UC-WF-stack-deps
    axis: workflow
    name: Full-stack dependency / sysroot validation
    status: complete
    evidence:
      modules: [abicheck/cli_stack.py, abicheck/resolver.py]
      tests: [tests/test_stack_checker.py]

  - id: UC-WF-mcp
    axis: workflow
    name: MCP server (AI-agent integration)
    status: complete
    evidence:
      modules: [abicheck/mcp_server.py]
      tests: [tests/test_mcp_server_unit.py]
    note: Unit-tested with mocks; no live client/server integration test.

  - id: UC-WF-bundle
    axis: workflow
    name: Multi-library bundle / cohort analysis
    status: complete
    note: >
      Linux/ELF only by design (ADR-018/ADR-023 — no DT_NEEDED/.gnu.version_*
      equivalent elsewhere); cross-platform bundle analysis is tracked under
      G1 / UC-PLAT-*. All bundle detectors run through compare-release; case84
      is validated end-to-end. bundle_soname_skew is opt-in via
      --bundle-cohort PREFIX so independent libraries are never inferred to be
      co-versioned from their filenames. Caveat: wheels whose vendored
      dependencies carry auditwheel/delocate content-hash sonames are not yet
      paired across rebuilds — that topology is tracked separately as
      UC-WF-wheel-vendored (G9).
    evidence:
      modules: [abicheck/bundle.py, abicheck/diff_cpp_patterns.py]
      tests: [tests/test_bundle.py, tests/test_cpp_pattern_detectors.py]
      examples: [examples/case84_bundle_soname_skew]

  - id: UC-WF-oneshot-deep
    axis: workflow
    name: One-shot deep compare (auto-collect L3–L5) + CLI usability
    status: partial
    gap: G21
    plan: docs/development/plans/g21-oneshot-deep-compare.md
    evidence:
      modules: [abicheck/cli.py, abicheck/cli_help.py, abicheck/cli_dump_helpers.py]
      tests: [tests/test_depth_vocabulary.py, tests/test_cov95_cli.py]
    next_steps: >
      Superseded: the standalone `deep-compare` orchestrator was removed and
      folded into `compare`. `compare old.so new.so --old-sources ./src1
      --new-sources ./src2 --max` collects L3-L5 inline from each raw source
      tree (a pre-built `collect` pack is accepted too) and embeds it in the
      per-side snapshot before comparing — the one-shot deep route, now on
      `compare`. The text below is retained as historical context for G21.
      Shipped in PR #422: the --depth headers|build|graph|source|full dial
      (--max = --depth full) on dump reusing the scan_levels vocabulary/mapping;
      the one-shot `deep-compare` orchestrator (cli_max.py) that dumps both sides
      with --sources at --depth then compares, collapsing the six-stage
      dump→collect→merge→compare sequence the oneDAL eval required into one
      command; option collapsing (rich-click groups M1, presets M2, vocab
      unification M5); a cross-platform --gcc-option; and a fail-loud signal on
      an empty requested layer (including empty-PARTIAL). The strict-mode honesty
      half (empty requested L4 => skipped) also shipped. Remaining (deferred as
      explicit opt-in per the P09 "warn, don't guess" design): header/source
      auto-discovery (G21.2) and compile_commands.json auto-synthesis (G21.6) —
      both guess inputs, so they must stay opt-in flags, not defaults.

  - id: UC-WF-cli-contract
    axis: workflow
    name: CLI interface contract, config balance, and extension policy
    status: complete
    gap: G22
    plan: docs/development/plans/g22-cli-consolidation.md
    evidence:
      modules: [abicheck/api_types.py, abicheck/service.py, abicheck/cli_options.py, abicheck/buildsource/inline.py]
      tests: [tests/test_cli_contract.py, tests/test_api_types.py, tests/test_config_rebalance.py]
      docs: [docs/development/adr/037-cli-interface-contract.md, docs/development/plans/g22-cli-consolidation.md]
    next_steps: >
      ADR-037. Formalises the CLI/API as three tiers (core/service/front-end)
      with the service layer as the only chokepoint, typed CompareRequest
      dataclasses (mirroring ADR-035 ScanRequest), one decorator per shared
      option family, a single --depth vocabulary (dropping the "evidence"
      naming and the user-facing L5 graph rung), folding compare-release and
      deep-compare into compare, renaming --header-backend to --ast-frontend,
      a CLI/config rebalance into .abicheck.yml, an explicit exit-code scheme,
      and a cli-contract CI gate. Backward-compat mechanism designed but left
      advisory until 1.0 (pre-1.0, breaking invocations is acceptable now).
      Phased in plans/g22-cli-consolidation.md (P1 chokepoint + typed request,
      P2 decorators, P3 depth vocab, P4 command fold, P5 config rebalance,
      P6 ast-frontend/MCP-name-map/validation, P7 deprecation scaffolding).

  - id: UC-WF-wheel-vendored
    axis: workflow
    name: manylinux/auditwheel vendored-library pairing (hashed sonames)
    status: planned
    gap: G9
    plan: docs/development/plans/g9-wheel-vendored-matching.md
    next_steps: >
      Normalize the auditwheel/delocate content-hash suffix (-[0-9a-f]{6,16})
      on filename AND soname before pairing in compare-release; pair on the
      unhashed soname stem so a bundled libpng16.so.16 matches across rebuilds.
      Add a two-wheel fixture. Empirically confirmed: today every vendored dep
      of a manylinux wheel shows as removed+added.

  - id: UC-WF-stable-abi-subset
    axis: workflow
    name: CPython Limited-API / abi3 import-contract conformance
    status: planned
    gap: G14
    plan: docs/development/plans/g14-stable-abi-subset.md
    next_steps: >
      An abi3 wheel's compatibility surface is the set of CPython C-API symbols
      it IMPORTS, not what it exports. Add a check that classifies an extension's
      imported Py* symbols against the stable-ABI allowlist for a target
      Py_LIMITED_API floor. Empirically confirmed on cryptography 42->43: abicheck
      returns COMPATIBLE from the export table while the imported surface grows
      +7 Py* symbols, entirely invisible. A symbol outside abi3 or newer than the
      floor would fail to import on older Python yet stay COMPATIBLE today.

  - id: UC-WF-audit
    axis: workflow
    name: Single-binary ABI audit / lint (no baseline)
    status: planned
    gap: G11
    plan: docs/development/plans/g11-single-binary-audit.md
    next_steps: >
      Add an `audit`/`dump --lint` driver running the single-snapshot subset of
      detectors (executable stack, insecure RPATH/RUNPATH, missing SONAME,
      unversioned symbols, internal-looking globals) so a library author can
      scan before their first release. Substrate already on the snapshot model;
      only the one-sided driver + hygiene rules are missing (F3).

  - id: UC-WF-security-hardening
    axis: workflow
    name: Security-hardening drift scan (checksec across releases)
    status: complete
    note: >
      G12 closed. The ELF snapshot now captures the full checksec-equivalent
      surface — RELRO (none/partial/full), BIND_NOW, PIE, stack-canary,
      FORTIFY_SOURCE, and writable+executable (W^X) segments — alongside the
      pre-existing executable_stack. Weakening transitions emit dedicated
      COMPATIBLE_WITH_RISK kinds (relro_weakened, pie_disabled,
      stack_canary_removed, fortify_source_weakened, writable_executable_segment).
      A shipped, turnkey policy (abicheck/policies/security.yaml) is referenced
      by name as `--policy-file security` and promotes all hardening kinds to
      break. Validated end-to-end: real hardened-vs-unhardened .so parsing
      (integration), the diff detectors and policy gating (unit), and the
      built-in policy resolution.
    evidence:
      modules: [abicheck/elf_metadata.py, abicheck/diff_platform.py, abicheck/checker_policy.py, abicheck/policies/security.yaml]
      tests: [tests/test_diff_platform_deep.py, tests/test_elf_metadata_unit.py, tests/test_policy_file.py, tests/test_elf_parse_integration.py]
      docs: [docs/user-guide/security-hardening.md]

  - id: UC-WF-probe-matrix
    axis: workflow
    name: Build-configuration matrix (probe harness)
    status: complete
    note: >
      G2 closed. Matrix findings fold into `compare`/`compare-release` via
      --probe-matrix-old/--probe-matrix-new. Both build-config kinds are now
      proven end-to-end through the mainline command:
      CXX_STANDARD_FLOOR_RAISED and — after the relocatable-object symbol-surface
      fix — API_DEPENDS_ON_CONSUMER_ENV. parse_elf_metadata now falls back to
      `.symtab` when a `.o` has no `.dynsym`, so a probe object's defined global
      symbols are captured and the env-dependence detector fires over the real
      compiled surface (tests/test_probe_examples.py +
      tests/test_elf_object_surface.py).
    evidence:
      modules: [abicheck/probe_harness.py, abicheck/diff_build_config.py, abicheck/elf_metadata.py]
      tests: [tests/test_probe_harness.py, tests/test_probe_examples.py, tests/test_elf_object_surface.py]
      examples: [examples/probes/onedpl.yaml, examples/probes/cxx_standard.yaml, examples/probes/feature_macro.yaml]

  # ── reporting ─────────────────────────────────────────────────────────────
  - id: UC-REP-json
    axis: reporting
    name: JSON report (versioned schema)
    status: complete
    evidence:
      modules: [abicheck/reporter.py, abicheck/schemas/__init__.py]
      tests: [tests/test_reporter.py, tests/test_report_schema.py]

  - id: UC-REP-sarif
    axis: reporting
    name: SARIF 2.1.0 (GitHub Code Scanning)
    status: complete
    evidence:
      modules: [abicheck/sarif.py]
      tests: [tests/test_sarif.py]

  - id: UC-REP-junit
    axis: reporting
    name: JUnit XML (CI dashboards)
    status: complete
    evidence:
      modules: [abicheck/junit_report.py]
      tests: [tests/test_junit_report.py]

  - id: UC-REP-markdown-html
    axis: reporting
    name: Markdown / HTML reports
    status: complete
    note: >
      Structural coverage across verdict tiers and the major sections (summary,
      severity groups, impact, release recommendation, confidence) plus HTML
      escaping is asserted in tests/test_report_sections.py; appcompat and
      stack-check render paths are exercised by the workflow E2E tests.
    evidence:
      modules: [abicheck/reporter.py, abicheck/html_report.py]
      tests: [tests/test_format_compliance.py, tests/test_sprint9_html.py, tests/test_report_sections.py]

  # ── toolchain / language standard ───────────────────────────────────────────
  - id: UC-TC-glibc-floor
    axis: toolchain
    name: Platform baseline floor (manylinux glibc requirement)
    status: planned
    gap: G10
    plan: docs/development/plans/g10-glibc-floor-check.md
    next_steps: >
      Compare the max GLIBC_2.x in elf.versions_required (already captured)
      against a declared floor (--glibc-floor or the wheel's manylinux tag) and
      emit a deployment-RISK finding when exceeded. New ChangeKind
      (platform_baseline_floor_raised); composes with diff_versioning.py (F2).

  - id: UC-TC-dual-abi
    axis: toolchain
    name: libstdc++ dual ABI flip (_GLIBCXX_USE_CXX11_ABI)
    status: complete
    evidence:
      examples: [examples/case104_glibcxx_dual_abi_flip]

  - id: UC-TC-flag-drift
    axis: toolchain
    name: Toolchain flag drift (DW_AT_producer)
    status: complete
    evidence:
      modules: [abicheck/diff_build_config.py]
      examples: [examples/case103_toolchain_flag_drift]

  - id: UC-TC-modern-cxx-types
    axis: toolchain
    name: Integer model / char8_t / _BitInt / atomic / ABI tags
    status: complete
    evidence:
      modules: [abicheck/diff_integer_model.py, abicheck/diff_char8t.py, abicheck/diff_bit_int.py, abicheck/diff_atomic.py]
      examples: [examples/case112_lp64_ilp64]

  - id: UC-TC-header-scope-robustness
    axis: toolchain
    name: Header-scoped source-mode robustness on stock host toolchains
    status: partial
    gap: G16
    plan: docs/development/plans/g16-header-scope-toolchain-robustness.md
    evidence:
      modules: [abicheck/dumper.py]
      tests: [tests/test_castxml_toolchain_robustness.py]
    next_steps: >
      Header-scoped scans (the castxml source path) are how abicheck separates
      public source API from private/internal surface, but in the 2026-06
      real-world cron the scoped re-run aborted before any comparison in 21 issue
      records — always the same host-toolchain parse failures, never an abicheck
      logic bug: glibc sized-float types (`unknown type name '_Float32'`,
      `_Float64`/`_Float128`), a GCC 13 libstdc++ `__assume__` attribute, and
      `--lang c` rejecting `extern "C"` headers guarded by `#ifdef __cplusplus`.
      DONE (this increment): `_castxml_failure_hint` classifies all three stderr
      signatures into a single actionable remediation, and on a sized-float /
      `__assume__` failure `_castxml_version_note` probes `castxml --version` and
      folds in the recommended Clang floor (`_RECOMMENDED_CLANG_MAJOR` = 18) so
      the user is told exactly what to upgrade — unit-tested in
      `tests/test_castxml_toolchain_robustness.py`. A `-D_FloatN` preprocessor
      shim was prototyped and REJECTED: glibc's own `typedef float _Float32;`
      fallback would be rewritten into `typedef float float;` (PR review). The
      durable cure is a castxml built against a newer Clang, or the libclang
      extractor (G4). ALSO DONE: per-side L2 backend selection
      (`--old-ast-frontend`/`--new-ast-frontend` on `compare`,
      each inheriting `--ast-frontend`) so a release whose new headers need the
      host toolchain parses on clang while the old release keeps the castxml schema
      reference — the backend mirror of `--old-header`/`--new-header`
      (tests/test_cli_new_features.py::TestPerSideHeaderBackend). REMAINING: an
      integration-marked end-to-end check that `compare --headers` over a
      `<math.h>`-including header succeeds on a real CI host, and a dedicated
      `HeaderToolchainError` type.

  - id: UC-TC-cxx-standard-floor
    axis: toolchain
    name: C++ standard floor raised (per-consumer source break)
    status: complete
    note: >
      Surfaced through the mainline `compare`/`compare-release` command via
      --probe-matrix-old/--probe-matrix-new; the finding reaches the verdict and
      the JSON/SARIF output (proven in tests/test_probe_examples.py).
    evidence:
      examples: [examples/case98_cxx_standard_floor_raised, examples/probes/cxx_standard.yaml]
      tests: [tests/test_probe_examples.py, tests/test_diff_build_config.py]

  # ── workflow: real-world validation corpus (field-eval G17) ────────────────
  - id: UC-WORKFLOW-real-world-corpus
    axis: workflow
    name: Real-world upstream-library validation corpus (conda-forge)
    status: partial
    gap: G17
    plan: docs/development/plans/g17-real-world-corpus.md
    evidence:
      modules: [eval/runner.py, eval/condafetch.py]
      tests: [eval/manifest.yaml, eval/results/latest.json]
    next_steps: >
      Reproducible benchmark (eval/) of abicheck against real conda-forge
      libraries: a curated manifest with expected verdicts, a runner that fetches
      + scans, and a generated report. DONE: binary (L0/L1) tier — verdicts match
      the manifest 'expect'. PLANNED: (1) a source (L3/L4/L5) tier in the runner
      that clones at the tag and runs `dump --sources`; (2) a scheduled CI lane
      that fails on verdict drift; (3) corpus growth to Rust/Go/Qt/Boost and
      win-64/osx-64 subdirs.

  # ── toolchain: Bazel build-evidence (field-eval P21 / G18) ─────────────────
  - id: UC-TC-bazel-build-evidence
    axis: toolchain
    name: Bazel-built C++ project build evidence (L3)
    status: modeled
    gap: G18
    plan: docs/development/plans/g18-bazel-build-evidence.md
    evidence:
      modules: [abicheck/buildsource/adapters/bazel.py]
    next_steps: >
      The cquery/aquery jsonproto adapter exists but is unvalidated end-to-end on
      a real Bazel C++ project (field-eval P21: oneDAL is Bazel + makefile, no
      CMake, so its L3/L4/L5 path was unreachable without the full Intel
      toolchain). Capture a pre-recorded `bazel aquery --output=jsonproto` from a
      small Bazel C++ project, add it as a non-executing fixture + test asserting
      non-empty BuildEvidence, then flip to partial/complete.

  # ── PR-tier source intelligence & cross-source validation (ADR-035 / G19) ──
  - id: UC-WORKFLOW-pr-source-tier
    axis: workflow
    name: Always-on compiler-free PR pre-scan + deterministic level selection
    status: complete
    gap: G19
    plan: docs/development/plans/g19-pr-source-intelligence.md
    evidence:
      modules:
        - abicheck/buildsource/pattern_scan.py
        - abicheck/buildsource/preprocessor_scan.py
        - abicheck/buildsource/risk.py
        - abicheck/buildsource/scan_levels.py
        - abicheck/cli_scan.py
      tests:
        - tests/test_pattern_scan.py
        - tests/test_preprocessor_scan.py
        - tests/test_risk.py
        - tests/test_scan_levels.py
        - tests/test_cli_scan.py
    next_steps: >
      ADR-035 D2/D3. DONE: the compiler-free lexical pattern pre-scan
      (`buildsource/pattern_scan.py`); and the deterministic `scan` orchestrator
      (`cli_scan.py`) — classify → always-on tier (pattern S3 + crosscheck D4) →
      the pinned level (`--mode` preset or explicit `--source-method`/`--depth`,
      resolved by `buildsource/scan_levels.py`) → one coverage-annotated report,
      with `--baseline` comparison and a `--budget` failure guard (exit 5 on
      overflow, never shrinks scope). The numeric risk score (`buildsource/risk.py`,
      tunable `risk_rules`) changes depth ONLY under `--source-method auto`
      (opt-in). Reuses the existing collect-mode/replay-scope machinery; emits
      RISK/API_BREAK only (authority rule). The D7 POI work-list
      (`buildsource/poi.py`) focuses the expensive scan (Phase 3b, done). The S2
      preprocessor pre-scan (`buildsource/preprocessor_scan.py`) — per-TU ABI
      macro-value capture + divergence and public-header private/generated-header
      leak detection, conditional on a compile DB + `clang -E` — is done and wired
      into `scan` with honest coverage. Phase 1 (D2) complete.

  - id: UC-CHANGE-crosscheck-hygiene
    axis: change_class
    name: Cross-source validation findings (intra-version ABI hygiene)
    status: complete
    gap: G19
    plan: docs/development/plans/g19-pr-source-intelligence.md
    evidence:
      modules:
        - abicheck/buildsource/crosscheck.py
      tests:
        - tests/test_crosscheck.py
    next_steps: >
      ADR-035 D4. DONE: the intra-version cross-source engine
      (`buildsource/crosscheck.py`, `run_crosschecks`) and its four
      correctly-partitioned ChangeKinds — exported_not_public (RISK),
      public_not_exported (RISK), header_build_context_mismatch (API_BREAK),
      private_header_leak (RISK) — which diff ONE merged snapshot's evidence
      sources against each other, with per-check coverage rows and the §6.8
      provider-agreement matrix; each skips (never false-positives) when its
      evidence is absent and is never BREAKING (authority rule). REMAINS:
      odr_type_variant and public_to_internal_dependency; wiring the engine into
      the Phase-3 `scan`/`audit` orchestrator and the `crosschecks:` severity
      config; and FP-rate-gate corpus cases before any check is promoted to gate.

  - id: UC-WORKFLOW-single-release-audit
    axis: workflow
    name: Single-release ABI-hygiene audit (no baseline compare)
    status: complete
    gap: G19
    plan: docs/development/plans/g19-pr-source-intelligence.md
    evidence:
      modules:
        - abicheck/cli_scan.py
        - abicheck/buildsource/crosscheck.py
        - abicheck/mcp_server.py
      tests:
        - tests/test_scan_estimate.py
        - tests/test_crosscheck.py
    next_steps: >
      ADR-035 D8. DONE: `scan --audit` runs the D2 pattern facts + D4 cross-checks
      intra-version with no baseline and renders the hygiene catalog (accidental
      ABI surface, public-not-exported, header/build-context mismatch,
      private-header leaks + advisory pattern facts), severity-mapped via
      `--crosscheck KEY=LEVEL` (`_audit_exit_code`: RISK advisory, API_BREAK →
      exit 2, promoted checks gate); the MCP `abi_audit` tool exposes the same
      catalog. REMAINS: `surface-report` reuse and the deeper one-time checks
      (ODR variants, visibility/versioning hygiene, RTTI-for-internal-types) once
      their ChangeKinds land (Phase 2 tail).

  - id: UC-WORKFLOW-evidence-directed-scope
    axis: workflow
    name: Evidence-directed scan focusing (binary/header facts steer source scan)
    status: complete
    gap: G19
    plan: docs/development/plans/g19-pr-source-intelligence.md
    evidence:
      modules:
        - abicheck/buildsource/poi.py
        - abicheck/cli_scan.py
      tests:
        - tests/test_poi.py
    next_steps: >
      ADR-035 D7. DONE: `buildsource/poi.py` (`build_points_of_interest`) computes
      a points-of-interest work-list from the changed-path floor + pattern-scan
      escalation triggers + L0/L1/L2 export deltas + risk score (reverse of the
      explain-finding localization walk); the floor is unconditional (risk/deltas
      only add, never drop a changed TU). `scan` runs the pattern pre-scan first
      and feeds `poi.changed_paths()` into the source-replay scope. The
      `scan --baseline` path now reads a cheap, header-free L0 view of both sides
      (`_load_exports_for_poi`) so the export-delta walk runs live, and
      `resolve_symbol_tus` turns the resulting symbol POIs into declaring TUs
      (via the baseline's cached L5 graph) that join the replay seed *and* the
      `crosscheck` changed-path work-list. REMAINS: symbol→TU resolution only
      fires when the baseline carries an L5 graph (the full-depth ADR-035 D9
      baseline); a shallow baseline still focuses on changed paths only.

  - id: UC-TC-build-emitted-facts
    axis: toolchain
    name: Build-emitted source facts (artifact protocol + Clang plugin/wrapper)
    status: complete
    gap: G19
    plan: docs/development/plans/g19-pr-source-intelligence.md
    evidence:
      modules:
        - abicheck/buildsource/inputs_pack.py
        - abicheck/buildsource/inputs_emit.py
        - abicheck/cc_wrapper.py
        - contrib/abicheck-clang-plugin/README.md
      tests:
        - tests/test_inputs_pack.py
        - tests/test_inputs_emit.py
    next_steps: >
      ADR-035 D5, complete. The `abicheck_inputs/` dump/facts artifact protocol
      (normalized source_facts/*.jsonl canonical → L4, build/compile_commands.json
      → L3; raw AST forensic-only) is ingested without re-running a frontend via
      the existing `merge` (`ingest_inputs_pack`, dir input auto-detected). The
      supported portable producer is the `abicheck-cc` compiler wrapper
      (pass-through compile + best-effort castxml/clang extraction →
      `inputs_emit`); the Clang plugin (`contrib/abicheck-clang-plugin/`) is the
      optional optimization that removes the second frontend pass, with GCC/MSVC
      fallbacks documented. Lets vendor/closed builds contribute exact-build-context
      facts without shipping sources; replay stays the portable default.

  - id: UC-REPORTING-scan-coverage-estimate
    axis: reporting
    name: Scan coverage/confidence report + per-project cost estimate
    status: complete
    gap: G19
    plan: docs/development/plans/g19-pr-source-intelligence.md
    evidence:
      modules:
        - abicheck/cli_scan.py
        - abicheck/service.py
        - abicheck/mcp_server.py
      tests:
        - tests/test_cli_scan.py
        - tests/test_scan_estimate.py
    next_steps: >
      ADR-035 D9/D10. DONE: `scan` emits one report (text/JSON) carrying the
      per-tier coverage table (intrinsic L0-L2 + pattern-scan S3 + L3/L4/L5 pack
      coverage + per-crosscheck rows + the POI focus summary), the resolved level,
      and the risk score, so a partial scan is legible (never a bare "source scan
      failed"). DONE: the typed `ScanRequest`/`CostEstimate`/`LayerResult`/`Budget`
      API + `service.estimate_scan()` (probes TU count from the compile DB /
      source tree + header fan-out), surfaced as `scan --estimate` and the MCP
      `abi_estimate` tool — a dry-run that scans nothing. REMAINS: the full
      `ScanResult`/`run_scan` refactor (move the `cli_scan` orchestration body into
      `service.py`) and the provider-agreement/confidence matrix rendered by
      reporter/PR-comment/SARIF.

  # ── Source-scan & cross-source example corpus (ADR-035 / G20) ──────────────
  - id: UC-WORKFLOW-audit-example-corpus
    axis: workflow
    name: Single-release audit example corpus (no baseline)
    status: partial
    gap: G20
    plan: docs/development/plans/g20-source-scan-example-catalog.md
    evidence:
      modules: [abicheck/buildsource/crosscheck.py, scripts/gen_g20_fixtures.py]
      tests: [tests/test_g20_catalog.py]
      examples:
        - examples/case143_audit_accidental_export
        - examples/case144_audit_private_header_leak
        - examples/case145_audit_unversioned_export
        - examples/case146_audit_rtti_for_internal
        - examples/case147_scan_depth_ladder
    next_steps: >
      ADR-035 D8 (G20.1). DONE: the Phase 0 `ground_truth.json` v4 schema
      (`mode: audit`, `expected_crosscheck_kinds`, `expected_providers`,
      `fixtures`) plus five single-build catalog cases (143-147) reaching a
      verdict from one artifact with no baseline — `exported_not_public`,
      `private_header_leak`, `unversioned_exported_symbol`,
      `rtti_for_internal_type`, and the depth-ladder case. Each ships a committed
      `snapshot.abi.json` validated compiler-free by `tests/test_g20_catalog.py`.
      The depth-ladder legibility property (case147) is now asserted compiler-free
      as provider escalation — the same input flags `private_header_leak` from
      `public_header_ast` alone at shallow depth and gains the `source_index`
      corroboration with the L5 graph attached
      (`test_g20_catalog.py::test_depth_ladder_deepens_corroboration_same_input`).
      REMAINING: a live S3->S5 depth-ladder run under the `integration` (castxml)
      lane to show the real per-depth *timing* (the compiler-free test proves the
      coverage escalation, not the cost), and Flow-2 `abicheck_inputs/` pack
      fixtures alongside the committed snapshots. See plan Phase 1.

  - id: UC-CHANGE-crosscheck-example-corpus
    axis: change_class
    name: Cross-source corroboration example corpus (combination beats one source)
    status: partial
    gap: G20
    plan: docs/development/plans/g20-source-scan-example-catalog.md
    evidence:
      modules: [abicheck/buildsource/crosscheck.py]
      tests: [tests/test_xcheck_scenarios.py, tests/test_g20_catalog.py]
      examples:
        - examples/case148_xcheck_header_build_mismatch
        - examples/case149_xcheck_odr_variant
        - examples/case150_xcheck_export_public_pair
        - examples/case151_xcheck_provider_matrix
    next_steps: >
      ADR-035 D4 (G20.2). DONE: four `examples/caseNN` cases whose finding is
      invisible/ambiguous to any single source and resolves only by crosschecking
      two — `header_build_context_mismatch` (L2 macros vs L3 flags),
      `odr_type_variant` (L4 layout vs layout), the bidirectional
      `exported_not_public`/`public_not_exported` pair, and the provider-agreement
      matrix (§6.8) asserting the recorded provider list differs for the rich vs
      thin corroboration. `tests/test_xcheck_scenarios.py` adds the clean negative
      counterpart per finding (FP-rate gate stays 0/0). REMAINING: deriving a
      per-finding confidence tag from provider count is a separate reporting
      enhancement, deliberately out of scope here. See plan Phase 2.

  - id: UC-WORKFLOW-focusing-example-corpus
    axis: workflow
    name: Evidence-directed focusing scenarios (sources steer sources)
    status: partial
    gap: G20
    plan: docs/development/plans/g20-source-scan-example-catalog.md
    evidence:
      modules: [abicheck/buildsource/poi.py, abicheck/buildsource/source_link.py]
      tests: [tests/test_poi_scenarios.py, tests/test_source_evidence_integrity.py]
    next_steps: >
      ADR-035 D7 (G20.3). DONE: test-only scenario suites
      (`tests/test_poi_scenarios.py`, `tests/test_source_evidence_integrity.py`)
      asserting on the scan plan, not just the verdict — an export delta targeting
      the changed symbol, an exported template instantiation seed, the D7
      changed-path floor (a mis-weighted `risk_rules` cannot drop a changed TU),
      and the D4 unlinked-source-evidence integrity guard (the oneDAL shape: many
      exports, TUs parsed, zero matched symbols reported as degraded, never clean).
      ALSO DONE: the §3.3 `_layers_from_coverage` plumbing — the L4 source-link
      boundary integrity counters (`exported_symbols`/`matched_symbols`/
      `unmatched_symbols` + `facts`) now ride the crosscheck coverage row
      (`crosscheck._coverage_row`/`_CheckOutput`) onto the rendered
      `ScanResult` layer (`service_scan.LayerResult.counters`), so a degraded
      link is named on the report even when ODR runs clean
      (`test_source_evidence_integrity.py::test_integrity_counters_surface_on_rendered_scan_layers`).
      REMAINING: nothing in this entry. See plan Phase 3.