Architecture¶
Overview¶
abicheck is a Python CLI tool that compares two versions of a C/C++ shared library
to detect ABI and API incompatibilities. Its core design idea is to reason over
five independent sources of information about a library — the binary, its
debug symbols, its public headers, its build-system data, and (optionally) its
sources — instead of relying on a single data source. Each source is an additive
evidence layer (L0–L4); feeding more layers both finds breaks the
weaker layers are blind to and suppresses false positives they would raise. See
Evidence layers: the five sources below for
the model, and Evidence & Detectability for the
conceptual companion.
Supported platforms and binary formats:
| Platform | Binary format | Binary metadata | Header AST (castxml) | Debug info cross-check |
|---|---|---|---|---|
| Linux | ELF (.so) |
Yes (pyelftools) | Yes (GCC, Clang) | Yes (DWARF) |
| Windows | PE/COFF (.dll) |
Yes (pefile) | Yes (MSVC, MinGW) | Planned (PDB) |
| macOS | Mach-O (.dylib) |
Yes (macholib) | Yes (Clang, GCC) | Yes (DWARF) |
Analysis pipeline¶
The CLI dumps each input into a normalized snapshot, enriches it with header AST and debug-info layers, then diffs the two snapshots to produce a verdict:
flowchart TD
CLI["abicheck CLI<br/>(dump · compare · compat check/dump)"]
FMT{"Format detection<br/>(ELF / PE / Mach-O)"}
ELF["ELF<br/>pyelftools"]
PE["PE/COFF<br/>pefile"]
MACHO["Mach-O<br/>macholib"]
SNAP["L0 — Binary metadata<br/>Snapshot (JSON model)"]
AST["L2 — Header AST<br/>castxml (all platforms)"]
DBG["L1 — Debug-info cross-check<br/>DWARF (Linux, macOS) · PDB (Windows)"]
CHK["Checker → Changes → Verdict"]
CLI --> FMT
FMT --> ELF
FMT --> PE
FMT --> MACHO
ELF --> SNAP
PE --> SNAP
MACHO --> SNAP
SNAP --> AST
AST --> DBG
DBG --> CHK
The analysis layers are independent and additive — each catches changes the others miss, and the checker reconciles them into a single verdict. The artifact layers (L0/L1/L2) are described in detail below; the build/source layers (L3/L4, plus the optional L5 reachability graph) are covered in Build & Source Packs.
Evidence layers: the five sources¶
abicheck's accuracy comes from treating compatibility analysis as a question of
evidence: the more independent sources of information you give it about a
library, the more it can prove — and the fewer false positives it raises. You
provide five sources (L0–L4); abicheck derives a sixth, the L5
graph — six evidence layers in all, layered from the least input to the most:
| Layer | Source | Collected from | Authority | Reveals |
|---|---|---|---|---|
| L0 | Just the binary | ELF/PE/Mach-O parsers (elf_metadata.py, pe_metadata.py, macho_metadata.py) |
Authoritative | Exported symbols, SONAME/install-name, versions, visibility, binding, dependencies |
| L1 | Debug symbols | DWARF/PDB/BTF/CTF (dwarf_*, pdb_*, btf_metadata.py, ctf_metadata.py) |
Authoritative when matched to the binary | Type layout: sizes, field offsets, enum values, vtable slots, calling convention, packing |
| L2 | Public headers | castxml or clang AST (dumper_castxml.py / dumper_clang.py, --ast-frontend) |
Authoritative for header-visible API | Source API: signatures, overloads, access, final/explicit/noexcept, templates, public/internal scoping |
| L3 | Build system data & options | compile DB / CMake / Ninja / Bazel / Make (build_context.py, build/source pack ADR-029) |
Context / confidence | ABI-relevant flags (-std, _GLIBCXX_USE_CXX11_ABI, -fvisibility, -fabi-version), toolchain, target graph, export policy |
| L4 | Sources | per-TU source ABI replay (build/source pack ADR-030) | Source-/API-risk evidence, never sole shipped-ABI authority | Macro/constexpr values, default-argument values, inline/template bodies, uninstantiated templates |
| L5 | Source/build graph (derived) | folded from L3 (+ any L4 surface) into a graph summary (build/source pack ADR-031) | Explanation / localization / impact, never shipped-ABI authority | Include/type/call reachability: which public surface a change reaches; prioritizes cross-symbol impact |
flowchart LR
L0["L0 · binary<br/>(stripped .so)"] --> L1["L1 · + debug<br/>(DWARF/PDB)"]
L1 --> L2["L2 · + headers<br/>(castxml / clang AST)"]
L2 --> L3["L3 · + build data<br/>(compile DB)"]
L3 --> L4["L4 · + sources<br/>(build/source pack)"]
L3 -.derived.-> L5["L5 · source/build graph<br/>(reachability)"]
L4 -.derived.-> L5
L0 -.weaker evidence.-> L4
The authority rule (ADR-028). The layers are not a fallback chain — abicheck overlays everything it is given and computes one worst-wins verdict. But not all evidence carries the same weight:
Artifact-backed L0/L1/L2 evidence is authoritative for the shipped-ABI verdict. Build/source L3/L4/L5 evidence may explain, localize, scope, or add confidence to a finding, and may raise its own source-/API-level findings (default
API_BREAKor risk) — but it never silently deletes an artifact-proven break.
So L3 noticing a -std bump or L4 noticing a changed macro can add a finding
or explain one, but only L0/L1/L2 can declare a binary BREAKING. Every
compare that uses build/source evidence prints an evidence-coverage table
(and a structured layer_coverage array in JSON) so consumers can tell which
findings are artifact-proven vs. context-only — see Build & Source Packs.
Graceful degradation. abicheck dump --show-data-sources reports exactly
which of L0/L1/L2 a binary affords and how many detectors that enables
(symbols-only ≈ 6/30, DWARF-only ≈ 24/30, with headers 30/30). With less input
abicheck degrades down the staircase rather than failing; with more it both
finds more and false-positives less. The empirical per-tier behaviour across the
example catalog is benchmarked in
Tool Comparison §Benchmarking by evidence tier.
Artifact layers in detail¶
Layer L0: Binary metadata¶
Reads native binary metadata using format-specific parsers:
ELF (Linux, via pyelftools):
- Exported symbols (functions, variables) from .dynsym
- SONAME, symbol binding (GLOBAL, WEAK, LOCAL), symbol versioning
- NEEDED dependencies, visibility attributes
PE/COFF (Windows, via pefile):
- Exported functions and ordinals from the export table
- Imported DLLs and functions from the import table
- Machine type, characteristics, DLL characteristics
- File and product version from VS_FIXEDFILEINFO resource
Mach-O (macOS, via macholib):
- Exported symbols from the symbol table (including weak definitions)
- Install name (LC_ID_DYLIB — equivalent of ELF SONAME)
- Dependent libraries (LC_LOAD_DYLIB — equivalent of ELF DT_NEEDED)
- Re-exported libraries (LC_REEXPORT_DYLIB)
- Current and compatibility versions, minimum OS version
- Fat/universal binary support (automatic architecture selection)
Layer L2: Header AST (castxml / Clang) — all platforms¶
Parses C/C++ headers through a selectable frontend — --ast-frontend
auto|castxml|clang (or ABICHECK_AST_FRONTEND);
auto prefers castxml and
falls back to clang -ast-dump=json on clang-only hosts (ADR-003). The rest of
this section describes the castxml backend. The clang backend exposes the same
declaration surface (signatures, classes/bases, enums, typedefs, access,
noexcept, templates) but is a syntactic AST: it does not compute record
layout, so size_bits/offset_bits/vtable slots stay unset and the layout
detectors skip an unknown-vs-unknown comparison — DWARF (L1) remains the layout
authority on a clang-only host. With that caveat, the header AST extracts:
- Function signatures (parameters, return types)
- Class/struct definitions; layout when backed by castxml or DWARF evidence
- Virtual method tables (vtable slot ordering) when backed by castxml or DWARF evidence
- Enum values and member names
- Typedefs and template instantiations
noexceptspecifications- Access levels (public, protected, private)
castxml is a cross-platform tool maintained by Kitware (available via conda-forge,
system packages, or direct download for Linux, Windows, and macOS). It is the primary
source for type-level analysis, catching changes invisible to debug-info-only tools:
noexcept, static qualifier, const qualifier, access level changes.
Compiler support: castxml uses an internal Clang compiler for parsing but
emulates the preprocessor defines, include paths, and target platform of an external
compiler via --castxml-cc-<id> <compiler-binary>. At invocation castxml calls the
external compiler to discover its built-in defines (e.g. __GNUC__, __GNUC_MINOR__,
_MSC_VER) and default include search paths, then injects those into its internal Clang
so the resulting AST matches what the external compiler would produce.
| Compiler ID | Compiler | Typical platforms |
|---|---|---|
gnu |
GCC / g++ | Linux, macOS, Windows (MinGW) |
gnu-c |
GCC / gcc (C mode) | Linux, macOS, Windows (MinGW) |
msvc |
Microsoft Visual C++ (cl) | Windows |
msvc-c |
Microsoft Visual C (cl, C mode) | Windows |
Auto-detection logic (see dumper.py:_castxml_dump()): abicheck extracts the
filename from the compiler binary path (via Path(cc_bin).name), lower-cases it, and
checks whether it is cl or cl.exe. If so, it passes --castxml-cc-msvc; otherwise it
passes --castxml-cc-gnu. The comparison is case-insensitive so CL.EXE, Cl.exe, etc.
are all correctly detected on Windows.
Compiler resolution priority (highest to lowest):
--gcc-path /path/to/compiler— explicit path override, used as-is--gcc-prefix <prefix>— cross-toolchain prefix; abicheck appendsg++(C++ mode) orgcc(C mode) automatically- Default mapping — logical name (
c++→g++,cc→gcc,clang++→clang++)
Scanning with a specific compiler version: use --gcc-path to point at the exact
binary. castxml queries that binary for its version-specific predefined macros and include
paths, so the parse reflects exactly what that compiler version defines:
abicheck dump libfoo.so -H foo.h --gcc-path /usr/bin/g++-9 # GCC 9
abicheck dump libfoo.so -H foo.h --gcc-path /usr/bin/g++-12 # GCC 12
Limitations — non-C/C++ languages and compiler extensions:
castxml can only parse C and C++ because its internal engine is Clang. It cannot parse
Fortran, Rust, Ada, or other languages — there is no --castxml-cc-fortran equivalent.
For compilers that add language extensions beyond standard C/C++ (e.g. Intel DPC++/SYCL
__attribute__((sycl_kernel)), CUDA __global__, OpenACC pragmas), castxml can query
the external compiler's preprocessor state but its internal Clang will reject
extension-specific syntax during parsing. To scan such headers you would need either a
CastXML build linked against the matching Clang fork (e.g. Intel's DPC++ Clang for SYCL)
or a different AST extraction tool that uses that compiler's libclang directly.
Layer L1: Debug info cross-check (optional)¶
When debug info is available in the binary:
DWARF (Linux .so, macOS .dylib — via pyelftools):
- Cross-validates struct/class sizes against header-computed sizes
- Verifies member offsets (catches #pragma pack or -march-specific alignment differences)
- Checks vtable slot offsets
- Detects calling convention and frame register changes
PDB (Windows .dll — via built-in PDB parser):
- Extracts struct/class/union sizes and field layouts from TPI stream
- Extracts enum underlying types and member values
- Detects calling convention changes (__cdecl, __stdcall, __fastcall,
__thiscall, __vectorcall) from LF_PROCEDURE / LF_MFUNCTION records
- Extracts MSVC toolchain info (version, machine type, ABI flags) from DBI stream
- Auto-discovers PDB files from PE debug directory; use --pdb-path to override
Debug artifact resolution (via debug_resolver module):
When debug info is not embedded, abicheck searches a configurable resolver
chain: split DWARF (.dwo/.dwp), build-id trees, path mirrors, dSYM bundles,
PDB files, and optionally debuginfod servers. Use --debug-root to point at
separate debug file directories, or --debuginfod for network-based resolution.
Layers L3 / L4: Build & source evidence (optional)¶
The build (L3) and source (L4) layers are post-build, opt-in, and never authoritative on their own — abicheck reads existing build outputs and build-system query interfaces; it does not rebuild your project. They are collected into a content-addressed build/source pack and attached to a snapshot:
- L3 — build context (
build_context.py, ADR-029): parses acompile_commands.json(-p build/) or a CMake/Ninja/Bazel/Make graph to recover the exact ABI-relevant flags and toolchain the library was built with. Diffs emit context/risk kinds likeabi_relevant_build_flag_changed,toolchain_version_changed, andlink_export_policy_changed. - L4 — source ABI replay (ADR-030): parses selected TUs and public headers
under their real per-TU build context and links the result against the
exported surface, catching
public_macro_value_changed,default_argument_changed,constexpr_value_changed, and the uninstantiated templates that no artifact carries.
Both are described in full in Build & Source Packs. Per the
authority rule, every L3/L4 finding defaults to API_BREAK or risk and carries
an explicit evidence-tier boundary so it is never read as a proven shipped-ABI
break.
Key modules¶
CLI & service layer¶
| Module | Responsibility |
|---|---|
cli.py |
CLI entrypoint — dump, compare, compat check, compat dump, deps (tree/compare), baseline, appcompat commands |
service.py |
Service layer — shared orchestration for CLI and MCP server (resolve_input, run_dump, run_compare, render_output) |
mcp_server.py |
MCP (Model Context Protocol) server for AI agent integration |
build_context.py |
compile_commands.json parsing and per-TU flag extraction |
debug_resolver.py |
Debug artifact resolution chain (DWARF, PDB, dSYM, debuginfod) |
baseline.py |
Baseline registry — push/pull/list/delete with SHA-256 integrity verification |
Data model & serialization¶
| Module | Responsibility |
|---|---|
model.py |
Data models for snapshots (AbiSnapshot, Function, RecordType, EnumType, etc.) |
checker_types.py |
Core result types (Change, DiffResult, DetectorSpec, LibraryMetadata) — extracted from checker.py to break circular dependencies |
serialization.py |
JSON snapshot serialization/deserialization |
errors.py |
Custom exception definitions |
Snapshot generation (dumper)¶
| Module | Responsibility |
|---|---|
dumper.py |
Snapshot generation: reads binary + headers → JSON snapshot |
elf_metadata.py |
ELF reader — Linux .so binaries (via pyelftools) |
pe_metadata.py |
PE/COFF reader — Windows .dll binaries (via pefile) |
macho_metadata.py |
Mach-O reader — macOS .dylib binaries (via macholib) |
binary_utils.py |
Shared binary format utilities |
Diff engine (checker)¶
| Module | Responsibility |
|---|---|
checker.py |
Diff orchestration: compares two snapshots, delegates to sub-modules, collects changes |
checker_policy.py |
ChangeKind enum, built-in policy profiles (strict_abi, sdk_vendor, plugin_abi), verdict computation |
diff_symbols.py |
Symbol-level ABI diff detectors (functions, variables, parameters) |
diff_types.py |
Type-level ABI diff detectors (structs, enums, unions, typedefs, fields) |
diff_platform.py |
Platform-specific ABI diff detectors (ELF, PE, Mach-O, DWARF) |
diff_filtering.py |
Post-processing: enrichment, redundancy filtering, AST-DWARF deduplication |
detectors.py |
Individual ABI change detection rules |
Policy & suppression¶
| Module | Responsibility |
|---|---|
policy_file.py |
Custom YAML policy file parsing (--policy-file) |
suppression.py |
Suppression rules, symbol/type filtering |
severity.py |
Severity classification for changes |
Report output¶
| Module | Responsibility |
|---|---|
reporter.py |
Markdown and JSON output formatting |
html_report.py |
HTML report generation |
sarif.py |
SARIF output for GitHub Code Scanning |
report_classifications.py |
Change classification helpers for reports |
report_summary.py |
Report summary generation |
Debug info (DWARF & PDB)¶
| Module | Responsibility |
|---|---|
dwarf_unified.py |
Unified DWARF handling (layer 3, Linux/macOS) |
dwarf_advanced.py |
Advanced DWARF analysis (calling convention, packing, toolchain flags) |
dwarf_metadata.py |
DWARF metadata extraction (Linux/macOS) |
dwarf_snapshot.py |
DWARF-based snapshot enrichment |
dwarf_utils.py |
DWARF parsing utility functions |
pdb_parser.py |
Minimal PDB parser (MSF container, TPI, DBI streams) |
pdb_metadata.py |
PDB debug info → DwarfMetadata/AdvancedDwarfMetadata |
pdb_utils.py |
PDB file location from PE debug directory |
Dependency & stack analysis¶
| Module | Responsibility |
|---|---|
resolver.py |
Dependency tree resolution (ELF DT_NEEDED / Mach-O LC_LOAD_DYLIB) |
binder.py |
Symbol binding simulation across loaded DSOs |
stack_checker.py |
Full-stack ABI validation across dependency trees |
stack_report.py |
Stack-check report formatting |
appcompat.py |
Application compatibility checking (filters diff to app-used symbols) |
package.py |
Package-level comparison (RPM, DEB, conda) |
ABICC compatibility¶
| Module | Responsibility |
|---|---|
compat/ |
ABICC compatibility layer (compat check, compat dump, XML parsing) |
compat/abicc_dump_import.py |
Import Perl-format ABICC dump files |
demangle.py |
C++ symbol demangling utilities |
Policy model¶
Policies control how detected changes are classified (BREAKING, API_BREAK, COMPATIBLE).
Built-in profiles:
| Profile | Behavior |
|---|---|
strict_abi (default) |
Every ABI change at maximum severity |
sdk_vendor |
Source-only changes downgraded to COMPATIBLE |
plugin_abi |
Calling-convention changes downgraded to COMPATIBLE |
Custom policies: YAML files with per-kind break|warn|ignore overrides.
Source of truth: BREAKING_KINDS, API_BREAK_KINDS, COMPATIBLE_KINDS, and RISK_KINDS sets in checker_policy.py.
Verdict system¶
| Verdict | Exit code | Meaning |
|---|---|---|
NO_CHANGE |
0 | Identical snapshots |
COMPATIBLE |
0 | Safe changes (new symbols, weak binding) |
COMPATIBLE_WITH_RISK |
0 | Binary-compatible but deployment risk present |
API_BREAK |
2 | Source-level break, binary-safe (rename, access change) |
BREAKING |
4 | Binary ABI break — old binaries will fail |
Error model¶
Public exceptions are defined in abicheck/errors.py. Tool errors produce exit code 1.