Part 3 — Type Layout Breaks¶
Series navigation: 0. Product Contract · 1. Foundations · 2. Symbol Contracts · 3. Type Layout · 4. C++ ABI · 5. Linker & ELF · 6. Transitive Breaks · 7. Designing for Stability · 8. Detecting Breaks
What you'll learn on this page
- Why a struct published in a header is a byte-level contract, and how the
compiler turns
offsetofandsizeofinto immediate constants the caller never re-checks. - The six axes along which layout breaks: size/offset, alignment, enum values, union size, bitfields/flexible arrays, and pointer/array element types.
- The exact rule that makes one union-field addition safe and another breaking.
- How to design types whose layout you can evolve forever.
Prerequisites: Part 1 — Foundations. The "compiler bakes it in" idea from there is the engine behind everything here.
The core mechanism: layout is a frozen constant in the caller¶
Every aggregate type published in a header is a byte-level contract: its size, its members' offsets, its alignment, and — for C++ — its vtable shape. The consumer does not re-read that contract at load time. The compiler bakes it into every caller:
offsetof(s, field)becomes an immediate displacement in amovinstruction.sizeof(T)becomes an allocation constant (malloc(12), a stack frame size, an array element stride).- array indexing multiplies the index by a stride chosen at compile time.
struct Point { int x; int y; }; // v1: sizeof = 8
caller code, compiled against v1:
mov dword [rbp-8], 0 ; p.x -> offset 0
mov dword [rbp-4], 0 ; p.y -> offset 4 ← "4" is now a constant in the binary
sub rsp, 8 ; reserve sizeof(Point) = 8 ← "8" is now a constant too
When the library's next release shifts even one offset, every call site compiled against the old layout reads or writes at the wrong address — silently, without a linker error, usually without a crash until adjacent memory is eventually read back.
1. Struct / class size and offsets¶
The most common layout break is appending, inserting, or reordering a field.
/* v1 */ struct Point { int x; int y; }; // sizeof = 8
/* v2 */ struct Point { int x; int y; int z; }; // sizeof = 12
Every caller that allocates Point on the stack or inside another struct now
under-allocates; every caller passing Point by value sends 8 bytes while the
library reads 12. The C++ analog (case14)
grows a char data[64] buffer to char data[128]: new Buffer() hands the
constructor a 64-byte allocation that it zero-fills with 128 bytes, smashing
whatever lives next on the heap.
Layout breaks are also transitive through embedding and inheritance. Adding
int extra_field to class Base (case43)
shifts Derived::value from offset 12 to 16 — every subclass member in the
ecosystem silently moves.
How abicheck sees it
type_size_changed / data-member offset deltas → 🔴 BREAKING, from
DWARF or headers. case07,
case14,
case40 (five field-level
mutations in one struct to show how "just one field" cascades).
2. Alignment and packing¶
Alignment is the second axis — fields and sizes can be identical while the alignment requirement changes.
/* v1 */ struct __attribute__((aligned(8))) CacheBlock { /* ... */ };
/* v2 */ struct __attribute__((aligned(64))) CacheBlock { /* ... */ };
v1 callers allocate CacheBlock on 8-byte boundaries. v2 code may emit
aligned-load instructions (vmovdqa) and fault on misaligned access —
SIGSEGV on x86-64 Linux, SIGBUS on strict-alignment platforms — and
malloc (typically 16-byte aligned) can no longer hand out correctly-aligned
storage without aligned_alloc.
Packing is the inverse:
case56 adds #pragma pack(1),
eliminating all padding. sizeof shrinks, every field after the first moves,
and on ARM/SPARC the now-unaligned int access traps. Because alignas and
#pragma pack propagate across translation-unit boundaries through the header, a
single-line change rewrites offsets for every TU that includes it.
3. Enum value stability¶
Enumerations look like constants, but they are part of the wire format — the integer values get persisted to disk, sent over the network, and switched on.
/* v1 */ enum Color { RED=0, GREEN=1, BLUE=2 };
/* v2 */ enum Color { RED=0, YELLOW=1, GREEN=2, BLUE=3 }; // inserted YELLOW
Inserting YELLOW in the middle shifts GREEN and BLUE by one. Every binary
that tested == 1 for green now hits the yellow branch.
Three breaking variants and one safe one:
| Change | Effect | Verdict |
|---|---|---|
| Insert member in the middle (case08) | shifts later values | 🔴 BREAKING |
Reassign a value, same name, e.g. ERROR=1→ERROR=99 (case20) |
protocol rewrite with no negotiation | 🔴 BREAKING |
| Remove a member (case19) | persisted/transmitted value becomes undefined | 🔴 BREAKING |
| Append at the end (case25) | existing values unchanged | 🟢 COMPATIBLE |
A subtler trap: case57
adds a sentinel = 0x100000000LL that forces the compiler to widen the
underlying type from int to long. sizeof(Color) jumps from 4 to 8, and
every struct embedding Color silently grows and relocates its later fields.
4. Union layout¶
Unions share offset 0 across all members, so adding a variant doesn't move existing fields — but the union's size equals its largest member, and that size propagates.
/* breaking */ union Value { int i; float f; }; // sizeof 4
union Value { int i; float f; double d; }; // sizeof 8 ← grows
The rule is precise:
A new union member is safe if and only if
sizeof(new) <= sizeof(old_union)andalignof(new) <= alignof(old_union).
That's why case26 (adds double,
grows 4→8) is 🔴 BREAKING, while
case26b (adds int to a
union already 8 bytes wide) is 🟢 COMPATIBLE. Removing a variant
(case24) is a semantic break
even at unchanged size: consumers compiled to write d.f = 3.14f have no
replacement access path.
5. Bitfields and flexible arrays¶
Bitfields are the most fragile layout primitive, because storage-unit
allocation is implementation-defined.
case63 widens mode from 3 bits to
5 inside a 32-bit register map. sizeof is unchanged — naive size checks pass
— but channel, priority, and reserved all shift two bit positions, so
every consumer reads corrupt values with no crash and no diagnostic. This bites
hardest in hardware-register maps and protocol headers, exactly where bitfields
are most used.
Flexible array members have the opposite static-size profile, same failure:
case70 changes
float data[] to double data[]. The fixed header sizeof(Packet) compares
equal, but every caller that allocated
sizeof(Packet) + count*sizeof(float) now holds half the needed memory, and
p->data[i] indexes with stride 8 instead of 4.
6. Pointer chains and arrays¶
Multi-level type changes propagate through indirection.
case45 changes
float data[4][4] to double data[4][4] inside a struct: the element type
changes, the struct doubles from 72 to 136 bytes, the stride doubles, and the
accessor functions change return/parameter widths.
case46 reaches further — a
function returning int ** becomes long **, a two-level chain where only the
ultimate pointee changes; callers that write an int through the chain write 4
bytes into what v2 treats as an 8-byte cell.
How abicheck sees it
abicheck walks pointer and array types structurally during
func_return_changed and func_params_changed detection — a surface-level
"both sides return a pointer" comparison would miss these.
How to defend type layout¶
Design patterns for Part 3
- Opaque handles. Expose only
struct foo *; define the struct in a.cfile. Callers can't takesizeoforoffsetof, so layout is free to change. OpenSSL 1.1.0's move toEVP_MD_CTX_new/_freeopaque handles is the canonical precedent. - Pimpl (C++). The public class holds one
d_ptrto a privateImpl; all state lives inImpl, sosizeofof the public class never changes. Qt enforces this across every public class. - Reserved padding. End every public struct with
void *reserved[N]/uint64_t _pad[N]; future releases repurpose slots without changingsizeofor shifting offsets (POSIXpthread_attr_t, kernel UAPI). - Freeze the enum underlying type. Write
enum class Color : int32_tin C++; in C keep values inintrange or add an explicit sentinel. Never let a new enumerator silently widen the type. - Append-only evolution. Reordering, inserting, or removing a field is
always breaking. Appending at the end is not automatically safe either:
it is ABI-safe only when the type is never caller-allocated, embedded
by value, or passed by value, or the API has an explicit
size/version-negotiation contract (a
struct_size/abi_versionfield, or reserved storage the caller never sizes itself). If any consumer cansizeof(T),new T, or hold aTby value, appending grows the type and breaks them — see Part 7 — Reserved padding (union analog: case26 vs case26b).
These patterns are developed in full, with code, in Part 7 — Designing for Stability.
Next¶
C++ adds a whole second layer of binary contract on top of plain layout: vtables, mangled qualifiers, templates, and the trivially-copyable trait that silently flips the calling convention.
See also: ABI Cheat Sheet · BREAKING examples · COMPATIBLE examples