-
Notifications
You must be signed in to change notification settings - Fork 28
Description
.. SPDX-License-Identifier: MIT OR Apache-2.0
SPDX-FileCopyrightText: The Coding Guidelines Subcommittee Contributors
.. default-domain:: coding-guidelines
.. guideline:: Do not read from union fields that may contain uninitialized bytes
🆔 gui_UnionPartialInit
:category: required
:status: draft
:release: 1.85.0
:decidability: undecidable
:scope: expression
:tags: unions, initialization, undefined-behavior
Do not read from a union field unless all bytes of that field have been explicitly
initialized. Partial initialization of a union's composite field leaves some bytes
in an uninitialized state, and reading those bytes is undefined behavior.
When working with unions:
- Initialize all bytes of a field before reading from it
- Do not assume that initializing one variant preserves the initialized state of another
- Do not rely on prior initialization of a union before reassignment
- Use
MaybeUninitwith proper initialization patterns rather than custom unions for
managing uninitialized memory
You can access a field of a union even when the backing bytes of that field are uninitialized provided that:
- The resulting value has an unspecified but well-defined bit pattern.
- Interpreting that value must still comply with the requirements of the accessed type
(e.g., no invalid enum discriminants, no invalid pointer values, etc.).
For example, reading an uninitialized u32 field of a union is allowed;
reading an uninitialized bool field is disallowed because not all bit patterns are valid.
.. rationale::
🆔 rat_UnionPartialInitReason
:status: draft
Unions in Rust allow multiple fields to share the same memory. When a union field
is a composite type (tuple, struct, array), writing to only some components leaves
the remaining bytes in an indeterminate state. Reading these uninitialized bytes
is undefined behavior [RUST-REF-UB]_.
This issue is particularly insidious because:
* **Silent data corruption**: The program may appear to work, reading stale or
garbage values that happen to be "reasonable" in testing.
* **Optimization interactions**: The compiler may merge, inline, or deduplicate
functions in ways that change which code paths execute. A function that fully
initializes a union may be merged with one that partially initializes it,
causing UB to appear in previously-safe code paths [LLVM-MERGE]_.
* **Function pointer comparisons**: Relying on function pointer equality to
select code paths is unreliable (see gui_FnPtrEquality). Combined with partial
initialization, this can lead to UB being introduced through seemingly unrelated
optimizations.
* **Reassignment resets initialization**: Assigning a new value to a union
(e.g., ``*u = MyUnion { uninit: () }``) does not preserve the initialized
state of other fields. All fields must be considered uninitialized after
such an assignment.
The Rust memory model requires that all bytes be initialized before a typed
read occurs. There is no exception for "partial" reads of composite types —
the entire field must be valid.
The sole exception is that unions work like C unions:
any union field may be read, even if it was never written.
The resulting bytes must, however, form a valid representation for the field's type,
which is not guaranteed if the union contains arbitrary data.
.. non_compliant_example::
🆔 non_compl_ex_PartialInit1
:status: draft
This noncompliant example partially initializes a tuple field, leaving the second element uninitialized.
.. code-block:: rust
union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}
fn write_first(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe { a.init.0 = 1; } // Only initializes the first byte
}
fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };
write_first(&mut a);
// Undefined behavior reading uninitialized value
println!("{}", unsafe { a.init.1 }); // noncompliant
}
.. non_compliant_example::
🆔 non_compl_ex_PartialInit2
:status: draft
This noncompliant example assumes prior initialization is preserved after reassignment.
.. code-block:: rust
union Data {
raw: [u8; 4],
value: u32,
}
fn partial_update(d: &mut Data) {
// Reassignment invalidates all prior initialization
*d = Data { raw: [0; 4] };
// Only update first two bytes
unsafe {
d.raw[0] = 0xAB;
d.raw[1] = 0xCD;
}
}
fn main() {
let mut d = Data { value: 0xFFFFFFFF };
partial_update(&mut d);
// 'raw[2]' and 'raw[3]' are uninitialized after reassignment
println!("{:?}", unsafe { d.raw }); // noncompliant
}
.. non_compliant_example::
🆔 non_compl_ex_PartialInit3
:status: draft
This noncompliant example combines function pointer comparison with partial initialization,
creating subtle undefined behavior that may only manifest after optimization.
.. code-block:: rust
union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}
#[no_mangle]
fn write_first(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe { a.init.0 = 1; }
}
#[no_mangle]
fn write_both(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe {
a.init.0 = 1;
a.init.1 = 2;
}
}
fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };
// Non-compliant: function pointer comparison is unreliable,
// and 'write_first' leaves 'a.init.'1 uninitialized
if write_first as usize == write_both as usize {
write_first(&mut a);
}
// UB if the branch was taken (functions may be merged by optimizer)
println!("{}", unsafe { a.init.1 }); // noncompliant
}
.. compliant_example::
🆔 compl_ex_FullInit1
:status: draft
This compliant examples initializes all bytes of the field before reading.
.. code-block:: rust
union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}
fn write_both(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe {
a.init.0 = 1;
a.init.1 = 2; // Initialize all bytes
}
}
fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };
write_both(&mut a);
// Both bytes are initialized
println!("{}", unsafe { a.init.1 }); // compliant
}
.. compliant_example::
🆔 compl_ex_FullInit2
:status: draft
This compliant example uses ``MaybeUninit`` with proper initialization patterns.
.. code-block:: rust
use std::mem::MaybeUninit;
fn init_tuple() -> (u8, u8) {
let mut data: MaybeUninit<(u8, u8)> = MaybeUninit::uninit();
unsafe {
let ptr = data.as_mut_ptr();
(*ptr).0 = 1;
(*ptr).1 = 2; // Initialize all fields
// data is fully initialized before call to 'assume_init'
data.assume_init()
}
}
fn main() {
let result = init_tuple();
println!("{}, {}", result.0, result.1); // compliant
}
.. compliant_example::
🆔 compl_ex_FullInit3
:status: draft
This compliant example initializes through the composite field directly.
.. code-block:: rust
union Data {
raw: [u8; 4],
value: u32,
}
fn full_init(d: &mut Data) {
// Initialize entire field at once
*d = Data { raw: [0xAB, 0xCD, 0xEF, 0x12] };
}
fn main() {
let mut d = Data { value: 0 };
full_init(&mut d);
// All bytes in 'd' are initialized
println!("{:?}", unsafe { d.raw }); // compliant
}
.. compliant_example::
🆔 compl_ex_FullInit4
:status: draft
This compliant solution avoids relying on function pointer comparisons.
.. code-block:: rust
union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}
enum InitLevel {
Partial,
Full,
}
fn write_first(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe { a.init.0 = 1; }
}
fn write_both(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe {
a.init.0 = 1;
a.init.1 = 2;
}
}
fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };
let level = InitLevel::Full; // Explicit tracking, not pointer comparison
match level {
InitLevel::Full => {
write_both(&mut a);
// Compliant: safe to read both fields
println!("{}", unsafe { a.init.1 });
}
InitLevel::Partial => {
write_first(&mut a);
// Only read the initialized field
println!("{}", unsafe { a.init.0 });
}
}
}
.. compliant_example::
🆔 compl_ex_Ke869nSXuShU
:status: draft
Types such as ``u8``, ``u16``, ``u32``, and ``i128`` allow all possible bit patterns.
Provided the memory is initialized, there is no undefined behavior.
.. rust-example::
union U {
n: u32,
bytes: [u8; 4],
}
# fn main() {
let u = U { bytes: [0xFF, 0xEE, 0xDD, 0xCC] };
let n = unsafe { u.n }; // OK — all bit patterns valid for u32
# }
.. compliant_example::
🆔 compl_ex_Ke869nSXuShT
:status: draft
The following code reads a union field:
.. rust-example::
union U {
x: u32,
y: f32,
}
# fn main() {
let u = U { x: 123 }; // write to one field
let f = unsafe { u.y }; // reading the other field is allowed
# }
.. non_compliant_example::
🆔 non_compl_ex_Qb5GqYTP6db3
:status: draft
Even though unions allow reads of any field, not all bit patterns are valid for a ``bool``.
Unions do not relax type validity requirements.
Only the read itself is allowed;
the resulting bytes must still be a valid bool.
.. rust-example::
union U {
b: bool,
x: u8,
}
# fn main() {
let u = U { x: 255 }; // 255 is not a valid bool representation
let b = unsafe { u.b }; // UB — invalid bool
# }
.. bibliography::
🆔 bib_UnionFieldValidity
:status: draft
.. list-table::
:header-rows: 0
:widths: auto
:class: bibliography-table
* - .. [RUST-REF-UB]
- The Rust Project Developers. "Behavior Considered Undefined." *The Rust
Reference*, n.d.
https://doc.rust-lang.org/reference/behavior-considered-undefined.html.
* - .. [RUST-REF-UNION]
- The Rust Project Developers. "Unions." *The Rust Reference*, n.d.
https://doc.rust-lang.org/reference/items/unions.html.
* - .. [LLVM-MERGE]
- LLVM Project. "MergeFunctions Pass." *LLVM Documentation*, n.d.
https://llvm.org/docs/MergeFunctions.html.
* - .. [UCG-VALIDITY]
- Rust Unsafe Code Guidelines Working Group. "Validity and Safety
Invariant." *Rust Unsafe Code Guidelines*, n.d.
https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#validity-and-safety-invariant.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status