@@ -7,7 +7,7 @@ are exhaustive.
77## Pattern usefulness
88
99The central question that usefulness checking answers is:
10- "in this match expression, is that branch reachable ?".
10+ "in this match expression, is that branch redundant ?".
1111More precisely, it boils down to computing whether,
1212given a list of patterns we have already seen,
1313a given new pattern might match any new value.
@@ -42,10 +42,8 @@ because a match expression can return a value).
4242
4343## Where it happens
4444
45- This check is done to any expression that desugars to a match expression in MIR.
46- That includes actual `match` expressions,
47- but also anything that looks like pattern matching,
48- including `if let`, destructuring `let`, and similar expressions.
45+ This check is done anywhere you can write a pattern: `match` expressions, `if let`, `let else`,
46+ plain `let`, and function arguments.
4947
5048```rust
5149// `match`
@@ -80,9 +78,141 @@ fn foo(Foo { x, y }: Foo) {
8078
8179## The algorithm
8280
83- Exhaustiveness checking is implemented in [`check_match`].
84- The core of the algorithm is in [`usefulness`].
81+ Exhaustiveness checking is run before MIR building in [`check_match`].
82+ It is implemented in the [`rustc_pattern_analysis`] crate,
83+ with the core of the algorithm in the [`usefulness`] module.
8584That file contains a detailed description of the algorithm.
8685
86+ ## Important concepts
87+
88+ ### Constructors and fields
89+
90+ In the value `Pair(Some(0), true)`, `Pair` is called the constructor of the value, and `Some(0)` and
91+ `true` are its fields. Every matcheable value can be decomposed in this way. Examples of
92+ constructors are: `Some`, `None`, `(,)` (the 2-tuple constructor), `Foo {..}` (the constructor for
93+ a struct `Foo`), and `2` (the constructor for the number `2`).
94+
95+ Each constructor takes a fixed number of fields; this is called its arity. `Pair` and `(,)` have
96+ arity 2, `Some` has arity 1, `None` and `42` have arity 0. Each type has a known set of
97+ constructors. Some types have many constructors (like `u64`) or even an infinitely many (like `&str`
98+ and `&[T]`).
99+
100+ Patterns are similar: `Pair(Some(_), _)` has constructor `Pair` and two fields. The difference is
101+ that we get some extra pattern-only constructors, namely: the wildcard `_`, variable bindings,
102+ integer ranges like `0..=10`, and variable-length slices like `[_, .., _]`. We treat or-patterns
103+ separately.
104+
105+ Now to check if a value `v` matches a pattern `p`, we check if `v`'s constructor matches `p`'s
106+ constructor, then recursively compare their fields if necessary. A few representative examples:
107+
108+ - `matches!(v, _) := true`
109+ - `matches!((v0, v1), (p0, p1)) := matches!(v0, p0) && matches!(v1, p1)`
110+ - `matches!(Foo { a: v0, b: v1 }, Foo { a: p0, b: p1 }) := matches!(v0, p0) && matches!(v1, p1)`
111+ - `matches!(Ok(v0), Ok(p0)) := matches!(v0, p0)`
112+ - `matches!(Ok(v0), Err(p0)) := false` (incompatible variants)
113+ - `matches!(v, 1..=100) := matches!(v, 1) || ... || matches!(v, 100)`
114+ - `matches!([v0], [p0, .., p1]) := false` (incompatible lengths)
115+ - `matches!([v0, v1, v2], [p0, .., p1]) := matches!(v0, p0) && matches!(v2, p1)`
116+
117+ This concept is absolutely central to pattern analysis. The [`constructor`] module provides
118+ functions to extract, list and manipulate constructors. This is a useful enough concept that
119+ variations of it can be found in other places of the compiler, like in the MIR-lowering of a match
120+ expression and in some clippy lints.
121+
122+ ### Constructor grouping and splitting
123+
124+ The pattern-only constructors (`_`, ranges and variable-length slices) each stand for a set of
125+ normal constructors, e.g. `_: Option<T>` stands for the set {`None`, `Some`} and `[_, .., _]` stands
126+ for the infinite set {`[,]`, `[,,]`, `[,,,]`, ...} of the slice constructors of arity >= 2.
127+
128+ In order to manage these constructors, we keep them as grouped as possible. For example:
129+
130+ ```rust
131+ match (0, false) {
132+ (0 ..=100, true) => {}
133+ (50..=150, false) => {}
134+ (0 ..=200, _) => {}
135+ }
136+ ```
137+
138+ In this example, all of `0`, `1`, .., `49` match the same arms, and thus can be treated as a group.
139+ In fact, in this match, the only ranges we need to consider are: `0..50`, `50..=100`,
140+ `101..=150`,`151..=200` and `201..`. Similarly:
141+
142+ ```rust
143+ enum Direction { North, South, East, West }
144+ # let wind = (Direction::North, 0u8);
145+ match wind {
146+ (Direction::North, 50..) => {}
147+ (_, _) => {}
148+ }
149+ ```
150+
151+ Here we can treat all the non-`North` constructors as a group, giving us only two cases to handle:
152+ `North`, and everything else.
153+
154+ This is called "constructor splitting" and is crucial to having exhaustiveness run in reasonable
155+ time.
156+
157+ ### Usefulness vs reachability in the presence of empty types
158+
159+ This is likely the subtlest aspect of exhaustiveness. To be fully precise, a match doesn't operate
160+ on a value, it operates on a place. In certain unsafe circumstances, it is possible for a place to
161+ not contain valid data for its type. This has subtle consequences for empty types. Take the
162+ following:
163+
164+ ```rust
165+ enum Void {}
166+ let x: u8 = 0;
167+ let ptr: *const Void = &x as *const u8 as *const Void;
168+ unsafe {
169+ match *ptr {
170+ _ => println!("Reachable!"),
171+ }
172+ }
173+ ```
174+
175+ In this example, `ptr` is a valid pointer pointing to a place with invalid data. The `_` pattern
176+ does not look at the contents of the place `*ptr`, so this code is ok and the arm is taken. In other
177+ words, despite the place we are inspecting being of type `Void`, there is a reachable arm. If the
178+ arm had a binding however:
179+
180+ ```rust
181+ # #[derive(Copy, Clone)]
182+ # enum Void {}
183+ # let x: u8 = 0;
184+ # let ptr: *const Void = &x as *const u8 as *const Void;
185+ # unsafe {
186+ match *ptr {
187+ _a => println!("Unreachable!"),
188+ }
189+ # }
190+ ```
191+
192+ Here the binding loads the value of type `Void` from the `*ptr` place. In this example, this causes
193+ UB since the data is not valid. In the general case, this asserts validity of the data at `*ptr`.
194+ Either way, this arm will never be taken.
195+
196+ Finally, let's consider the empty match `match *ptr {}`. If we consider this exhaustive, then
197+ having invalid data at `*ptr` is invalid. In other words, the empty match is semantically
198+ equivalent to the `_a => ...` match. In the interest of explicitness, we prefer the case with an
199+ arm, hence we won't tell the user to remove the `_a` arm. In other words, the `_a` arm is
200+ unreachable yet not redundant. This is why we lint on redundant arms rather than unreachable
201+ arms, despite the fact that the lint says "unreachable".
202+
203+ These considerations only affects certain places, namely those that can contain non-valid data
204+ without UB. These are: pointer dereferences, reference dereferences, and union field accesses. We
205+ track during exhaustiveness checking whether a given place is known to contain valid data.
206+
207+ Having said all that, the current implementation of exhaustiveness checking does not follow the
208+ above considerations. On stable, empty types are for the most part treated as non-empty. The
209+ [`exhaustive_patterns`] feature errs on the other end: it allows omitting arms that could be
210+ reachable in unsafe situations. The [`never_patterns`] experimental feature aims to fix this and
211+ permit the correct behavior of empty types in patterns.
212+
87213[`check_match`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/thir/pattern/check_match/index.html
88- [`usefulness`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/thir/pattern/usefulness/index.html
214+ [`rustc_pattern_analysis`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_pattern_analysis/index.html
215+ [`usefulness`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_pattern_analysis/usefulness/index.html
216+ [`constructor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_pattern_analysis/constructor/index.html
217+ [`never_patterns`]: https://github.com/rust-lang/rust/issues/118155
218+ [`exhaustive_patterns`]: https://github.com/rust-lang/rust/issues/51085
0 commit comments