Update patterns.md

This commit is contained in:
Neal Gafter 2018-12-19 17:04:12 -08:00 committed by GitHub
parent 1f0ca65d22
commit 443f521cb7
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -1,20 +1,10 @@
# pattern matching
* [x] Proposed
* [ ] Prototype:
* [ ] Implementation:
* [ ] Specification:
# Recursive Pattern Matching
## Summary
[summary]: #summary
Pattern matching extensions for C# enable many of the benefits of algebraic data types and pattern matching from functional languages, but in a way that smoothly integrates with the feel of the underlying language. The basic features are: [record types](records.md), which are types whose semantic meaning is described by the shape of the data; and pattern matching, which is a new expression form that enables extremely concise multilevel decomposition of these data types. Elements of this approach are inspired by related features in the programming languages [F#](http://www.msr-waypoint.net/pubs/79947/p29-syme.pdf "Extensible Pattern Matching Via a Lightweight Language") and [Scala](http://lampwww.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf "Matching Objects With Patterns").
## Motivation
[motivation]: #motivation
Why are we doing this? What use cases does it support? What is the expected outcome?
## Detailed design
[design]: #detailed-design
@ -24,6 +14,10 @@ The `is` operator is extended to test an expression against a *pattern*.
```antlr
relational_expression
: is_pattern_expression
;
is_pattern_expression
: relational_expression 'is' pattern
;
```
@ -34,257 +28,202 @@ Every *identifier* of the pattern introduces a new local variable that is *defin
### Patterns
Patterns are used in the `is` operator and in a *switch_statement* to express the shape of data against which incoming data is to be compared. Patterns may be recursive so that parts of the data may be matched against sub-patterns.
Patterns are used in the *is_pattern* operator, in a *switch_statement*, and in a *switch_expression* to express the shape of data against which incoming data (which we call the input value) is to be compared. Patterns may be recursive so that parts of the data may be matched against sub-patterns.
```antlr
pattern
: type_pattern
| constant_pattern
| discard_pattern
: declaration_pattern
| constant_pattern
| var_pattern
| recursive_pattern
;
type_pattern
: type identifier
;
discard_pattern
: '_'
;
var_pattern
: 'var' identifier
;
| deconstruction_pattern
| property_pattern
;
declaration_pattern
: type simple_designation
;
constant_pattern
: shift_expression
: expression
;
var_pattern
: 'var' designation
;
recursive_pattern
: positional_pattern
| property_pattern
;
positional_pattern
: type? '(' subpattern_list? ')'
;
subpattern_list
: subpattern
| subpattern ',' subpattern_list
;
deconstruction_pattern
: type? '(' subpatterns? ')' property_subpattern? simple_designation?
;
subpatterns
: subpattern
| subpattern ',' subpatterns
;
subpattern
: argument_name? pattern
;
property_pattern
: type? '{' property_subpattern_list? '}'
| type identifier '{' property_subpattern_list? '}'
| var identifier '{' property_subpattern_list? '}'
;
property_subpattern_list
: property_subpattern
| property_subpattern ',' property_subpattern_list
;
: pattern
| identifier ':' pattern
;
property_subpattern
: identifier 'is' pattern
;
: '{' subpatterns? '}'
;
property_pattern
: type? property_subpattern simple_designation?
;
simple_designation
: single_variable_designation
| discard_designation
;
```
> Note: There is technically an ambiguity between *type* in an `is-expression` and *constant_pattern*, either of which might be a valid parse of a qualified identifier. We try to bind it as a type for compatibility with previous versions of the language; only if that fails do we resolve it as we do in other contexts, to the first thing found (which must be either a constant or a type). This ambiguity is only present on the right-hand-side of an `is` expression.
> Note: There is technically an ambiguity between *type* in an `is-expression` and *constant_pattern*, either of which might be a valid parse of a qualified identifier. We try to bind it as a type for compatibility with previous versions of the language; only if that fails do we resolve it as we do an expression in other contexts, to the first thing found (which must be either a constant or a type). This ambiguity is only present on the right-hand-side of an `is` expression.
#### Type Pattern
The *type_pattern* both tests that an expression is of a given type and casts it to that type if the test succeeds. This introduces a local variable of the given type named by the given identifier. That local variable is *definitely assigned* when the result of the pattern-matching operation is true.
#### Declaration Pattern
```antlr
type_pattern
: type identifier
declaration_pattern
: type simple_designation
;
```
The runtime semantic of this expression is that it tests the runtime type of the left-hand *relational_expression* operand against the *type* in the pattern. If it is of that runtime type (or some subtype), the result of the `is operator` is `true`. It declares a new local variable named by the *identifier* that is assigned the value of the left-hand operand when the result is `true`.
The *declaration_pattern* both tests that an expression is of a given type and casts it to that type if the test succeeds. This may introduce a local variable of the given type named by the given identifier, if the designation is a *single_variable_designation*. That local variable is *definitely assigned* when the result of the pattern-matching operation is `true`.
Certain combinations of static type of the left-hand-side and the given type are considered incompatible and result in compile-time error. A value of static type `E` is said to be *pattern compatible* with the type `T` if there exists an identity conversion, an implicit reference conversion, a boxing conversion, an explicit reference conversion, or an unboxing conversion from `E` to `T`. It is a compile-time error if an expression of type `E` is not pattern compatible with the type in a type pattern that it is matched with.
The runtime semantic of this expression is that it tests the runtime type of the left-hand *relational_expression* operand against the *type* in the pattern. If it is of that runtime type (or some subtype) and not `null`, the result of the `is operator` is `true`.
Certain combinations of static type of the left-hand-side and the given type are considered incompatible and result in compile-time error. A value of static type `E` is said to be *pattern compatible* with the type `T` if there exists an identity conversion, an implicit reference conversion, a boxing conversion, an explicit reference conversion, or an unboxing conversion from `E` to `T`, or if one of those types is an open type. It is a compile-time error if an input of type `E` is not pattern compatible with the type in a type pattern that it is matched with.
The type pattern is useful for performing run-time type tests of reference types, and replaces the idiom
```cs
var v = expr as Type;
if (v != null) { // code using v }
if (v != null) { // code using v
```
With the slightly more concise
```cs
if (expr is Type v) { // code using v }
if (expr is Type v) { // code using v
```
It is an error if *type* is a nullable value type and the *identifier* is present.
It is an error if *type* is a nullable value type.
The type pattern can be used to test values of nullable types: a value of type `Nullable<T>` (or a boxed `T`) matches a type pattern `T2 id` if the value is non-null and the type of `T2` is `T`, or some base type or interface of `T`. For example, in the code fragment
```cs
int? x = 3;
if (x is int v) { // code using v }
if (x is int v) { // code using v
```
The condition of the `if` statement is `true` at runtime and the variable `v` holds the value `3` of type `int` inside the block.
The condition of the `if` statement is `true` at runtime and the variable `v` holds the value `3` of type `int` inside the block. After the block the variable `v` is in scope but not definitely assigned.
#### Constant Pattern
A constant pattern tests the value of an expression against a constant value. The constant may be any constant expression, such as a literal, the name of a declared `const` variable, or an enumeration constant, or a `typeof` expression. The expression is implicitly converted to the type of the matched expression. If no suitable implicit conversion exists, or the result is not a constant, the pattern-matching operation is an error. Otherwise the pattern *c* is considered matching the expression *e* if `object.Equals(c, e)` would return `true`.
```antlr
constant_pattern
: constant_expression
;
```
A constant pattern tests the value of an expression against a constant value. The constant may be any constant expression, such as a literal, the name of a declared `const` variable, or an enumeration constant. When the input value is not an open type, the constant expression is implicitly converted to the type of the matched expression; if the type of the input value is not *pattern compatible* with the type of the constant expression, the pattern-matching operation is an error.
The pattern *c* is considered matching the converted input value *e* if `object.Equals(c, e)` would return `true`.
#### Var Pattern
An expression *e* matches the pattern `var identifier` always. In other words, a match to a *var pattern* always succeeds. At runtime the value of *e* is bounds to a newly introduced local variable. The type of the local variable is the static type of *e*.
``` antlr
var_pattern
: 'var' designation
;
designation
: simple_designation
| tuple_designation
;
simple_designation
: single_variable_designation
| discard_designation
;
single_variable_designation
: identifier
;
discard_designation
: _
;
tuple_designation
: '(' desginations? ')'
;
designations
: designation
| designations ',' designation
;
```
If the name `var` binds to a type, then we instead treat the pattern as a *type_pattern*.
If the *designation* is a *simple_designation*, an expression *e* matches the pattern. In other words, a match to a *var pattern* always succeeds with a *simple_designation*. If the *simple_designation* is a *single_variable_designation*, the value of *e* is bounds to a newly introduced local variable. The type of the local variable is the static type of *e*.
If the *designation* is a *tuple_designation*, then the pattern is equivalent to a *deconstruction_pattern* of the form `(var ` *designation*, ... `)` where the *designation*s are those found within the *tuple_designation*. For example, the pattern `var (x, (y, z))` is equivalent to `(var x, (var y, var z))`.
It is an error if the name `var` binds to a type.
#### Discard Pattern
An expression *e* matches the pattern `_` always. In other words, every expression matches the discard pattern.
#### Positional Pattern
A discard pattern may not be used as the pattern of an *is_pattern_expression*.
A positional pattern enables the program to invoke an appropriate `operator is`, and (if the operator has a `void` return type, or returns `true`) perform further pattern matching on the values that are returned from it. It also supports a tuple-like pattern syntax when the static type is the same as the type containing `operator is`, or if the runtime type of the expression implements `ITuple`.
#### Deconstruction Pattern
A deconstruction pattern checks that the input value is not `null`, invokes an appropriate `Deconstruct` method, and performs further pattern matching on the resulting values. It also supports a tuple-like pattern syntax (without the type being provided) when the type of the input value is the same as the type containing `Deconstruct`, or if the type of the input value is a tuple type, or if the type of the input value is `object` or `ITuple` and the runtime type of the expression implements `ITuple`.
```antlr
positional_pattern
: type? '(' subpattern_list? ')'
;
deconstruction_pattern
: type? '(' subpatterns? ')' property_subpattern? simple_designation?
;
subpatterns
: subpattern
| subpattern ',' subpatterns
;
subpattern
: pattern
| identifier ':' pattern
;
```
If the *type* is omitted, we take it to be the static type of *e*. In this case it is an error if *e* does not have a type.
If the *type* is omitted, we take it to be the static type of the input value.
Given a match of an expression *e* to the pattern *type* `(` *subpattern_list* `)`, a method is selected by searching in *type* for accessible declarations of `operator is` and selecting one among them using *match operator overload resolution*.
Given a match of an input value to the pattern *type* `(` *subpattern_list* `)`, a method is selected by searching in *type* for accessible declarations of `Deconstruct` and selecting one among them using the same rules as for the deconstruction declaration.
- If a suitable `operator is` exists, it is a compile-time error if the expression *e* is not *pattern compatible* with the type of the first argument of the selected operator. If the *type* is omitted, it is an error if the `operator is` found does not have the static type of *e* as its first parameter. At runtime the value of the expression is tested against the type of the first parameter as in a type pattern. If this fails then the positional pattern match fails and the result is `false`. If it succeeds, the operator is invoked with fresh compiler-generated variables to receive the `out` parameters. Each value that was received is matched against the corresponding *subpattern*, and the match succeeds if all of these succeed. The order in which subpatterns are matched is unspecified, and a failed match may not match all subpatterns.
- If no suitable `operator is` exists, but the expression is *pattern compatible* with the type `System.ITuple`, and no *argument_name* appears among the subpatterns, then we match using `ITuple`. [Note: this needs to be made more precise.]
It is an error if a *deconstruction_pattern* omits the type, has a single *subpattern* without an *identifier*, has no *property_subpattern* and has no *simple_designation*. This disamgibuates between a *constant_pattern* that is parenthesized and a *deconstruction_pattern*.
In order to extract the values to match against the patterns in the list,
- If *type* was omitted and the input value's type is a tuple type, then the number of subpatterns is required to be the same as the cardinality of the tuple. Each tuple element is matched against the corresponding *subpattern*, and the match succeeds if all of these succeed. If any *subpattern* has an *identifier*, then that must name a tuple element at the corresponding position in the tuple type.
- Otherwise, if a suitable `Deconstruct` exists as a member of *type*, it is a compile-time error if the type of the input value is not *pattern compatible* with *type*. At runtime the input value is tested against *type*. If this fails then the positional pattern match fails. If it succeeds, the input value is converted to this type and `Deconstruct` is invoked with fresh compiler-generated variables to receive the `out` parameters. Each value that was received is matched against the corresponding *subpattern*, and the match succeeds if all of these succeed. If any *subpattern* has an *identifier*, then that must name a parameter at the corresponding position of `Deconstruct`.
- Otherwise if *type* was omitted, and the input value is of type `object` or `ITuple` or some type that can be converted to `ITuple` by an implicit reference conversion, and no *identifier* appears among the subpatterns, then we match using `ITuple`.
- Otherwise the pattern is a compile-time error.
If a *subpattern* has an *argument_name*, then every subsequent *subpattern* must have an *argument_name*. In this case each argument name must match a parameter name (of an overloaded `operator is` in the first bullet above). [Note: this needs to be made more precise.]
The order in which subpatterns are matched at runtime is unspecified, and a failed match may not attempt to match all subpatterns.
#### Property Pattern
A property pattern enables the program to recursively match values extracted by the use of properties.
A property pattern checks that the input value is not `null` and recursively matches values extracted by the use of accessible properties or fields.
```antlr
property_pattern
: type? '{' property_subpattern_list? '}'
| type identifier '{' property_subpattern_list? '}'
| var identifier '{' property_subpattern_list? '}'
;
property_subpattern_list
: property_subpattern
| property_subpattern ',' property_subpattern_list
;
: type? property_subpattern simple_designation?
;
property_subpattern
: identifier 'is' pattern
;
: '{' subpatterns? '}'
;
```
Given a match of an expression *e* to the pattern *type* `{` *property_pattern_list* `}`, it is a compile-time error if the expression *e* is not *pattern compatible* with the type *T* designated by *type*. If the type is absent or designated by `var`, we take it to be the static type of *e*. If the *identifier* is present, it declares a pattern variable of type *type*. Each of the identifiers appearing on the left-hand-side of its *property_pattern_list* must designate a readable property or field of *T*. If the *identifier* of the *property_pattern* is present, it defines a pattern variable of type *T*.
It is an error if any _subpattern_ of a _property_pattern_ does not contain an _identifier_ (it must be of the second form, which has an _identifier_).
At runtime, the expression is tested against *T*. If this fails then the property pattern match fails and the result is `false`. If it succeeds, then each *property_subpattern* field or property is read and its value matched against its corresponding pattern. The result of the whole match is `false` only if the result of any of these is `false`. The order in which subpatterns are matched is not specified, and a failed match may not match all subpatterns at runtime. If the match succeeds and the *identifier* of the *property_pattern* is present, it is assigned the matched value.
Note that a null-checking pattern falls out of a trivial property pattern. To check if the string `s` is non-null, you can write any of the following forms
> Note: The property pattern can be used to pattern-match with anonymous types.
#### Scope of Pattern Variables
The scope of a pattern variable is as follows:
- If the pattern appears in the condition of an `if` statement, its scope is the condition and controlled statement of the `if` statement, but not its `else` clause.
- If the pattern appears in the `when` clause of a `catch`, its scope is the *catch_clause*.
- If the pattern appears in a *switch_label*, its scope is the *switch_section*.
- If the pattern is the *pattern* of or in the *expression* of a *match_section*, its scope is that *match_section*.
- If the pattern appears in the `when` clause of a *switch_label* or *match_label*, its scope of that *switch_section* or *match_section*.
- If the pattern appears in the body of an expression_bodied lambda, its scope is that lambda's body.
- If the pattern appears in the body of an expression_bodied method or property, its scope is that expression body.
- If the pattern appears in the body of an expression_bodied local function, its scope is that method body.
- If the pattern appears in a *ctor_initializer*, its scope is the constructor body.
- If the pattern appears in a field initializer, its scope is that field initializer.
- If the pattern appears in the pattern of a *let_statement*, its scope is the enclosing block.
- If the pattern appears in the pattern of a *case_expression*, its scope is the *case_expression*.
- Otherwise if the pattern appears directly in some *statement*, its scope is that *statement*.
Other cases are errors for other reasons (e.g. in a parameter's default value or an attribute, both of which are an error because those contexts require a constant expression).
The use of a pattern variable is a value, not a variable. In other words pattern variables are read-only.
### User_defined operator is
An explicit `operator is` may be declared to extend the pattern matching capabilities. Such a method is invoked by the `is` operator or a *switch_statement* with a *positional_pattern*.
For example, suppose we have a type representing a Cartesian point in 2-space:
```cs
public class Cartesian
{
public int X { get; }
public int Y { get; }
}
``` c#
if (s is object o) ... // o is of type object
if (s is string x) ... // x is of type string
if (s is {} x) ... // x is of type string
if (s is {}) ...
```
We may sometimes think of them in polar coordinates:
Given a match of an expression *e* to the pattern *type* `{` *property_pattern_list* `}`, it is a compile-time error if the expression *e* is not *pattern compatible* with the type *T* designated by *type*. If the type is absent, we take it to be the static type of *e*. If the *identifier* is present, it declares a pattern variable of type *type*. Each of the identifiers appearing on the left-hand-side of its *property_pattern_list* must designate an accessible readable property or field of *T*. If the *simple_designation* of the *property_pattern* is present, it defines a pattern variable of type *T*.
```cs
public static class Polar
{
public static bool operator is(Cartesian c, out double R, out double Theta)
{
R = Math.Sqrt(c.X*c.X + c.Y*c.Y);
Theta = Math.Atan2(c.Y, c.X);
return c.X != 0 || c.Y != 0;
}
}
```
At runtime, the expression is tested against *T*. If this fails then the property pattern match fails and the result is `false`. If it succeeds, then each *property_subpattern* field or property is read and its value matched against its corresponding pattern. The result of the whole match is `false` only if the result of any of these is `false`. The order in which subpatterns are matched is not specified, and a failed match may not match all subpatterns at runtime. If the match succeeds and the *simple_designation* of the *property_pattern* is a *single_variable_designation*, it defines a variable of type *T* that is assigned the matched value.
And now we can operate on `Cartesian` values using polar coordinates
```cs
var c = Cartesian(3, 4);
if (c is Polar(var R, _)) Console.WriteLine(R);
```
Which prints `5`.
### Switch Statement
The `switch` statement is extended to select for execution the first block having an associated pattern that matches the *switch expression*.
```antlr
switch_label
: 'case' complex_pattern case_guard? ':'
| 'case' constant_expression case_guard? ':'
| 'default' ':'
;
case_guard
: 'when' expression
;
```
[TODO: we need to explain the interaction with definite assignment here.]
[TODO: we need to describe the scope of pattern variables appearing in the *switch_label*.]
The order in which patterns are matched is not defined. A compiler is permitted to match patterns out of order, and to reuse the results of already matched patterns to compute the result of matching of other patterns.
In some cases the compiler can prove that a switch section can have no effect at runtime because its pattern is subsumed by a previous case. In these cases a warning may be produced. [TODO: these warnings should be mandatory and we should specify precisely when they are produced.]
If a *case-guard* is present, its expression is of type `bool`. It is evaluated as an additional condition that must be satisfied for the case to be considered satisfied.
> Note: The property pattern can be used to pattern-match with anonymous types.
### Match Expression
@ -294,108 +233,60 @@ The C# language syntax is augmented with the following syntactic productions:
```antlr
relational_expression
: match_expression
: switch_expression
;
match_expression
: relational_expression 'switch' match_block
switch_expression
: relational_expression 'switch' '{' switch_expression_arms? '}'
;
match_block
: '(' match_sections ','? ')'
;
match_sections
: match_section
| match_sections ',' match_section
switch_expression_arms
: switch_expression_arm
| switch_expression_arm ',' switch_expression_arm
;
match_section
: 'case' pattern case_guard? ':' expression
switch_expression_arm
: pattern case_guard? '=>' null_coalescing_expression
;
case_guard
: 'when' expression
: 'when' null_coalescing_expression
;
```
The *match_expression* is not allowed as an *expression_statement*.
The type of the *match_expression* is the *best common type* of the expressions appearing to the right of the `:` tokens of the *match section*s.
> We are looking at relaxing this in a future revision.
It is an error if the compiler can prove (using a set of techniques that has not yet been specified) that some *match_section*'s pattern cannot affect the result because some previous pattern will always match.
The type of the *match_expression* is the *best common type* of the expressions appearing to the right of the `=>` tokens of the *switch_expression_arm*s.
At runtime, the result of the *match_expression* is the value of the *expression* of the first *match_section* for which the expression on the left-hand-side of the *match_expression* matches the *match_section*'s pattern, and for which the *case_guard* of the *match_section*, if present, evaluates to `true`.
It is an error if the compiler proves (using a set of techniques that has not yet been specified) that some *switch_expression_arm*'s pattern cannot affect the result because some previous pattern will always match. The compiler shall produce a warning if it proves (using those techniques) that some possible input value might not match some *switch_expression_arm* at runtime.
#### Throw expression
At runtime, the result of the *match_expression* is the value of the *expression* of the first *switch_expression_arm* for which the expression on the left-hand-side of the *match_expression* matches the *switch_expression_arm*'s pattern, and for which the *case_guard* of the *switch_expression_arm*, if present, evaluates to `true`. If there is no such *switch_expression_arm*, the *switch_expression* throws an instance of the exception `System.Runtime.CompilerServices.SwitchExpressionException`.
We extend the set of expression forms to include
### Optional parens which switching on a tuple literal
```antlr
throw_expression
: 'throw' null_coalescing_expression
;
In order to switch on a tuple literal using the *switch_statement*, you have to write what appear to be redundant parens
null_coalescing_expression
: throw_expression
;
``` c#
switch ((a, b))
{
```
The type rules are as follows:
To permit
- A *throw_expression* has no type.
- A *throw_expression* is convertible to every type by an implicit conversion.
The flow-analysis rules are as follows:
- For every variable *v*, *v* is definitely assigned before the *null_coalescing_expression* of a *throw_expression* iff it is definitely assigned before the *throw_expression*.
- For every variable *v*, *v* is definitely assigned after *throw_expression*.
A *throw expression* is permitted in only the following syntactic contexts:
- As the second or third operand of a ternary conditional operator `?:`
- As the second operand of a null coalescing operator `??`
- After the colon of a *match section*
- As the body of an expression-bodied lambda or method.
### Destructuring assignment
Inspired by an [F# feature](https://msdn.microsoft.com/en-us/library/dd233238.aspx) and a [conversation on github](https://github.com/dotnet/roslyn/issues/5154#issuecomment-151974994), and similar features in [Swift](https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/Statements.html) and proposed for [Rust](https://github.com/mbrubeck/rfcs/blob/if-not-let/text/0000-let-else.md), we support decomposition with a *let statement*:
```antlr
block_statement
: let_statement
;
let_statement
: 'let' identifier '=' expression ';'
| 'let' pattern '=' expression ';'
| 'let' pattern '=' expression 'else' embedded_statement
| 'let' pattern '=' expression 'when' expression 'else' embedded_statement
;
``` c#
switch (a, b)
{
```
`let` is an existing contextual keyword.
the parentheses of the switch statement are optional when the expression being switched on is a tuple literal.
The form
> `let` *identifier* `=` *expression* `;`
### Order of evaluation in pattern-matching
is shorthand for
Giving the compiler flexibility in reordering the operations executed during pattern-matching can permit flexibility that can be used to improve the efficiency of pattern-matching. The (unenforced) requirement would be that properties accessed in a pattern, and the Deconstruct methods, are required to be "pure" (side-effect free, idempotent, etc). That doesn't mean that we would add purity as a language concept, only that we would allow the compiler flexibility in reordering operations.
> `let` `var` *identifier* `=` *expression* `;`
(i.e. a *var_pattern*) and is a convenient way for declaring a read-only local variable.
Semantically, it is an error unless precisely one of the following is true
- the compiler can prove that the expression always matches the pattern; or
- an `else` clause is present.
Any pattern variables in the *pattern* are in scope throughout the enclosing block. They are not definitely assigned before the `else` clause. They are definitely assigned after the *let_statement* if there is no `else` clause or they are definitely assigned at the end of the `else` clause (which could only occur because the end point of the `else` clause is unreachable). It is an error to use these variables before their point of definition.
A *let_statement* is a *block_statement* and not an *embedded_statement* because its primary purpose is to introduce names into the enclosing scope. It therefore does not introduce a dangling-else ambiguity.
If a `when` clause is present, the expression following it must be of type `bool`.
At runtime the expression to the right of `=` is evaluated and matched against the *pattern*. If the match fails, control transfers to the `else` clause. If the match succeeds and there is a `when` clause, the expression following `when` is evaluated, and if its value is `false` control transfers to the `else` clause.
**Resolution 2018-04-04 LDM**: confirmed: the compiler is permitted to reorder calls to `Deconstruct`, property accesses, and invocations of methods in `ITuple`, and may assume that returned values are the same from multiple calls. The compiler should not invoke functions that cannot affect the result, and we will be very careful before making any changes to the compiler-generated order of evaluation in the future.
### Some Possible Optimizations
@ -406,158 +297,3 @@ When some of the patterns are integers or strings, the compiler can generate the
For more on these kinds of optimizations, see [[Scott and Ramsey (2000)]](http://www.cs.tufts.edu/~nr/cs257/archive/norman-ramsey/match.pdf "When Do Match-Compilation Heuristics Matter?").
It would be possible to support declaring a type hierarchy closed, meaning that all subtypes of the given type are declared in the same assembly. In that case the compiler can generate an internal tag field to distinguish among the different subtypes and reduce the number of type tests required at runtime. Closed hierarchies enable the compiler to detect when a set of matches are complete. It is also possible to provide a slightly weaker form of this optimization while allowing the hierarchy to be open.
### Some Examples of Pattern Matching
#### Is-As
We can replace the idiom
```cs
var v = expr as Type;
if (v != null) {
// code using v
}
```
With the slightly more concise and direct
```cs
if (expr is Type v) {
// code using v
}
```
#### Testing nullable
We can replace the idiom
```cs
Type? v = x?.y?.z;
if (v.HasValue) {
var value = v.GetValueOrDefault();
// code using value
}
```
With the slightly more concise and direct
```cs
if (x?.y?.z is Type value) {
// code using value
}
```
#### Arithmetic simplification
Suppose we define a set of recursive types to represent expressions (per a separate proposal):
```cs
abstract class Expr;
class X() : Expr;
class Const(double Value) : Expr;
class Add(Expr Left, Expr Right) : Expr;
class Mult(Expr Left, Expr Right) : Expr;
class Neg(Expr Value) : Expr;
```
Now we can define a function to compute the (unreduced) derivative of an expression:
```cs
Expr Deriv(Expr e)
{
switch (e) {
case X(): return Const(1);
case Const(_): return Const(0);
case Add(var Left, var Right):
return Add(Deriv(Left), Deriv(Right));
case Mult(var Left, var Right):
return Add(Mult(Deriv(Left), Right), Mult(Left, Deriv(Right)));
case Neg(var Value):
return Neg(Deriv(Value));
}
}
```
An expression simplifier demonstrates positional patterns:
```cs
Expr Simplify(Expr e)
{
switch (e) {
case Mult(Const(0), _): return Const(0);
case Mult(_, Const(0)): return Const(0);
case Mult(Const(1), var x): return Simplify(x);
case Mult(var x, Const(1)): return Simplify(x);
case Mult(Const(var l), Const(var r)): return Const(l*r);
case Add(Const(0), var x): return Simplify(x);
case Add(var x, Const(0)): return Simplify(x);
case Add(Const(var l), Const(var r)): return Const(l+r);
case Neg(Const(var k)): return Const(-k);
default: return e;
}
}
```
#### A match expression (contributed by @orthoxerox):
```cs
var areas =
from primitive in primitives
let area = primitive switch (
case Line l: 0,
case Rectangle r: r.Width * r.Height,
case Circle c: Math.PI * c.Radius * c.Radius,
case _: throw new ApplicationException()
)
select new { Primitive = primitive, Area = area };
```
#### Tuple decomposition
The *let_statement* would apply to tuples as follows. Given
```cs
public (int, int) Coordinates => …
```
You could receive the results into a block scope this way
```cs
let (int x, int y) = Coordinates;
```
(This assumes that the tuple types define an appropriate `operator is`.)
#### Roslyn diagnostic analyzers
Much of the Roslyn compiler code base, and client code written to use Roslyn for producing user-defined diagnostics, could have its core logic simplified by using syntax-based pattern matching.
#### Cloud computing applications
[NOTE: This section needs much more explanation and examples.]
* Records are very convenient for communicating data in a distributed system (client-server and server-server).
It is also useful for returning multiple results from an async method.
* "Views", or user-written operator "is", is useful for treating, for example,
json as if it is an application-specific data structure. Pattern matching is very convenient for
dispatching in an actors framework.
## Drawbacks
[drawbacks]: #drawbacks
Why should we *not* do this?
## Alternatives
[alternatives]: #alternatives
What other designs have been considered? What is the impact of not doing this?
## Unresolved questions
[unresolved]: #unresolved-questions
Open questions are described in the proposal, inline
## Design meetings
Link to design notes that affect this proposal, and describe in one sentence for each what changes they led to.