csharplang/meetings/2020/LDM-2020-12-14.md
2020-12-14 20:33:50 -08:00

8.2 KiB

C# Language Design Meeting for December 14th, 2020

Agenda

Quote of the Day

  • "My Monday is your Friday... you're going to get unadulterated truth here"

Discussion

List Patterns

https://github.com/dotnet/csharplang/pull/3245

Today, we took our first in-depth look at list patterns, which will be the next big turn of the crank in generalized pattern support in C#. The intent of this type of pattern is to allow matching on arbitrary collection types, matching against both the length and the content of the collection in a single, simple-to-understand fashion, much like our other pattern forms enable for positional and nominal content. To bring context to the discussion around our general goals and principles for pattern matching in C#, we again brought up the table from discussion 3107:

Type Declaration Creation Decomposition Dispatch (pattern)
int int x 3 int x int x or 3
class with mutable fields class Point { public int X, Y; } new Point { X = 3, Y = 4 } p.X Point { X: int x, Y: int y } or Point { X: 3, Y: 4 }
anonymous type - new { X = 3, Y = 4 } p.X { X: int x, Y: int y } or { X: 3, Y: 4 }
record class class Point(int X, int Y); (proposed) Point(3, 4) (proposed) (int x, int y) = p Point(int x, int y) or Point(3, 4)
tuple (int x, int y) (3, 4) (int x, int y) = p (int x, int y) or (3, 4)
List List<int> new List<int> { 3, 4 } ? List patterns fit in here
Dictionary Dictionary<K,V> new Dictionary<K,V> { { K, V } } ? ?

While we are not looking at list decomposition with this proposal, we should keep it in mind, as we will want whatever form we use for pattern dispatch to be used for decomposition as well. In this table, we can see a correspondence between the different syntactic forms of creation and destructuring: object initializers correspond with recursive object patterns, tuple literals correspond with tuple patterns, etc. By this principle, our initial instinct is to make the collection pattern correspond with the collection initializer syntax, which uses curly braces. However, this runs into an immediate issue: { } is already a valid pattern today, the empty object pattern. This means it cannot serve as the empty collection pattern, as that pattern must specifically check that the collection is actually empty. The current proposal instead takes the approach of using square brackets [] to represent a collection pattern, instead of using the curlies. We're pretty divided on this approach: C# has not used square brackets to represent a list or array in the past. Even C# 1.0 used the curly brackets for array initializers, reserving the brackets for array size or index access. This would make the proposed syntax a really big break with C# tradition. We could "retcon" this by enabling new types of collection literals using square brackets, but that's an issue that LDM has not intensely looked at beyond previously rejecting https://github.com/dotnet/csharplang/issues/414 and related issues. After some discussion, we've come to the realization that the empty collection (ie, the base case for recursive algorithms) is the most important pattern to design for, and the rest of the syntax falls out from that design. We've come up with a few different syntactic proposals:

  1. The existing proposal as is. Notably, this pattern form is not part of a recursive pattern, and that means that you can't specify a pattern like this: int[] [1, 2, 3]. Indeed, such a pattern is potentially ambiguous, as int[] [] already means a type test against a jagged int array today. Instead, such a pattern would have to be expressed as int[] and []. The first part narrows the type to int[], and the second part specifies that the array must be empty. We're not huge fans of needing the and combinator for a base case (when the input type to the pattern is not narrowed enough to use a collection pattern) given that one is not needed for tuple deconstruction patterns or property patterns, but it is elegant in its simplicity.
  2. A similar version to 1, except that it allows nesting the square brackets inside the curly braces of the property pattern. This would allow int[] { [1, 2, 3] } for the case where you need to both narrow the input type and test the array content. There are some concerns with this syntax: we've also envisioned a dictionary pattern, that would match content using a form like this: { [1]: { /* some nested pattern */ } }. This would mean that the colon at the end of the brackets would determine whether the contents of the brackets are used as arguments to an indexer or the patterns the collection is being tested against.
  3. Using curlies to match the list contents. There are a couple of sub proposals in this section, separated by the way they enable testing for the empty collection case. They share the content tests, which look like { 1, 2, 3 }.
    1. No empty case. Instead, use a property pattern on Length or Count to check for the empty case. This has issues with our previously-desired support for IEnumerable and general foreach-able type support, as they do not have any such property to check.
    2. Empty case represented by a single comma: {,} would represent an array with Length == 0. This was suggested, but no one argued in favor.
    3. Square brackets for Length tests. This proposal would look something like this: int[] [0]. The interesting angle with this version is that it allows for succinct length tests that could be composed of patterns itself. For example, the BCL has some cases where they need to check to see whether an array has some content and Length between two cases, and that could be expressed as [>= 0 and < 256] { 1, 2, .. }. This would also allow a general length check to be expressed for foreach-able types, though there are some concerns that it would become a perf pitfall if enumerating the entire sequence was necessary to check a non-zero length. The length or count of the collection could also be extracted with a declaration pattern, which could turn into a nice shorthand for not having to know whether this collection uses Length or Count, something we didn't standardize and now can't. How this version combines with other property tests on the same object would still need to be discussed: could you do MyType { Property: a } [10] { 1, 2, 3, .. }, for example, or would the property and collection patterns need to be combined with an and?
    4. Add a new combinator keyword to make the transition to a collection pattern explicit. This is similar to how VB uses From to indicate collection initializers. Such a pattern might look something like int[] with { } for the empty case. (with was the word we spitballed here, but likely wouldn't end up being the final word for confusion with with expressions).

We came to no solid conclusions on the syntax topic today, as we were mostly generating ideas and need some time to mull over the various forms. We'll come back to this at a later date.

We also took a brief look at the slice pattern and whether it could be extended to foreach-able types. A trailing .. in a foreach match would be easy to implement and not have any hidden costs, as it would just skip a check to MoveNext() after the leading bits of the pattern match. However, a leading .. would be much more concerning. Depending on implementation strategy, we'd have emit a much larger state machine or keep track of a potentially large number of previous values as we iterate the enumerable, so that when we get to the end we can ensure that the previous slots matched correctly. We're not sure if this difference will be obvious enough to users, and will need to think more about whether we should enable the trailing slice, or enable both slice patterns and let the codegen be what it will. In all likelihood, if the user needs this pattern they're going to code it by hand if they can't do it with a pattern, and we can make it less likely to introduce a bug for it if we generate the states programmatically instead of the user doing it by hand.

Again, we came to no solid conclusions here, as we spent most of our time on the syntax aspects.