Create LDM-2019-11-25.md

This commit is contained in:
Mads Torgersen 2019-11-27 13:10:40 -08:00 committed by GitHub
parent 7590a8e8f4
commit 7cd7ddf9c7
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -0,0 +1,126 @@
# C# Language Design Notes for Nov 25, 2019
## Agenda
1. Allow `T?` where `T` is a type parameter that's not constrained to be a value or reference type
# "Unconstrained" `T?`
In C# 8.0 we entertained the possibility of allowing `T?` on type parameters `T` that are unconstrained, or constrained only in a way that they can still be instantiated with both value and reference types.
The idea was to accurately express the type of `default(T)`. To this end, `T?` would end up being the same as `T`, except when `T` is instantiated with a nonnullable reference type `C`, in which case `T?` is `C?`. The reason is that for non-nullable reference types `C`, `default(C)` is still null, and hence of the type `C?`, not `C`.
`T?` would, for instance, allow the return type of methods like `FirstOrDefault()` to be better expressed:
``` c#
public T? FirstOrDefault<T>(this IEnumerable<T> src);
```
In the generic context where `T` is in scope, the meaning of `T?` would be "may be null, even when `T` is not nullable". The expression `default(T)` would no longer warn, but the type of it would be `T?`, and assigning `T?` to `T` *would* warn.
In the end we decided not to allow `T?` due to a couple of problems which we'll get back to below. Instead we addressed many of the scenarios with the addition of nullable attributes such as `[MaybeNull]`, which can be used to annotate e.g. the `FirstOrDefault()` method:
``` c#
public [return:MaybeNull] T FirstOrDefault<T>(this IEnumerable<T> src);
```
## Recent changes
Default values keep causing grief, and we've recently [(Nov 13)](https://github.com/dotnet/csharplang/blob/master/meetings/2019/LDM-2019-11-13.md#improved-analysis-of-maybenullt) decided on some changes to handle them:
1. Add a third null-state next to "not null" and "maybe null", which is "maybe null *even* when the type doesn't allow", and make that the null state of `default(T)` for an unconstrained type parameter `T`, as well as results of method calls etc. (such as `FirstOrDefault`) that were annotated with `[MaybeNull]`
2. Remove the "W warning" on assignment of such values to locals of type `T`, but instead subsequently track the new state *for* those locals, which may subsequently lead to warnings if used unsafely
3. Remove warnings when producing such values as a result (e.g. return value) that is itself annotated as `[MaybeNull]`
Put together, these three eliminate most of the friction around the use of `default(T)` and `[MaybeNull]`. However, they also get very close to the previously proposed semantics of allowing `T?` on unconstrained `T`. We could take this two ways:
1. Since we mostly succeeded in addressing the problem without `T?`, it further reduces the need for it.
2. This shows that "the type of `default(T)`" is an important concept that should be properly reified in the language instead of tricks with attributes.
## Taking another look at "unconstrained" `T?`
The main arguments in favor of a syntax for "the type of `default(T)`" even with the above changes are:
1. By adding "it" as a tracked null state, we're already half embracing it, but by leaving it inexpressible as a type we need to fall back on "tricks" to make use of it
2. `[MaybeNull]` doesn't help when "the type of `default(T)`" is needed in a constructed (array, generic, tuple, ...) type
The use of the syntax `T?` to mean "the type of `default(T)`" as two significant problems as well, which ultimately led us to not embrace it for C# 8.0:
1. Its meaning is different from the meaning of `T?` when `T` is constrained to be a value type.
2. Overrides cannot restate their constraint, and `T?` in those today always means the value type kind.
## Problem 1: `T?` and `T?` mean different things
The first problem is not technical, but one of perception and language regularity. Consider:
```
public T? M1<T>(T t) where T : struct;
public T? M2<T>(T t);
var i1 = M1(7); // i1 is int?
var i2 = M2(7); // i2 is int
```
The declaration of `M1` is legal today. Because `T` is constrained to a (nonnullable) value type, `T?` is known to be a nullable value type, and hence, when instantiated with `int`, the return type is `int?`.
The declaration of `M2` is what's proposed to allow. Because `T` is unconstrained, `T?` is "the type of `default(T)`". When instantiated with `int` the type of `default(int)` is `int`, so that is the return type.
In other words, for the same provided `T` these two methods have *different* return types, even though the only difference is that one has a constraint on `T` but the other does not.
The cognitive dissonance here was a major part of why we didn't embrace `T?` for unconstrained `T`.
## Problem 2: pseudo-"unconstraint" in overrides for disambiguation
In C# overrides of generic methods cannot re-specify constraints, but must inherit them from the original declaration. Before C# 8.0, when `T?` appeared in signatures of such overrides, it was always assumed to mean the nullable value type `Nullable<T>`, because what else could it mean?
In C# 8.0 `T?` acquired the possible second meaning of nullable reference type. In order to disambiguate the search for the original declaration, we reluctantly introduced the ability to specify "pseudo-constraints" on the overrides: When `T` was a type parameter constrained to be a reference type, you can now say so by adding `where T: class` to the declaration. If you want the default assumption of it being constrained to a value type to be made explicit, you can also optionally specify `where T: struct`.
These helper constraints are only there to help find the right declaring method up the inheritance chain (which may be overloaded). The *actual* constraint that's emitted into generated code is still the one that's inherited from the original declaration, once that one has been identified. Thus, only `class` and `struct` can be used as constraints on an override.
Now here comes a third `T?`, the defining characteristic of which is that it is *not* constrained to be either a reference or value type. To distinguish it from the other two in an override of a generic method, we would need a third "constraint" that is actually an *unconstraint* - that specifically says that it is *not* constrained!
## Solutions
We have two general approaches to address these problems (beyond the ever-present solution of giving up on the feature):
1. Live with problem 1, and try to explain things as best we can. Find a syntax to express the "unconstraint" of problem 2.
2. Use a different syntax than `T?` to express "the type of `default(T)`".
## Solution 1
A brain storm for solution 1 syntaxes yielded:
``` c#
override void M1<[Unconstrained]T,U>(T? x) // a
override void M1<T,U>(T? x) where T: object? // b
override void M1<T,U>(T? x) where T: unconstrained // c
override void M1<T,U>(T? x) where T: // d
override void M1<T,U>(T? x) where T: ? // e
override void M1<T,U>(T? x) where T: null // f
override void M1<T,U>(T? x) where T: class|struct // g
override void M1<T,U>(T? x) where T: class or struct // h
override void M1<T,U>(T? x) where T: cluct // joke
```
Of these we have a vague preference for c, expressing explicitly that there is *not* a constraint, closely followed by b, which is the most general constraint one could state. None of the others resonated.
## Solution 2
A brain storm for solution 2 syntaxes yielded:
``` c#
void M1<T,U>(default(T) x) // X
void M1<T,U>(default<T> x) // Y
void M1<T,U>(T?? x) // Z
```
Solutions X and Y try to be explicit about the type meaning "the type of `default(T)`", but run the risk of implying "the result *is* `default(T)`". Solution Z is mostly just the shortest `?`-like token we could think of that isn't `?`.
We unanimously preferred Z. `??` can be read as *may be* (`?`) nullable (`?`).
## Conclusion
Comparing the two approaches we do favor solution 2, adopting the syntax `T??` to mean "the type of `default(T)`" (when `T` is unconstrained). We do think that this is the best solution, because it addresses both problems, is terse, and looks reasonable. We will move ahead with prototyping and fleshing it out.
We would probably only allow it to be used at all on type parameters `T` that are not constrained to be either value or reference types. That way you can never use either `??` or `?` on the same thing.