Add design notes

This commit is contained in:
Mads Torgersen 2017-08-08 17:17:09 -07:00
parent bd52a5bbf1
commit 411106b010
2 changed files with 168 additions and 0 deletions

View file

@ -0,0 +1,159 @@
# C# Language Design Notes for Aug 7, 2017
## Agenda
We continued refining the nullable reference types feature set with the aim of producing a public prototype for experimentation and learning.
1. Warnings
2. Local variables revisited
3. Opt-in mechanisms
# Warnings
The nullable features lead to three distinct kinds of warnings:
1. *Direct dereference* of nullable values
2. *Indirect dereference* of nullable values - conversion to not-nullable
3. *Not-nullable is non-null* - Conversion of null value to not-nullable type
Examples:
``` c#
string s = ...;
string? n = ...;
var l = n.Length; // Warning 1: Direct dereference
s = n; // Warning 2: Indirect dereference
s = null; // Warning 3: null conversion
```
The design decisions in the following influence which of these warnings need to be on by default.
There are two dual viewpoints on the value of these warnings:
- Warning on assignment of null is the most important part of the feature, because that's how null values get in there in the first place. (This is true on API boundaries)
- Warning of dereference is the most important part of the feature, because that's what protects you from null reference exceptions. (This is true in code bodies)
The fact is that both kinds are important, and go hand in hand in protecting against null reference exceptions.
# Local variables revisited
The current plan of record is that nullability annotations (`string` vs `string?`) matter just the same in local variable declarations as elsewhere.
``` c#
class C
{
public static string S { get; } // won't be null
public static string? N { get; } // might be null
public void M()
{
int l;
string s; // won't be null
s = C.S; // ok
s = C.N; // warning 2: Indirect dereference
s = null; // warning 3: null conversion
l = s.Length; // ok
l = (s != null) ? s.length : 0; // ok
string? n; // might be null
n = C.S; // ok
n = C.N; // ok
n = null; // ok
l = n.Length; // warning 1: Direct dereference
l = (n != null) ? n.length : 0; // ok
}
}
```
With this approach, like everywhere else, unannotated local variables are non-null, and are protected by warnings from having null assigned to them, as where warning 2 and 3 are given above. `?`-annotated locals on the other hand are subjected to a flow-analysis, and if this shows them to potentially be null at a particular point in the code, a warning is given on dereference, as with warning 1 above.
This causes a problem with existing code. Since the `?` annotation wasn't previously available in C#, many local variables that *are* supposed to be null will not have it in their declaration. Their subsequent use may be completely safe (i.e. have null checks everywhere necessary), but with the advent of nullability checking we'd still be bothering the developer with warning 2 (as soon as `C.N` has gotten a `?` of its own) and 3 above.
An alternative proposal is to make all locals (of reference type) nullable, regardless of their annotation. There'd be no notion of preventing locals from assuming a null value. Instead, they would all be uniformly protected by flow analysis. Thus, the `s` part of the above code would act as follows:
``` c#
string s; // might be null
s = C.S; // ok
s = C.N; // ok
s = null; // ok
l = s.Length; // warning 1: Direct dereference
l = (s != null) ? s.length : 0; // ok
```
That is, identical to the `n` code above.
The upside of this approach is that you would get no "cosmetic" warnings when the nullable feature is turned on and used; i.e. no flood of warnings telling you to add `?` to a bunch of perfectly safe locals.
The downside of course is that locals would be treated completely differently from all other declarations in the language. It might be very confusing to read and maintain the code.
Unlike the current plan, this would require all of the warnings 1-3 to be opt-in, as there would be nullable things in code even without any explicit `?`s. The current plan at least has a hope of letting warnings 1-2 be always on for source code (since only new code introduces nullable types). This may or may not be important.
## Conclusion
We do recognize the problem that the alternative approach sets out to solve. However, we believe that locals should be consistent with other declarations, and that code is better off getting fixed up to reflect nullability in local variable declarations. Explicitly typed locals should declare their intent the same way other declarations do.
There are mitigations:
- People who use `var` won't have this problem, as the nullability flows with the type.
- We can do a lot with "fix all" style tooling to help you quickly fix all local declarations that e.g. are initialized with a nullable expression.
Also, if the "cosmetic" warnings get too noisy during upgrades to use the feature, there may be some version of the alternative approach available as an intermediate setting.
Let's get a public prototype out. It is fine (maybe actually great) if it's configurable, but should have a default according to where the LDM is trending. It would be nice to have some of the tooling as well ("fix all"), because that's part of the experience we are comparing.
# Opt-in mechanisms
There are various parts of the nullable reference types feature set that lead to new warnings on existing code:
- Assignment of `null` to unannotated reference types is perfectly allowed today: `string s = null;`
- Libraries with `?` annotations (which would be compiler-encoded into attributes in the assembly) may be used via older compilers that don't recognize the annotations, then all of a sudden would take effect when upgrading to C# 8.0.
In addition, libraries should be encouraged to add annotations to new versions, and upgrading to those versions would yield new warnings.
On the whole, all these new warnings are a good thing, and in a sense the purpose of the feature. However, there needs to be *some* way to avoid them, so that they don't constitute a breaking change that requires existing code to be edited in order to get clean.
We've been discussing the various upgrade scenarios and possible kinds and granularities of opt-in mechanisms to help the developers achieve a smooth upgrade at their own pace. However, these mechanisms come with their own complexities, and we need to decide which ones are worth that.
## Opt-in for all new nullable warnings
This is the notion that there's one big switch on the compiler. If it's on, you get all nullable warnings on your code, if its off you get none. You can leave it off forever if you're just in maintenance mode. You can repeatedly turn it on, fix up some stuff, then turn it back off until your code is properly null-annotated and your null-reference-exception bugs have been fixed.
TypeScript took this approach when adding nullable types and the accompanying checks. When on, it means that all existing libraries are seen as non-nullable on all input and output, regardless of whether that's what they intended. This sounds quite unfortunate on the face of it, but has actually proven a lot less of a problem than anticipated: preventing people from passing null to an API is most often the right thing, occasionally unnecessary but benign, and rarely actually limiting to the use of the API. In those cases you can work around it - in C# that would be with use of the postfix `!` operator to shut up the warning.
So there's at least an argument to be made that this is "good enough", and it has a lot going for it in terms of simplicity!
## Per-reference opt-in
This approach lets the client of a library decide whether to "see" its nullability warnings or not. This might be useful when
- A new version of a library adds annotations, but the client isn't ready to deal with them yet
- The client upgrades to C# 8.0 and discovers annotations in libraries they were already using, but aren't ready to deal with the warnings yet
- The library has not been annotated, and the client experience of treating the library as all non-null is too frustrating
This is expected to come with heavy tooling costs, because we need to wire through UI to turn null warnings on and off for each referenced assembly, add support in config files, etc. But it gives the client great control over what they would like to be bothered with right now.
## Library self-opt-in
This approach lets the library express, through attributes, whether to consider its unnanotated types non-nullable. The default would be no, so that an existing nullability-unaware library would continue working as it does today, i.e. lead to no warnings.
We sometimes call this option "URTANN" because the meaning of the attribute is "**U**nannotated **R**eference **T**ypes **A**re **N**on-**N**ullable".
The attribute could be used at the assembly level, but you can also imagine making it more granular, using it to mark parts of the code that "have been annotated". The client's compiler would then emit warnings only when nulls are passed to the URTANN-marked unannotated parameters.
In this way, the approach becomes a poor man's version of the feature design that has separate syntaxes for "non-null" (`string!`) and "oblivious" (`string`). We decided not to go that syntactic route (for good reasons described elsewhere!), but this would allow close-to-similar expressiveness on API boundaries using an extralingual mechanism (attributes) rather than syntax.
A problem with this approach is that it's not clear what code means. Am I currently under URTANN? I'd have to look around and make sure. Also, if highly granular application is allowed, it may be tempting to leave libraries half-annotated in some complicated pattern.
Another issue is that this would make it impossible for a client to interpret existing (unannotated) libraries as all-non-null, even when that would be the best thing to do.
But it does provide a potentially very granular mechanism for a library to express its own intent, nullability-wise.
## Conclusion
We're imagining all kinds of ways the user could get screwed by this feature, and coming up with opt-ins/opt-outs to mitigate. But we don't actually know if it's all that bad, and if those mechanisms are worth the complexity that they add.
What we'll do for now is to have only "big switch" feature opt-in in the prototype. We'll use that to learn what scenarios are painful enough that we should reach back into our catalogue of opt-in ideas and try one out.

View file

@ -279,3 +279,12 @@ We started putting a series of stakes in the ground for nullable reference types
2. Nullable reference types
3. Rarely-null members
## Aug 7, 2017
[C# Language Design Notes for Aug 7, 2017](LDM-2017-08-07.md)
We continued refining the nullable reference types feature set with the aim of producing a public prototype for experimentation and learning.
1. Warnings
2. Local variables revisited
3. Opt-in mechanisms