csharplang/meetings/2017/LDM-2017-08-07.md
2018-01-25 13:11:49 -08:00

10 KiB

C# Language Design Notes for Aug 7, 2017

Agenda

We continued refining the nullable reference types feature set with the aim of producing a public prototype for experimentation and learning.

  1. Warnings
  2. Local variables revisited
  3. Opt-in mechanisms

Warnings

The nullable features lead to three distinct kinds of warnings:

  1. Direct dereference of nullable values
  2. Indirect dereference of nullable values - conversion to not-nullable
  3. Not-nullable is non-null - Conversion of null value to not-nullable type

Examples:

string s = ...;
string? n = ...;

var l = n.Length; // Warning 1: Direct dereference
s = n;            // Warning 2: Indirect dereference
s = null;         // Warning 3: null conversion

The design decisions in the following influence which of these warnings need to be on by default.

There are two dual viewpoints on the value of these warnings:

  • Warning on assignment of null is the most important part of the feature, because that's how null values get in there in the first place. (This is true on API boundaries)
  • Warning of dereference is the most important part of the feature, because that's what protects you from null reference exceptions. (This is true in code bodies)

The fact is that both kinds are important, and go hand in hand in protecting against null reference exceptions.

Local variables revisited

The current plan of record is that nullability annotations (string vs string?) matter just the same in local variable declarations as elsewhere.

class C
{
    public static string  S { get; } // won't be null
    public static string? N { get; } // might be null

    public void M()
    {
        int l;

        string s; // won't be null
        s = C.S; // ok
        s = C.N; // warning 2: Indirect dereference
        s = null; // warning 3: null conversion
        l = s.Length; // ok
        l = (s != null) ? s.length : 0; // ok

        string? n; // might be null
        n = C.S; // ok
        n = C.N; // ok
        n = null; // ok
        l = n.Length; // warning 1: Direct dereference
        l = (n != null) ? n.length : 0; // ok
    }
}

With this approach, like everywhere else, unannotated local variables are non-null, and are protected by warnings from having null assigned to them, as where warning 2 and 3 are given above. ?-annotated locals on the other hand are subjected to a flow-analysis, and if this shows them to potentially be null at a particular point in the code, a warning is given on dereference, as with warning 1 above.

This causes a problem with existing code. Since the ? annotation wasn't previously available in C#, many local variables that are supposed to be null will not have it in their declaration. Their subsequent use may be completely safe (i.e. have null checks everywhere necessary), but with the advent of nullability checking we'd still be bothering the developer with warning 2 (as soon as C.N has gotten a ? of its own) and 3 above.

An alternative proposal is to make all locals (of reference type) nullable, regardless of their annotation. There'd be no notion of preventing locals from assuming a null value. Instead, they would all be uniformly protected by flow analysis. Thus, the s part of the above code would act as follows:

        string s; // might be null
        s = C.S; // ok
        s = C.N; // ok
        s = null; // ok
        l = s.Length; // warning 1: Direct dereference
        l = (s != null) ? s.length : 0; // ok

That is, identical to the n code above.

The upside of this approach is that you would get no "cosmetic" warnings when the nullable feature is turned on and used; i.e. no flood of warnings telling you to add ? to a bunch of perfectly safe locals.

The downside of course is that locals would be treated completely differently from all other declarations in the language. It might be very confusing to read and maintain the code.

Unlike the current plan, this would require all of the warnings 1-3 to be opt-in, as there would be nullable things in code even without any explicit ?s. The current plan at least has a hope of letting warnings 1-2 be always on for source code (since only new code introduces nullable types). This may or may not be important.

Conclusion

We do recognize the problem that the alternative approach sets out to solve. However, we believe that locals should be consistent with other declarations, and that code is better off getting fixed up to reflect nullability in local variable declarations. Explicitly typed locals should declare their intent the same way other declarations do.

There are mitigations:

  • People who use var won't have this problem, as the nullability flows with the type.
  • We can do a lot with "fix all" style tooling to help you quickly fix all local declarations that e.g. are initialized with a nullable expression.

Also, if the "cosmetic" warnings get too noisy during upgrades to use the feature, there may be some version of the alternative approach available as an intermediate setting.

Let's get a public prototype out. It is fine (maybe actually great) if it's configurable, but should have a default according to where the LDM is trending. It would be nice to have some of the tooling as well ("fix all"), because that's part of the experience we are comparing.

Opt-in mechanisms

There are various parts of the nullable reference types feature set that lead to new warnings on existing code:

  • Assignment of null to unannotated reference types is perfectly allowed today: string s = null;
  • Libraries with ? annotations (which would be compiler-encoded into attributes in the assembly) may be used via older compilers that don't recognize the annotations, then all of a sudden would take effect when upgrading to C# 8.0.

In addition, libraries should be encouraged to add annotations to new versions, and upgrading to those versions would yield new warnings.

On the whole, all these new warnings are a good thing, and in a sense the purpose of the feature. However, there needs to be some way to avoid them, so that they don't constitute a breaking change that requires existing code to be edited in order to get clean.

We've been discussing the various upgrade scenarios and possible kinds and granularities of opt-in mechanisms to help the developers achieve a smooth upgrade at their own pace. However, these mechanisms come with their own complexities, and we need to decide which ones are worth that.

Opt-in for all new nullable warnings

This is the notion that there's one big switch on the compiler. If it's on, you get all nullable warnings on your code, if its off you get none. You can leave it off forever if you're just in maintenance mode. You can repeatedly turn it on, fix up some stuff, then turn it back off until your code is properly null-annotated and your null-reference-exception bugs have been fixed.

TypeScript took this approach when adding nullable types and the accompanying checks. When on, it means that all existing libraries are seen as non-nullable on all input and output, regardless of whether that's what they intended. This sounds quite unfortunate on the face of it, but has actually proven a lot less of a problem than anticipated: preventing people from passing null to an API is most often the right thing, occasionally unnecessary but benign, and rarely actually limiting to the use of the API. In those cases you can work around it - in C# that would be with use of the postfix ! operator to shut up the warning.

So there's at least an argument to be made that this is "good enough", and it has a lot going for it in terms of simplicity!

Per-reference opt-in

This approach lets the client of a library decide whether to "see" its nullability warnings or not. This might be useful when

  • A new version of a library adds annotations, but the client isn't ready to deal with them yet
  • The client upgrades to C# 8.0 and discovers annotations in libraries they were already using, but aren't ready to deal with the warnings yet
  • The library has not been annotated, and the client experience of treating the library as all non-null is too frustrating

This is expected to come with heavy tooling costs, because we need to wire through UI to turn null warnings on and off for each referenced assembly, add support in config files, etc. But it gives the client great control over what they would like to be bothered with right now.

Library self-opt-in

This approach lets the library express, through attributes, whether to consider its unannotated types non-nullable. The default would be no, so that an existing nullability-unaware library would continue working as it does today, i.e. lead to no warnings.

We sometimes call this option "URTANN" because the meaning of the attribute is "Unannotated Reference Types Are Non-Nullable".

The attribute could be used at the assembly level, but you can also imagine making it more granular, using it to mark parts of the code that "have been annotated". The client's compiler would then emit warnings only when nulls are passed to the URTANN-marked unannotated parameters.

In this way, the approach becomes a poor man's version of the feature design that has separate syntaxes for "non-null" (string!) and "oblivious" (string). We decided not to go that syntactic route (for good reasons described elsewhere!), but this would allow close-to-similar expressiveness on API boundaries using an extralingual mechanism (attributes) rather than syntax.

A problem with this approach is that it's not clear what code means. Am I currently under URTANN? I'd have to look around and make sure. Also, if highly granular application is allowed, it may be tempting to leave libraries half-annotated in some complicated pattern.

Another issue is that this would make it impossible for a client to interpret existing (unannotated) libraries as all-non-null, even when that would be the best thing to do.

But it does provide a potentially very granular mechanism for a library to express its own intent, nullability-wise.

Conclusion

We're imagining all kinds of ways the user could get screwed by this feature, and coming up with opt-ins/opt-outs to mitigate. But we don't actually know if it's all that bad, and if those mechanisms are worth the complexity that they add.

What we'll do for now is to have only "big switch" feature opt-in in the prototype. We'll use that to learn what scenarios are painful enough that we should reach back into our catalogue of opt-in ideas and try one out.