csharplang/meetings/2016/LDM-2016-07-15.md
2017-01-31 10:38:26 -08:00

121 lines
5.8 KiB
Markdown

# C# Design Language Notes for Jul 15, 2016
## Agenda
In this meeting we took a look at what the scope rules should be for variables introduced by patterns and out vars.
# Scope of locals introduced in expressions
So far in C#, local variables have only been:
1. Introduced by specific kinds of statements, and
2. Scoped by specific kinds of statements
`for`, `foreach` and `using` statements are all able to introduce locals, but at the same time also constitute their scope. Declaration statements can introduce local variables into their immediate surroundings, but those surroundings are prevented by grammar from being anything other than a block `{...}`. So for most statement forms, questions of scope are irrelevant.
Well, not anymore! Expressions can now contain patterns and out arguments that introduce fresh variables. For any statement form that can contain expressions, we therefore need to decide how it relates to the scope of such variables.
## Current design
Our default approach has been fairly restrictive:
- All such variables are scoped to the nearest enclosing statement (even if that is a declaration statement)
- Variables introduced in the condition of an `if` statement aren't in scope in the `else` clause (to allow reuse of variable names in nested `else if`s)
This approach caters to the "positive" scenarios of `is` expressions with patterns and invocations of `Try...` style methods with out parameters:
``` c#
if (o is bool b) ...b...; // b is in scope
else if (o is byte b) ...b...; // fine because bool b is out of scope
...; // no b's in scope here
```
It doesn't handle unconditional uses of out vars, though:
``` c#
GetCoordinates(out var x, out var y);
...; // x and y not in scope :-(
```
It also fits poorly with the "negative" scenarios embodied by what is sometimes called the "bouncer pattern", where a method body starts out with a bunch of tests (of parameters etc.) and jumps out if the tests fail. At the end of the test you can write code at the highest level of indentation that can assume that all the tests succeeded:
```
void M(object o)
{
if (o == null) throw new ArgumentNullException(nameof(o));
...; // o is known not to be null
}
```
However, the strict scope rules above make it intractable to extend the bouncer pattern to use patterns and out vars:
``` c#
void M(object o)
{
if (!(o is int i)) throw new ArgumentException("Not an int", nameof(o));
...; // we know o is int, but i is out of scope :-(
}
```
## Guard statements
In Swift, this scenario was found so important that it earned its own language feature, `guard`, that acts like an inverted `if`, except that a) variables introduced in the conditions are in scope after the `guard` statement, and b) there must be an `else` branch that leaves the enclosing scope. In C# it might look something like this:
``` c#
void M(object o)
{
guard (o is int i) else throw new ArgumentException("Not an int", nameof(o)); // else must leave scope
...i...; // i is in scope because the guard statement is specially leaky
}
```
A new statement form seems like a heavy price to pay. And a guard statement wouldn't deal with non-error bouncers that correct the problem instead of bailing out:
``` c#
void M(object o)
{
if (!(o is int i)) i = 0;
...; // would be nice if i was in scope and definitely assigned here
}
```
(In the bouncer analogy I guess this is equivalent to the bouncer lending the non-conforming customer a tie instead of throwing them out for not wearing one).
## Looser scope rules
It would seem better to address the scenarios and avoid a new statement form by adopting more relaxed scoping rules for these variables.
_How_ relaxed, though?
**Option 1: Expression variables are only scoped by blocks**
This is as lenient as it gets. It would create some odd situations, though:
``` c#
for (int i = foo(out int j);;);
// j in scope but not i?
```
It seems that these new variables should at least be scoped by the same statements as old ones:
**Option 2: Expression variables are scoped by blocks, for, foreach and using statements, just like other locals:**
This seems more sane. However, it still leads to possibly confusing and rarely useful situations where a variable "bleeds" out many levels:
``` c#
if (...)
if (...)
if (o is int i) ...i...
...; // i is in scope but almost certainly not definitely assigned
```
It is unlikely that the inner `if` intended `i` to bleed out so aggressively, since it would almost certainly not be useful at the outer level, and would just pollute the scope.
One could say that this can easily be avoided by the guidance of using curly brace blocks in all branches and bodies, but it is unfortunate if that changes from being a style guideline to having semantic meaning.
**Option 3: Expression variables are scoped by blocks, for, foreach and using statements, as well as all embedded statements:**
What is meant by an embedded statement here, is one that is used as a nested statement in another statement - except inside a block. Thus the branches of an `if` statement, the bodies of `while`, `foreach`, etc. would all be considered embedded.
The consequence is that variables would always escape the condition of an `if`, but never its branches. It's as if you put curlies in all the places you were "supposed to".
## Conclusion
While a little subtle, we will adopt option 3. It strikes a good balance:
- It enables key scenarios, including out vars for non-`Try` methods, as well as patterns and out vars in bouncer if-statements.
- It doesn't lead to egregious and counter-intuitive multi-level "spilling".
It does mean that you will get more variables in scope than the current restrictive regime. This does not seem dangerous, because definite assignment analysis will prevent uninitialized use. However, it prevents the variable names from being reused, and leads to more names showing in completion lists. This seems like a reasonable tradeoff.