csharplang/meetings/2014/LDM-2014-09-03.md
2017-02-09 09:25:09 -08:00

8.9 KiB
Raw Permalink Blame History

C# Design Notes for Sep 3, 2014

Quote of the day: “Its a design smell. But its a good smell.”

Agenda

The meeting focused on rounding out the design of declaration expressions

  1. Removing “spill out” from declaration expressions in simple statements <yes, remove>
  2. Same name declared in subsequent else-ifs <condition decls out of scope in else-branch>
  3. Add semicolon expressions <not in this version>
  4. Make variables in declaration expressions readonly <no>

“Spill out”

The scope rules for variables introduced in declaration expressions are reasonably regular: the scope of such a variable extends to the nearest enclosing statement, and like all local variables, it may be used only after it has been defined, textually.

We did make a couple of exceptions, though: an expression-statement or a declaration-statement does not serve as a boundary for such a variable instead it “spills out” to the directly enclosing block if there is one.

Similarly, a declaration expression in one field initializer is in scope for neighboring field initializers (as long as they are in the same part of the type declaration).

This was supposed to enable scenarios such as this:

GetCoordinates(out var x, out var y);
 // use x and y;

to address the complaint that it is too much of a hassle to use out and ref parameters. But we have a nagging suspicion that this scenario pick up the value in one statement and use it in the next is not very common. Instead the typical scenario looks like this:

if (int.TryParse(s, out int i)) {  i  }

Where the introduced local is used in the same statement as it is declared in.

Outside of conditions, probably the most common use is the inline common-subexpression refactoring, where the result of an expression is captured into a variable the first time it is used, so the variable can be applied to the remaining ones:

Console.WriteLine("Result: {0}", (var x = GetValue()) * x);

The spill-out is actually a bit of a nuisance for the somewhat common scenario of passing dummies to ref or out parameters that you dont need (common in COM interop scenarios), because you cannot use the same dummy names in subsequent statements.

From a rule regularity perspective, the spilling is quite complicated to explain. It would be a meaningful simplification to get rid of it. While complexity of specing and implementing shouldnt stand in the way of a good feature, it is often a smell that the design isnt quite right.

Conclusion

Lets get rid of the spilling. Every declaration expression is now limited in scope to it nearest enclosing statement. Well live with the (hopefully) slight reduction in usage scenarios.

Else-ifs

Declaration expressions lend themselves particularly well to a style of programming where an if/else-if chain goes through various options, each represented by a variable declared in a condition, using those variables in the then-clause:

if ((var i = o as int?) != null) {  i  }
else if ((var s = o as string) != null) {  s  }
else if 

This particular pattern looks like a chain of subsequent options, and even indents like that, but linguistically the else clauses are nested. For that reason, with our current scope rules the variable I introduced in the first condition is in scope in all the rest of the statement even though it is only meaningful and interesting in the then-branch. In particular, it blocks another variable with the same name from being introduced in a subsequent condition, which is quite annoying.

We do want to solve this problem. There is no killer option that we can think of, but there are a couple of plausible approaches:

  1. Change the scope rules so that variables declared in the condition of an if are in scope in the then-branch but not in the else-branch
  2. Remove the restriction that locals cannot be shadowed by other locals
  3. Do something very scenario specific

Changing the scope rules

Changing the scope rules would have the unfortunate consequence of breaking the symmetry of if-statements, so that

if (b) S1 else S2

No longer means exactly the same as

if (!b) S2 else S1

It kind of banks on the fact that the majority of folks who would introduce declaration expressions in a condition would do so for use in the then-branch only. That certainly seems to be likely, given the cases we have seen (type tests, uses of the Try… pattern, etc.). But it still may be surprising to some, and downright bothersome in certain cases.

Worse, there may be tools that rely on this symmetry principle. Refactorings to swap then and else branches (negating the condition) abound. These would no longer always work.

Moreover, of course, this breaks with the nice simple scoping principle for declaration expressions that we just established above: that they are bounded (only) by their enclosing statement.

Removing the shadowing restriction

Since C# 1.0, it has been forbidden to shadow a local variable or parameter with another one. This is seen as one of the more successful rules of hygiene in C# - it makes code safe for refactoring in many scenarios, and just generally easier to read.

There are existing cases where this rule is annoying:

task.ContinueWith(task =>  task ); // Same task! Why cant I name it the same?

Here it seems the rule even runs counter to refactoring, because you need to change every occurrence of the name when you move code into a lambda.

Lifting this restriction would certainly help the else-if scenario. While previous variables would still be in scope, you could now just shadow them with new ones if you choose.

If you do not choose to use the same name, however, the fact that those previous variables are in scope may lead to confusion or accidental use.

More importantly, are we really ready to part with this rule? It seems to be quite well appreciated as an aid to avoid subtle bugs.

Special casing the scenario

Instead of breaking with general rules, maybe we can do something very localized? Some combination of the two? It would have to work both in then and else branches; otherwise, it would still break the if symmetry, and be as bad as the first option.

We could allow only variables introduced in conditions of if-statements to be shadowed only by other variables introduced in conditions of if-statements?

This might work, but seems inexcusably ad-hoc, and is almost certain to cause a bug tail in many tools down the line, as well as confusion when refactoring code or trying to experiment with language semantics.

Conclusion

It seems there truly is no great option here. However, wed rather solve the problem with a wart or two than not address it at all. On balance, option 1, the special scope rule for else clauses, seems the most palatable, so thats what well do.

Semicolon expressions

We previously proposed a semicolon operator, to be commonly used with declaration expressions, to make “let-like” scenarios a little nicer:

Console.WriteLine("Result: {0}", (var x = GetValue(); x * x));

Instead of being captured on first use, the value is now captured first, then used multiple times.

We are not currently on track to include this feature in the upcoming version of the language. The question is; should we be? Theres an argument that declaration expressions only come to their full use when they can be part of such a let-like construct. Also, there are cases (such as conditional expressions) where you cannot just declare the variable on first use, since the use is in a branch separate from later uses.

Nevertheless, it might be rash to say that this is our let story. Is this how we want let to look like in C#? We dont easily get another shot at addressing the long-standing request for a let-like expression. It probably needs more thought than we have time to devote to it now.

Conclusion

Lets punt on this feature and reconsider in a later version.

Should declaration expression variables be mutable?

C# is an imperative language, and locals are often used in a way that depends on mutating them sometime after initialization. However, you could argue that this is primarily useful when used across statements, whereas it generally would be a code smell to have a declaration thats only visible inside one statement rely on mutation.

This may or may not be the case, but declaration expressions also benefit from a strong analogy with declaration statements. It would be weird that var s = GetString() introduces a readonly variable in one setting but not another. (Note: it does in fact introduce a readonly variable in a few situations, like foreach and using statements, but those can be considered special).

Conclusion

Lets keep declaration expressions similar to declaration statements. It is too weird if a slight refactoring causes the meaning to change. It may be worth looking at adding readonly locals at a later point, but that should be done in an orthogonal way.