csharplang/meetings/2019/LDM-2019-09-16.md
2019-09-24 10:38:16 -07:00

4 KiB

C# Language Design Meeting for September 16, 2019

Agenda

  1. UTF-8 strings (https://github.com/dotnet/corefxlab/issues/2350)
  2. Triage

Discussion

UTF-8 Strings

Motivation: UTF-8 is everywhere, and converting to and from the .NET native string type (UCS2/UTF-16), can be expensive and confusing.

Language impact

Literals

Current proposal is to emit the data as UTF-16, and use a runtime helper to project to UTF-8. The main question for the language is if this encoding limits the usability in some way, and whether it blocks us off from introducing an optimal strategy later. There are two things to consider: the cost in the compiler, and the cost to the language. At the moment we don't see anything that would be considered a breaking change in the language.

Enumeration

The proposal doesn't include any default enumeration. It's worth considering if this violates C# user expectations. There are different forms of enumeration available by calling properties, but none of them are the default. Should there be a default?

One problem is that users who see enumeration may expect indexing, which is not cheap, and does not match expectations. Another problem is that there is almost always a better operation than enumerating a UTF-8 string. It seems like adding this more likely to encourage a user to write a bug, or inefficient code, than to help them.

Why language support

The main advantages of language support are:

  • O(n) string concatenation (calling utf8string.Concat with all n arguments)
  • String literals

Target-typing

Target-typing of existing string literals is possible but produces "bad" behavior for seemingly the most common scenario:

void M(string s) { ... }
void M(Utf8String s) { ... }

M("some string"); // This would call the `string` version, because backwards compatibility requires
                  // "" always be a `string` first, and a `Utf8String` second

Syntax

As always, there's debate about the syntax. The "u" prefix is somewhat unsatisfying because UTF-8, UTF-16, UCS-2, and unsigned all begin with "u". The same complaints hold for the proposed ustring contextual keyword. However, there are no enthusiastically favored alternatives.

In addition, the value of the ustring contextual keyword seems questionable. If the brevity is important, then it seems important enough that the framework could call it ustring or utf8. One argument is that Utf8String could be a new "primitive" type, and we should add a keyword for all primitive types. However, this is not proposed as a primitive type at the same level, so that weakens support. In addition, the original aliases were all added at the inception of the language, so we have no precedence for adding either primitive types or aliases.

Conclusion

We think the feature is valuable and probably worth some language support. We think a literal syntax (like the "u" prefix) is good, but we don't like target typing. We're also not convinced of the need for a contextual keyword.

More triage

Null parameter checking

Issue #2145

ASAP, 9.0

CallerArgumentExpression

Issue #287

We could potentially use this to implement "Null parameter checking" as a method call, instead of new syntax. Thus, consider for 9.0.

Relax ordering constraints about modifiers (especially ref and partial)

Issue #946

9.0

Zero- and one-element tuples

Issue #883

In terms of language value, zero- and one-element tuples are different features. Zero-element tuples are most useful as a unit type, which is not necessarily the unit type we would choose to standardize on (e.g. against System.Void), while one-element tuples are simply useful as a wrapper type and don't have an obvious competitor.

We need to revisit the decisions here. Moving to X.X

Mix declarations and variable in deconstruction

Issue #125

Not an urgent feature, but useful for completeness. On the other hand, it's always very clear whether or not the deconstruction is using existing variables, or creating new ones.

Any time

Discard for lambda parameters

Issue #111

Any time