Added LDM notes for September 27th, 2021

This commit is contained in:
Fredric Silberberg 2021-10-27 17:50:30 -07:00
parent 2e10a1b56d
commit 81f2670ec5
No known key found for this signature in database
GPG key ID: 5D1C0E06CE5CD929
2 changed files with 86 additions and 6 deletions

View file

@ -0,0 +1,79 @@
# C# Language Design Meeting for October 27th, 2021
## Agenda
1. [UTF-8 String Literals](#utf-8-string-literals)
2. [Readonly modifiers for primary constructors](#readonly-modifiers-for-primary-constructors)
## Quote of the Day
- "I fought for that when I was young and idealistic"
## Discussion
### UTF-8 String Literals
https://github.com/dotnet/csharplang/issues/184
UTF-8 strings are an extremely important part of modern day programming, particularly for the web. We've previously talked about including special support for literals
based on UTF-8 in C#, as currently all our literals are UTF-16. It is possible to manually obtain byte arrays of UTF-8 bytes, but the process for doing so is either:
1. Error-prone, if you're hand-encoding a byte array.
2. Cumbersome and slightly inefficient, if you're creating static data to be later used.
3. Very inefficient, if you're converting the bytes every invocation.
We'd like to address these issues, doing the minimum possible work to make current scenarios more palatable without blocking our future ability to innovate in this space.
Currently, the runtime does not have a real `Utf8String` type, and instead uses `byte[]`, `Span<byte>`, or `ReadOnlySpan<byte>` as the de-facto interchange type for UTF-8
data. Many members of the LDM are concerned that, if we bless these types with conversions in the language, we will limit our future ability to react to the addition of
such a type into the runtime itself. In particular, if `var myStr = "Hello world"u8;` meant any of those three types, we lose out on the ability to make it mean `Utf8String`
in a future where that is added. This issue is further compounded because we don't know that such a type _will_ be added in the future: ongoing discussions are still being
had over whether a dedicated type will be added, or if the runtime could just have a flag to opt into having all `System.String` instances just become UTF-8 under the hood.
The u8 suffix had mixed reception with LDM. On the one hand, turning strings into one of the interchange types (either implicitly via target-typing, or only via explicit
cast) is convenient from a programming perspective. It also doesn't create a blessed syntax form that runs into the problems of the previous paragraph. On the other hand,
there is no direct indication _how_ the strings are being converted to bytes, which the suffix is useful for. Are the bytes just a direct representation of the UTF-16 data,
or an encoding the string in UTF-8 bytes?
Another question is whether a language-specific conversion is the correct approach here, or if we could create a user-defined conversion from `string` to `byte[]`. This
would allow non-constants to be converted as well, and the language/compiler could make the conversion special such that it could be optimized to occur at compile-time
when constant data is involved. It does suffer from the same problems of what byte encoding is being used, however, and we have existing solutions for converting non-constant
strings to UTF-8.
We also discussed a few other questions:
* Should we have a u8 character literal?
* People cast `char`s to `byte`s in a number of places today, this would need a suffix.
* It would also still need to return a sequence of bytes, rather than a single byte, so what advantage would it bring over a single-character string literal?
* Should the byte sequence have a null terminator?
* Interop scenarios will want a null terminator. However, they can add one by including a `\0` at the end of the string, and an analyzer can catch these cases.
* On the other hand, most managed code scenarios would break if a terminator was included.
* How specific should we make the language around the compile-time encoding of a given string literal?
* We'd like to make it relatively loose, such that we can say "the compiler is free to implement the most efficient and allocation-free form possible."
#### Conclusion
We don't feel confident enough to make any final calls today, but we'd like to make a prototype with target-typed literals and get some usage feedback. We'll implement
that behind a feature flag, and give the compiler to our BCL and ASP.NET partners so they can give us their thoughts from real API usage.
### Readonly modifiers for primary constructors
https://github.com/dotnet/csharplang/discussions/5314
https://github.com/dotnet/csharplang/issues/188
We wanted to revisit this question after we received a large amount of feedback from the community after our [last meeting](./LDM-2021-10-20.md#primary-constructors) on
primary constructors. The feedback was notable in particular because of the volume, as our discussion issues don't normally generate as much discussion from as many separate
community members as this one did. The feedback generally tended in one direction: mutability is an ok default, especially given the history of C#. However, they would like
a succinct way to make the generated fields `readonly`, without having to fall back to manual field declaration and assignment. To address this, we took a look at another
longstanding C# proposal, allowing `readonly` on parameters and locals. Several LDM members are concerned about this as a general feature, mainly because of the "attractive
nuisance" issue. `readonly` is often a sensible default for locals and (in particular) parameters, but in practice LDM members do not find accidental local/parameter mutation
to be a real source of bugs. Meanwhile, if we introduce this feature in general, people will start to use it, and require usage where possible in their codebases. It can add
clarity of intent for reading later, but many members are not convinced it meets the bar with C# 20 years old.
However, scoping the feature down to just primary constructor parameters was slightly less controversial. Most members of the LDM are not opposed to the concept in general,
but we remain conflicted as to whether we need to have such a feature immediately. While we did get a good deal of immediate feedback, some LDM members are unsure whether
this feedback will persist after users actually get their hands on the feature and give it a try. We could also take primary constructors in a more radically different direction,
where we'd adopt a more F#-like approach and consider these parameters captures, not fields.
#### Conclusion
We did not come to any conclusions today. We will revisit soon to talk more about this area.

View file

@ -27,16 +27,17 @@ All schedule items must have a public issue or checked in proposal that can be l
## Nov 1, 2021
## Oct 27, 2021
- Utf8 strings (Jared): https://github.com/dotnet/csharplang/blob/main/proposals/utf8-string-literals.md
- Primary constructors and readonly fields feedback (Fred, Cyrus): https://github.com/dotnet/csharplang/discussions/5314
- Collection literals (if time). (Cyrus): https://github.com/dotnet/csharplang/issues/5354
# C# Language Design Notes for 2021
Overview of meetings and agendas for 2021
## Oct 27, 2021
[C# Language Design Notes for October 27th, 2021](https://github.com/dotnet/csharplang/blob/main/meetings/2021/LDM-2021-10-27.md)
1. UTF-8 String Literals
2. Readonly modifiers for primary constructors
## Oct 25, 2021
[C# Language Design Notes for October 25th, 2021](https://github.com/dotnet/csharplang/blob/main/meetings/2021/LDM-2021-10-25.md)