af27912886
Seems according to prev. text that "final generated code" should have `[WithConstructor]` attribute on `With` method. It's omission, isn't it.
355 lines
11 KiB
Markdown
355 lines
11 KiB
Markdown
|
|
# Records v2
|
|
|
|
In the past we've thought about records as a feature to enable working with data.
|
|
|
|
"Working with data" is a big group with a number of facets, so it may be interesting to look at
|
|
each in isolation. Let's start by looking at an example of records today and some of its drawbacks.
|
|
|
|
For instance, a simple record would be defined today as follows
|
|
|
|
```C#
|
|
public class UserInfo
|
|
{
|
|
public string Username { get; set; }
|
|
public string Email { get; set; }
|
|
public bool IsAdmin { get; set; } = false;
|
|
}
|
|
```
|
|
|
|
and the usage would read
|
|
|
|
```C#
|
|
void M()
|
|
{
|
|
var userInfo = new UserInfo()
|
|
{
|
|
Username = "andy",
|
|
Email = "angocke@microsoft.com",
|
|
IsAdmin = true
|
|
};
|
|
}
|
|
```
|
|
|
|
There are significant advantages in this code:
|
|
|
|
1. The definition is version resilient, properties can easily be added or moved
|
|
2. Properties can be set in any order, and the names in the initialization always
|
|
match the accessors
|
|
3. Properties with default values can simply be skipped
|
|
|
|
The first flaw is that the properties must now be mutable.
|
|
|
|
# Mutability
|
|
|
|
What we'd like is for C# to provide a way to set a `readonly` member in object initializers.
|
|
Since some types may not have been designed with this initialization in mind, we'd also like
|
|
it to be opt-in.
|
|
|
|
The proposed solution is a new modifier, `initonly`, that can be applied to
|
|
properties and fields:
|
|
|
|
```C#
|
|
public class UserInfo
|
|
{
|
|
public initonly string Username { get; }
|
|
public initonly string Email { get; }
|
|
public initonly bool IsAdmin { get; } = false;
|
|
}
|
|
```
|
|
|
|
The codegen for this is surprisingly straight forward: we just set the readonly field.
|
|
Specifically, the lowered properties would look like:
|
|
|
|
```C#
|
|
public class UserInfo
|
|
{
|
|
private readonly string <Backing>_username;
|
|
public string get_Username() => <Backing>_username;
|
|
[return: modreq(initonly)]
|
|
public void set_Username(string value) { <Backing>_username = value; }
|
|
...
|
|
}
|
|
```
|
|
|
|
The CLR considers setting readonly fields to be unverifiable, but not unsafe. To support
|
|
a more advanced verifier, the following rule is proposed: a readonly field can be modified
|
|
only inside `initonly` methods, or on a new object that is on the CLR stack and has not been
|
|
published via a store or method call.
|
|
|
|
This should solve many of the problems with mutability in the `UserInfo` example, while not
|
|
requiring complicated or brittle emit strategies. However, immutability does present a new
|
|
problem: easily constructing an object with changes.
|
|
|
|
# With-ing
|
|
|
|
When programming with immutability, making changes to an object is done by constructing a
|
|
copy with changes instead of making the changes directly on the object. Unfortunately, there's
|
|
no convenient way to do this in C#, even with current-style records. It's been previously
|
|
proposed that some kind of autogenerated "With" method be provided for records that implements
|
|
that functionality. If we have such a mechanism, it's important that it work with `initonly`
|
|
members. To achieve this, it's proposed that we add a `with` expression, analogous to an object
|
|
initializer. Sample usage would be as follows:
|
|
|
|
```C#
|
|
var userInfo = new UserInfo()
|
|
{
|
|
Username = "andy",
|
|
Email = "angocke@microsoft.com",
|
|
IsAdmin = true
|
|
};
|
|
var newUserName = userInfo with { Username = "angocke" };
|
|
```
|
|
|
|
The resulting `newUserName` object would be a copy of `userInfo`, with `Username` set to "angocke".
|
|
The codegen on the `with` expression would also be similar to the object initializer: a new object
|
|
is constructed, and then the `initonly` `Username` setter would be called in the method body.
|
|
|
|
Of course, the difference here is that the new object being constructed is not a simple new object
|
|
creation, it is a duplicate of the original object. To provide this functionality, we require that
|
|
the object provide a "With constructor" that provides a duplicate object. A sample `With` constructor
|
|
would look like:
|
|
|
|
```C#
|
|
class UserInfo
|
|
{
|
|
...
|
|
[WithConstructor] // placeholder syntax, up for debate
|
|
public UserInfo With()
|
|
{
|
|
return new UserInfo() { Username = this.Username, Email = this.Email, IsAdmin = this.IsAdmin };
|
|
}
|
|
}
|
|
```
|
|
|
|
Notably, the `with` expression will set `initonly` members, just like the object initializer, so to
|
|
support verification we must ensure that the object cannot have been published before the `initonly`
|
|
members are set. To enforce this, the `WithConstructor` attribute (or equivalent syntax) will enforce
|
|
a new rule for the method: all return statements must directly contain an object creation expression,
|
|
possibly with an object initializer.
|
|
|
|
If the `With` constructor requires validation, the user may introduce a constructor to do that validation,
|
|
e.g.
|
|
|
|
```C#
|
|
class UserInfo
|
|
{
|
|
...
|
|
private UserInfo(UserInfo original)
|
|
{
|
|
// validation code
|
|
}
|
|
[WithConstructor]
|
|
public UserInfo With() => new UserInfo(this);
|
|
}
|
|
```
|
|
|
|
The last piece of complexity associated with `With` is inheritance. If your record is extensible, you
|
|
will need to provide a new `With` for the subclass. This can be achieved as follows:
|
|
|
|
```C#
|
|
class Base
|
|
{
|
|
...
|
|
protected Base(Base original)
|
|
{
|
|
// validation
|
|
}
|
|
[WithConstructor]
|
|
public virtual Base With() => new Base(this);
|
|
}
|
|
class Derived : Base
|
|
{
|
|
...
|
|
protected Derived(Derived original)
|
|
: base(original)
|
|
{
|
|
// validation
|
|
}
|
|
[WithConstructor]
|
|
public override Derived With() => new Derived(this);
|
|
}
|
|
```
|
|
|
|
Note one additional piece of complexity here: in order to override the `With` constructor with
|
|
the derived type the language will also need to support covariant return types in overrides.
|
|
There is already a separate proposal for this feature
|
|
[here](https://github.com/dotnet/csharplang/blob/725763343ad44a9251b03814e6897d87fe553769/proposals/covariant-returns.md).
|
|
|
|
**Drawbacks**
|
|
|
|
- Making all return statements in `WithConstructor`s contain new object expressions is restrictive.
|
|
This could be possibly be mitigated by flow analysis that ensures the new object doesn't escape
|
|
the method
|
|
- Supporting variance in overrides through compiler tricks will require stub methods, which will
|
|
grow quadratically with the inheritance depth. The need for a stub method is due to a runtime
|
|
requirement that override signatures match exactly. If the runtime requirement were loosened,
|
|
the stub methods would not be required at all.
|
|
- Using chained constructors of the form `Type(Type original)` effectively reserves that constructor
|
|
for the use of the pattern. Since constructors have unique semantics and cannot be re-named this
|
|
could be limiting.
|
|
|
|
|
|
## Wrapping it all up: Records
|
|
|
|
The above features enable a style of programming that was very difficult before. But even with
|
|
the new features it could be quite verbose and error prone to annotate everything yourself. There
|
|
are also a few items, like Equals and GetHashCode, which can already be written today, it's just laborious.
|
|
Moreover, a significant flaw in implementing equality on top of these new primitives is that
|
|
structural equality is something that should change with your data type as new data is added, but
|
|
when handling it manually it is likely that these things can get out of sync.
|
|
|
|
Therefore, it is proposed that C# support new syntax for records, not for providing new features,
|
|
but for setting defaults and generating code designed for use in records. Example syntax would
|
|
look like
|
|
|
|
```C#
|
|
data class UserInfo
|
|
{
|
|
public string Username { get; }
|
|
public string Email { get; }
|
|
public bool IsAdmin { get; } = false;
|
|
}
|
|
```
|
|
|
|
The generated code for this class would regard all public fields and auto-properties as structural
|
|
members of the record. Record members could be customized using a new `RecordMember(bool)` attribute
|
|
that could be used to either include or exclude members. Record members would be `initonly` by default
|
|
and equality would be autogenerated for the class based on the record members. At any point the behavior
|
|
of these members could be customized simply by declaring them in source. The user-written implementation
|
|
would replace the default implementation in all pattern usage.
|
|
|
|
Note that equality in the face of inheritance is complex, but seems to have been
|
|
adequately solved in the [other records proposal](records.md).
|
|
|
|
## Primary constructors
|
|
|
|
Previous record proposal have also included a new syntax for a parameter list on the type itself, e.g.
|
|
|
|
```C#
|
|
class Point(int X, int Y);
|
|
```
|
|
|
|
In the new design, the parameter list would be an orthogonal C# feature, which could be cleanly integrated
|
|
with records. If a primary constructor is included in a record, it would have new defaults, just like
|
|
public fields and auto-properties: the parameters in the primary constructor would be used to generate
|
|
public record-member properties with the same name. In addition, the primary constructor could now be
|
|
used to auto-generate a deconstructor.
|
|
|
|
For example, the following record with a primary constructor
|
|
|
|
```C#
|
|
data class Point(int X, int Y);
|
|
```
|
|
|
|
would be equivalent to
|
|
|
|
```C#
|
|
data class Point
|
|
{
|
|
public int X { get; }
|
|
public int Y { get; }
|
|
|
|
public Point(int x, int y)
|
|
{
|
|
X = x;
|
|
Y = y;
|
|
}
|
|
|
|
public void Deconstruct(out int X, out int Y)
|
|
{
|
|
X = this.X;
|
|
Y = this.Y;
|
|
}
|
|
}
|
|
```
|
|
|
|
and the final generation of the above would be
|
|
|
|
```C#
|
|
class Point
|
|
{
|
|
public initonly int X { get; }
|
|
public initonly int Y { get; }
|
|
|
|
public Point(int x, int y)
|
|
{
|
|
X = x;
|
|
Y = y;
|
|
}
|
|
|
|
protected Point(Point other)
|
|
: this(other.X, other.Y)
|
|
{ }
|
|
|
|
[WithConstructor]
|
|
public virtual Point With() => new Point(this);
|
|
|
|
public void Deconstruct(out int X, out int Y)
|
|
{
|
|
X = this.X;
|
|
Y = this.Y;
|
|
}
|
|
|
|
// Generated equality
|
|
}
|
|
```
|
|
|
|
Note that we've taken one other piece of information into account
|
|
for a data class with a primary constructor: instead of setting
|
|
the primary fields inside the generated protected constructor, we delegate
|
|
to the primary constructor. If the Point class had another non-primary
|
|
record member, e.g.
|
|
|
|
```C#
|
|
data class Point(int X, int Y)
|
|
{
|
|
public int Z { get; }
|
|
}
|
|
```
|
|
|
|
then that would change the generated protected constructor as follows:
|
|
|
|
```C#
|
|
class Point
|
|
{
|
|
// ...
|
|
protected Point(Point other)
|
|
: this(other.X, other.Y)
|
|
{
|
|
Z = other.Z;
|
|
}
|
|
// ...
|
|
}
|
|
```
|
|
|
|
Notably, this doesn't answer what to do about inheritance of records
|
|
with primary constructors. For instance,
|
|
|
|
```C#
|
|
data class A(int X, int Y);
|
|
data class B(int X, int Y, int Z) : A;
|
|
```
|
|
|
|
Rather than resolving in an arbitrary manner, a more explicit approach
|
|
could require that a parameter list be provided with the base list, e.g.
|
|
|
|
```C#
|
|
data class A(int X, int Y);
|
|
data class B(int X, int Y, int Z) : A(X, Y);
|
|
```
|
|
|
|
The parameter list in the base list would then be applied to a `base` call
|
|
in the generated primary constructor:
|
|
|
|
```C#
|
|
class B
|
|
{
|
|
// ..
|
|
public B(int x, int y, int z)
|
|
: base(x, y)
|
|
// ..
|
|
}
|
|
```
|
|
|
|
As for what a primary constructor could mean outside of a record, that is still open to further proposal.
|