diff --git a/proposals/params-span.md b/proposals/params-span.md new file mode 100644 index 0000000..2e31955 --- /dev/null +++ b/proposals/params-span.md @@ -0,0 +1,150 @@ +# `params Span` + +## Summary +Avoid heap allocation for implicit allocation of arrays in specific scenarios with `params` arguments. + +## Motivation +`params` array parameters provide a convenient way to call a method that takes an arbitrary length list of arguments. +However, using an array type for the parameter means the compiler must implicitly allocate an array on the heap at each call site. + +If we extend `params` types to include the `ref struct` types `Span` and `ReadOnlySpan`, where values of those types cannot escape the call stack, the array at the call site may be created on the stack instead. + +And if we're extending `params` to other types, we could also allow `params IEnumerable` to avoid allocating and copying collections at call sites that have an `IEnumerable` rather than `T[]`. + +The benefits of `params ReadOnlySpan` and `params Span` are primarily for new APIs. Existing commonly used APIs such as `Console.WriteLine()` and `StringBuilder.AppendFormat()` already have overloads that avoid array allocations for common cases and those overloads would need to be retained for backward compatibility. +```csharp +public static class Console +{ + public static void WriteLine(string value); + public static void WriteLine(string format, object arg0); + public static void WriteLine(string format, object arg0, object arg1); + public static void WriteLine(string format, object arg0, object arg1, object arg2); + public static void WriteLine(string format, params object[] arg); +} +``` + +## Detailed design + +### Extending `params` +`params` parameters will be supported with types `Span`, `ReadOnlySpan`, and `IEnumerable`. + +A call in [_expanded form_](../../spec/expressions.md#applicable-function-member) to a method with a `params T[]` or `params IEnumerable` parameter will result in an array `T[]` allocated on the heap. + +A call in [_expanded form_](../../spec/expressions.md#applicable-function-member) to a method with a `params ReadOnlySpan` or `params Span` parameter will result in an array `T[]` created on the stack _if the `params` array is within limits (if any) set by the compiler_. +Otherwise the array will be allocated on the heap. + +```csharp +Console.WriteLine(fmt, x, y, z); // WriteLine(string format, params ReadOnlySpan arg) +``` + +The compiler will report an error when compiling the method declaring the `params` parameter if the `ReadOnlySpan` or `Span` parameter value is returned from the method or assigned to an `out` parameter. +That ensures call-sites can create the underlying array on the stack and reuse the array across call-sites without concern for aliases. + +A `params` parameter must be last parameter in the method signature. + +Two overloads cannot differ by `params` modifier alone. + +`params` parameters will be marked in metadata with a `System.ParamArrayAttribute` regardless of type. + +### Overload resolution +Overload resolution will continue to prefer overloads that are applicable in [_normal form_](../../spec/expressions.md#applicable-function-member) rather than [_expanded form_](../../spec/expressions.md#applicable-function-member). + +For overloads that are applicable in _expanded form_, [better function member](../../spec/expressions.md#better-function-member) will be updated to prefer `params` types in a specific order: + +> When performing this evaluation, if `Mp` or `Mq` is applicable in its expanded form, then `Px` or `Qx` refers to a parameter in the expanded form of the parameter list. +> +> In case the parameter type sequences `{P1, P2, ..., Pn}` and `{Q1, Q2, ..., Qn}` are equivalent (i.e. each `Pi` has an identity conversion to the corresponding `Qi`), the following tie-breaking rules are applied, in order, to determine the better function member. +> +> * If `Mp` is a non-generic method and `Mq` is a generic method, then `Mp` is better than `Mq`. +> * ... +> * **Otherwise, if both methods have `params` parameters and are applicable only in their expanded forms, and the `params` types are distinct types with equivalent element type (there is an identity conversion between element types), the more specific `params` type is the first of:** +> * **`ReadOnlySpan`** +> * **`Span`** +> * **`T[]`** +> * **`IEnumerable`** +> * Otherwise if one member is a non-lifted operator and the other is a lifted operator, the non-lifted one is better. +> * Otherwise, neither function member is better. + +### Array creation expressions +Array creation expressions that are target-typed to `ReadOnlySpan` or `Span` will be created on the stack _if the length of the array is a constant value within limits (if any) set by the compiler_. +Otherwise the array will be allocated on the heap. + +```csharp +Span s = new[] { i, j, k }; // int[] on the stack +WriteLine(fmt, new[] { x, y, z }); // object[] on the stack for WriteLine(string fmt, ReadOnlySpan args); +``` + +### Array re-use +The compiler _may_ reuse an implicitly allocated array across multiple uses within a single thread executing a method: +- At the same call-site (within a loop) or +- At distinct call-sites if the lifetime of the spans do not overlap, and the array length is sufficient, and + - the element types are managed types that are considered identical by the runtime, or + - the element types are unmanaged types of the same size. + +An implicitly allocated array may be reused regardless of whether the array was created on the stack or the heap. + +### Lowering implicit allocation +For the `params` and array creation cases above that are target typed to `Span` or `ReadOnlySpan`, the compiler will lower the creation of spans using an efficient approach, specifically avoiding heap allocations when possible. +The exact details are still to be determined and may differ based on the target framework and runtime. + +The guarantee the compiler gives is the span will be the expected size and will contain the expected items at any point in user code. + +## Open issues +### Is `params Span` necessary? +Is there a reason to support `params` parameters of type `Span` in addition to `ReadOnlySpan`? Is allowing mutation within the `params` method useful? + +### Is `params IEnumerable` necessary? +If the compiler allows `params ReadOnlySpan`, then new APIs that require `params` could use `params ReadOnlySpan` instead of `params T[]` because `T[]` is implicitly convertible to `ReadOnlySpan`. And existing APIs could add a `params ReadOnlySpan` overload where the existing `params T[]` simply delegates to the new overload. + +There is no conversion from `IEnumerable` to `ReadOnlySpan` however, so allowing `params IEnumerable` is essentially asking APIs to provide two overloads for `params` methods: `params ReadOnlySpan` and `params IEnumerable`. + +Are scenarios for `params IEnumerable` sufficiently compelling to justify that? + +### Array limits +The compiler may use heuristics to determine when to fallback to heap allocation for the underlying data for spans. +If heuristics are necessary, experimentation should establish the limits we agree on. + +### Lowering approach +We need to determine the particular approach used to lower `params` and array creation expressions to avoid heap allocation. + +For instance, one potential approach to represent a `Span` of constant length `N` is to synthesize a `struct` with `N` fields of type `T` +where the layout and alignment of the fields matches the alignment of elements in `T[]`, and create the `Span` from a `ref` to the first field of the `struct`. + +With that approach, `Console.WriteLine(fmt, x, y, z);` would be emitted as: +```csharp +[StructLayout(LayoutKind.Sequential)] +internal struct __ValueArray3 { public T Item1, Item2, Item3; }; + +var values = new __ValueArray3() { Item1 = x, Item2 = y, Item3 = z }; +var span = MemoryMarshal.CreateSpan(ref values.Item1, 3); +Console.WriteLine(fmt, (ReadOnlySpan)span); // WriteLine(string format, params ReadOnlySpan arg) +``` + +Alternative approaches may require runtime support. + +### Explicit `stackalloc` +Should we allow explicit stack allocation of arrays of managed types with `stackalloc` as well? +```csharp +public static ImmutableArray Select(this ImmutableArray source, Func map) +{ + int n = source.Length; + Span result = n <= 16 ? stackalloc TResult[n] : new TResult[n]; + for (int i = 0; i < n; i++) + result[i] = map(source[i]); + return ImmutableArray.Create(result); // requires ImmutableArray.Create([DoesNotEscape] ReadOnlySpan items) +} +``` + +This would require runtime support for stack allocation of arrays of non-constant length and any type, and GC tracking of the elements. + +Direct runtime support for stack allocation of arrays of managed types might be useful for lowering implicit allocation as well. + +The GC does not currently track the lifetime of a `stackalloc` array so if the contents of the array have a shorter lifetime than the method, the compiler will need to zero the contents of the array so the lifetime of elements matches expectations. + +### Opting out +Should we allow opt-ing out of _implicit allocation_ on the call stack? +Perhaps an attribute that can be applied to a method, type, or assembly. + +## Related proposals +- https://github.com/dotnet/csharplang/issues/1757 +- https://github.com/dotnet/csharplang/blob/main/proposals/format.md#extending-params