2019-04-01 01:00:29 +02:00
|
|
|
# Index and Range Changes
|
|
|
|
|
|
|
|
## Summary
|
|
|
|
This proposes to make several changes to the
|
|
|
|
[index and range design](https://github.com/dotnet/csharplang/blob/master/proposals/csharp-8.0/ranges.md) based on
|
|
|
|
customer feedback: particularly from the CoreFX team and their experiences adding Index / Range support to .NET Core.
|
|
|
|
The change is not to the syntax but rather how the langauge maps the syntax to APIs.
|
|
|
|
|
|
|
|
## Motivation
|
2019-04-01 05:39:22 +02:00
|
|
|
The level of API churn necessary to adopt `Index` and `Range` is quite high today. Every collection type must have at
|
|
|
|
least two new indexers added for `Index` and `Range` respectively. The implementation of `Index` is exactly the same
|
|
|
|
for virtually every collection in .NET. The implementation for `Range` is likewise often similar by simply deferring to
|
|
|
|
the underlying collection type.
|
2019-04-01 01:00:29 +02:00
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
This work must be done by the collection author. The reliance on indexers as the primary use case means developers can't
|
|
|
|
update existing collections using extension methods. This means the adoption of the feature will be slowed by waiting
|
|
|
|
for library authors to update their types.
|
|
|
|
|
|
|
|
The use of `Index` does come with a small performance hit. Typically a consumer of `Index` will pay an extra branch
|
|
|
|
check (`IsFromEnd` and bounds) as they would with a `int` parameter (bounds only). Moving to `Range` doubles this cost
|
|
|
|
as it occurs on each `Index`.
|
|
|
|
|
|
|
|
This penalty is small and insignificant in the vast majority of code. For highly optimized code though this extra
|
|
|
|
branching can be unacceptable. It simply can't be thought of as a syntactic transformation from a call to
|
|
|
|
`span.Slice(start, end)` to `span[start..end]`. Instead the performance implications must be evaluated on each
|
|
|
|
usage. This will likely lead to the feature being banned in certain portions of code bases.
|
|
|
|
|
|
|
|
This proposals seeks to remove the abstraction penalties around `Index` and `Range`: both at an API and performance
|
|
|
|
layer. The goal being that most collection types just work today with minimal change and the feature can be adopted
|
|
|
|
in existing code without fear of lost performance.
|
2019-04-01 01:00:29 +02:00
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
This proposal specifically does not want to change how `Index` and `Range` type binding occurs. The index and range
|
|
|
|
expressions continue to have the same syntax and types.
|
|
|
|
|
|
|
|
## Detailed Design
|
2019-04-01 01:00:29 +02:00
|
|
|
### Indexable types
|
2019-04-01 04:03:26 +02:00
|
|
|
Any type which has an accessible getter property named `Length` or `Count` with a return type of `int` is considered
|
2019-04-01 05:39:22 +02:00
|
|
|
Indexable. The language can make use of this property to convert an index expression into an `int` at the point of
|
|
|
|
the expression without the need to use the type `Index` at all.
|
2019-04-01 04:03:26 +02:00
|
|
|
|
|
|
|
Note: For simplicity going forward the proposal will use the name `Length` to represent `Count` or `Length`.
|
|
|
|
|
|
|
|
For example it allows the following simplication:
|
|
|
|
|
|
|
|
``` csharp
|
|
|
|
Span<char> span = ...;
|
|
|
|
char c = span[^1];
|
|
|
|
|
|
|
|
// Can be translated to
|
|
|
|
Span<char> span = ...;
|
|
|
|
char c = span[span.Length - 1];
|
|
|
|
```
|
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
Transforming an index expression to an `int` at the call site significantly reduces the burden of frameworks to adopt
|
|
|
|
`Index`. Vitrually any collection type will automatically work with `Index` now as the compiler can translate it to
|
|
|
|
`int` in all cases.
|
2019-04-01 04:03:26 +02:00
|
|
|
|
|
|
|
Further this can improve performance by eliminating extra branching. The callee when accepting an `Index` parameter must
|
|
|
|
do both test to see if the value is from the end, `Index.IsFromEnd`, and if the value is inside the bounds of the
|
|
|
|
collection. While a small check this can be important in performance sensitive areas.
|
|
|
|
|
|
|
|
Doing the translation to `int` at the call site means the `IsFromEnd` check can often be eliminated. For example when
|
|
|
|
dealing with an `int` the compiler can pass the value through. Or in the cases where `^` is used the computation from
|
|
|
|
end can be done directly without the additional branching.
|
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
### Index and Range implementations are known
|
|
|
|
The implementations of `Index` and `Range` are considered to be known and side effect free. Much like
|
|
|
|
`ValueTuple<T1, T2>` the language can assume a standard implementation and emit code inline which represents the
|
|
|
|
implementation of methods on `Index` and `Range`. This implementation includes methods like `GetOffset` or conversions
|
|
|
|
like the implicit conversion from `int` to `Index`.
|
|
|
|
|
|
|
|
All arithemtic operations which are emitted will be done so using an `unchecked` context. That matches the context
|
|
|
|
in which `Index` and `Range` are compiled in.
|
|
|
|
|
2019-04-01 04:03:26 +02:00
|
|
|
### Index target type conversion
|
|
|
|
Whenever an expression with type `Index` is used as an argument to an instance member invocation and the receiver is
|
|
|
|
Indexable then the expression will have a target type conversion to `int`. The member invocations applicable for this
|
|
|
|
conversion include methods, indexers, properties, extension methods, etc ... Only constructors are excluded as they
|
|
|
|
have no receiver.
|
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
The target type conversion will be implemented as follows on the index expression:
|
2019-04-01 04:03:26 +02:00
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
- When the expression is `^expr` and the type is `int` it will be translated to `receiver.Length - expr`.
|
2019-04-01 04:03:26 +02:00
|
|
|
- Else it will be translated as `expr.GetOffset(receiver.Length)` where `expr` is the expression typed as `Index`.
|
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
The receiver will be spilled as appropriate to ensure he side effects of obtaining the receiver are only executed
|
2019-04-01 04:03:26 +02:00
|
|
|
once. For example:
|
|
|
|
|
|
|
|
``` csharp
|
|
|
|
class SideEffect {
|
|
|
|
int[] Get() {
|
|
|
|
Console.Write("Get ");
|
|
|
|
return new [] { 1, 2 , 3};
|
|
|
|
}
|
|
|
|
|
|
|
|
void Use() {
|
|
|
|
int i = Get()[^1]
|
|
|
|
Console.WriteLine(i);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
This code will print "Get 3".
|
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
When a range expression is used as an argument to an instance member invocation then the target type conversion to
|
|
|
|
`int` extends to both `Index` operands. In the case either of the `Index` members are omitted then the the appropriate
|
|
|
|
start or end value will be inserted using `0` or `receiver.Length` as appropriate.
|
2019-04-01 04:03:26 +02:00
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
``` csharp
|
|
|
|
class RangeTargettype {
|
|
|
|
void Example() {
|
|
|
|
var array = new[] { 1, 2, 3 };
|
|
|
|
Console.WriteLine(array[1..]);
|
|
|
|
|
|
|
|
// Becomes
|
|
|
|
Console.WriteLine(array[new Range(1, array.Length));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
### Indexing on Range
|
|
|
|
When binding an index member on a Indexable type where the single argument is of type `Range` the language will
|
|
|
|
attempt to translate it to a `Slice` call. The arguments to slice will be both `Index` values of the range converted
|
|
|
|
to `Index` using their target typed conversion described in the previous section. If this translation is not succesful
|
|
|
|
then normal index binding will occur.
|
|
|
|
|
|
|
|
``` csharp
|
|
|
|
class Collection {
|
|
|
|
public int Length { get; }
|
|
|
|
public int[] this[Range range] => ...;
|
|
|
|
}
|
2019-04-01 01:00:29 +02:00
|
|
|
|
2019-04-01 05:39:22 +02:00
|
|
|
class Slice {
|
|
|
|
void Example(Span<int> span, Collection collection) {
|
|
|
|
Span<int> slicedSpan = span[2..]
|
|
|
|
int[] slicedCollection = collection[2..];
|
|
|
|
|
|
|
|
// Translated to
|
|
|
|
Span<int> slicedSpan = span.Slice(2, span.Length);
|
|
|
|
int[] slicedCollection = collection[new Range(2, collection.Count)v;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
The `Slice` method can instance or extension so long as it is accessible has types that are convertible from `int`.
|
|
|
|
|
|
|
|
The compiler will special case the following receiver types binding to `Slice`:
|
|
|
|
|
|
|
|
- `string`: instead of `Slice` the method `Substring` will be used.
|
|
|
|
- array: the runtime helper for array slicing will be used.
|
2019-04-01 01:00:29 +02:00
|
|
|
|
|
|
|
## Open Issues
|
|
|
|
|
|
|
|
|
|
|
|
## Considerations
|
|
|
|
|
|
|
|
### Detect Indexable based on ICollection
|
|
|
|
The inspiration for this proposal was collection initializers. Using the structure of a type to convey that it had
|
|
|
|
opted into a feature. In the case of collection initializers types can opt into the feature by implementing the
|
|
|
|
interface `IEnumerable` (non generic).
|
|
|
|
|
|
|
|
Initially this proposal required that types implement `ICollection` in order to qualify as Indexable. That though
|
|
|
|
required a number of special cases:
|
|
|
|
|
|
|
|
- `ref struct`: these cannot implement interfaces yet types like `Span<T>` are ideal for index / range support.
|
|
|
|
- `string`: does not implement `ICollection` and adding that `interface` has a large cost.
|
|
|
|
|
|
|
|
This means to support key types special casing is already needed. The special casing of `string` is less interesting
|
|
|
|
as the language does this in other areas (`foreach` lowering, constants, etc ...). The special casing of `ref struct`
|
|
|
|
is more concerning as it's special casing an entire class of types. The get labeled as Indexable if they simply have
|
|
|
|
a property named `Count` with a return type of `int`.
|
|
|
|
|
2019-04-01 04:03:26 +02:00
|
|
|
After consideration the design was normalized to say that any type which has a property `Count` / `Length` with a
|
|
|
|
return type of `int` is Indexable. That removes all special casing, even for `string` and arrays.
|
|
|
|
|
|
|
|
### Detect just Count
|
|
|
|
Detecting on the property names `Count` or `Length` does complicate the design a bit. Picking just one to standardize
|
|
|
|
though is not sufficient as it ends up excluding a large number of types:
|
|
|
|
|
|
|
|
- Use `Length`: excludes pretty much every collection in System.Collections and sub-namespaces. Those tend to derive
|
|
|
|
from `ICollection` and hence prefer `Count` over length.
|
|
|
|
- Use `Count`: excludes `string`, arrays, `Span<T>` and most `ref struct` based types
|
|
|
|
|
|
|
|
The extra complication on the initial detection of Indexable types is outweighed by it's simplification in other
|
|
|
|
aspects.
|
|
|
|
|
|
|
|
### Choice of Slice as a anme
|
2019-04-01 05:39:22 +02:00
|
|
|
The name `Slice` was chosen as it's the de-facto standard name for slice style operations in .NET. Starting with
|
|
|
|
netcoreapp2.1 all span style types use the name `Slice` for slicing operations. Prior to netcoreapp2.1 there really
|
|
|
|
aren't any examples of slicing to look to for an example. Types like `List<T>`, `ArraySegment<T>`, `SortedList<T>`
|
|
|
|
would've been ideal for slicing but the concept didn't exist when types were added.
|
|
|
|
|
|
|
|
Thus `Slice` being the sole example it was chosen as the name.
|
2019-04-01 01:00:29 +02:00
|
|
|
|
|
|
|
|
|
|
|
## Related Issues
|
|
|
|
- https://github.com/dotnet/csharplang/blob/master/proposals/csharp-8.0/ranges.cs
|
|
|
|
- https://github.com/dotnet/csharplang/blob/master/proposals/csharp-8.0/ranges.md
|
|
|
|
|
|
|
|
## Design Meetings
|