Merge pull request #866 from stephentoub/update_async_stream_proposal

Update async stream proposal based on design meeting
This commit is contained in:
Stephen Toub 2017-08-31 00:25:18 -04:00 committed by GitHub
commit cd1c38a546

View file

@ -8,7 +8,7 @@
## Summary
[summary]: #summary
C# has support for iterator methods and async methods, but no support for a method that is both an iterator and an async method. We should rectify this by allowing for await to be used in a new form of iterator, one that returns an `IAsyncEnumerable<T>` or `IAsyncEnumerator<T>` rather than an `IEnumerable<T>` or `IEnumerator<T>`. An optional `IAsyncDisposable` interface is also used to enable for asynchronous cleanup.
C# has support for iterator methods and async methods, but no support for a method that is both an iterator and an async method. We should rectify this by allowing for `await` to be used in a new form of `async` iterator, one that returns an `IAsyncEnumerable<T>` or `IAsyncEnumerator<T>` rather than an `IEnumerable<T>` or `IEnumerator<T>`, with `IAsyncEnumerable<T>` consumable in a new `foreach await`. An `IAsyncDisposable` interface is also used to enable asynchronous cleanup.
## Related discussion
- https://github.com/dotnet/roslyn/issues/261
@ -33,9 +33,9 @@ namespace System
}
}
```
Types may implement both `IDisposable` and `IAsyncDisposable`. If they do, consumers of an instance should not invoke both `DisposeAsync` and `Dispose`; rather, if a type implements `IAsyncDisposable`, `DisposeAsync` should be used, else if it implements `IDisposable`, `Dispose` should be used.
As with `Dispose`, invoking `DisposeAsync` multiple times is acceptable, and subsequent invocations after the first should be treated as nops, returning a synchronously completed successful task (`DisposeAsync` need not be thread-safe, though, and need not support concurrent invocation). Further, types may implement both `IDisposable` and `IAsyncDisposable`, and if they do, it's similarly acceptable to invoke `Dispose` and then `DisposeAsync` or vice versa, but only the first should be meaningful and subsequent invocations of either should be a nop. As such, if a type does implement both, consumers are encouraged to call once and only once the more relevant method based on the context, `Dispose` in synchronous contexts and `DisposeAsync` in asynchronous ones.
I'm leaving discussion of how `IAsyncDisposable` interacts with `using` to a separate discussion. And coverage of how it interacts with `foreach` is handled later in this proposal.
(I'm leaving discussion of how `IAsyncDisposable` interacts with `using` to a separate discussion. And coverage of how it interacts with `foreach` is handled later in this proposal.)
Alternatives considered:
- _`DisposeAsync` accepting a `CancellationToken`_: while in theory it makes sense that anything async can be canceled, disposal is about cleanup, closing things out, free'ing resources, etc., which is generally not something that should be canceled; cleanup is still important for work that's canceled. The same `CancellationToken` that caused the actual work to be canceled would typically be the same token passed to `DisposeAsync`, making `DisposeAsync` worthless because cancellation of the work would cause `DisposeAsync` to be a nop. If someone wants to avoid being blocked waiting for disposal, they can avoid waiting on the resulting `Task`, or wait on it only for some period of time.
@ -52,7 +52,7 @@ namespace System.Collections.Generic
{
public interface IAsyncEnumerable<out T>
{
IAsyncEnumerator<T> GetAsyncEnumerator(CancellationToken cancellationToken = default);
IAsyncEnumerator<T> GetAsyncEnumerator();
}
public interface IAsyncEnumerator<out T> : IAsyncDisposable
@ -83,7 +83,7 @@ Discarded options considered:
- _`ITask<T?> TryMoveNextAsync();`_: Not covariant, allocations on every call, etc.
- _`ITask<(bool,T)> TryMoveNextAsync();`_: Not covariant, allocations on every call, etc.
- _`Task<bool> TryMoveNextAsync(out T result);`_: The `out` result would need to be set when the operation returns synchronously, not when it asynchronously completes the task potentially sometime long in the future, at which point there'd be no way to communicate the result.
- _`IAsyncEnumerator<T>` not implementing `IAsyncDisposable`_: We could choose to separate these. However, doing so complicates certain other areas of the proposal, as code must then be able to deal with the possibility that an enumerator doesn't provide disposal, which makes it difficult to write pattern-based helpers (see `ConfigureAwait` discussion). Further, it will be common for enumerators to have a need for disposal (e.g. any C# async iterator that has a finally block, most things enumerating data from a network connection, etc.), and if one doesn't, it is simple to implement the method purely as `public Task DisposeAsync() => Task.CompletedTask;` with minimal additional overhead.
- _`IAsyncEnumerator<T>` not implementing `IAsyncDisposable`_: We could choose to separate these. However, doing so complicates certain other areas of the proposal, as code must then be able to deal with the possibility that an enumerator doesn't provide disposal, which makes it difficult to write pattern-based helpers. Further, it will be common for enumerators to have a need for disposal (e.g. any C# async iterator that has a finally block, most things enumerating data from a network connection, etc.), and if one doesn't, it is simple to implement the method purely as `public Task DisposeAsync() => Task.CompletedTask;` with minimal additional overhead.
#### Viable alternative:
```C#
@ -91,7 +91,7 @@ namespace System.Collections.Generic
{
public interface IAsyncEnumerable<out T>
{
IAsyncEnumerator<T> GetAsyncEnumerator(CancellationToken cancellationToken = default);
IAsyncEnumerator<T> GetAsyncEnumerator();
}
public interface IAsyncEnumerator<out T>
@ -108,15 +108,15 @@ while (await enumerator.WaitForNextAsync())
{
while (true)
{
int item = e2.TryGetNext(out bool success);
int item = enumerator.TryGetNext(out bool success);
if (!success) break;
Use(item);
}
}
```
Consumption of this interface is obviously more complex. However, the advantage of this is two-fold, one minor and one major:
- _Allows for an enumerator to support multiple consumers_. There may be scenarios where it's valuable for an enumerator to support multiple concurrent consumers. That can't be achieved when `MoveNextAsync` and `Current` are separate such that an implementation can't make their usage atomic. In contrast, this approach provides a single method `TryGetNext` that supports pushing the enumerator forward and getting the next item, so the enumerator can enable atomicity if desired. However, it's likely that such scenarios could also be enabled by giving each consumer its own enumerator from a shared enumerable. Further, we don't want to enforce that every enumerator support concurrent usage, as that would add non-trivial overheads to the majority case that doesn't require it, which means a consumer of the interface generally couldn't rely on this any way.
- _Performance_. The `MoveNextAsync`/`Current` approach requires two interface calls per operation, whereas the best case for `WaitForNextAsync`/`TryGetNext` is that most iterations complete synchronously, enabling a tight inner loop with `TryGetNext`, such that we only have one interface call per operation. This can have a measurable impact.
- _Minor: Allows for an enumerator to support multiple consumers_. There may be scenarios where it's valuable for an enumerator to support multiple concurrent consumers. That can't be achieved when `MoveNextAsync` and `Current` are separate such that an implementation can't make their usage atomic. In contrast, this approach provides a single method `TryGetNext` that supports pushing the enumerator forward and getting the next item, so the enumerator can enable atomicity if desired. However, it's likely that such scenarios could also be enabled by giving each consumer its own enumerator from a shared enumerable. Further, we don't want to enforce that every enumerator support concurrent usage, as that would add non-trivial overheads to the majority case that doesn't require it, which means a consumer of the interface generally couldn't rely on this any way.
- _Major: Performance_. The `MoveNextAsync`/`Current` approach requires two interface calls per operation, whereas the best case for `WaitForNextAsync`/`TryGetNext` is that most iterations complete synchronously, enabling a tight inner loop with `TryGetNext`, such that we only have one interface call per operation. This can have a measurable impact.
From a performance perspective, here's a set of BenchmarkDotNet microbenchmarks just to highlight the best case impact (typical iterators probably wouldn't benefit as much). The enumerator in use here simply returns integers counting up from 0 to some max value, and may asynchronously yield (i.e. not complete synchronously when getting an item) every N items.
```C#
@ -206,7 +206,7 @@ public class Program
}
// IAsyncEnumerable1<T>
IAsyncEnumerator1<int> IAsyncEnumerable1<int>.GetAsyncEnumerator(CancellationToken cancellationToken) => this;
IAsyncEnumerator1<int> IAsyncEnumerable1<int>.GetAsyncEnumerator() => this;
Task<bool> IAsyncEnumerator1<int>.MoveNextAsync()
{
if (_current >= _items - 1) return s_falseTask;
@ -226,7 +226,7 @@ public class Program
int IAsyncEnumerator1<int>.Current => _current;
// IAsyncEnumerable2<T>
IAsyncEnumerator2<int> IAsyncEnumerable2<int>.GetAsyncEnumerator(CancellationToken cancellationToken) => this;
IAsyncEnumerator2<int> IAsyncEnumerable2<int>.GetAsyncEnumerator() => this;
Task<bool> IAsyncEnumerator2<int>.WaitForNextAsync()
{
if (_current >= _items - 1) return s_falseTask;
@ -284,7 +284,7 @@ namespace System.Collections.Generic
{
public interface IAsyncEnumerable1<out T>
{
IAsyncEnumerator1<T> GetAsyncEnumerator(CancellationToken cancellationToken = default(CancellationToken));
IAsyncEnumerator1<T> GetAsyncEnumerator();
}
public interface IAsyncEnumerator1<out T>
{
@ -294,7 +294,7 @@ namespace System.Collections.Generic
public interface IAsyncEnumerable2<out T>
{
IAsyncEnumerator2<T> GetAsyncEnumerator(CancellationToken cancellationToken = default(CancellationToken));
IAsyncEnumerator2<T> GetAsyncEnumerator();
}
public interface IAsyncEnumerator2<out T>
{
@ -316,7 +316,7 @@ On my machine, I get output like the following:
```
In these benchmarks, the enumerator hands back 10,000,000 integers. In the first three lines, it's yielding (completing asynchronously) every 100 items, and in the second three lines, it's essentially just yielding once at the very beginning and then never again.
Note that the difference becomes even more stark if an async method is used to implement `MoveNextAsync` and `WaitForNextAsync`, as async methods have more overhead associated with them than do normal methods. If I change the methods accordingly:
Note that the difference becomes even more stark if `async` is used to implement `MoveNextAsync` and `WaitForNextAsync`, as async methods have more overhead associated with them than do normal methods. If I change the methods accordingly:
```C#
async Task<bool> IAsyncEnumerator1<int>.MoveNextAsync()
{
@ -365,16 +365,133 @@ Discarded options considered:
(The remainder of this proposal is written in terms of the `MoveNextAsync`/`Current`-based interface, but it could be changed to use the `WaitForNextAsync`/`TryGetNext` interface if desired.)
#### Cancellation
Logically it makes sense for each individual `MoveNextAsync` operation to accept a `CancellationToken` so that it can be canceled in a very fine-grained way. However:
1. That's prohibitively expensive in many situations, as the code in many iterators needs to register/unregister for a callback on each item.
2. There's rarely a need to supply a different `CancellationToken` per item.
3. Logically the individual `MoveNextAsync` calls are just part of the larger async operation to do with enumerating the whole enumerator, so it makes sense to treat it as a unit from a cancellation perspective.
Given this, cancellation is defined at the enumerator level rather than at the sub-`MoveNextAsync` level.
There are several possible approaches to supporting cancellation:
1. `IAsyncEnumerable<T>`/`IAsyncEnumerator<T>` are cancellation-agnostic: `CancellationToken` doesn't appear anywhere. Cancellation is achieved by logically baking the `CancellationToken` into the enumerable and/or enumerator in whatever manner is appropriate, e.g. when calling an iterator, passing the `CancellationToken` as an argument to the iterator method and using it in the body of the iterator, as is done with any other parameter.
2. `IAsyncEnumerator<T>.GetEnumerator(CancellationToken)`: You pass a `CancellationToken` to `GetEnumerator`, and subsequent `MoveNextAsync` operations respect it however it can.
3. `IAsyncEnumerator<T>.MoveNextAsync(CancellationToken)`: You pass a `CancellationToken` to each individual `MoveNextAsync` call.
4. 1 && 2: You both embed `CancellationToken`s into your enumerable/enumerator and pass `CancellationToken`s into `GetEnumerator`.
5. 1 && 3: You both embed `CancellationToken`s into your enumerable/enumerator and pass `CancellationToken`s into `MoveNextAsync`.
#### ConfigureAwait
From a purely theoretical perspective, (5) is the most robust, in that (a) `MoveNextAsync` accepting a `CancellationToken` enables the most fine-grained control over what's canceled, and (b) `CancellationToken` is just any other type that can passed as an argument into iterators, embedded in arbitrary types, etc.
In support of `foreach`, the following types would be added to .NET as well, likely to System.Threading.Tasks.Extensions.dll:
However, there are multiple problems with that approach:
- How does a `CancellationToken` passed to `GetEnumerator` make it into the body of the iterator? We could expose a new `iterator` keyword that you could dot off of to get access to the `CancellationToken` passed to `GetEnumerator`, but a) that's a lot of additional machinery, b) we're making it a very first-class citizen, and c) the 99% case would seem to be the same code both calling an iterator and calling `GetEnumerator` on it, in which case it can just pass the `CancellationToken` as an argument into the method.
- How does a `CancellationToken` passed to `MoveNextAsync` get into the body of the method? This is even worse, as if it's exposed off of an `iterator` local object, its value could change across awaits, which means any code that registered with the token would need to unregister from it prior to awaits and then re-register after; it's also potentially quite expensive to need to do such registering and unregistering in every `MoveNextAsync` call, regardless of whether implemented by the compiler in an iterator or by a developer manually.
- How does a developer cancel a `foreach` loop? If it's done by giving a `CancellationToken` to an enumerable/enumerator, then either a) we need to support `foreach`'ing over enumerators, which raises them to being first-class citizens, and now you need to start thinking about an ecosystem built up around enumerators (e.g. LINQ methods) or b) we need to embed the `CancellationToken` in the enumerable anyway by having some `WithCancellation` extension method off of `IAsyncEnumerable<T>` that would store the provided token and then pass it into the wrapped enumerable's `GetEnumerator` when the `GetEnumerator` on the returned struct is invoked (ignoring that token). Or, you can just use the `CancellationToken` you have in the body of the foreach.
- If/when query comprehensions are supported, how would the `CancellationToken` supplied to `GetEnumerator` or `MoveNextAsync` be passed into each clause? The easiest way would simply be for the clause to capture it, at which point whatever token is passed to `GetEnumerator`/`MoveNextAsync` is ignored.
Due to all of this, the simplest and most consistent solution is simply to do (1): `IAsyncEnumerable<T>`/`IAsyncEnumerator<T>` are cancellation-agnostic. If you want to cancel a `foreach` loop, you can use a `CancellationToken` in the body and in any methods you call:
```C#
CancellationToken ct = ...;
foreach await (var i in GetData())
{
ct.ThrowIfCancellationRequested();
await UseAsync(i, ct);
...
}
```
If you want to pass a `CancellationToken` into an iterator, you simply do so as an argument, just as with other `async` methods:
```C#
static async IAsyncEnumerable<T> GetData(CancellationToken cancellationToken = default)
{
using (cancellationToken.Register(...))
{
...
await ...
...
}
}
...
foreach await (T i in GetData(ct))
{
...
}
```
If you want to use a `CancellationToken` in a query comprehension, you just capture it:
```C#
CancellationToken ct = ...
IAsyncEnumerable<string> results = from url in source
select DownloadAsync(url, ct);
```
Etc.
For now, we should pursue (1).
## foreach
`foreach` will be augmented to support `IAsyncEnumerable<T>` in addition to its existing support for `IEnumerable<T>`. And it will support the equivalent of `IAsyncEnumerable<T>` as a pattern if the relevant members are exposed publicly, falling back to using the interface directly if not, in order to enable struct-based extensions that avoid allocating as well as using alternative awaitables as the return type of `MoveNextAsync` and `DisposeAsync`.
### Syntax
Using the syntax:
```C#
foreach (var i in enumerable)
```
C# will continue to treat `enumerable` as a synchronous enumerable, such that even if it exposes the relevant APIs for async enumerables (exposing the pattern or implementing the interface), it will only consider the synchronous APIs.
To force `foreach` to instead only consider the asynchronous APIs, `await` is inserted as follows:
```C#
foreach await (var i in enumerable)
```
No syntax would be provided that would support using either the async or the sync APIs; the developer must choose based on the syntax used.
Discarded options considered:
- _`foreach (var i in await enumerable)`_: This is already valid syntax, and changing its meaning would be a breaking change. This means to `await` the `enumerable`, get back something synchronously iterable from it, and then synchronously iterate through that.
- _`foreach (var i await in enumerable)`, `foreach (var await i in enumerable)`, `foreach (await var i in enumerable)`_: These all suggest that we're awaiting the next item, but there are other awaits involved in foreach, in particular if the enumerable is an `IAsyncDisposable`, we will be `await`'ing its async disposal. That await is as the scope of the foreach rather than for each individual element, and thus the `await` keyword deserves to be at the `foreach` level. Further, having it associated with the `foreach` gives us a way to describe the `foreach` with a different term, e.g. a "foreach await". But more importantly, there's value in considering `foreach` syntax at the same time as `using` syntax, so that they remain consistent with each other, and `using (await ...)` is already valid syntax.
- _`await foreach (var i in enumerable)`_: This suggests that the entire `foreach` is somehow returning something that's being `await`'d, but it's not.
- `async foreach (var i in enumerable)`_: This reads very nicely, but we've established it's important to have `await`s in async methods, yet with this you could have an async method that awaited but without the `await` keyword anywhere in site, making it potentially confusing.
Still to consider:
- `foreach` today does not support iterating through an enumerator. We expect it will be more common to have `IAsyncEnumerator<T>`s handed around, and thus it's tempting to support `foreach await` with both `IAsyncEnumerable<T>` and `IAsyncEnumerator<T>`. But once we add such support, it introduces the question of whether `IAsyncEnumerator<T>` is a first-class citizen, and whether we need to have overloads of combinators that operate on enumerators in addition to enumerables? Do we want to encourage methods to return enumerators rather than enumerables? We should continue to discuss this. If we decide we don't want to support it, we might want to introduce an extension method `public static IAsyncEnumerable<T> AsEnumerable<T>(this IAsyncEnumerator<T> enumerator);` that would allow an enumerator to still be `foreach`'d. If we decide we do want to support it, we'll need to also decide on whether the `foreach await` would be responsible for calling `DisposeAsync` on the enumerator, and the answer is likely "no, control over disposal should be handled by whoever called `GetEnumerator`."
### Pattern-based Compilation
The compiler will bind to the pattern-based APIs if they exist, preferring those over using the interface (the pattern may be satisfied with instance methods or extension methods). The requirements for the pattern are:
- The enumerable must expose a `GetAsyncEnumerator` method that may be called with no arguments and that returns an enumerator that meets the relevant pattern.
- The enumerator must expose a `MoveNextAsync` method that may be called with no arguments and that returns something which may be `await`ed and whose `GetResult()` returns a `bool`.
- The enumerator must also expose `Current` property whose getter returns a `T` representing the kind of data being enumerated.
- The enumerator may optionally expose a `DisposeAsync` method that may be invoked with no arguments and that returns something that can be `await`ed and whose `GetResult()` returns `void`.
This code:
```C#
var enumerable = ...;
foreach await (T item in enumerable)
{
...
}
```
is translated to the equivalent of:
```C#
var enumerable = ...;
var enumerator = enumerable.GetAsyncEnumerator();
try
{
while (await enumerator.MoveNextAsync())
{
T item = enumerator.Current;
...
}
}
finally
{
await enumerator.DisposeAsync(); // omitted, along with the try/finally, if the enumerator doesn't expose DisposeAsync
}
```
If the iterated type doesn't expose the right pattern, the interfaces will be used.
### ConfigureAwait
This pattern-based compilation will allow `ConfigureAwait` to be used on all of the awaits, via a `ConfigureAwait` extension method:
```C#
foreach await (T item in enumerable.ConfigureAwait(false))
{
...
}
```
This will be based on types we'll add to .NET as well, likely to System.Threading.Tasks.Extensions.dll:
```C#
// Approximate implementation, omitting arg validation and the like
namespace System.Threading.Tasks
@ -384,9 +501,6 @@ namespace System.Threading.Tasks
public static ConfiguredAsyncEnumerable<T> ConfigureAwait<T>(this IAsyncEnumerable<T> enumerable, bool continueOnCapturedContext) =>
new ConfiguredAsyncEnumerable<T>(enumerable, continueOnCapturedContext);
public static ConfiguredAsyncEnumerator<T> ConfigureAwait<T>(this IAsyncEnumerator<T> enumerator, bool continueOnCapturedContext) =>
new ConfiguredAsyncEnumerable<T>.Enumerator(enumerator, continueOnCapturedContext);
public struct ConfiguredAsyncEnumerable<T>
{
private readonly IAsyncEnumerable<T> _enumerable;
@ -424,112 +538,10 @@ namespace System.Threading.Tasks
}
}
```
Use of this will be shown in the subsequent section.
## foreach
`foreach` will be augmented to support `IAsyncEnumerable<T>` in addition to its existing support for `IEnumerable<T>`. Further, `foreach` will be augmented to support `IAsyncEnumerator<T>`. And it will support these APIs as patterns if the relevant members are exposed publicly, falling back to using the interface directly if not.
### Syntax
Using the syntax:
```C#
foreach (var i in enumerable)
```
C# will continue to treat `enumerable` as a synchronous enumerable, such that even if it exposes the relevant APIs for async enumerables (exposing the pattern or implementing the interface), it will only consider the synchronous APIs.
To force `foreach` to instead only consider the asynchronous APIs, `await` is inserted as follows:
```C#
foreach await (var i in enumerable)
```
No syntax would be provided that would support using either the async or the sync APIs; the developer must choose based on the syntax used.
Discarded options considered:
- _`foreach (var i in await enumerable)`_: This is already valid syntax, and changing its meaning would be a breaking change. This means to `await` the `enumerable`, get back something synchronously iterable from it, and then synchronously iterate through that.
- _`foreach (var i await in enumerable)`, `foreach (var await i in enumerable)`, `foreach (await var i in enumerable)`_: These all suggest that we're awaiting the next item, but there are other awaits involved in foreach, in particular if the enumerable is an `IAsyncDisposable`, we will be `await`'ing its async disposal. That await is as the scope of the foreach rather than for each individual element, and thus the `await` keyword deserves to be at the `foreach` level. Further, having it associated with the `foreach` gives us a way to describe the `foreach` with a different term, e.g. a "foreach await".
- _`await foreach (var it in enumerable)`_: This suggests that the entire `foreach` is somehow returning something that's being `await`'d, but it's not.
### Pattern-based Compilation
The compiler will bind to the pattern-based APIs, if they exist, preferring those over using the interface (the pattern may be satisfied with instance methods or extension methods). The requirements for the pattern are:
- The enumerable may expose a `GetAsyncEnumerator` method that may be called with no arguments and that returns something that may be enumerated. If it does, that enumerator is then used for the remainder of the requirements; if it doesn't, then it itself must meet the enumerator requirements.
- The enumerator must expose a `MoveNextAsync` method that may be called with no arguments and that returns something which may be `await`ed and whose `GetResult()` returns a `bool`.
- The enumerator must also expose `Current` property whose getter returns a `T` representing the kind of data being enumerated.
- The enumerator may optionally expose a `DisposeAsync` method that may be invoked with no arguments and that returns something that can be `await`ed and whose `GetResult()` returns `void`.
This code:
```C#
var enumerable = ...;
foreach await (T item in enumerable)
{
...
}
```
is translated to the equivalent of:
```C#
var enumerable = ...;
var enumerator = enumerable.GetAsyncEnumerator();
try
{
while (await enumerator.MoveNextAsync())
{
T item = enumerator.Current;
...
}
}
finally
{
await enumerator.DisposeAsync(); // omitted, along with the try/finally, if the enumerator doesn't expose DisposeAsync
}
```
Further, this code:
```C#
foreach await (T item in enumerator)
{
...
}
```
is translated to the equivalent of:
```C#
try
{
while (await enumerator.MoveNextAsync())
{
T item = enumerator.Current;
...
}
}
finally
{
await enumerator.DisposeAsync(); // omitted, along with the try/finally, if the enumerator doesn't expose DisposeAsync
}
```
If the iterated type doesn't expose the right pattern, the interfaces will be used.
### ConfigureAwait
This pattern-based compilation will allow `ConfigureAwait` to be used on all of the awaits, via the `ConfigureAwait` extension method described earlier:
```C#
foreach await (T item in enumerable.ConfigureAwait(false))
{
...
}
```
Note that this approach will not enable `ConfigureAwait` to be used with pattern-based enumerables, but then again it's already the case that the `ConfigureAwait` is only exposed as an extension on `Task`/`Task<T>`/`ValueTask<T>` and can't be applied to arbitrary awaitable things, as it only makes sense when applied to Tasks (it controls a behavior implemented in Task's continuation support), and thus doesn't make sense when using a pattern where the awaitable things may not be tasks. Anyone returning awaitable things can provide their own custom behavior in such advanced scenarios.
### Cancellation
If code desires to provide a `CancellationToken` that can be used to cancel the enumerator, that can then be done simply by calling `GetEnumerator` on the enumerable rather than having the compiler do it explicitly:
```C#
foreach await (T item in enumerable.GetEnumerator(cancellationToken))
{
...
}
```
That `CancellationToken` will be respected by the enumerator however it sees fit.
(If we can come up with some way to support a scope- or assembly-level `ConfigureAwait` solution, then this won't be necessary.)
## Async Iterators
@ -558,7 +570,7 @@ but `await` can't be used in the body of these iterators. We will add that supp
The existing language support for iterators infers the iterator nature of the method based on whether it contains any `yield`s. The same will be true for async iterators. Such async iterators will be demarcated and differentiated from synchronous iterators via adding `async` to the signature, and must then also have either `IAsyncEnumerable<T>` or `IAsyncEnumerator<T>` as its return type. For example, the above example could be written as an async iterator as follows:
```C#
static IAsyncEnumerable<int> MyIterator()
static async IAsyncEnumerable<int> MyIterator()
{
try
{
@ -577,9 +589,9 @@ static IAsyncEnumerable<int> MyIterator()
```
Alternatives considered:
- _Not using `async` in the signature_: This may not be technically required, as the use of `IAsyncEnumerable<T>`/`IAsyncEnumerator<T>` should be sufficient to differentiate from synchronous iterators. However, we've established that `await` may only be used in methods marked as `async`, and it seems important to keep the consistency.
- _Not using `async` in the signature_: Using `async` is likely technically required by the compiler, as it uses it to determine whether `await` is valid in that context. But even if it's not required, we've established that `await` may only be used in methods marked as `async`, and it seems important to keep the consistency.
- _Enabling custom builders for `IAsyncEnumerable<T>`_: That's something we could look at for the future, but the machinery is complicated and we don't support that for the synchronous counterparts.
- _Having an `iterator` keyword_: Async iterators would use `async iterator` in the signature, and `yield` could only be used in `async` methods that included `iterator`; `iterator` would then be made optional on synchronous iterators. Depending on your perspective, this has the benefit of making it very clear by the signature of the method whether `yield` is allowed and whether the method is actually meant to return instances of type `IAsyncEnumerable<T>` rather than the compiler manufacturing one based on whether the code uses `yield` or not. But it is different from synchronous iterators, which don't and can't be made to require one. Plus some developers don't like the extra syntax.
- _Having an `iterator` keyword in the signature_: Async iterators would use `async iterator` in the signature, and `yield` could only be used in `async` methods that included `iterator`; `iterator` would then be made optional on synchronous iterators. Depending on your perspective, this has the benefit of making it very clear by the signature of the method whether `yield` is allowed and whether the method is actually meant to return instances of type `IAsyncEnumerable<T>` rather than the compiler manufacturing one based on whether the code uses `yield` or not. But it is different from synchronous iterators, which don't and can't be made to require one. Plus some developers don't like the extra syntax. If we were designing it from scratch, we'd probably make this required, but at this point there's much more value in keeping async iterators close to sync iterators.
### Compilation
@ -713,39 +725,6 @@ async Task IAsyncDisposable.DisposeAsync()
```
We will need to avoid introducing any `await`s that weren't in the written C# code, so as not to introduce any unexpected behaviors (e.g. introducing a continuation that may then be scheduled back to an undesired context). And we will want to consider possible optimizations that could avoid various kinds of overhead involved in such a code generation approach.
### Cancellation
Cancellation may be achieved simply by passing a `CancellationToken` into the iterator as is the case for any other method:
```C#
static IAsyncEnumerable<int> MyIterator(CancellationToken cancellationToken)
{
for (int i = 0; i < 100; i++)
{
await Task.Delay(1000, cancellationToken);
yield return i;
}
}
```
The compiler doesn't treat a `CancellationToken` argument as special in any way.
This does "bake" the `CancellationToken` into the enumerable, such that multiple calls to `GetEnumerator` will all implicitly share the same token, but we expect it to be rare with async enumerators to iterate through a single call to an iterator multiple times. In other words, while we expect it'll be common to have code that does something like:
```C#
foreach await (T item in ProduceDataAsync(cancellationToken)) { ... }
foreach await (T item in ProduceDataAsync(cancellationToken)) { ... }
foreach await (T item in ProduceDataAsync(cancellationToken)) { ... }
```
we expect it to be much, much less common to want to do:
```C#
var enumerable = ProduceDataAsync(cancellationToken);
foreach await (T item in enumerable) { ... }
foreach await (T item in enumerable) { ... }
foreach await (T item in enumerable) { ... }
```
and to care about using a different `CancellationToken` in each iteration (if you do care, you can just invoke it multiple times).
Alternatives considered:
- _Exposing the `CancellationToken` passed to the enumerable's `GetEnumerator` into the body of the method_: This could be done, for example, if the compiler added a new keyword like `iterator` that referred to state associated with the iterator, and the `CancellationToken` could be exposed off of that such that the body of the iterator could access `iterator.CancellationToken`. But that's a lot of machinery for little gain, and it could be introduced in the future if desired / if additional functionality might be exposed off of `iterator` to motivate it.
## LINQ
There are over ~200 overloads of methods on the `System.Linq.Enumerable` class, all of which work in terms of `IEnumerable<T>`; some of these accept `IEnumerable<T>`, some of them produce `IEnumerable<T>`, and many do both. Adding LINQ support for `IAsyncEnumerable<T>` would likely entail duplicating all of these overloads for it, for another ~200. And since `IAsyncEnumerator<T>` is likely to be more common as a standalone entity in the asynchronous world than `IEnumerator<T>` is in the synchronous world, we could potentially need another ~200 overloads that work with `IAsyncEnumerator<T>`. Plus, a large number of the overloads deal with predicates (e.g. `Where` that takes a `Func<T, bool>`), and it may be desirable to have `IAsyncEnumerable<T>`-based overloads that deal with both synchronous and asynchronous predicates (e.g. `Func<T, Task<bool>>` in addition to `Func<T, bool>`). While this isn't applicable to all of the now ~400 new overloads, a rough calculation is that it'd be applicable to half, which means another ~200 overloads, for a total of ~600 new methods.
@ -770,9 +749,9 @@ public static IAsyncEnumerable<TResult> Select<TSource, TResult>(this IAsyncEnum
```
then this would "just work":
```C#
IAsyncEnumerable<int> result = from item in enumerable
where item % 2 == 0
select SomeAsyncMethod(item);
IAsyncEnumerable<string> result = from url in urls
where item % 2 == 0
select SomeAsyncMethod(item);
async Task<int> SomeAsyncMethod(int item)
{
@ -790,6 +769,7 @@ IAsyncEnumerable<int> result = from item in enumerable
return item * 2;
};
```
or to enabling `await` to be used directly in expressions, such as by supporting `async from`. However, it's unlikely a design here would impact the rest of the feature set one way or the other, and this isn't a particularly high-value thing to invest in right now, so the proposal is to do nothing additional here right now.
## Integration with other asynchronous frameworks