csharplang/proposals/function-pointers.md

275 lines
11 KiB
Markdown
Raw Normal View History

2018-09-14 06:31:30 +02:00
# Function Pointers
## Summary
This proposal provides language constructs that expose IL opcodes that cannot currently be accessed efficiently,
2018-10-15 17:25:48 +02:00
or at all, in C# today: `ldftn` and `calli`. These IL opcodes can be important in high performance code and developers
need an effecient way to access them.
2018-09-14 06:31:30 +02:00
## Motivation
2018-10-05 23:18:13 +02:00
The motivations and background for this feature are described in the following issue (as is a
potential implementation of the feature):
https://github.com/dotnet/csharplang/issues/191
2019-01-23 16:08:30 +01:00
This is an alternate design propsoal to [compiler intrinsics]
(https://github.com/dotnet/csharplang/blob/master/proposals/intrinsics.md)
2018-09-14 06:31:30 +02:00
## Detailed Design
2019-01-23 16:08:30 +01:00
### Function pointers
The language will allow for the declaration of function pointers using the `func*` syntax. The full syntax is described
in detail in the next section but it is meant to resemble the syntax used by `delegate` declarations.
2018-10-11 02:46:36 +02:00
``` csharp
2018-10-11 20:21:26 +02:00
unsafe class Example {
delegate void DAction(int a);
2019-01-23 16:08:30 +01:00
void Example(DAction d, func* void(int) f) {
2018-10-11 20:21:26 +02:00
d(42);
f(42);
}
}
2018-10-11 02:46:36 +02:00
```
These types are represented using the function pointer type as outlined in ECMA-335. This means invocation
2019-01-23 16:08:30 +01:00
of a `func*` will use `calli` where invocation of a `delegate` will use `callvirt` on the `Invoke` method.
2018-10-15 17:25:48 +02:00
Syntactically though invocation is identical for both constructs.
2018-10-11 02:46:36 +02:00
2019-01-23 16:08:30 +01:00
The ECMA-335 definition of method pointers includes the calling convention as part of the type signature (section 7.1).
The default calling convention will be `managed `. Alternate forms can be specified by adding the appropriate modifier
after the `func*` syntax: `cdecl`, `fastcall`, `stdcall`, `thiscall` or `winapi`. Example:
2018-10-11 02:46:36 +02:00
``` csharp
// This method will be invoked using the cdecl calling convention
2019-01-23 16:08:30 +01:00
func* cdecl int(int value);
2018-10-11 02:46:36 +02:00
2018-10-15 17:25:48 +02:00
// This method will be invoked using the stdcall calling convention
2019-01-23 16:08:30 +01:00
func* stdcall int(int value);
2018-10-15 17:25:48 +02:00
```
2018-10-11 20:21:26 +02:00
2019-01-23 16:08:30 +01:00
Conversions between `func*` types is done based on their signature including the calling convention.
2018-10-11 02:46:36 +02:00
``` csharp
2018-10-11 20:21:26 +02:00
unsafe class Example {
void Conversions() {
2019-01-23 16:08:30 +01:00
func* int(int, int) p1 = ...;
func* managed int(int, int) p2 = ...;
func* cdecl int(int, int) p3 = ...;
2018-10-11 20:21:26 +02:00
2019-01-23 16:08:30 +01:00
p1 = p2; // okay Func1 and Func3 have compatible signatures
Console.WriteLine(p2 == p1); // True
p2 = p2; // error: calling conventions are incompatible
2018-10-11 20:21:26 +02:00
}
}
```
2018-09-14 06:31:30 +02:00
2019-01-23 16:08:30 +01:00
A `func*` type is a pointer type which means it has all of the capabilities and restrictions of a standard pointer
2018-10-15 17:25:48 +02:00
type:
- Only valid in an `unsafe` context.
2019-01-23 16:08:30 +01:00
- Methods which contain a `func*` parameter or return type can only be called from an `unsafe` context.
2018-10-15 17:25:48 +02:00
- Cannot be converted to `object`.
2019-01-23 16:08:30 +01:00
- Cannot be used as a generic argument.
- Can implicitly convert `func*` to `void*`.
- Can explicitly convert from `void*` to `func*`.
2018-09-14 06:31:30 +02:00
2018-10-15 17:25:48 +02:00
Restrictions:
2019-01-23 16:08:30 +01:00
- Custom attributes cannot be applied to a `func*` or any of its elements.
- A `func*` parameter cannot be marked as `params`
- A `func*` type has all of the restrictions of a normal pointer type.
### Function pointer syntax
The full function pointer syntax is represented by the following grammar:
```
funcptr_type =
'func' '*' [calling_convention] type method_arglist |
'(' funcptr_type ')' ;
calling_convention =
'managed' |
'cdecl' |
'winapi' |
'fastcall' |
'stdcall' |
'thiscall' ;
```
When there is a nested function pointer, a function pointer which has or returns a function pointer, parens can be
opitionally used to disambiguate the signature. Though they are not required and the resulting types are equivalent.
``` csharp
delegate int Func1(string s);
delegate Func1 Func2(Func1 f);
// Function pointer equivalent without parens or calling convention
func* int(string);
func* func* int(string) int(func* int(string));
// Function pointer equivalent without parens and with calling convention
func* managed int(string);
func* managed func* managed int(string) int(func* managed int(string));
// Function pointer equivalent with parens and without calling convention
func* int(string);
func* (func* int(string)) int((func* int(string));
// Function pointer equivalent of with parens and calling convention
func* int(string)
func* managed (func* managed int(string)) int((func* managed int(string));
```
When the calling convention is omitted from the syntax then `managed` will be used as the calling convention. That means
all of the forms of `Func1` and `Func2` defined above are equivalent signatures.
The calling convention cannot be omitted when the return type of the function pointer has the same name as a calling
convention. Inthat case the parser would process the return type as a calling convention instead of a type. To resolve
this the developer must specify both the calling convention and the return type.
``` csharp
class cdecl { }
// Function pointer which has a cdecl calling convention, a cdecl return type and takes a single
// paramater of type cdecl;
func* cdecl cdecl(cdecl);
```
2018-10-11 02:46:36 +02:00
2018-10-11 20:21:26 +02:00
### Allow addresss-of to target methods
2018-10-11 02:46:36 +02:00
2018-10-11 20:21:26 +02:00
Method groups will now be allowed as arguments to an address-of expression. The type of such an
2019-01-23 16:08:30 +01:00
expression will be a `func*` which has the equivalent signature of the target method and a managed
2018-10-15 17:25:48 +02:00
calling convention:
2018-10-11 02:46:36 +02:00
2018-10-11 20:21:26 +02:00
``` csharp
unsafe class Util {
public static void Log() { }
2018-09-14 06:31:30 +02:00
2018-10-11 20:21:26 +02:00
void Use() {
2019-01-23 16:08:30 +01:00
func* void() ptr1 = &Util.Log;
2018-10-11 20:21:26 +02:00
2019-01-23 16:08:30 +01:00
// Error: type "func* void()" not compatible with "func int()";
func* int() ptr2 = &Util.Log;
2018-10-11 20:21:26 +02:00
// Okay. Conversion to void* is always allowed.
void* v = &Util.Log;
}
}
```
2018-09-14 06:31:30 +02:00
2019-01-23 16:08:30 +01:00
The conversion of an address-of method group to `func*` has roughly the same process as method group to `delegate`
2018-10-15 17:25:48 +02:00
conversion. There are two additional restrictions to the existing process:
- Only members of the method group that are marked as `static` will be considered.
2019-01-23 16:08:30 +01:00
- Only a `func*` with a managed calling convention can be the target of such a conversion.
2018-10-15 17:25:48 +02:00
This means developers can depend on overload resolution rules to work in conjunction with the
2018-10-11 20:21:26 +02:00
address-of operator:
2018-09-14 06:31:30 +02:00
2018-10-11 20:21:26 +02:00
``` csharp
unsafe class Util {
public static void Log() { }
public static void Log(string p1) { }
public static void Log(int i) { };
void Use() {
2019-01-23 16:08:30 +01:00
func* void() a1 = &Log; // Log()
func* void(int) a2 = &Log; // Log(int i)
2018-10-11 20:21:26 +02:00
// Error: ambiguous conversion from method group Log to "void*"
void* v = &Log;
}
```
The address-of operator will be implemented using the `ldftn` instruction.
Restrictions of this feature:
- Only applies to methods marked as `static`.
- Local functions cannot be used in `&`. The implementation details of these methods are
deliberately not specified by the language. This includes whether they are static vs. instance or
exactly what signature they are emitted with.
### Better function member
The better function member specification will be changed to include the following line:
2019-01-23 16:08:30 +01:00
> A `func*` is more specific than `void*`
2018-10-11 20:21:26 +02:00
2019-01-23 16:08:30 +01:00
This means that it is possible to overload on `void*` and a `func*` and still sensibly use the address-of operator.
2018-10-11 20:21:26 +02:00
## Open Issuess
2018-10-15 17:25:48 +02:00
- The address-of operator is limited to `static` methods in this proposal. It can be made to work with instance methods
2019-01-23 16:08:30 +01:00
but the behavior can be confusing to developers. The `this` type becomes an explicit first parameter on the `func*`
2018-10-15 17:25:48 +02:00
type. This means the behavior and usage would differ significantly from `delegate`. This extra confusion was the main
reason it was not included in the design.
2018-09-14 06:31:30 +02:00
## Considerations
2018-10-11 20:21:26 +02:00
### Don't require unsafe at declaration
2019-01-23 16:08:30 +01:00
Instead of requiring `unsafe` at every use of a `func*`, only require it at the point where a method group is
converted to a `func*`. This is where the core safety issues come into play (knowing that the containing assembly
2018-10-11 20:21:26 +02:00
cannot be unloaded while the value is alive). Requiring `unsafe` on the other locations can be seen as excessive.
This is how the design was originally intended. But the resulting language rules felt very awkward. It's impossible to
hide the fact that this is a pointer value and it kept peeking through even without the `unsafe` keyword. For example
the conversion to `object` can't be allowed, it can't be a member of a `class`, etc ... The C# design is to require
`unsafe` for all pointer uses and hence this design follows that.
2019-01-23 16:08:30 +01:00
Developers will still be capable of preventing a _safe_ wrapper on top of `func*` values the same way that they do
2018-10-11 20:21:26 +02:00
for normal pointer types today. Consider:
``` csharp
unsafe struct Action {
2019-01-23 16:08:30 +01:00
func* void() _ptr;
2018-10-11 20:21:26 +02:00
2019-01-23 16:08:30 +01:00
Action(func* void() ptr) => _ptr = ptr;
2018-10-11 20:21:26 +02:00
public void Invoke() => _ptr();
}
```
2018-09-14 06:57:07 +02:00
### Using delegates
2019-01-23 16:08:30 +01:00
Instead of using a new syntax element, `func*`, simply use exisiting `delegate` types with a `*` following the type:
2018-09-14 06:57:07 +02:00
``` csharp
Func<object, object, bool>* ptr = &object.ReferenceEquals;
```
2018-10-11 02:46:36 +02:00
Handling calling convention can be done by annotating the `delegate` types with an attribute that specifies
a `CallingConvention` value. The lack of an attribute would signify the managed calling convention.
2018-09-14 06:57:07 +02:00
Encoding this in IL is problematic. The underlying value needs to be represented as a pointer yet it also must:
1. Have a unique type to allow for overloads with different function pointer types.
1. Be equivalent for OHI purposes across assembly boundaries.
2018-10-11 02:46:36 +02:00
The last point is particularly problematic. This mean that every assembly which uses `Func<int>*` must encode
an equivalent type in metadata even though `Func<int>*` is defined in an assembly though don't control.
Additionally any other type which is defined with the name `System.Func<T>` in an assembly that is not mscorlib
must be different than the version defined in mscorlib.
One option that was explored was emitting such a pointer as `mod_req(Func<int>) void*`. This doesn't
work though as a `mod_req` cannot bind to a `TypeSpec` and hence cannot target generic instantiations.
2018-09-14 06:57:07 +02:00
2018-10-11 20:21:26 +02:00
### No names altogether
2019-01-23 16:08:30 +01:00
Given that a `func*` can be used without names why even allow names at all? The underlying CLI primitive doesn't have
2018-10-11 20:21:26 +02:00
names hence the use of names is purely a C# invention. That ends up being a leaky abstraction in some cases (like
2019-01-23 16:08:30 +01:00
not allowing overloads when `func*` differ by only names).
2018-10-11 20:21:26 +02:00
2019-01-23 16:08:30 +01:00
At the same time, `func*` look and feel so much like `delegate` types, not allowing them to be named would be seen
2018-10-11 20:21:26 +02:00
as an enormous gap by customers. The leaky abstraction is wort the trade offs here.
### Requiring names always
2019-01-23 16:08:30 +01:00
Given that names are allowed for `func*` why not just require them always? Given that `func*` is an existing CLI
2018-10-11 20:21:26 +02:00
type there are uses of it in the ecosystem today. None of those uses will have the metadata serialization format
chosen by the C# compiler. This means the feature would be using CLI function pointers but not interopting with any
existing usage.
2018-09-14 06:31:30 +02:00
## Future Considerations
2018-10-11 20:21:26 +02:00
### static local functions
This refers to [the proposal](https://github.com/dotnet/csharplang/issues/1565) to allow the
`static` modifier on local functions. Such a function would be guaranteed to be emitted as
`static` and with the exact signature specified in source code. Such a function should be a valid
argument to `&` as it contains none of the problems local functions have today
2019-01-23 16:08:30 +01:00
*** static delegates https://github.com/dotnet/csharplang/blob/master/proposals/static-delegates.md
*** PR feedback needs to be gone through.