Merge pull request #2165 from jaredpar/fix-funcptr

Function pointer updates
This commit is contained in:
Jared Parsons 2019-01-23 07:59:43 -08:00 committed by GitHub
commit b897a4b8c2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -11,20 +11,20 @@ potential implementation of the feature):
https://github.com/dotnet/csharplang/issues/191
This is an alternate design propsoal to [compiler intrinsics](https://github.com/dotnet/csharplang/blob/master/proposals/intrinsics.md)
This is an alternate design propsoal to [compiler intrinsics]
(https://github.com/dotnet/csharplang/blob/master/proposals/intrinsics.md)
## Detailed Design
### funcptr
The language will allow for the declaration of function pointers using the `funcptr` contextual keyword. The
declaration and usage of function pointers closely resemble that of `delegate`:
### Function pointers
The language will allow for the declaration of function pointers using the `func*` syntax. The full syntax is described
in detail in the next section but it is meant to resemble the syntax used by `delegate` declarations.
``` csharp
unsafe class Example {
delegate void DAction(int a);
funcptr void FAction(int a);
void Example(DAction d, FAction f) {
void Example(DAction d, func* void(int) f) {
d(42);
f(42);
}
@ -32,90 +32,125 @@ unsafe class Example {
```
These types are represented using the function pointer type as outlined in ECMA-335. This means invocation
of a `funcptr` will use `calli` where invocation of a `delegate` will use `callvirt` on the `Invoke` method.
of a `func*` will use `calli` where invocation of a `delegate` will use `callvirt` on the `Invoke` method.
Syntactically though invocation is identical for both constructs.
The `calli` instruction requires the calling convention be specified as a part of the invocation. The default
for `funcptr` will be managed. Alternate forms can be specified by adding the appropriate modifier after the
`funcptr` keyword: `cdecl`, `fastcall`, `stdcall`, `thiscall` or `winapi`. Example:
The ECMA-335 definition of method pointers includes the calling convention as part of the type signature (section 7.1).
The default calling convention will be `managed `. Alternate forms can be specified by adding the appropriate modifier
after the `func*` syntax: `cdecl`, `fastcall`, `stdcall`, `thiscall` or `winapi`. Example:
``` csharp
// This method will be invoked using the cdecl calling convention
funcptr cdecl int F1(int value);
func* cdecl int(int value);
// This method will be invoked using the stdcall calling convention
funcptr stdcall int F2(int value);
func* stdcall int(int value);
```
Conversions between `funcptr` types is done based on their signature, not name. When two `funcptr` declarations
have the same signature they have an identity conversion no matter what the name is.
Conversions between `func*` types is done based on their signature including the calling convention.
``` csharp
unsafe class Example {
funcptr int Func1(int left, int right);
funcptr int Func2(int x);
funcptr int Func3(int x, int y);
funcptr cdecl int Func4(int x, int y);
void Conversions() {
Func1 p1 = ...;
Func2 p2 = ...;
func* int(int, int) p1 = ...;
func* managed int(int, int) p2 = ...;
func* cdecl int(int, int) p3 = ...;
Func3 p3 = p1; // okay Func1 and Func3 have compatible signatures
Console.WriteLine(p3 == p1); // True
Func3 p4 = p2; // error: Func3 and Func2 have incompatible signatures (parameter)
Func4 p5 = p1; // error: Func4 and Func1 have incompatible signatures (calling convention)
p1 = p2; // okay Func1 and Func3 have compatible signatures
Console.WriteLine(p2 == p1); // True
p2 = p2; // error: calling conventions are incompatible
}
}
```
In addition to declaring a named `funcptr` type, as you declare a `delegate`, it is possible to use an unnamed
`funcptr` type without first declaring it. This type can be used anywhere a type declaration would occur:
``` csharp
unsafe struct Example {
funcptr int (int) Field;
unsafe void UnnamedExample(funcptr int(int) ptr) {
int x = ptr(42);
Field = ptr;
...
}
}
```
A `funcptr` type is a pointer type which means it has all of the capabilities and restrictions of a standard pointer
A `func*` type is a pointer type which means it has all of the capabilities and restrictions of a standard pointer
type:
- Only valid in an `unsafe` context.
- Methods which contain a `funcptr` parameter or return type can only be called from an `unsafe` context.
- Methods which contain a `func*` parameter or return type can only be called from an `unsafe` context.
- Cannot be converted to `object`.
- Cannot be used an a generic argument.
- Can implicitly convert `funcptr` to `void*`.
- Can explicitly convert from `void*` to `funcptr`.
- Cannot be used as a generic argument.
- Can implicitly convert `func*` to `void*`.
- Can explicitly convert from `void*` to `func*`.
Restrictions:
- Cannot overload when the only difference in parameter types is the name of the function pointer.
- Custom attributes cannot be applied to a `funcptr` or any of its elements.
- A `funcptr` parameter cannot be marked as `params`
- A `funcptr` type has all of the restrictions of a normal pointer type.
- Custom attributes cannot be applied to a `func*` or any of its elements.
- A `func*` parameter cannot be marked as `params`
- A `func*` type has all of the restrictions of a normal pointer type.
### Function pointer syntax
The full function pointer syntax is represented by the following grammar:
```
funcptr_type =
'func' '*' [calling_convention] type method_arglist |
'(' funcptr_type ')' ;
calling_convention =
'managed' |
'unmanaged' |
'cdecl' |
'winapi' |
'fastcall' |
'stdcall' |
'thiscall' ;
```
The `unmanaged` calling convention represents the default calling convention for native code on the current platform.
When there is a nested function pointer, a function pointer which has or returns a function pointer, parens can be
opitionally used to disambiguate the signature. Though they are not required and the resulting types are equivalent.
``` csharp
delegate int Func1(string s);
delegate Func1 Func2(Func1 f);
// Function pointer equivalent without parens or calling convention
func* int(string);
func* func* int(string) int(func* int(string));
// Function pointer equivalent without parens and with calling convention
func* managed int(string);
func* managed func* managed int(string) int(func* managed int(string));
// Function pointer equivalent with parens and without calling convention
func* int(string);
func* (func* int(string)) int((func* int(string));
// Function pointer equivalent of with parens and calling convention
func* int(string)
func* managed (func* managed int(string)) int((func* managed int(string));
```
When the calling convention is omitted from the syntax then `managed` will be used as the calling convention. That means
all of the forms of `Func1` and `Func2` defined above are equivalent signatures.
The calling convention cannot be omitted when the return type of the function pointer has the same name as a calling
convention. Inthat case the parser would process the return type as a calling convention instead of a type. To resolve
this the developer must specify both the calling convention and the return type.
``` csharp
class cdecl { }
// Function pointer which has a cdecl calling convention, a cdecl return type and takes a single
// paramater of type cdecl;
func* cdecl cdecl(cdecl);
```
### Allow addresss-of to target methods
Method groups will now be allowed as arguments to an address-of expression. The type of such an
expression will be an unnamed `funcptr` which has the equivalent signature of the target method and a managed
expression will be a `func*` which has the equivalent signature of the target method and a managed
calling convention:
``` csharp
unsafe class Util {
public static void Log() { }
funcptr void Action();
funcptr int Func();
void Use() {
funcptr void() ptr1 = &Util.Log;
Action ptr2 = &Util.Log;
func* void() ptr1 = &Util.Log;
// Error: type "funcptr void()" not compatible with "funcptr int()";
Func ptr3 = &Util.Log;
// Error: type "func* void()" not compatible with "func int()";
func* int() ptr2 = &Util.Log;
// Okay. Conversion to void* is always allowed.
void* v = &Util.Log;
@ -123,10 +158,10 @@ unsafe class Util {
}
```
The conversion of an address-of method group to `funcptr` has roughly the same process as method group to `delegate`
The conversion of an address-of method group to `func*` has roughly the same process as method group to `delegate`
conversion. There are two additional restrictions to the existing process:
- Only members of the method group that are marked as `static` will be considered.
- Only a `funcptr` with a managed calling convention can be the target of such a conversion.
- Only a `func*` with a managed calling convention can be the target of such a conversion.
This means developers can depend on overload resolution rules to work in conjunction with the
address-of operator:
@ -137,12 +172,9 @@ unsafe class Util {
public static void Log(string p1) { }
public static void Log(int i) { };
funcptr void Action1();
funcptr void Action2();
void Use() {
Action1 a1 = &Log; // Log()
Action2 a2 = &Log; // Log(int i)
func* void() a1 = &Log; // Log()
func* void(int) a2 = &Log; // Log(int i)
// Error: ambiguous conversion from method group Log to "void*"
void* v = &Log;
@ -160,25 +192,83 @@ exactly what signature they are emitted with.
### Better function member
The better function member specification will be changed to include the following line:
> A `funcptr` is more specific than `void*`
> A `func*` is more specific than `void*`
This means that it is possible to overload on `void*` and a `funcptr` and still sensibly use the address-of operator.
This means that it is possible to overload on `void*` and a `func*` and still sensibly use the address-of operator.
## Open Issuess
## Open Issues
- Round tripping function pointer names, as well as parameter names, through metadata will require additional work. The
function pointer type itself is natively supported by CLI but that does not include any names. This is not anticipated
to be a big issue, just needs design work.
- The address-of operator is limited to `static` methods in this proposal. It can be made to work with instance methods
but the behavior can be confusing to developers. The `this` type becomes an explicit first parameter on the `funcptr`
type. This means the behavior and usage would differ significantly from `delegate`. This extra confusion was the main
reason it was not included in the design.
### NativeCallback Attribute
This is an attribute used by the CLR to avoid the managed to native prologue when invoking. Methods marked by this
attribute are only callable from native code, not managed (cant call methods, create a delegate, etc …). The attribute
is not special to mscorlib; the runtime will treat any attribute with this name with the same semantics.
It's possible for the runtime and language to work together to fully support this. The language could choose to treat
address-of `static` members with a `NativeCallback` attribute as a `func*` with the specified calling convention.
``` csharp
unsafe class NativeCallbackExample {
[NativeCallback(CallingConvention.CDecl)]
static extern bool CloseHandle(IntPtr p);
void Use() {
func* bool(IntPtr) p1 = &CloseHandle; // Error: Invalid calling convention
func* cdecl bool(IntPtr) p2 = &CloseHandle; // Okay
}
}
```
Additionally the language would likely also want to:
- Flag any managed calls to a method tagged with `NativeCallback` as an error. Given the function can't be invoked from
managed code the compiler should prevent developers from attempting such an invocation.
- Prevent method group conversions to `delegate` when the method is tagged with `NativeCallback`.
This is not necessary to support `NativeCallback` though. The compiler can support the `NativeCallback` attribute as is
using the existing syntax. The runtime would simply need to cast to `void*` before casting to the corrcect `func*`
signature. That would be no worse than the support today.
``` csharp
void* v = &CloseHandle;
func* cdecl bool(IntPtr) f1 = (func* cdecl bool(IntPtr))v;
```
## Considerations
### Allow instance methods
The proposal could be extended to support instance methods by taking advantage of the `EXPLICITTHIS` CLI calling
convention (named `instance` in C# code). This form of CLI function pointers puts the `this` parameter as an explicit
first parameter of the function pointer syntax.
``` csharp
unsafe class Instance {
void Use() {
func* instance string(Instance) f = &ToString;
f(this);
}
}
```
This is sound but adds some complication to the proposal. Particularly because function pointers which differed by the
calling convention `instance` and `managed` would be incompatbile even though both cases are used to invoke managed
methods with the same C# signature. Also in every case considered where this would be valuable to have there was a
simple work around: use a `static` local function.
``` csharp
unsafe class Instance {
void Use() {
static string toString(Instance i) = i.ToString();
func* string(Instance) f = &toString;
f(this);
}
}
```
### Don't require unsafe at declaration
Instead of requiring `unsafe` at every use of a `funcptr`, only require it at the point where a method group is
converted to a `funcptr`. This is where the core safety issues come into play (knowing that the containing assembly
Instead of requiring `unsafe` at every use of a `func*`, only require it at the point where a method group is
converted to a `func*`. This is where the core safety issues come into play (knowing that the containing assembly
cannot be unloaded while the value is alive). Requiring `unsafe` on the other locations can be seen as excessive.
This is how the design was originally intended. But the resulting language rules felt very awkward. It's impossible to
@ -186,20 +276,20 @@ hide the fact that this is a pointer value and it kept peeking through even with
the conversion to `object` can't be allowed, it can't be a member of a `class`, etc ... The C# design is to require
`unsafe` for all pointer uses and hence this design follows that.
Developers will still be capable of preventing a _safe_ wrapper on top of `funcptr` values the same way that they do
Developers will still be capable of preventing a _safe_ wrapper on top of `func*` values the same way that they do
for normal pointer types today. Consider:
``` csharp
unsafe struct Action {
funcptr void() _ptr;
func* void() _ptr;
Action(funcptr void() ptr) => _ptr = ptr;
Action(func* void() ptr) => _ptr = ptr;
public void Invoke() => _ptr();
}
```
### Using delegates
Instead of using a new syntax element, `funcptr`, simply use exisiting `delegate` types with a `*` following the type:
Instead of using a new syntax element, `func*`, simply use exisiting `delegate` types with a `*` following the type:
``` csharp
Func<object, object, bool>* ptr = &object.ReferenceEquals;
@ -221,25 +311,70 @@ must be different than the version defined in mscorlib.
One option that was explored was emitting such a pointer as `mod_req(Func<int>) void*`. This doesn't
work though as a `mod_req` cannot bind to a `TypeSpec` and hence cannot target generic instantiations.
### No names altogether
Given that a `funcptr` can be used without names why even allow names at all? The underlying CLI primitive doesn't have
names hence the use of names is purely a C# invention. That ends up being a leaky abstraction in some cases (like
not allowing overloads when `funcptr` differ by only names).
### Named function pointers
The function pointer syntax can be cumbersome, particlarly in complex cases like nested function pointers. Rather than
have developers type out the signature every time the language could allow for named declarations of function pointers
as is done with `delegate`.
At the same time, `funcptr` look and feel so much like `delegate` types, not allowing them to be named would be seen
as an enormous gap by customers. The leaky abstraction is wort the trade offs here.
``` csharp
func* void Action();
### Requiring names always
Given that names are allowed for `funcptr` why not just require them always? Given that `funcptr` is an existing CLI
type there are uses of it in the ecosystem today. None of those uses will have the metadata serialization format
chosen by the C# compiler. This means the feature would be using CLI function pointers but not interopting with any
existing usage.
unsafe class NamedExample {
void M(Action a) {
a();
}
}
```
Part of the problem here is the underlying CLI primitive doesn't have names hence this would be purely a C# invention
and require a bit of metadata work to enable. That is doable but is a significant about of work. It essentially requires
C# to have a companion to the type def table purely for these names.
Also when the arguments for named function pointers was examined we found they could apply equally well to a number of
other scenarios. For example it would be just as convenient to declare named tuples to reduce the need to type out
the full signature in all cases.
``` csharp
(int x, int y) Point;
class NamedTupleExample {
void M(Point p) {
Console.WriteLine(p.x);
}
}
```
After discussion we decided to not allow named declaration of `func*` types. If we find there is significant need for
this based on customer usage feedback then we will investigate a naming solution that works for function pointers,
tuples, generics, etc ... This is likely to be similar in form to other suggestions like full `typedef` support in
the language.
## Future Considerations
### static local functions
This refers to [the proposal](https://github.com/dotnet/csharplang/issues/1565) to allow the
`static` modifier on local functions. Such a function would be guaranteed to be emitted as
`static` and with the exact signature specified in source code. Such a function should be a valid
argument to `&` as it contains none of the problems local functions have today
### static delegates
This refers to [the proposal](https://github.com/dotnet/csharplang/issues/302) to allow for the declaration of
`delegate` types which can only refer to `static` members. The advantage being that such `delegate` instances can be
allocation free and better in performance sensitive scenarios.
If the function pointer feature is implemented the `static delegate` proposal will likely be closed out. The proposed
advantage of that feature is the allocation free nature. However recent investigations have found that is not possible
to achieve due to assembly unloading. There must be a strong handle from the `static delegate` to the method it refers
to in order to keep the assembly from being unloaded out from under it.
To maintain every `static delegate` instance would be required to allocate a new handle which runs counter to the goals
of the proposal. There were some designs where the allocation could be amortized to a single allocation per call-site
but that was a bit complex and didn't seem worth the trade off.
That means developers essentially have to decide between the following trade offs:
1. Safety in the face of assembly unloading: this requires allocations and hence `delegate` is already a sufficient
option.
1. No safety in face of assembly unloading: use a `func*`. This can be wrapped in a `struct` to allow usage outside
an `unsafe` context in the rest of the code.