Merge pull request #2165 from jaredpar/fix-funcptr

Function pointer updates
2019-01-23 07:59:43 -08:00 · 2019-01-23 07:59:43 -08:00 · b897a4b8c2
parent 24f7fe1f22 7f75c8c42f
commit b897a4b8c2
1 changed files with 225 additions and 90 deletions
--- a/proposals/function-pointers.md
+++ b/proposals/function-pointers.md
@ -11,20 +11,20 @@ potential implementation of the feature):

 https://github.com/dotnet/csharplang/issues/191

-This is an alternate design propsoal to [compiler intrinsics](https://github.com/dotnet/csharplang/blob/master/proposals/intrinsics.md)
+This is an alternate design propsoal to [compiler intrinsics]
+(https://github.com/dotnet/csharplang/blob/master/proposals/intrinsics.md)

 ## Detailed Design 

-### funcptr
-The language will allow for the declaration of function pointers using the `funcptr` contextual keyword. The 
-declaration and usage of function pointers closely resemble that of `delegate`:
+### Function pointers
+The language will allow for the declaration of function pointers using the `func*` syntax. The full syntax is described
+in detail in the next section but it is meant to resemble the syntax used by `delegate` declarations.

 ``` csharp
 unsafe class Example {
    delegate void DAction(int a);
-    funcptr void FAction(int a);

-    void Example(DAction d, FAction f) {
+    void Example(DAction d, func* void(int) f) {
        d(42);
        f(42);
    }
@ -32,90 +32,125 @@ unsafe class Example {
 ```

 These types are represented using the function pointer type as outlined in ECMA-335. This means invocation
-of a `funcptr` will use `calli` where invocation of a `delegate` will use `callvirt` on the `Invoke` method.
+of a `func*` will use `calli` where invocation of a `delegate` will use `callvirt` on the `Invoke` method.
 Syntactically though invocation is identical for both constructs.

-The `calli` instruction requires the calling convention be specified as a part of the invocation. The default 
-for `funcptr` will be managed. Alternate forms can be specified by adding the appropriate modifier after the 
-`funcptr` keyword: `cdecl`, `fastcall`, `stdcall`, `thiscall` or `winapi`. Example:
+The ECMA-335 definition of method pointers includes the calling convention as part of the type signature (section 7.1).
+The default calling convention will be `managed `. Alternate forms can be specified by adding the appropriate modifier 
+after the `func*` syntax: `cdecl`, `fastcall`, `stdcall`, `thiscall` or `winapi`. Example:

 ``` csharp
 // This method will be invoked using the cdecl calling convention
-funcptr cdecl int F1(int value);
+func* cdecl int(int value);

 // This method will be invoked using the stdcall calling convention
-funcptr stdcall int F2(int value);
+func* stdcall int(int value);
 ```

-Conversions between `funcptr` types is done based on their signature, not name. When two `funcptr` declarations 
-have the same signature they have an identity conversion no matter what the name is. 
+Conversions between `func*` types is done based on their signature including the calling convention. 

 ``` csharp
 unsafe class Example {
-    funcptr int Func1(int left, int right);
-    funcptr int Func2(int x);
-    funcptr int Func3(int x, int y);
-    funcptr cdecl int Func4(int x, int y);
-
    void Conversions() {
-        Func1 p1 = ...;
-        Func2 p2 = ...;
+        func* int(int, int) p1 = ...;
+        func* managed int(int, int) p2 = ...;
+        func* cdecl int(int, int) p3 = ...;

-        Func3 p3 = p1; // okay Func1 and Func3 have compatible signatures
-        Console.WriteLine(p3 == p1); // True
-        Func3 p4 = p2; // error: Func3 and Func2 have incompatible signatures (parameter)
-        Func4 p5 = p1; // error: Func4 and Func1 have incompatible signatures (calling convention)
+        p1 = p2; // okay Func1 and Func3 have compatible signatures
+        Console.WriteLine(p2 == p1); // True
+        p2 = p2; // error: calling conventions are incompatible
    }
 }
 ```

-In addition to declaring a named `funcptr` type, as you declare a `delegate`, it is possible to use an unnamed 
-`funcptr` type without first declaring it. This type can be used anywhere a type declaration would occur:
-
-``` csharp
-unsafe struct Example {
-    funcptr int (int) Field;
-    unsafe void UnnamedExample(funcptr int(int) ptr) {
-        int x = ptr(42);
-        Field = ptr;
-        ...
-    }
-}
-```
-
-A `funcptr` type is a pointer type which means it has all of the capabilities and restrictions of a standard pointer
+A `func*` type is a pointer type which means it has all of the capabilities and restrictions of a standard pointer
 type:
 - Only valid in an `unsafe` context.
- Methods which contain a `funcptr` parameter or return type can only be called from an `unsafe` context.
+- Methods which contain a `func*` parameter or return type can only be called from an `unsafe` context.
 - Cannot be converted to `object`.
- Cannot be used an a generic argument.
- Can implicitly convert `funcptr` to `void*`.
- Can explicitly convert from `void*` to `funcptr`.
+- Cannot be used as a generic argument.
+- Can implicitly convert `func*` to `void*`.
+- Can explicitly convert from `void*` to `func*`.

 Restrictions:
- Cannot overload when the only difference in parameter types is the name of the function pointer. 
- Custom attributes cannot be applied to a `funcptr` or any of its elements.
- A `funcptr` parameter cannot be marked as `params`
- A `funcptr` type has all of the restrictions of a normal pointer type.
+- Custom attributes cannot be applied to a `func*` or any of its elements.
+- A `func*` parameter cannot be marked as `params`
+- A `func*` type has all of the restrictions of a normal pointer type.
+
+### Function pointer syntax
+The full function pointer syntax is represented by the following grammar:
+
+``` 
+funcptr_type = 
+    'func' '*' [calling_convention] type method_arglist |
+    '(' funcptr_type ')' ;
+
+calling_convention = 
+    'managed' |
+    'unmanaged' |
+    'cdecl' |
+    'winapi' | 
+    'fastcall' | 
+    'stdcall' | 
+    'thiscall' ;
+```
+
+The `unmanaged` calling convention represents the default calling convention for native code on the current platform.
+
+When there is a nested function pointer, a function pointer which has or returns a function pointer, parens can be 
+opitionally used to disambiguate the signature. Though they are not required and the resulting types are equivalent.
+
+``` csharp
+delegate int Func1(string s);
+delegate Func1 Func2(Func1 f);
+
+// Function pointer equivalent without parens or calling convention
+func* int(string);
+func* func* int(string) int(func* int(string));
+
+// Function pointer equivalent without parens and with calling convention
+func* managed int(string);
+func* managed func* managed int(string) int(func* managed int(string));
+
+// Function pointer equivalent with parens and without calling convention
+func* int(string);
+func* (func* int(string)) int((func* int(string));
+
+// Function pointer equivalent of with parens and calling convention
+func* int(string)
+func* managed (func* managed int(string)) int((func* managed int(string));
+```
+
+When the calling convention is omitted from the syntax then `managed` will be used as the calling convention. That means
+all of the forms of `Func1` and `Func2` defined above are equivalent signatures.
+
+The calling convention cannot be omitted when the return type of the function pointer has the same name as a calling 
+convention. Inthat case the parser would process the return type as a calling convention instead of a type. To resolve
+this the developer must specify both the calling convention and the return type. 
+
+``` csharp
+class cdecl { }
+
+// Function pointer which has a cdecl calling convention, a cdecl return type and takes a single 
+// paramater of type cdecl;
+func* cdecl cdecl(cdecl);
+```

 ### Allow addresss-of to target methods

 Method groups will now be allowed as arguments to an address-of expression. The type of such an 
-expression will be an unnamed `funcptr` which has the equivalent signature of the target method and a managed 
+expression will be a `func*` which has the equivalent signature of the target method and a managed 
 calling convention:

 ``` csharp
 unsafe class Util { 
    public static void Log() { } 

-    funcptr void Action();
-    funcptr int Func();
    void Use() {
-        funcptr void() ptr1 = &Util.Log; 
-        Action ptr2 = &Util.Log;
+        func* void() ptr1 = &Util.Log;

-        // Error: type "funcptr void()" not compatible with "funcptr int()";
-        Func ptr3 = &Util.Log; 
+        // Error: type "func* void()" not compatible with "func int()";
+        func* int() ptr2 = &Util.Log; 

        // Okay. Conversion to void* is always allowed.
        void* v = &Util.Log;
@ -123,10 +158,10 @@ unsafe class Util {
 }
 ```

-The conversion of an address-of method group to `funcptr` has roughly the same process as method group to `delegate`  
+The conversion of an address-of method group to `func*` has roughly the same process as method group to `delegate`  
 conversion. There are two additional restrictions to the existing process:
 - Only members of the method group that are marked as `static` will be considered. 
- Only a `funcptr` with a managed calling convention can be the target of such a conversion.
+- Only a `func*` with a managed calling convention can be the target of such a conversion.

 This means developers can depend on overload resolution rules to work in conjunction with the 
 address-of operator:
@ -137,12 +172,9 @@ unsafe class Util {
    public static void Log(string p1) { } 
    public static void Log(int i) { };

-    funcptr void Action1();
-    funcptr void Action2();
-
    void Use() {
-        Action1 a1 = &Log; // Log()
-        Action2 a2 = &Log; // Log(int i)
+        func* void() a1 = &Log; // Log()
+        func* void(int) a2 = &Log; // Log(int i)

        // Error: ambiguous conversion from method group Log to "void*"
        void* v = &Log; 
@ -160,25 +192,83 @@ exactly what signature they are emitted with.
 ### Better function member
 The better function member specification will be changed to include the following line:

-> A `funcptr` is more specific than `void*`
+> A `func*` is more specific than `void*`

-This means that it is possible to overload on `void*` and a `funcptr` and still sensibly use the address-of operator.
+This means that it is possible to overload on `void*` and a `func*` and still sensibly use the address-of operator.

-## Open Issuess
+## Open Issues

- Round tripping function pointer names, as well as parameter names, through metadata will require additional work. The
-function pointer type itself is natively supported by CLI but that does not include any names. This is not anticipated
-to be a big issue, just needs design work.
- The address-of operator is limited to `static` methods in this proposal. It can be made to work with instance methods 
-but the behavior can be confusing to developers. The `this` type becomes an explicit first parameter on the `funcptr` 
-type. This means the behavior and usage would differ significantly from `delegate`. This extra confusion was the main 
-reason it was not included in the design.
+### NativeCallback Attribute
+This is an attribute used by the CLR to avoid the managed to native prologue when invoking. Methods marked by this 
+attribute are only callable from native code, not managed (can’t call methods, create a delegate, etc …). The attribute
+is not special to mscorlib; the runtime will treat any attribute with this name with the same semantics. 
+
+It's possible for the runtime and language to work together to fully support this. The language could choose to treat
+address-of `static` members with a `NativeCallback` attribute as a `func*` with the specified calling convention.
+
+``` csharp
+unsafe class NativeCallbackExample {
+    [NativeCallback(CallingConvention.CDecl)]
+    static extern bool CloseHandle(IntPtr p);
+
+    void Use() {
+        func* bool(IntPtr) p1 = &CloseHandle; // Error: Invalid calling convention
+
+        func* cdecl bool(IntPtr) p2 = &CloseHandle; // Okay
+    }
+}
+
+```
+
+Additionally the language would likely also want to: 
+
+- Flag any managed calls to a method tagged with `NativeCallback` as an error. Given the function can't be invoked from
+managed code the compiler should prevent developers from attempting such an invocation.
+- Prevent method group conversions to `delegate` when the method is tagged with `NativeCallback`. 
+
+This is not necessary to support `NativeCallback` though. The compiler can support the `NativeCallback` attribute as is
+using the existing syntax. The runtime would simply need to cast to `void*` before casting to the corrcect `func*` 
+signature. That would be no worse than the support today.
+
+``` csharp
+void* v = &CloseHandle;
+func* cdecl bool(IntPtr) f1 = (func* cdecl bool(IntPtr))v;
+```

 ## Considerations

+### Allow instance methods
+The proposal could be extended to support instance methods by taking advantage of the `EXPLICITTHIS` CLI calling 
+convention (named `instance` in C# code). This form of CLI function pointers puts the `this` parameter as an explicit
+first parameter of the function pointer syntax. 
+
+``` csharp
+unsafe class Instance {
+    void Use() {
+        func* instance string(Instance) f = &ToString;
+        f(this);
+    }
+}
+```
+
+This is sound but adds some complication to the proposal. Particularly because function pointers which differed by the
+calling convention `instance` and `managed` would be incompatbile even though both cases are used to invoke managed 
+methods with the same C# signature. Also in every case considered where this would be valuable to have there was a 
+simple work around: use a `static` local function.
+
+``` csharp
+unsafe class Instance {
+    void Use() {
+        static string toString(Instance i) = i.ToString();
+        func* string(Instance) f = &toString;
+        f(this);
+    }
+}
+```
+
 ### Don't require unsafe at declaration
-Instead of requiring `unsafe` at every use of a `funcptr`, only require it at the point where a method group is
-converted to a `funcptr`. This is where the core safety issues come into play (knowing that the containing assembly
+Instead of requiring `unsafe` at every use of a `func*`, only require it at the point where a method group is
+converted to a `func*`. This is where the core safety issues come into play (knowing that the containing assembly
 cannot be unloaded while the value is alive). Requiring `unsafe` on the other locations can be seen as excessive.

 This is how the design was originally intended. But the resulting language rules felt very awkward. It's impossible to
@ -186,20 +276,20 @@ hide the fact that this is a pointer value and it kept peeking through even with
 the conversion to `object` can't be allowed, it can't be a member of a `class`, etc ... The C# design is to require
 `unsafe` for all pointer uses and hence this design follows that.

-Developers will still be capable of preventing a _safe_ wrapper on top of `funcptr` values the same way that they do 
+Developers will still be capable of preventing a _safe_ wrapper on top of `func*` values the same way that they do 
 for normal pointer types today. Consider:

 ``` csharp
 unsafe struct Action {
-    funcptr void() _ptr;
+    func* void() _ptr;

-    Action(funcptr void() ptr) => _ptr = ptr;
+    Action(func* void() ptr) => _ptr = ptr;
    public void Invoke() => _ptr();
 }
 ```

 ### Using delegates
-Instead of using a new syntax element, `funcptr`, simply use exisiting `delegate` types with a `*` following the type:
+Instead of using a new syntax element, `func*`, simply use exisiting `delegate` types with a `*` following the type:

 ``` csharp
 Func<object, object, bool>* ptr = &object.ReferenceEquals;
@ -221,25 +311,70 @@ must be different than the version defined in mscorlib.
 One option that was explored was emitting such a pointer as `mod_req(Func<int>) void*`. This doesn't 
 work though as a `mod_req` cannot bind to a `TypeSpec` and hence cannot target generic instantiations.

-### No names altogether
-Given that a `funcptr` can be used without names why even allow names at all? The underlying CLI primitive doesn't have
-names hence the use of names is purely a C# invention. That ends up being a leaky abstraction in some cases (like
-not allowing overloads when `funcptr` differ by only names). 
+### Named function pointers
+The function pointer syntax can be cumbersome, particlarly in complex cases like nested function pointers. Rather than
+have developers type out the signature every time the language could allow for named declarations of function pointers
+as is done with `delegate`. 

-At the same time, `funcptr` look and feel so much like `delegate` types, not allowing them to be named would be seen
-as an enormous gap by customers. The leaky abstraction is wort the trade offs here.
+``` csharp
+func* void Action();

-### Requiring names always
-Given that names are allowed for `funcptr` why not just require them always? Given that `funcptr` is an existing CLI
-type there are uses of it in the ecosystem today. None of those uses will have the metadata serialization format 
-chosen by the C# compiler. This means the feature would be using CLI function pointers but not interopting with any 
-existing usage.
+unsafe class NamedExample {
+    void M(Action a) {
+        a();
+    }
+}
+```
+
+Part of the problem here is the underlying CLI primitive doesn't have names hence this would be purely a C# invention 
+and require a bit of metadata work to enable. That is doable but is a significant about of work. It essentially requires
+C# to have a companion to the type def table purely for these names.
+
+Also when the arguments for named function pointers was examined we found they could apply equally well to a number of
+other scenarios. For example it would be just as convenient to declare named tuples to reduce the need to type out
+the full signature in all cases. 
+
+``` csharp
+(int x, int y) Point;
+
+class NamedTupleExample {
+    void M(Point p) {
+        Console.WriteLine(p.x);
+    }
+}
+```
+
+After discussion we decided to not allow named declaration of `func*` types. If we find there is significant need for
+this based on customer usage feedback then we will investigate a naming solution that works for function pointers, 
+tuples, generics, etc ... This is likely to be similar in form to other suggestions like full `typedef` support in
+the language.

 ## Future Considerations

 ### static local functions
-
 This refers to [the proposal](https://github.com/dotnet/csharplang/issues/1565) to allow the 
 `static` modifier on local functions. Such a function would be guaranteed to be emitted as 
 `static` and with the exact signature specified in source code. Such a function should be a valid
 argument to `&` as it contains none of the problems local functions have today
+
+### static delegates
+This refers to [the proposal](https://github.com/dotnet/csharplang/issues/302) to allow for the declaration of 
+`delegate` types which can only refer to `static` members. The advantage being that such `delegate` instances can be 
+allocation free and better in performance sensitive scenarios.
+
+If the function pointer feature is implemented the `static delegate` proposal will likely be closed out. The proposed
+advantage of that feature is the allocation free nature. However recent investigations have found that is not possible
+to achieve due to assembly unloading. There must be a strong handle from the `static delegate` to the method it refers
+to in order to keep the assembly from being unloaded out from under it.
+
+To maintain every `static delegate` instance would be required to allocate a new handle which runs counter to the goals
+of the proposal. There were some designs where the allocation could be amortized to a single allocation per call-site
+but that was a bit complex and didn't seem worth the trade off. 
+
+That means developers essentially have to decide between the following trade offs:
+
+1. Safety in the face of assembly unloading: this requires allocations and hence `delegate` is already a sufficient 
+option.
+1. No safety in face of assembly unloading: use a `func*`. This can be wrapped in a `struct` to allow usage outside 
+an `unsafe` context in the rest of the code. 
+