* Fix some tracing issues.
- Add endpoints for `startUpdate` and `postEngineEventsBatch` so that
spans for these invocations have proper names
- Inject a tracing span when walking a plan so that resource operations
are properly parented
- When handling gRPC calls, inject a tracing span into the call's
metadata if no span is already present so that resource monitor and
engine spans are properly parented
- Do not trace client gRPC invocations of the empty method so that these
calls (which are used to determine server availability) do not muddy
the trace. Note that I tried parenting these spans appropriately, but
doing so broke the trace entirely.
With these changes, the only unparented span in a typical Pulumi
invocation is a single call to `getUser`. This span is unparented
because that call does not have a context available. Plumbing a context
into that particular call is surprisingly tricky, as it is often called
by other context-less functions.
* Make tracing support more flexible.
- Add support for writing trace data to a local file using Appdash
- Add support for viewing Appdash traces via the CLI
This commit adds CLI support for resource providers to provide partial
state upon failure. For resource providers that model resource
operations across multiple API calls, the Provider RPC interface can now
accomodate saving bags of state for resource operations that failed.
This is a common pattern for Terraform-backed providers that try to do
post-creation steps on resource as part of Create or Update resource
operations.
* Improve the error message arising from missing required configs for
resource providers
If the resource provider that we are speaking to is new enough, it will send
across a list of keys and their descriptions alongside an error
indicating that the provider we are configuring is missing required
config. This commit packages up the list of missing keys into an error
that can be presented nicely to the user.
* Code review feedback: renaming simplification and correcting errors in comments
* Send structured errors across RPC boundaries
This brings us closer to gRPC best practices where we send structured
errors with error codes across RPC endpoints. The new "rpcerrors"
package can wrap errors from RPC endpoints, so RPC servers can attach
some additional context as to why a request failed.
* Code review feedback:
1. Rename rpcerrors -> rpcerror, better package name
2. Rename RPCError -> Error, RPCErrorCause -> ErrorCause, names
suggested by gometalinter to improve their package-qualified names
3. Fix import organization in rpcerror.go
Adds OpenTracing in the Pulumi engine and plugin + langhost subprocesses.
We currently create a single root span for any `Enging.plan` operation - which is a single `preview`, `update`, `destroy`, etc.
The only sub-spans we currently create are at gRPC boundaries, both on the client and server sides and on both the langhost and provider plugin interfaces.
We could extend this to include spans for any other semantically meaningful sections of compute inside the engine, though initial examples show we get pretty good granularity of coverage by focusing on the gRPC boundaries.
In the future, this should be easily extensible to HTTP boundaries and to track other bulky I/O like datastore read/writes once we hook up to the PPC and Pulumi Cloud.
We expose a `--trace <endpoint>` option to enable tracing on the CLI, which we will aim to thread through to subprocesses.
We currently support sending tracing data to a Zipkin-compatible endpoint. This has been validated with both Zipkin and Jaeger UIs.
We do not yet have any tracing inside the TypeScript side of the JS langhost RPC interface. There is not yet automatic gRPC OpenTracing instrumentation (though it looks like it's in progress now) - so we would need to manually create meaningful spans on that side of the interface.
Trapping these signals hijacks the usual termination behavior for any
program that happens to link in the engine and perform an operation
that starts a gRPC server. These servers already provide a cancellation
mechanism via a `cancel` channel parameter; if the using program wants
to gracefully terminate these servers on some signal, it is responsible
for providing that behavior.
This also fixes a leak in which the goroutine responsible for waiting on
a server's signals and cancellation channel would never exit.
Instead of binding on 0.0.0.0 (which will listen on every interface)
let's only listen on localhost. On windows, this both makes the
connection Just Work and also prevents the Windows Firewall from
blocking the listen (and displaying UI saying it has blocked an
application and asking if the user should allow it)
This continues the previous commit and establishes the interpreter
context so that we can use the new host interface. In summary:
* Instead of using the NullSource for destructions -- which
doesn't hook up an interpreter and so any reads of configuration
variables will fail -- we will enlighten the EvalSource to know
how to orchestrate destruction interpretation. The primary
difference is that we don't actually run the code, but *we do*
perform all of the necessary configuration and variable init.
* Associate the active interpreter with the plugin context as
we are executing, so that the host object can actually read the
state from the heap as requested to do so by attached plugins.
* Rename anything "engine" related to use the term "host"; this
avoids introducing unnecesarily new terminology.
* Add a new pkg/resource/provider/ package where we can begin
consolidating helper functionality for resource providers.
Right now, this includes a wrapper interface atop the gRPC
machinery necessary to contact the host, in addition to a
Main function that hides some boilerplate entrypoint code.
* Add a rpcutil.IsBenignCloseErr routine to let us ignore
"benign" gRPC errors that are knowingly returned at shutdown.
This commit completes pulumi/lumi#117.
This change adds an engine gRPC interface, and associated implementation,
so that plugins may do interesting things that require "phoning home".
Previously, the engine would fire up plugins and talk to them directly,
but there was no way for a plugin to ask the engine to do anything.
The motivation here is so that plugins can read evaluator state, such
as config information, but this change also allows richer logging
functionality than previously possible. We will still auto-log any
stdout/stderr writes; however, explicit errors, warnings, informational,
and even debug messages may be written over the Log API.