* experimental: separate language host from node
* Remove langhost details from the NodeJS SDK runtime
* Cleanup
* Work around an issue where Node sometimes loads the same module twice in two different contexts, resulting in two distinct module objects. Some additional cleanup.
* Add some tests
* Fix up the Windows script
* Fix up the install scripts and Windows build
* Code review feedback
* Code review feedback: error capitalization
As documented in issue #616, the inputs/defaults/outputs model we have
today has fundamental problems. The crux of the issue is that our
current design requires that defaults present in the old state of a
resource are applied to the new inputs for that resource.
Unfortunately, it is not possible for the engine to decide which
defaults remain applicable and which do not; only the provider has that
knowledge.
These changes take a more tactical approach to resolving this issue than
that originally proposed in #616 that avoids breaking compatibility with
existing checkpoints. Rather than treating the Pulumi inputs as the
provider input properties for a resource, these inputs are first
translated by `Check`. In order to accommodate provider defaults that
were chosen for the old resource but should not change for the new,
`Check` now takes the old provider inputs as well as the new Pulumi
inputs. Rather than the Pulumi inputs and provider defaults, the
provider inputs returned by `Check` are recorded in the checkpoint file.
Put simply, these changes remove defaults as a first-class concept
(except inasmuch as is required to retain the ability to read old
checkpoint files) and move the responsibilty for manging and
merging defaults into the provider that supplies them.
Fixes#616.
This change adds a new manifest section to the checkpoint files.
The existing time moves into it, and we add to it the version of
the Pulumi CLI that created it, along with the names, types, and
versions of all plugins used to generate the file. There is a
magic cookie that we also use during verification.
This is to help keep us sane when debugging problems "in the wild,"
and I'm sure we will add more to it over time (checksum, etc).
For example, after an up, you can now see this in `pulumi stack`:
```
Current stack is demo:
Last updated at 2017-12-01 13:48:49.815740523 -0800 PST
Pulumi version v0.8.3-79-g1ab99ad
Plugin pulumi-provider-aws [resource] version v0.8.3-22-g4363e77
Plugin pulumi-langhost-nodejs [language] version v0.8.3-79-g77bb6b6
Checkpoint file is /Users/joeduffy/dev/code/src/github.com/pulumi/pulumi-aws/.pulumi/stacks/webserver/demo.json
```
This addresses pulumi/pulumi#628.
This change switches from child lists to parent pointers, in the
way resource ancestries are represented. This cleans up a fair bit
of the old parenting logic, including all notion of ambient parent
scopes (and will notably address pulumi/pulumi#435).
This lets us show a more parent/child display in the output when
doing planning and updating. For instance, here is an update of
a lambda's text, which is logically part of a cloud timer:
* cloud:timer:Timer: (same)
[urn=urn:pulumi:malta::lm-cloud:☁️timer:Timer::lm-cts-malta-job-CleanSnapshots]
* cloud:function:Function: (same)
[urn=urn:pulumi:malta::lm-cloud:☁️function:Function::lm-cts-malta-job-CleanSnapshots]
* aws:serverless:Function: (same)
[urn=urn:pulumi:malta::lm-cloud::aws:serverless:Function::lm-cts-malta-job-CleanSnapshots]
~ aws:lambda/function:Function: (modify)
[id=lm-cts-malta-job-CleanSnapshots-fee4f3bf41280741]
[urn=urn:pulumi:malta::lm-cloud::aws:lambda/function:Function::lm-cts-malta-job-CleanSnapshots]
- code : archive(assets:2092f44) {
// etc etc etc
Note that we still get walls of text, but this will be actually
quite nice when combined with pulumi/pulumi#454.
I've also suppressed printing properties that didn't change during
updates when --detailed was not passed, and also suppressed empty
strings and zero-length arrays (since TF uses these as defaults in
many places and it just makes creation and deletion quite verbose).
Note that this is a far cry from everything we can possibly do
here as part of pulumi/pulumi#340 (and even pulumi/pulumi#417).
But it's a good start towards taming some of our output spew.
Adds support for top-level exports in the main script of a Pulumi Program to be captured as stack-level output properties.
This create a new `pulumi:pulumi:Stack` component as the root of the resource tree in all Pulumi programs. That resources has properties for each top-level export in the Node.js script.
Running `pulumi stack` will display the current value of these outputs.
Because the Pulumi.yaml file demarcates the boundary used when
uploading a program to the Pulumi.com service at the moment, we
have trouble when a Pulumi program uses "up and over" references.
For instance, our customer wants to build a Dockerfile located
in some relative path, such as `../../elsewhere/`.
To support this, we will allow the Pulumi.yaml file to live
somewhere other than the main Pulumi entrypoint. For example,
it can live at the root of the repo, while the Pulumi program
lives in, say, `infra/`:
Pulumi.yaml:
name: as-before
main: infra/
This fixespulumi/pulumi#575. Further work can be done here to
provide even more flexibility; see pulumi/pulumi#574.
This change formats log messages the same way that Node.js does
in its console.log/error APIs. This ensures, for example, that
errors have their stack printed if present, and switches over to
just printing the error directly rather than manually toStringing it.
Adds OpenTracing in the Pulumi engine and plugin + langhost subprocesses.
We currently create a single root span for any `Enging.plan` operation - which is a single `preview`, `update`, `destroy`, etc.
The only sub-spans we currently create are at gRPC boundaries, both on the client and server sides and on both the langhost and provider plugin interfaces.
We could extend this to include spans for any other semantically meaningful sections of compute inside the engine, though initial examples show we get pretty good granularity of coverage by focusing on the gRPC boundaries.
In the future, this should be easily extensible to HTTP boundaries and to track other bulky I/O like datastore read/writes once we hook up to the PPC and Pulumi Cloud.
We expose a `--trace <endpoint>` option to enable tracing on the CLI, which we will aim to thread through to subprocesses.
We currently support sending tracing data to a Zipkin-compatible endpoint. This has been validated with both Zipkin and Jaeger UIs.
We do not yet have any tracing inside the TypeScript side of the JS langhost RPC interface. There is not yet automatic gRPC OpenTracing instrumentation (though it looks like it's in progress now) - so we would need to manually create meaningful spans on that side of the interface.
This change remembers that we failed due to an uncaught exception,
and defers the process.exit(1) until we actually reach the process's
exit event. This ensures that we drain the message queue before
exiting, which ensures that outbound messages actually reach their
destination.
As part of fixing the exit bug recently, we accidentally made errors
lead to zero exit codes. As a result, the Pulumi CLI thought the
prgoram exited ordinarily, and proceeded to do its usual planning and
deployment, rather than terminating abruptly.
This is a byproduct of how Node's process.uncaughtException handler
works. It hijacks and replaces all usual error logic, including the
process.exit part. This change simply adds back the non-zero exit.
I also added a test (and fixed one other that began failing
afterwards), so that we can prevent regressions down the road.
The `nodejs` language support is implemented as two programs: one that
manages the initial connection to the engine and provides the language
serivce itself, and another that the language service invokes in order
to run a `nodejs` Pulumi program. The latter is responsible for running
the user's program and communicating its resource requests to the
engine. Currently, `run` effectively assumes that the user's program
will run synchronously from start to finish, and will disconnect from
the engine once the user's program has completed. This assumption breaks
if the user's program requires multiple turns of the event loop to
finish its root resource requests. For example, the following program
would fail to create its second resource because the engine will be
disconnected once it reaches its `await`:
```
(async () => {
let a = new Resource();
await somePromise();
let = new Resource();
})();
```
These changes fix this issue by disconnecting from the engine during
process shutdown rather than after the user's program has finished its
first turn through the event loop.
This change adds functions, `pulumi.getProject()` and `pulumi.getStack()`,
to fetch the names of the project and stack, respectively. These can be
handy in generating names, specializing areas of the code, etc.
This fixespulumi/pulumi#429.
A dynamic resource is a resource whose provider is implemented alongside
the resource itself. This provider may close over and use orther
resources in the implementation of its CRUD operations. The provider
itself must be stateless, as each CRUD operation for a particular
dynamic resource type may use an independent instance of the provider.
Changes to the definition of a resource's provider result in replacement
of the resource itself (rather than a simple update), as this allows the
old provider definition to delete the old resource and the new provider
definition to create an appropriate replacement.
This change adds environment variable fallbacks for configuration
variables, such that you can either set them explicitly, as a specific
variable PULUMI_CONFIG_<K>, or an entire JSON serialized bag via
PULUMI_CONFIG.
This is convenient when simply invoking programs at the command line,
via node, e.g.
PULUMI_CONFIG_AWS_CONFIG_REGION=us-west-2 node bin/index.js
Our language host also now uses this to communicate config when invoking
a Run RPC, rather than at the command line. This fixespulumi/pulumi#336.
This resource provider accepts a single configuration parameter, `testing:provider:module`, that is the path to a Javascript module that implements CRUD operations for a set of resource types. This allows e.g. a test case to provide its own implementation of these operations that may succeed or fail in interesting ways.
Fixes#338.
This exposes the existing runtime logging functionality in a way meant
for 3rd-parties to consume. This can be useful if we want to introduce
debug logging, warnings, or other things, that fit nicely with the
Pulumi CLI and overall developer workflow.
This change improves our output formatting by generally adding
fewer prefixes. As shown in pulumi/pulumi#359, we were being
excessively verbose in many places, including prefixing every
console.out with "langhost[nodejs].stdout: ", displaying full
stack traces for simple errors like missing configuration, etc.
Overall, this change includes the following:
* Don't prefix stdout and stderr output from the program, other
than the standard "info:" prefix. I experimented with various
schemes here, but they all felt gratuitous. Simply emitting
the output seems fine, especially as it's closer to what would
happen if you just ran the program under node.
* Do NOT make writes to stderr fail the plan/deploy. Previously
we assumed that any console.errors, for instance, meant that
the overall program should fail. This simply isn't how stderr
is treated generally and meant you couldn't use certain
logging techniques and libraries, among other things.
* Do make sure that stderr writes in the program end up going to
stderr in the Pulumi CLI output, however, so that redirection
works as it should. This required a new Infoerr log level.
* Make a small fix to the planning logic so we don't attempt to
print the summary if an error occurs.
* Finally, add a new error type, RunError, that when thrown and
uncaught does not result in a full stack trace being printed.
Anyone can use this, however, we currently use it for config
errors so that we can terminate with a pretty error message,
rather than the monstrosity shown in pulumi/pulumi#359.
This change flips the polarity on parallelism: rather than having a
--serialize flag, we will have a --parallel=P flag, and by default
we will shut off parallelism. We aren't benefiting from it at the
moment (until we implement pulumi/pulumi-fabric#106), and there are
more hidden dependencies in places like AWS Lambdas and Permissions
than I had realized. We may revisit the default, but this allows
us to bite off the messiness of dependsOn only when we benefit from
it. And in any case, the --parallel=P capability will be useful.
This change adds an optiona dependsOn parameter to Resource constructors,
to "force" a fake dependency between resources. We have an extremely strong
desire to resort to using this only in unusual cases -- and instead rely
on the natural dependency DAG based on properties -- but experience in other
resource provisioning frameworks tells us that we're likely to need this in
the general case. Indeed, we've already encountered the need in AWS's
API Gateway resources... and I suspect we'll run into more especially as we
tackle non-serverless resources like EC2 Instances, where "ambient"
dependencies are far more commonplace.
This also makes parallelism the default mode of operation, and we have a
new --serialize flag that can be used to suppress this default behavior.
Full disclosure: I expect this to become more Make-like, i.e. -j 8, where
you can specify the precise width of parallelism, when we tackle
pulumi/pulumi-fabric#106. I also think there's a good chance we will flip
the default, so that serial execution is the default, so that developers
who don't benefit from the parallelism don't need to worry about dependsOn
in awkward ways. This tends to be the way most tools (like Make) operate.
This fixespulumi/pulumi-fabric#335.
There's a fair bit of clean up in here, but the meat is:
* Allocate the language runtime gRPC client connection on the
goroutine that will use it; this eliminates race conditions.
* The biggie: there *appears* to be a bug in gRPC's implementation
on Linux, where it doesn't implement WaitForReady properly. The
behavior I'm observing is that RPC calls will not retry as they
are supposed to, but will instead spuriously fail during the RPC
startup. To work around this, I've added manual retry logic in
the shared plugin creation function so that we won't even try
to use the client connection until it is in a well-known state.
pulumi/pulumi-fabric#337 tracks getting to the bottom of this and,
ideally, removing the work around.
The other minor things are:
* Separate run.js into its own module, so it doesn't include
index.js and do a bunch of random stuff it shouldn't be doing.
* Allow run.js to be invoked without a --monitor. This makes
testing just the run part of invocation easier (including
config, which turned out to be super useful as I was debugging).
* Tidy up some messages.
This change closes the gRPC client connections, as they keep the
Node.js message loop alive on Linux (but, strangely, not Mac;
regardless, a good thing to do anyway...)
This change adds the ability to perform runtime logging, including
debug logging, that wires up to the Pulumi Fabric engine in the usual
ways. Most stdout/stderr will automatically go to the right place,
but this lets us add some debug tracing in the implementation of the
runtime itself (and should come in handy in other places, like perhaps
the Pulumi Framework and even low-level end-user code).
The organization of packages underneath lib/ breaks the easy consumption
of submodules, a la
import {FileAsset} from "@pulumi/pulumi-fabric/asset";
We will go back to having everything hanging off the module root directory.
This makes a few tweaks to get the integration tests passing:
* Add `runtime: nodejs` to the minimal example's `Lumi.yaml` file.
* Remove usage of `@lumi/lumirt { printf }` and just use `console.log`.
* Remove calls to `lumijs` in the integration test framework and
the minimal example's package.json. Instead, we just run
`yarn run build`, which itself internally just invokes `tsc`.
* Add package validation logic and eliminate the pkg/compiler/metadata
library, in favor of the simpler code in pkg/engine.
* Simplify the Node.js langhost plugin CLI, and simply take an
argument rather than requiring required and optional --flags.
* Use a default path of "." if the program path isn't provided. This
is a legal scenario if you've passed a pwd and just want to load
the package's default module ("./index.js" or whatever main says).
* Add an executable script, lumi-langhost-nodejs, that fires up the
`bin/cmd/langhost/index.js` file to serve the Node.js language plugin.
This change introduces the notion of a "dry run" into the property
serialization logic, since this controls whether we wait for dependent
linked property values to arrive or not. It also changes the test
harness to run all tests both ways: once in planning mode (when properties
will show up as "unknown" and the second time in deployment mode (when
properties will have settled to their final values).
This is the initial step towards redefining Lumi as a library that runs
atop vanilla Node.js/V8, rather than as its own runtime.
This change is woefully incomplete but this includes some of the more
stable pieces of my current work-in-progress.
The new structure is that within the sdk/ directory we will have a client
library per language. This client library contains the object model for
Lumi (resources, properties, assets, config, etc), in addition to the
"language runtime host" components required to interoperate with the
Lumi resource monitor. This resource monitor is effectively what we call
"Lumi" today, in that it's the thing orchestrating plans and deployments.
Inside the sdk/ directory, you will find nodejs/, the Node.js client
library, alongside proto/, the definitions for RPC interop between the
different pieces of the system. This includes existing RPC definitions
for resource providers, etc., in addition to the new ones for hosting
different language runtimes from within Lumi.
These new interfaces are surprisingly simple. There is effectively a
bidirectional RPC channel between the Lumi resource monitor, represented
by the lumirpc.ResourceMonitor interface, and each language runtime,
represented by the lumirpc.LanguageRuntime interface.
The overall orchestration goes as follows:
1) Lumi decides it needs to run a program written in language X, so
it dynamically loads the language runtime plugin for language X.
2) Lumi passes that runtime a loopback address to its ResourceMonitor
service, while language X will publish a connection back to its
LanguageRuntime service, which Lumi will talk to.
3) Lumi then invokes LanguageRuntime.Run, passing information like
the desired working directory, program name, arguments, and optional
configuration variables to make available to the program.
4) The language X runtime receives this, unpacks it and sets up the
necessary context, and then invokes the program. The program then
calls into Lumi object model abstractions that internally communicate
back to Lumi using the ResourceMonitor interface.
5) The key here is ResourceMonitor.NewResource, which Lumi uses to
serialize state about newly allocated resources. Lumi receives these
and registers them as part of the plan, doing the usual diffing, etc.,
to decide how to proceed. This interface is perhaps one of the
most subtle parts of the new design, as it necessitates the use of
promises internally to allow parallel evaluation of the resource plan,
letting dataflow determine the available concurrency.
6) The program exits, and Lumi continues on its merry way. If the program
fails, the RunResponse will include information about the failure.
Due to (5), all properties on resources are now instances of a new
Property<T> type. A Property<T> is just a thin wrapper over a T, but it
encodes the special properties of Lumi resource properties. Namely, it
is possible to create one out of a T, other Property<T>, Promise<T>, or
to freshly allocate one. In all cases, the Property<T> does not "settle"
until its final state is known. This cannot occur before the deployment
actually completes, and so in general it's not safe to depend on concrete
resolutions of values (unlike ordinary Promise<T>s which are usually
expected to resolve). As a result, all derived computations are meant to
use the `then` function (as in `someValue.then(v => v+x)`).
Although this change includes tests that may be run in isolation to test
the various RPC interactions, we are nowhere near finished. The remaining
work primarily boils down to three things:
1) Wiring all of this up to the Lumi code.
2) Fixing the handful of known loose ends required to make this work,
primarily around the serialization of properties (waiting on
unresolved ones, serializing assets properly, etc).
3) Implementing lambda closure serialization as a native extension.
This ongoing work is part of pulumi/pulumi-fabric#311.