Thse changes make a subtle but critical adjustment to the process the
Pulumi engine uses to determine whether or not a difference exists
between a resource's actual and desired states, and adjusts the way this
difference is calculated and displayed accordingly.
Today, the Pulumi engine get the first chance to decide whether or not
there is a difference between a resource's actual and desired states. It
does this by comparing the current set of inputs for a resource (i.e.
the inputs from the running Pulumi program) with the last set of inputs
used to update the resource. If there is no difference between the old
and new inputs, the engine decides that no change is necessary without
consulting the resource's provider. Only if there are changes does the
engine consult the resource's provider for more information about the
difference. This can be problematic for a number of reasons:
- Not all providers do input-input comparison; some do input-state
comparison
- Not all providers are able to update the last deployed set of inputs
when performing a refresh
- Some providers--either intentionally or due to bugs--may see changes
in resources whose inputs have not changed
All of these situations are confusing at the very least, and the first
is problematic with respect to correctness. Furthermore, the display
code only renders diffs it observes rather than rendering the diffs
observed by the provider, which can obscure the actual changes detected
at runtime.
These changes address both of these issues:
- Rather than comparing the current inputs against the last inputs
before calling a resource provider's Diff function, the engine calls
the Diff function in all cases.
- Providers may now return a list of properties that differ between the
requested and actual state and the way in which they differ. This
information will then be used by the CLI to render the diff
appropriately. A provider may also indicate that a particular diff is
between old and new inputs rather than old state and new inputs.
Fixes#2453.
Previously, when the CLI wanted to install a plugin, it used a special
method, `DownloadPlugin` on the `httpstate` backend to actually fetch
the tarball that had the plugin. The reason for this is largely tied
to history, at one point during a closed beta, we required presenting
an API key to download plugins (as a way to enforce folks outside the
beta could not download them) and because of that it was natural to
bake that functionality into the part of the code that interfaced with
the rest of the API from the Pulumi Service.
The downside here is that it means we need to host all the plugins on
`api.pulumi.com` which prevents community folks from being able to
easily write resource providers, since they have to manually manage
the process of downloading a provider to a machine and getting it on
the `$PATH` or putting it in the plugin cache.
To make this easier, we add a `--server` argument you can pass to
`pulumi plugin install` to control the URL that it attempts to fetch
the tarball from. We still have perscriptive guidence on how the
tarball must be
named (`pulumi-[<type>]-[<provider-name>]-vX.Y.Z.tar.gz`) but the base
URL can now be configured.
Folks publishing packages can use install scripts to run `pulumi
plugin install` passing a custom `--server` argument, if needed.
There are two improvements we can make to provide a nicer end to end
story here:
- We can augment the GetRequiredPlugins method on the language
provider to also return information about an optional server to use
when downloading the provider.
- We can pass information about a server to download plugins from as
part of a resource registration or creation of a first class
provider.
These help out in cases where for one reason or another where `pulumi
plugin install` doesn't get run before an update takes place and would
allow us to either do the right thing ahead of time or provide better
error messages with the correct `--server` argument. But, for now,
this unblocks a majority of the cases we care about and provides a
path forward for folks that want to develop and host their own
resource providers.
`pulumi query` is designed, essentially, as a souped-up `exec`. We
execute a query program, and add a few convenience constructs (e.g., the
default providers that give you access to things like `getStack`).
Early in the design process, we decided to not re-use the `up`/update
path, both to minimize risk to update operations, and to simplify the
implementation.
This commit will add this "parallel query universe" into the engine
package. In particular, this includes:
* `QuerySource`, which executes the language provider running the query
program, and providing it with some simple constructs, such as the
default provider, which provides access to `getStack`. This is much
like a very simplified `EvalSource`, though notably without any of the
planning/step execution machinery.
* `queryResmon`, which disallows all resource operations, except the
`Invoke` that retrieves the resource outputs of some stack's last
snapshot. This is much like a simplified `resmon`, but without any of
the provider resolution, and without and support for resource
operations generally.
* Various static functions that pull together miscellaneous things
needed to execute a query program. Notably, this includes gathering
language plugins.
* Load default providers deterministically
This commit adds a new algorithm for deriving a list of default
providers from the set of plugins reported from the language host and
from the snapshot. If the language host reports a set of plugins,
default providers are sourced directly from that set, otherwise default
providers are sourced from the full set of plugins, including ones from
the snapshot.
When multiple versions of the same provider are requested, the newest
version of that provider is always select as the default provider.
* Add CHANGELOG.md entry
* Skip the language host's plugins if it reports no resource plugins
* CR feedback
* CR: Log when skipping non resource plugin
* Install missing plugins on startup
This commit addresses the problem of missing plugins by scanning the
snapshot and language host on startup for the list of required plugins
and, if there are any plugins that are required but not installed,
installs them. The mechanism by which plugins are installed is exactly
the same as 'pulumi plugin install'.
The installation of missing plugins is best-effort and, if it fails,
will not fail the update.
This commit addresses pulumi/pulumi-azure#200, where users using Pulumi
in CI often found themselves missing plugins.
* Add CHANGELOG
* Skip downloading plugins if no client provided
* Reduce excessive test output
* Update Gopkg.lock
* Update pkg/engine/destroy.go
Co-Authored-By: swgillespie <sean@pulumi.com>
* CR: make pluginSet a newtype
* CR: Assign loop induction var to local var
This implements the new algorithm for deciding which resources must be
deleted due to a delete-before-replace operation.
We need to compute the set of resources that may be replaced by a
change to the resource under consideration. We do this by taking the
complete set of transitive dependents on the resource under
consideration and removing any resources that would not be replaced by
changes to their dependencies. We determine whether or not a resource
may be replaced by substituting unknowns for input properties that may
change due to deletion of the resources their value depends on and
calling the resource provider's Diff method.
This is perhaps clearer when described by example. Consider the
following dependency graph:
A
__|__
B C
| _|_
D E F
In this graph, all of B, C, D, E, and F transitively depend on A. It may
be the case, however, that changes to the specific properties of any of
those resources R that would occur if a resource on the path to A were
deleted and recreated may not cause R to be replaced. For example, the
edge from B to A may be a simple dependsOn edge such that a change to
B does not actually influence any of B's input properties. In that case,
neither B nor D would need to be deleted before A could be deleted.
In order to make the above algorithm a reality, the resource monitor
interface has been updated to include a map that associates an input
property key with the list of resources that input property depends on.
Older clients of the resource monitor will leave this map empty, in
which case all input properties will be treated as depending on all
dependencies of the resource. This is probably overly conservative, but
it is less conservative than what we currently implement, and is
certainly correct.
We run the same suite of changes that we did on gometalinter. This
ended up catching a few new issues, some of which were addressed and
some of which were baselined.
Whenever an update fails partially, it gives the engine back some state
bag of outputs that should be persisted to the snapshot. When saving
this state, we shouldn't save the inputs that triggered the update that
failed, since the resource that exists will never have been updated
successfully with those inputs.
Instead of saving the new inputs on partial failed updates, this commit
saves the old inputs and the new outputs. The new outputs will likely
need to be refreshed to be useful, but the old inputs will be correct
from the perpsective of the Pulumi program that generated the last
successful update.
Fixespulumi/pulumi#2011
This commit reverts most of #1853 and replaces it with functionally
identical logic, using the notion of status message-specific sinks.
In other words, where the original commit implemented ephemeral status
messages by adding an `isStatus` parameter to most of the logging
methdos in pulumi/pulumi, this implements ephemeral status messages as a
parallel logging sink, which emits _only_ ephemeral status messages.
The original commit message in that PR was:
> Allow log events to be marked "status" events
>
> This commit will introduce a field, IsStatus to LogRequest. A "status"
> logging event will be displayed in the Info column of the main
> display, but will not be printed out at the end, when resource
> operations complete.
>
> For example, for complex resource initialization, we'd like to display
> a series of intermediate results: [1/4] Service object created, for
> example. We'd like these to appear in the Info column, but not at the
> end, where they are not helpful to the user.
Replace the Source-based implementation of refresh with a phase that
runs as the first part of plan execution and rewrites the snapshot in-memory.
In order to fit neatly within the existing framework for resource operations,
these changes introduce a new kind of step, RefreshStep, to represent
refreshes. RefreshSteps operate similar to ReadSteps but do not imply that
the resource being read is not managed by Pulumi.
In addition to the refresh reimplementation, these changes incorporate those
from #1394 to run refresh in the integration test framework.
Fixes#1598.
Fixespulumi/pulumi-terraform#165.
Contributes to #1449.
* Show a better error message when decrypting fails
It is most often the case that failing to decrypt a secret implies that
the secret was transferred from one stack to another via copying the
configuration. This commit introduces a better error message for this
case and instructs users to explicitly re-encrypt their encrypted keys
in the context of the new stack.
* Spelling
* CR: Grammar fixes
The belief is that this hides some complexity that we shouldn't be
exposing in the default case.
In order to filter these events from both the diff/progress display
and the resource change summary, we perform this filtering in
`pkg/engine`.
Fixes#1733.
### First-Class Providers
These changes implement support for first-class providers. First-class
providers are provider plugins that are exposed as resources via the
Pulumi programming model so that they may be explicitly and multiply
instantiated. Each instance of a provider resource may be configured
differently, and configuration parameters may be source from the
outputs of other resources.
### Provider Plugin Changes
In order to accommodate the need to verify and diff provider
configuration and configure providers without complete configuration
information, these changes adjust the high-level provider plugin
interface. Two new methods for validating a provider's configuration
and diffing changes to the same have been added (`CheckConfig` and
`DiffConfig`, respectively), and the type of the configuration bag
accepted by `Configure` has been changed to a `PropertyMap`.
These changes have not yet been reflected in the provider plugin gRPC
interface. We will do this in a set of follow-up changes. Until then,
these methods are implemented by adapters:
- `CheckConfig` validates that all configuration parameters are string
or unknown properties. This is necessary because existing plugins
only accept string-typed configuration values.
- `DiffConfig` either returns "never replace" if all configuration
values are known or "must replace" if any configuration value is
unknown. The justification for this behavior is given
[here](https://github.com/pulumi/pulumi/pull/1695/files#diff-a6cd5c7f337665f5bb22e92ca5f07537R106)
- `Configure` converts the config bag to a legacy config map and
configures the provider plugin if all config values are known. If any
config value is unknown, the underlying plugin is not configured and
the provider may only perform `Check`, `Read`, and `Invoke`, all of
which return empty results. We justify this behavior becuase it is
only possible during a preview and provides the best experience we
can manage with the existing gRPC interface.
### Resource Model Changes
Providers are now exposed as resources that participate in a stack's
dependency graph. Like other resources, they are explicitly created,
may have multiple instances, and may have dependencies on other
resources. Providers are referred to using provider references, which
are a combination of the provider's URN and its ID. This design
addresses the need during a preview to refer to providers that have not
yet been physically created and therefore have no ID.
All custom resources that are not themselves providers must specify a
single provider via a provider reference. The named provider will be
used to manage that resource's CRUD operations. If a resource's
provider reference changes, the resource must be replaced. Though its
URN is not present in the resource's dependency list, the provider
should be treated as a dependency of the resource when topologically
sorting the dependency graph.
Finally, `Invoke` operations must now specify a provider to use for the
invocation via a provider reference.
### Engine Changes
First-class providers support requires a few changes to the engine:
- The engine must have some way to map from provider references to
provider plugins. It must be possible to add providers from a stack's
checkpoint to this map and to register new/updated providers during
the execution of a plan in response to CRUD operations on provider
resources.
- In order to support updating existing stacks using existing Pulumi
programs that may not explicitly instantiate providers, the engine
must be able to manage the "default" providers for each package
referenced by a checkpoint or Pulumi program. The configuration for
a "default" provider is taken from the stack's configuration data.
The former need is addressed by adding a provider registry type that is
responsible for managing all of the plugins required by a plan. In
addition to loading plugins froma checkpoint and providing the ability
to map from a provider reference to a provider plugin, this type serves
as the provider plugin for providers themselves (i.e. it is the
"provider provider").
The latter need is solved via two relatively self-contained changes to
plan setup and the eval source.
During plan setup, the old checkpoint is scanned for custom resources
that do not have a provider reference in order to compute the set of
packages that require a default provider. Once this set has been
computed, the required default provider definitions are conjured and
prepended to the checkpoint's resource list. Each resource that
requires a default provider is then updated to refer to the default
provider for its package.
While an eval source is running, each custom resource registration,
resource read, and invoke that does not name a provider is trapped
before being returned by the source iterator. If no default provider
for the appropriate package has been registered, the eval source
synthesizes an appropriate registration, waits for it to complete, and
records the registered provider's reference. This reference is injected
into the original request, which is then processed as usual. If a
default provider was already registered, the recorded reference is
used and no new registration occurs.
### SDK Changes
These changes only expose first-class providers from the Node.JS SDK.
- A new abstract class, `ProviderResource`, can be subclassed and used
to instantiate first-class providers.
- A new field in `ResourceOptions`, `provider`, can be used to supply
a particular provider instance to manage a `CustomResource`'s CRUD
operations.
- A new type, `InvokeOptions`, can be used to specify options that
control the behavior of a call to `pulumi.runtime.invoke`. This type
includes a `provider` field that is analogous to
`ResourceOptions.provider`.
* Execute chains of steps in parallel
Fixespulumi/pulumi#1624. Since register resource steps are known to be
ready to execute the moment the engine sees them, we can effectively
parallelize all incoming step chains. This commit adds the machinery
necessary to do so - namely a step executor and a plan executor.
* Remove dead code
* CR: use atomic.Value to be explicit about what values are atomically loaded and stored
* CR: Initialize atomics to 'false'
* Add locks around data structures in event callbacks
* CR: Add DegreeOfParallelism method on Options and add comment on select in Execute
* CR: Use context.Context for cancellation instead of cancel.Source
* CR: improve cancellation
* Rebase against master: execute read steps in parallel
* Please gometalinter
* CR: Inline a few methods in stepExecutor
* CR: Feedback and bug fixes
1. Simplify step_executor.go by 'bubbling' up errors as far as possible
and reporting diagnostics and cancellation in one place
2. Fix a bug where the CLI claimed that a plan was cancelled even if it
wasn't (it just has an error)
* Comments
* CR: Add comment around problematic select, move workers.Add outside of goroutine, return instead of break
This commit adds CLI support for resource providers to provide partial
state upon failure. For resource providers that model resource
operations across multiple API calls, the Provider RPC interface can now
accomodate saving bags of state for resource operations that failed.
This is a common pattern for Terraform-backed providers that try to do
post-creation steps on resource as part of Create or Update resource
operations.
This changes two things:
1) Eliminates the fact that we had two kinds of previews in our engine.
2) Always initialize the plugin.Events, to ensure that all plugin loads
are persisted no matter the update type (update, refresh, destroy),
and skip initializing it when dryRun == true, since we won't save them.
In pulumi/pulumi#1356, we observed that we can fail during a destroy
because we attempt to load the language plugin, which now eagerly looks
for the @pulumi/pulumi package.
This is also blocking ingestion of the latest engine bits into the PPC.
It turns out that for destroy (and refresh), we have no need for the
language plugin. So, let's skip loading it when appropriate.
This changes the CLI interface in a few ways:
* `pulumi preview` is back! The alternative of saying
`pulumi update --preview` just felt awkward, and it's a common
operation to want to perform. Let's just make it work.
* There are two flags consistent across all update commands,
`update`, `refresh`, and `destroy`:
- `--skip-preview` will skip the preview step. Note that this
does *not* skip the prompt to confirm that you'd like to proceed.
Indeed, it will still prompt, with a little warning text about
the fact that the preview has been skipped.
* `--yes` will auto-approve the updates.
This lands us in a simpler and more intuitive spot for common scenarios.
I found the flag --force to be a strange name for skipping a preview,
since that name is usually reserved for operations that might be harmful
and yet you're coercing a tool to do it anyway, knowing there's a chance
you're going to shoot yourself in the foot.
I also found that what I almost always want in the situation where
--force was being used is to actually just run a preview and have the
confirmation auto-accepted. Going straight to --force isn't the right
thing in a CI scenario, where you actually want to run a preview first,
just to ensure there aren't any issues, before doing the update.
In a sense, there are four options here:
1. Run a preview, ask for confirmation, then do an update (the default).
2. Run a preview, auto-accept, and then do an update (the CI scenario).
3. Just run a preview with neither a confirmation nor an update (dry run).
4. Just do an update, without performing a preview beforehand (rare).
This change enables all four workflows in our CLI.
Rather than have an explosion of flags, we have a single flag,
--preview, which can specify the mode that we're operating in. The
following are the values which correlate to the above four modes:
1. "": default (no --preview specified)
2. "auto": auto-accept preview confirmation
3. "only": only run a preview, don't confirm or update
4. "skip": skip the preview altogether
As part of this change, I redid a bit of how the preview modes
were specified. Rather than booleans, which had some illegal
combinations, this change introduces a new enum type. Furthermore,
because the engine is wholly ignorant of these flags -- and only the
backend understands them -- it was confusing to me that
engine.UpdateOptions stored this flag, especially given that all
interesting engine options _also_ accepted a dryRun boolean. As of
this change, the backend.PreviewBehavior controls the preview options.
* Refactor the SnapshotManager interface
Lift snapshot management out of the engine by delegating it to the
SnapshotManager implementation in pkg/backend.
* Add a event interface for plugin loads and use that interface to record plugins in the snapshot
* Remove dead code
* Add comments to Events
* Add a number of tests for SnapshotManager
* CR feedback: use a successful bit on 'End' instead of having a separate 'Abort' API
* CR feedback
* CR feedback: register plugins one-at-a-time instead of the entire state at once
* Re-introduce interface for snapshot management
Snapshot management was done through the Update interface; this commit
splits it into a separate interface
* Put the SnapshotManager instance onto the engine context
* Remove SnapshotManager from planContext and updateActions now that it can be accessed by engine Context
Do not fire a "resource outputs" display event for component resources
after their initial registration. Instead, defer this event until the
component's `RegisterResourceOutputs` call arrives.
hese changes plumb basic support for cancellation through the engine.
Two types of cancellation are supported for all engine operations:
- Cancellation, which waits for the operation to drive itself to a safe
point before the operation returns, and
- Termination, which does not wait for the operation to drive itself
to a safe opint for the operation returns.
When updating local or managed stacks, a single ^C triggers cancellation
of any running operation; a second ^C will trigger termination.
Fixes#513, #1077.
* Lift snapshot management out of the engine
This PR is a prerequisite for parallelism by addressing a major problem
that the engine has to deal with when performing parallel resource
construction: parallel mutation of the global snapshot. This PR adds
a `SnapshotManager` type that is responsible for maintaining and
persisting the current resource snapshot. It serializes all reads and
writes to the global snapshot and persists the snapshot to persistent
storage upon every write.
As a side-effect of this, the core engine no longer needs to know about
snapshot management at all; all snapshot operations can be handled as
callbacks on deployment events. This will greatly simplify the
parallelization of the core engine.
Worth noting is that the core engine will still need to be able to read
the current snapshot, since it is interested in the dependency graphs
contained within. The full implications of that are out of scope of this
PR.
Remove dead code, Steps no longer need a reference to the plan iterator that created them
Fixing various issues that arise when bringing up pulumi-aws
Line length broke the build
Code review: remove dead field, fix yaml name error
Rebase against master, provide implementation of StackPersister for cloud backend
Code review feedback: comments on MutationStatus, style in snapshot.go
Code review feedback: move SnapshotManager to pkg/backend, change engine to use an interface SnapshotManager
Code review feedback: use a channel for synchronization
Add a comment and a new test
* Maintain two checkpoints, an immutable base and a mutable delta, and
periodically merge the two to produce snapshots
* Add a lot of tests - covers all of the non-error paths of BeginMutation and End
* Fix a test resource provider
* Add a few tests, fix a few issues
* Rebase against master, fixed merge
This change includes a bunch of refactorings I made in prep for
doing refresh (first, the command, see pulumi/pulumi#1081):
* The primary change is to change the way the engine's core update
functionality works with respect to deploy.Source. This is the
way we can plug in new sources of resource information during
planning (and, soon, diffing). The way I intend to model refresh
is by having a new kind of source, deploy.RefreshSource, which
will let us do virtually everything about an update/diff the same
way with refreshes, which avoid otherwise duplicative effort.
This includes changing the planOptions (nee deployOptions) to
take a new SourceFunc callback, which is responsible for creating
a source specific to the kind of plan being requested.
Preview, Update, and Destroy now are primarily differentiated by
the kind of deploy.Source that they return, rather than sprinkling
things like `if Destroying` throughout. This tidies up some logic
and, more importantly, gives us precisely the refresh hook we need.
* Originally, we used the deploy.NullSource for Destroy operations.
This simply returns nothing, which is how Destroy works. For some
reason, we were no longer doing this, and instead had some
`if Destroying` cases sprinkled throughout the deploy.EvalSource.
I think this is a vestige of some old way we did configuration, at
least judging by a comment, which is apparently no longer relevant.
* Move diff and diff-printing logic within the engine into its own
pkg/engine/diff.go file, to prepare for upcoming work.
* I keep noticing benign diffs anytime I regenerate protobufs. I
suspect this is because we're also on different versions. I changed
generate.sh to also dump the version into grpc_version.txt. At
least we can understand where the diffs are coming from, decide
whether to take them (i.e., a newer version), and ensure that as
a team we are monotonically increasing, and not going backwards.
* I also tidied up some tiny things I noticed while in there, like
comments, incorrect types, lint suppressions, and so on.