* Plumbing the custom timeouts from the engine to the providers
* Plumbing the CustomTimeouts through to the engine and adding test to show this
* Change the provider proto to include individual timeouts
* Plumbing the CustomTimeouts from the engine through to the Provider RPC interface
* Change how the CustomTimeouts are sent across RPC
These errors were spotted in testing. We can now see that the timeout
information is arriving in the RegisterResourceRequest
```
req=&pulumirpc.RegisterResourceRequest{
Type: "aws:s3/bucket:Bucket",
Name: "my-bucket",
Parent: "urn:pulumi:dev::aws-vpc::pulumi:pulumi:Stack::aws-vpc-dev",
Custom: true,
Object: &structpb.Struct{},
Protect: false,
Dependencies: nil,
Provider: "",
PropertyDependencies: {},
DeleteBeforeReplace: false,
Version: "",
IgnoreChanges: nil,
AcceptSecrets: true,
AdditionalSecretOutputs: nil,
Aliases: nil,
CustomTimeouts: &pulumirpc.RegisterResourceRequest_CustomTimeouts{
Create: 300,
Update: 400,
Delete: 500,
XXX_NoUnkeyedLiteral: struct {}{},
XXX_unrecognized: nil,
XXX_sizecache: 0,
},
XXX_NoUnkeyedLiteral: struct {}{},
XXX_unrecognized: nil,
XXX_sizecache: 0,
}
```
* Changing the design to use strings
* CHANGELOG entry to include the CustomTimeouts work
* Changing custom timeouts to be passed around the engine as converted value
We don't want to pass around strings - the user can provide it but we want
to make the engine aware of the timeout in seconds as a float64
A resource can be imported by setting the `import` property in the
resource options bag when instantiating a resource. In order to
successfully import a resource, its desired configuration (i.e. its
inputs) must not differ from its actual configuration (i.e. its state)
as calculated by the resource's provider.
There are a few interesting state transitions hiding here when importing
a resource:
1. No prior resource exists in the checkpoint file. In this case, the
resource is simply imported.
2. An external resource exists in the checkpoint file. In this case, the
resource is imported and the old external state is discarded.
3. A non-external resource exists in the checkpoint file and its ID is
different from the ID to import. In this case, the new resource is
imported and the old resource is deleted.
4. A non-external resource exists in the checkpoint file, but the ID is
the same as the ID to import. In this case, the import ID is ignored
and the resource is treated as it would be in all cases except for
changes that would replace the resource. In that case, the step
generator issues an error that indicates that the import ID should be
removed: were we to move forward with the replace, the new state of
the stack would fall under case (3), which is almost certainly not
what the user intends.
Fixes#1662.
Recent changes to default provider semantics and the addition of
resource aliases allow a resource's provider reference to change even if
the resource itself is considered to have no diffs. `mustWrite` did not
expect this scenario, and indeed asserted against it. These changes
update `mustWrite` to detect such changes and require that the
checkpoint be written if and when they occur.
Fixes#2804.
Adds a new resource option `aliases` which can be used to rename a resource. When making a breaking change to the name or type of a resource or component, the old name can be added to the list of `aliases` for a resource to ensure that existing resources will be migrated to the new name instead of being deleted and replaced with the new named resource.
There are two key places this change is implemented.
The first is the step generator in the engine. When computing whether there is an old version of a registered resource, we now take into account the aliases specified on the registered resource. That is, we first look up the resource by its new URN in the old state, and then by any aliases provided (in order). This can allow the resource to be matched as a (potential) update to an existing resource with a different URN.
The second is the core `Resource` constructor in the JavaScript (and soon Python) SDKs. This change ensures that when a parent resource is aliased, that all children implicitly inherit corresponding aliases. It is similar to how many other resource options are "inherited" implicitly from the parent.
Four specific scenarios are explicitly tested as part of this PR:
1. Renaming a resource
2. Adopting a resource into a component (as the owner of both component and consumption codebases)
3. Renaming a component instance (as the owner of the consumption codebase without changes to the component)
4. Changing the type of a component (as the owner of the component codebase without changes to the consumption codebase)
4. Combining (1) and (3) to make both changes to a resource at the same time
When creating a new stack using the local backend, the default
checkpoint has no deployment. That means there's a nil snapshot
created, which means our strategy of using the base snapshot's secrets
manager was not going to work. Trying to do so would result in a panic
because the baseSnapshot is nil in this case.
Using the secrets manager we are going to use to persist the snapshot
is a better idea anyhow, as that's what's actually going to be burned
into the deployment when we serialize the snapshot, so let's use that
instead.
We have many cases where we want to do the following:
deployment -> snapshot -> process snapshot -> deployment
We now retain information in the snapshot about the secrets manager
that was used to construct it, so in these round trip cases, we can
re-use the existing manager.
In general, a "delete" in Pulumi is destroying an actual physical
resource. In the case of a read resource, however, the delete is
merely removing the resource from the stack; the physical resource
is not affected. These changes attempt to clarify this situation by
using the term "discard" rather than "delete".
Fixes#2015.
This implements the new algorithm for deciding which resources must be
deleted due to a delete-before-replace operation.
We need to compute the set of resources that may be replaced by a
change to the resource under consideration. We do this by taking the
complete set of transitive dependents on the resource under
consideration and removing any resources that would not be replaced by
changes to their dependencies. We determine whether or not a resource
may be replaced by substituting unknowns for input properties that may
change due to deletion of the resources their value depends on and
calling the resource provider's Diff method.
This is perhaps clearer when described by example. Consider the
following dependency graph:
A
__|__
B C
| _|_
D E F
In this graph, all of B, C, D, E, and F transitively depend on A. It may
be the case, however, that changes to the specific properties of any of
those resources R that would occur if a resource on the path to A were
deleted and recreated may not cause R to be replaced. For example, the
edge from B to A may be a simple dependsOn edge such that a change to
B does not actually influence any of B's input properties. In that case,
neither B nor D would need to be deleted before A could be deleted.
In order to make the above algorithm a reality, the resource monitor
interface has been updated to include a map that associates an input
property key with the list of resources that input property depends on.
Older clients of the resource monitor will leave this map empty, in
which case all input properties will be treated as depending on all
dependencies of the resource. This is probably overly conservative, but
it is less conservative than what we currently implement, and is
certainly correct.
It is possible for the inputs of a "same" resource to have changed even
if the contents of the input bags are different if the resource's
provider deems the physical change to be semantically irrelevant.
- Create all refresh steps before issuing any. This is important as the
state update loop expects all steps to exist.
- Check for cancellation later in the refresher.
This also fixes races in the SnapshotManager and the test journal that
could cause panics during cancellation.
Replace the Source-based implementation of refresh with a phase that
runs as the first part of plan execution and rewrites the snapshot in-memory.
In order to fit neatly within the existing framework for resource operations,
these changes introduce a new kind of step, RefreshStep, to represent
refreshes. RefreshSteps operate similar to ReadSteps but do not imply that
the resource being read is not managed by Pulumi.
In addition to the refresh reimplementation, these changes incorporate those
from #1394 to run refresh in the integration test framework.
Fixes#1598.
Fixespulumi/pulumi-terraform#165.
Contributes to #1449.
A checkpoint write is unnecessary if it does not change the semantics of
the data currently stored in the checkpoint. We currently perform
unnecessary checkpoint writes in two cases:
- Same steps where no aspect of the resource's state has changed
- Replace steps, which exist solely for display purposes
The former case is particularly bothersome, as it is rather common to
run updates--especially in CI--that consist largely/entirely of these
same steps.
These changes eliminate the checkpoint writes we perform in these two
cases. Some care is needed to ensure that we continue to write the
checkpoint in the case of same steps that do represent meaningful
changes (e.g. changes to a resource's output properties or
dependencies).
Fixes#1769.
* Add a list of in-flight operations to the deployment
This commit augments 'DeploymentV2' with a list of operations that are
currently in flight. This information is used by the engine to keep
track of whether or not a particular deployment is in a valid state.
The SnapshotManager is responsible for inserting and removing operations
from the in-flight operation list. When the engine registers an intent
to perform an operation, SnapshotManager inserts an Operation into this
list and saves it to the snapshot. When an operation completes, the
SnapshotManager removes it from the snapshot. From this, the engine can
infer that if it ever sees a deployment with pending operations, the
Pulumi CLI must have crashed or otherwise abnormally terminated before
seeing whether or not an operation completed successfully.
To remedy this state, this commit also adds code to 'pulumi stack
import' that clears all pending operations from a deployment, as well as
code to plan generation that will reject any deployments that have
pending operations present.
At the CLI level, if we see that we are in a state where pending
operations were in-flight when the engine died, we'll issue a
human-friendly error message that indicates which resources are in a bad
state and how to recover their stack.
* CR: Multi-line string literals, renaming in-flight -> pending
* CR: Add enum to apitype for operation type, also name status -> type for clarity
* Fix the yaml type
* Fix missed renames
* Add implementation for lifecycle_test.go
* Rebase against master
* Serialize SourceEvents coming from the refresh source
The engine requires that a source event coming from a source be "ready
to execute" at the moment that it is sent to the engine. Since the
refresh source sent all goal states eagerly through its source iterator,
the engine assumed that it was legal to execute them all in parallel and
did so. This is a problem for the snapshot, since the snapshot expects
to be in an order that is a legal topological ordering of the dependency
DAG.
This PR fixes the issue by sending refresh source events one-at-a-time
through the refresh source iterator, only unblocking to send the next
step as soon as the previous step completes.
* Fix deadlock in refresh test
* Fix an issue where the engine "completed" steps too early
By signalling that a step is done before committing the step's results
to the snapshot, the engine was left with a race where dependent
resources could find themselves completely executed and committed before
a resource that they depend on has been committed.
Fixespulumi/pulumi#1726
* Fix an issue with Replace steps at the end of a plan
If the last step that was executed successfully was a Replace, we could
end up in a situation where we unintentionally left the snapshot
invalid.
* Add a test
* CR: pass context.Context as first parameter to Iterate
* CR: null->nil
* Protobuf changes to record dependencies for read resources
* Add a number of tests for read resources, especially around replacement
* Place read resources in the snapshot with "external" bit set
Fixespulumi/pulumi#1521. This commit introduces two new step ops: Read
and ReadReplacement. The engine generates Read and ReadReplacement steps
when servicing ReadResource RPC calls from the language host.
* Fix an omission of OpReadReplace from the step list
* Rebase against master
* Transition to use V2 Resources by default
* Add a semantic "relinquish" operation to the engine
If the engine observes that a resource is read and also that the
resource exists in the snapshot as a non-external resource, it will not
delete the resource if the IDs of the old and new resources match.
* Typo fix
* CR: add missing comments, DeserializeDeployment -> DeserializeDeploymentV2, ID check
The PluginEvents will now try to register loaded plugins which,
during a refresh preview, will result in attempting to save mutations
when a token is missing. This change mirrors the changes made to
destroy which avoid it panicing similarly, by simply leaving
PluginEvents unset. Also adds a bit of tracing that was helpful to
me as I debugged through the underlying issues.
Fixes#1377.
* Refactor the SnapshotManager interface
Lift snapshot management out of the engine by delegating it to the
SnapshotManager implementation in pkg/backend.
* Add a event interface for plugin loads and use that interface to record plugins in the snapshot
* Remove dead code
* Add comments to Events
* Add a number of tests for SnapshotManager
* CR feedback: use a successful bit on 'End' instead of having a separate 'Abort' API
* CR feedback
* CR feedback: register plugins one-at-a-time instead of the entire state at once
* Lift snapshot management out of the engine
This PR is a prerequisite for parallelism by addressing a major problem
that the engine has to deal with when performing parallel resource
construction: parallel mutation of the global snapshot. This PR adds
a `SnapshotManager` type that is responsible for maintaining and
persisting the current resource snapshot. It serializes all reads and
writes to the global snapshot and persists the snapshot to persistent
storage upon every write.
As a side-effect of this, the core engine no longer needs to know about
snapshot management at all; all snapshot operations can be handled as
callbacks on deployment events. This will greatly simplify the
parallelization of the core engine.
Worth noting is that the core engine will still need to be able to read
the current snapshot, since it is interested in the dependency graphs
contained within. The full implications of that are out of scope of this
PR.
Remove dead code, Steps no longer need a reference to the plan iterator that created them
Fixing various issues that arise when bringing up pulumi-aws
Line length broke the build
Code review: remove dead field, fix yaml name error
Rebase against master, provide implementation of StackPersister for cloud backend
Code review feedback: comments on MutationStatus, style in snapshot.go
Code review feedback: move SnapshotManager to pkg/backend, change engine to use an interface SnapshotManager
Code review feedback: use a channel for synchronization
Add a comment and a new test
* Maintain two checkpoints, an immutable base and a mutable delta, and
periodically merge the two to produce snapshots
* Add a lot of tests - covers all of the non-error paths of BeginMutation and End
* Fix a test resource provider
* Add a few tests, fix a few issues
* Rebase against master, fixed merge