We need to access the bound property values for a given stack, especially
during code-generation. This information was present for services before,
however not for stacks constructed via other means (e.g., the top-most one).
This change adds a PropertyValues bag plus a corresponding BoundPropertyValues
to the ast.Stack type.
This implements support for arbitrary service types on properties,
not just the weakly typed "service". For example, in the AWS stacks,
the aws/ec2/route type requires a routeTable, among other things:
name: aws/ec2/route
properties:
routeTable:
type: aws/ec2/routeTable
This not only binds the definition of such properties, but also the
callsites of those creating stacks and supplying values for them.
This includes checking for concrete, instantiated, and even base
types, so that, for instance, if a custom stack derived from
aws/ec2/routeTable using the base property, in the above example
it could be supplied as a legal value for the routeTable property.
A stack property can refer to other stack types. For example:
properties:
gateway:
type: aws/ec2/internetGateway
...
In such cases, we need to validate the property during binding,
in addition to binding it to an actual type so that we can later
validate callers who are constructing instances of this stack
and providing property values that we must typecheck.
Note that this binding is subtly different than existing stack
type binding. All the name validation, resolution, and so forth
are the same. However, notice that in this case we are not actually
supplying any property setters. That is, internetGateway is not
an "expanded" type, in that we have not processed any of its templates.
An analogy might help: this is sort of akin referring to an
uninstantiated generic type in a traditional programming language,
versus its instantiated form. In this case, certain properties aren't
available to us, however we can still use it for type identity, etc.
This change properly transforms literal AST nodes during code-gen.
This includes emitting CloudFormation !Refs where appropriate, for
intra-stack references (capability types).
This change adds a handful of property binding tests.
It also fixes:
* AsName should assert IsName.
* Enumerate properties stably, so that it is deterministic.
* Do not issue errors about unrecognized properties for the special
`mu/extension` type. It's entire purpose in life is to offer an
entirely custom set of properties, which the provider is meant to
validate.
* Default to an empty map if properties are missing.
* Add a "/" to the end of the namespace from the workspace, if present.
And rearranges some code:
* Rename the LiteralX types to XLiteral; e.g., StringLiteral instead of
LiteralString. I kept typing XLiteral erroneously.
* Eliminate the Mu prefix on all of the predefined type and service
functions and types. It's superfluous and reads nicer this way.
* Swap the order of "expected" vs. "got" in the error message about
incorrect property types. It used to say "got %v, expected %v"; I
personally find that it is more helpful if it says "expected %v,
got %v". YMMV.
This change permits a workspace to specify a namespace, which is just a name
part that is trimmed off the front of directories when probing for inter-
workspace dependencies. For example, if our namespace is aws/, normally we'd
need to organize our namespace into directories like:
<root>
| aws/
| | dynamodb/
| | ec2/
| | s3/
... and so on ...
If we instead specify a namespace
namespace: aws
Then we can instead organize our project workspace as follows:
<root>
| dynamodb/
| ec2/
| s3/
... and so on ...
This is an initial pass at property binding. For all stack instantiations,
we must verify that the set of properties supplied are correct. We also must
remember the bound property information so that code-generation has all of
the information it needs to generate correct code (including capability refs).
This entails:
* Ensuring required properties are provided.
* Expanding missing properties that have Default values.
* Type-checking that supplied properties are of the right type.
* Expanding property values into AST literal nodes.
To do this requires a third AST pass in the semantic analysis part of the
compiler. In the 1st pass, dependencies aren't even known yet; in the 2nd
pass, dependencies have not yet been bound; therefore, we need a 3rd pass,
which can depend on the full binding information for the transitive closure
of AST nodes and dependencies to have been populated with types.
There are a few loose ends in here:
* We don't yet validate top-level stack properties.
* We don't yet validate top-level stack base type properties.
* We don't yet support complex schema property types.
* We don't yet support even "simple" complex property types, like `[ string ]`.
* We don't yet support strongly typed capability property types (just `service`).
That said, I am going to turn to writing a few tests for the basic cases, and then
resume to finishing this afterwards (tracked by marapongo/mu#25).
This changes the way binding dependencies works slightly, to ensure that
the full transitive closure of dependencies is bound appropriately before
hitting code-generation. Namely, now binder.PrepareStack returns a list
of unresolved dependency Refs; the compiler is responsible for turning this
into a map from Ref to the loaded diag.Document, before calling BindStack;
then, BindStack instantiates these as necessary (template expansion, etc),
returning an array of unbound *ast.Stacks that the compiler must then bind.
This change introduces the notion of "perturbing" properties. Changing
one of these impacts the live service, possibly leading to downtime. As
such, we will likely encourage blue/green deployments of them just to be
safe. Note that this is really just a placeholder so I can keep track of
metadata as we go, since AWS CF has a similar notion to this.
I'm not in love with the name. I considered `interrupts`, however,
I must admit I liked that `readonly` and `perturbs` are symmetric in
the number of characters (meaning stuff lines up nicely...)
This change detects the target cloud earlier on in the compilation process.
Prior to this change, we didn't know this information until the backend code-generation.
Clearly we need to know this at least by then, however, templates can specialize on this
information, so we actually need it sooner. This change moves it into the frontend part.
Note that to support this we now eliminate the ability to specify target clusters in
the Mufile alone. That "feels" right to me anyway, since Mufiles are supposed to be
agnostic to their deployment environment, other than template specialization. Instead,
this information can come from the CLI and/or the workspace settings file.
This change adds the notion of readonly properties to stacks. Although these
*can* be "changed", doing so implies recreation of the resources all over again.
As a result, all dependents must be recreated, in a cascading manner.
Most properties in CF templates are auto-mapped by the aws/cf extension
provider. However, sometimes we want to inject extra properties that are
outside of that auto-mapping (like our convenient shortcut for supplying
name to mean adding a "Name=v" tag). And sometimes we want to skip auto-
mapping for certain properties (like our capability-based approach to
passing service references, versus string IDs).
This change adds a super simple initial whack at a basic cluster topology
comprised of VPC, subnet, internet gateway, attachments, and route tables.
This is actually written in Mu itself, and I am committing this early, since
there are quite a few features required before we can actually make progress
getting this up and running.
This change performs template expansion both for root stack documents in
addition to the transitive closure of dependencies. There are many ongoing
design and implementation questions about how this should actually work;
please see marapongo/mu#7 for a discussion of them.
The only two AST nodes that track any semblance of location right now
are ast.Workspace and ast.Stack. This is simply because, using the standard
JSON and YAML parsers, we aren't given any information about the resulting
unmarshaled node locations. To fix that, we'll need to crack open the parsers
and get our hands dirty. In the meantime, we can crudely implement diag.Diagable
on ast.Workspace and ast.Stack, however, to simply return their diag.Documents.
This change completes the implementation of dependency and type binding.
The top-level change here is that, during the first semantic analysis AST walk,
we gather up all unknown dependencies. Then the compiler resolves them, caching
the lookups to ensure that we don't load the same stack twice. Finally, during
the second and final semantic analysis AST walk, we populate the bound nodes
by looking up what the compiler resolved for us.
This changes the logic when parsing RefParts so that we can actually
round-trip them faithfully when required. To do this, by default we
preserve blank ""s in the RefPart fields that are legally omitted from
the Ref string. Then, there is a Defaults() method that can populate
the missing fields with the defaults if desired.
This change implements dependency versions, including semantic analysis, per the
checkin 83030685c3.
There's quite a bit in here but at a top-level this parses and validates dependency
references of the form
[[proto://]base.url]namespace/.../name[@version]
and verifies that the components are correct, as well as binding them to symbols.
These references can appear in two places at the moment:
* Service types.
* Cluster dependencies.
As part of this change, a number of supporting changes have been made:
* Parse Workspaces using a full-blown parser, parser analysis, and semantic analysis.
This allows us to share logic around the validation of common AST types. This also
moves some of the logic around loading workspace.yaml files back to the parser, where
it can be unified with the way we load Mu.yaml files.
* New ast.Version and ast.VersionSpec types. The former represents a precise version
-- either a specific semantic version or a short or long Git SHA hash -- and the
latter represents a range -- either a Version, "latest", or a semantic range.
* New ast.Ref and ast.RefParts types. The former is an unparsed string that is
thought to contain a Ref, while the latter is a validated Ref that has been parsed
into its components (Proto, Base, Name, and Version).
* Added some type assertions to ensure certain structs implement certain interfaces,
to speed up finding errors. (And remove the coercions that zero-fill vtbl slots.)
* Be consistent about prefixing error types with Error or Warning.
* Organize the core compiler driver's logic into three methods, FE, sema, and BE.
* A bunch of tests for some of the above ... more to come in an upcoming change.
This change adds support for Workspaces, a convenient way of sharing settings
among many Stacks, like default cluster targets, configuration settings, and the
like, which are not meant to be distributed as part of the Stack itself.
The following things are included in this checkin:
* At workspace initialization time, detect and parse the .mu/workspace.yaml
file. This is pretty rudimentary right now and contains just the default
cluster targets. The results are stored in a new ast.Workspace type.
* Rename "target" to "cluster". This impacts many things, including ast.Target
being changed to ast.Cluster, and all related fields, the command line --target
being changed to --cluster, various internal helper functions, and so on. This
helps to reinforce the desired mental model.
* Eliminate the ast.Metadata type. Instead, the metadata moves directly onto
the Stack. This reflects the decision to make Stacks "the thing" that is
distributed, versioned, and is the granularity of dependency.
* During cluster targeting, add the workspace settings into the probing logic.
We still search in the same order: CLI > Stack > Workspace.
Also add some convenience helper methods to deal with names, including
IsName and AsName, which validate that names obey the intended regular expressions.
This change includes logic to resolve dependencies declared by stacks. The design
is described in https://github.com/marapongo/mu/blob/master/docs/deps.md.
In summary, each stack may declare dependencies, which are name/semver pairs. A
new structure has been introduced, ast.Ref, to distinguish between ast.Names and
dependency names. An ast.Ref includes a protocol, base part, and a name part (the
latter being an ast.Name); for example, in "https://hub.mu.com/mu/container/",
"https://" is the protocol, "hub.mu.com/" is the base, and "mu/container" is the
name. This is used to resolve URL-like names to package manager-like artifacts.
The dependency resolution phase happens after parsing, but before semantic analysis.
This is because dependencies are "source-like" in that we must load and parse all
dependency metadata files. We stick the full transitive closure of dependencies
into a map attached to the compiler to avoid loading dependencies multiple times.
Note that, although dependencies prohibit cycles, this forms a DAG, meaning multiple
inbound edges to a single stack may come from multiple places.
From there, we rely on ordinary visitation to deal with dependencies further.
This includes inserting symbol entries into the symbol table, mapping names to the
loaded stacks, during the first phase of binding so that they may be found
subsequently when typechecking during the second phase and beyond.
This change moves the workspace and Mufile detection logic out of the compiler
package and into the workspace one.
This also sketches out the overall workspace structure. A workspace is "delimited"
by the presence of a .mu/ directory anywhere in the parent ancestry. Inside of that
directory we have an optional .mu/clusters.yaml (or .json) file containing cluster
settings shared among the whole workspace. We also have an optional .mu/stacks/
directory that contains dependencies used during package management.
The notion of a "global" workspace will also be present, which is essentially just
a .mu/ directory in your home, ~/.mu/, that has an equivalent structure, but can be
shared among all workspaces on the same machine.
The more I live with the current system, the more I prefer "properties" to
"parameters" for stacks and services. Although it is true that these things
are essentially construction-time arguments, they manifest more like properties
in the way they are used; in fact, if you think of the world in terms of primary
constructors, the distinction is pretty subtle anyway.
For example, when creating a new service, we say the following:
services:
private:
some/service:
a: 0
b: true
c: foo
This looks like a, b, and c are properties of the type some/service. If, on
the other hand, we kept calling these parameters, then you'd arguably prefer to
see the following:
services:
private:
some/service:
arguments:
a: 0
b: true
c: foo
This is a more imperative than declarative view of the world, which I dislike
(especially because it is more verbose).
Time will tell whether this is the right decision or not ...
During unmarshaling, the default behavior of the stock Golang JSON marshaler,
and consequently the YAML one we used which mimics its behavior, is to toss away
unrecognized properties. This isn't what we want for two reasons:
First, we want to issue errors/warnings on unrecognized fields to aid in diagnostics;
we will set aside some extensible section for 3rd parties to use. This is not
addressed in this change, however.
Second, and more pertinent, is that we need to retain unrecognized fields for certain
types like services, which are extensible by default.
Until golang/go#6213 is addressed -- imminent, it seems -- we will have to do a
somewhat hacky workaround to this problem. This change contains what I consider to
be the "least bad" in that we won't introduce a lot of performance overhead, and
just have to deal with the slight annoyance of the ast.Services node type containing
both Public/Private *and* PublicUntyped/PrivateUntyped fields alongside one another.
The marshaler dumps property bags into the *Untyped fields, and the parsetree analyzer
expands them out into a structured ast.Service type. Subsequent passes can then
ignore the *Untyped fields altogether.
Note that this would cause some marshaling funkiness if we ever wanted to remarshal
the mutated ASTs back into JSON/YAML. Since we don't do that right now, however, I've
not made any attempt to keep the two pairs in synch. Post-parsetree analyzer, we
literally just forget about the *Untyped guys.
This change rearranges the last checkin a little bit. Rather than storing
shadow BoundPublic/BoundPrivate maps, we will store the *ast.Stack directly on
the ast.Service node itself. This helps with context-free manipulation (e.g.,
you don't need access to the parent map just to interact with the node), and
simplifies the backend code quite a bit (again, less context to pass).
This is another change of mostly placeholders.
In general, there will be three kinds of types handled by code-generation:
* Mu primitives will be expanded into AWS goo in a very specialized way, to
accomplish the desired Mu semantics for those abstractions.
* AWS-specific extension types (mu/extension) will be recognized, so that we
can create special AWS resources like S3 buckets, DynamoDB tables, etc.
* Anything else is interpreted as a reference to another stack that will be
instantiated at deployment time (basically through template expansion).
This change does rearrange two noteworthy things in the core compiler, however:
first, it creates a place for bound nodes in the public and private service
references, so that the backend can access the raw stack types behind them; and
second, it moves the predefined types underneath their own package to avoid cycles.
This change introduces the notion of "Stack subclassing" in two ways:
1. A Stack may declare that it subclasses another one using the base property:
name: mystack
base: other/stack
.. as before ..
2. A Stack may declare that it is abstract; in other words, that it is meant
solely for subclassing, and cannot be compiled and deployed independently:
name: mystack
abstract: true
.. as before ..
Note that non-abstract Stacks are required to declare at least one Service,
whether that is public, private, or both.
This change includes a few steps towards AWS backend code-generation:
* Add a BoundDependencies property to ast.Stack to remember the *ast.Stack
objects bound during Stack binding.
* Make a few CloudFormation properties optional (cfOutput Export/Condition).
* Rename clouds.ArchMap, clouds.ArchNames, schedulers.ArchMap, and
schedulers.ArchNames to clouds.Values, clouds.Names, schedulers.Values,
and schedulers.Names, respectively. This reads much nicer to my eyes.
* Create a new anonymous ast.Target for deployments if no specific target
was specified; this is to support quick-and-easy "one off" deployments,
as will be common when doing local development.
* Sketch out more of the AWS Cloud implementation. We actually map the
Mu Services into CloudFormation Resources; well, kinda sorta, since we
don't actually have Service-specific logic in here yet, however all of
the structure and scaffolding is now here.
We previously used stable enumeration of the various AST maps in the core
visitor, however we now need stable enumeration in more places (like the AWS
backend I am working on). This change refactors this logic to expose a set
of core ast.StableX routines that stably enumerate maps, and then simply uses
them in place of the existing visitor logic. (Missing generics right now...)
This change implements most of the cloud target and architecture detection
logic, along with associated verification and a bunch of new error messages.
There are two settings for picking a cloud destination:
* Architecture: this specifies the combination of cloud (e.g., AWS, GCP, etc)
plus scheduler (e.g., none, Swarm, ECS, etc).
* Target: a named, preconfigured entity that includes both an Architecture and
an assortment of extra default configuration options.
The general idea here is that you can preconfigure a set of Targets for
named environments like "prod", "stage", etc. Those can either exist in a
single Mufile, or the Mucluster file if they are shared amongst multiple
Mufiles. This can be specified at the command line as such:
$ mu build --target=stage
Furthermore, a given environment may be annointed the default, so that
$ mu build
selects that environment without needing to say so explicitly.
It is also possible to specify an architecture at the command line for
scenarios where you aren't intending to target an existing named environment.
This is good for "anonymous" testing scenarios or even just running locally:
$ mu build --arch=aws
$ mu build --arch=aws:ecs
$ mu build --arch=local:kubernetes
$ .. and so on ..
This change does little more than plumb these settings around, verify them,
etc., however it sets us up to actually start dispating to the right backend.
Instead of:
name: mystack
public:
someservice
private:
someotherservice
we want it to be:
name: mystack
services:
public:
someservice
private
someotherservice
I had always intended it to be this way, but coded up the ASTs wrong.
Neither the YAML nor JSON decoders appreciate having pointers in the AST
structures. This is unfortunate because we end up mutating them later on.
Perhaps we will need separate parse trees and ASTs after all ...
This change lays some groundwork that registers symbols when doing semantic
analysis of the resulting AST. For now, that just entails detecting duplicate
services by way of symbol registration.
Note that we've also split binding into two phases to account for the fact
that intra-stack dependencies are wholly legal.
This change introduces a check during parse-tree analysis that dependencies
are valid, along with some tests. Note that this could technically happen later
during semantic analysis and I will likely move it so that we can get better
diagnostics (more errors before failing). I've also cleaned up and unified some
of the logic by introducing the general notion of a Visitor interface, which the
parse tree analyzer, binder, and analyzers to come will all implement.
This change begins to lay the groundwork for doing semantic analysis and
lowering to the cloud target's representation. In particular:
* Split the mu/schema package. There is now mu/ast which contains the
core types and mu/encoding which concerns itself with JSON and YAML
serialization.
* Notably I am *not* yet introducing a second AST form. Instead, we will
keep the parse tree and AST unified for the time being. I envision very
little difference between them -- at least for now -- and so this keeps
things simpler, at the expense of two downsides: 1) the trees will be
mutable (which turns out to be a good thing for performance), and 2) some
fields will need to be ignored during de/serialization. We can always
revisit this later when and if the need to split them arises.
* Add a binder phase. It is currently a no-op.