This change performs template expansion both for root stack documents in
addition to the transitive closure of dependencies. There are many ongoing
design and implementation questions about how this should actually work;
please see marapongo/mu#7 for a discussion of them.
The only two AST nodes that track any semblance of location right now
are ast.Workspace and ast.Stack. This is simply because, using the standard
JSON and YAML parsers, we aren't given any information about the resulting
unmarshaled node locations. To fix that, we'll need to crack open the parsers
and get our hands dirty. In the meantime, we can crudely implement diag.Diagable
on ast.Workspace and ast.Stack, however, to simply return their diag.Documents.
This change completes the implementation of dependency and type binding.
The top-level change here is that, during the first semantic analysis AST walk,
we gather up all unknown dependencies. Then the compiler resolves them, caching
the lookups to ensure that we don't load the same stack twice. Finally, during
the second and final semantic analysis AST walk, we populate the bound nodes
by looking up what the compiler resolved for us.
This change implements dependency versions, including semantic analysis, per the
checkin 83030685c3.
There's quite a bit in here but at a top-level this parses and validates dependency
references of the form
[[proto://]base.url]namespace/.../name[@version]
and verifies that the components are correct, as well as binding them to symbols.
These references can appear in two places at the moment:
* Service types.
* Cluster dependencies.
As part of this change, a number of supporting changes have been made:
* Parse Workspaces using a full-blown parser, parser analysis, and semantic analysis.
This allows us to share logic around the validation of common AST types. This also
moves some of the logic around loading workspace.yaml files back to the parser, where
it can be unified with the way we load Mu.yaml files.
* New ast.Version and ast.VersionSpec types. The former represents a precise version
-- either a specific semantic version or a short or long Git SHA hash -- and the
latter represents a range -- either a Version, "latest", or a semantic range.
* New ast.Ref and ast.RefParts types. The former is an unparsed string that is
thought to contain a Ref, while the latter is a validated Ref that has been parsed
into its components (Proto, Base, Name, and Version).
* Added some type assertions to ensure certain structs implement certain interfaces,
to speed up finding errors. (And remove the coercions that zero-fill vtbl slots.)
* Be consistent about prefixing error types with Error or Warning.
* Organize the core compiler driver's logic into three methods, FE, sema, and BE.
* A bunch of tests for some of the above ... more to come in an upcoming change.
This change adds support for Workspaces, a convenient way of sharing settings
among many Stacks, like default cluster targets, configuration settings, and the
like, which are not meant to be distributed as part of the Stack itself.
The following things are included in this checkin:
* At workspace initialization time, detect and parse the .mu/workspace.yaml
file. This is pretty rudimentary right now and contains just the default
cluster targets. The results are stored in a new ast.Workspace type.
* Rename "target" to "cluster". This impacts many things, including ast.Target
being changed to ast.Cluster, and all related fields, the command line --target
being changed to --cluster, various internal helper functions, and so on. This
helps to reinforce the desired mental model.
* Eliminate the ast.Metadata type. Instead, the metadata moves directly onto
the Stack. This reflects the decision to make Stacks "the thing" that is
distributed, versioned, and is the granularity of dependency.
* During cluster targeting, add the workspace settings into the probing logic.
We still search in the same order: CLI > Stack > Workspace.
This change includes logic to resolve dependencies declared by stacks. The design
is described in https://github.com/marapongo/mu/blob/master/docs/deps.md.
In summary, each stack may declare dependencies, which are name/semver pairs. A
new structure has been introduced, ast.Ref, to distinguish between ast.Names and
dependency names. An ast.Ref includes a protocol, base part, and a name part (the
latter being an ast.Name); for example, in "https://hub.mu.com/mu/container/",
"https://" is the protocol, "hub.mu.com/" is the base, and "mu/container" is the
name. This is used to resolve URL-like names to package manager-like artifacts.
The dependency resolution phase happens after parsing, but before semantic analysis.
This is because dependencies are "source-like" in that we must load and parse all
dependency metadata files. We stick the full transitive closure of dependencies
into a map attached to the compiler to avoid loading dependencies multiple times.
Note that, although dependencies prohibit cycles, this forms a DAG, meaning multiple
inbound edges to a single stack may come from multiple places.
From there, we rely on ordinary visitation to deal with dependencies further.
This includes inserting symbol entries into the symbol table, mapping names to the
loaded stacks, during the first phase of binding so that they may be found
subsequently when typechecking during the second phase and beyond.
This change moves the workspace and Mufile detection logic out of the compiler
package and into the workspace one.
This also sketches out the overall workspace structure. A workspace is "delimited"
by the presence of a .mu/ directory anywhere in the parent ancestry. Inside of that
directory we have an optional .mu/clusters.yaml (or .json) file containing cluster
settings shared among the whole workspace. We also have an optional .mu/stacks/
directory that contains dependencies used during package management.
The notion of a "global" workspace will also be present, which is essentially just
a .mu/ directory in your home, ~/.mu/, that has an equivalent structure, but can be
shared among all workspaces on the same machine.
The more I live with the current system, the more I prefer "properties" to
"parameters" for stacks and services. Although it is true that these things
are essentially construction-time arguments, they manifest more like properties
in the way they are used; in fact, if you think of the world in terms of primary
constructors, the distinction is pretty subtle anyway.
For example, when creating a new service, we say the following:
services:
private:
some/service:
a: 0
b: true
c: foo
This looks like a, b, and c are properties of the type some/service. If, on
the other hand, we kept calling these parameters, then you'd arguably prefer to
see the following:
services:
private:
some/service:
arguments:
a: 0
b: true
c: foo
This is a more imperative than declarative view of the world, which I dislike
(especially because it is more verbose).
Time will tell whether this is the right decision or not ...
During unmarshaling, the default behavior of the stock Golang JSON marshaler,
and consequently the YAML one we used which mimics its behavior, is to toss away
unrecognized properties. This isn't what we want for two reasons:
First, we want to issue errors/warnings on unrecognized fields to aid in diagnostics;
we will set aside some extensible section for 3rd parties to use. This is not
addressed in this change, however.
Second, and more pertinent, is that we need to retain unrecognized fields for certain
types like services, which are extensible by default.
Until golang/go#6213 is addressed -- imminent, it seems -- we will have to do a
somewhat hacky workaround to this problem. This change contains what I consider to
be the "least bad" in that we won't introduce a lot of performance overhead, and
just have to deal with the slight annoyance of the ast.Services node type containing
both Public/Private *and* PublicUntyped/PrivateUntyped fields alongside one another.
The marshaler dumps property bags into the *Untyped fields, and the parsetree analyzer
expands them out into a structured ast.Service type. Subsequent passes can then
ignore the *Untyped fields altogether.
Note that this would cause some marshaling funkiness if we ever wanted to remarshal
the mutated ASTs back into JSON/YAML. Since we don't do that right now, however, I've
not made any attempt to keep the two pairs in synch. Post-parsetree analyzer, we
literally just forget about the *Untyped guys.
This change rearranges the last checkin a little bit. Rather than storing
shadow BoundPublic/BoundPrivate maps, we will store the *ast.Stack directly on
the ast.Service node itself. This helps with context-free manipulation (e.g.,
you don't need access to the parent map just to interact with the node), and
simplifies the backend code quite a bit (again, less context to pass).
This is another change of mostly placeholders.
In general, there will be three kinds of types handled by code-generation:
* Mu primitives will be expanded into AWS goo in a very specialized way, to
accomplish the desired Mu semantics for those abstractions.
* AWS-specific extension types (mu/extension) will be recognized, so that we
can create special AWS resources like S3 buckets, DynamoDB tables, etc.
* Anything else is interpreted as a reference to another stack that will be
instantiated at deployment time (basically through template expansion).
This change does rearrange two noteworthy things in the core compiler, however:
first, it creates a place for bound nodes in the public and private service
references, so that the backend can access the raw stack types behind them; and
second, it moves the predefined types underneath their own package to avoid cycles.
This change introduces the notion of "Stack subclassing" in two ways:
1. A Stack may declare that it subclasses another one using the base property:
name: mystack
base: other/stack
.. as before ..
2. A Stack may declare that it is abstract; in other words, that it is meant
solely for subclassing, and cannot be compiled and deployed independently:
name: mystack
abstract: true
.. as before ..
Note that non-abstract Stacks are required to declare at least one Service,
whether that is public, private, or both.
This change includes a few steps towards AWS backend code-generation:
* Add a BoundDependencies property to ast.Stack to remember the *ast.Stack
objects bound during Stack binding.
* Make a few CloudFormation properties optional (cfOutput Export/Condition).
* Rename clouds.ArchMap, clouds.ArchNames, schedulers.ArchMap, and
schedulers.ArchNames to clouds.Values, clouds.Names, schedulers.Values,
and schedulers.Names, respectively. This reads much nicer to my eyes.
* Create a new anonymous ast.Target for deployments if no specific target
was specified; this is to support quick-and-easy "one off" deployments,
as will be common when doing local development.
* Sketch out more of the AWS Cloud implementation. We actually map the
Mu Services into CloudFormation Resources; well, kinda sorta, since we
don't actually have Service-specific logic in here yet, however all of
the structure and scaffolding is now here.
This change implements most of the cloud target and architecture detection
logic, along with associated verification and a bunch of new error messages.
There are two settings for picking a cloud destination:
* Architecture: this specifies the combination of cloud (e.g., AWS, GCP, etc)
plus scheduler (e.g., none, Swarm, ECS, etc).
* Target: a named, preconfigured entity that includes both an Architecture and
an assortment of extra default configuration options.
The general idea here is that you can preconfigure a set of Targets for
named environments like "prod", "stage", etc. Those can either exist in a
single Mufile, or the Mucluster file if they are shared amongst multiple
Mufiles. This can be specified at the command line as such:
$ mu build --target=stage
Furthermore, a given environment may be annointed the default, so that
$ mu build
selects that environment without needing to say so explicitly.
It is also possible to specify an architecture at the command line for
scenarios where you aren't intending to target an existing named environment.
This is good for "anonymous" testing scenarios or even just running locally:
$ mu build --arch=aws
$ mu build --arch=aws:ecs
$ mu build --arch=local:kubernetes
$ .. and so on ..
This change does little more than plumb these settings around, verify them,
etc., however it sets us up to actually start dispating to the right backend.
Instead of:
name: mystack
public:
someservice
private:
someotherservice
we want it to be:
name: mystack
services:
public:
someservice
private
someotherservice
I had always intended it to be this way, but coded up the ASTs wrong.
Neither the YAML nor JSON decoders appreciate having pointers in the AST
structures. This is unfortunate because we end up mutating them later on.
Perhaps we will need separate parse trees and ASTs after all ...
This change lays some groundwork that registers symbols when doing semantic
analysis of the resulting AST. For now, that just entails detecting duplicate
services by way of symbol registration.
Note that we've also split binding into two phases to account for the fact
that intra-stack dependencies are wholly legal.
This change introduces a check during parse-tree analysis that dependencies
are valid, along with some tests. Note that this could technically happen later
during semantic analysis and I will likely move it so that we can get better
diagnostics (more errors before failing). I've also cleaned up and unified some
of the logic by introducing the general notion of a Visitor interface, which the
parse tree analyzer, binder, and analyzers to come will all implement.
This change begins to lay the groundwork for doing semantic analysis and
lowering to the cloud target's representation. In particular:
* Split the mu/schema package. There is now mu/ast which contains the
core types and mu/encoding which concerns itself with JSON and YAML
serialization.
* Notably I am *not* yet introducing a second AST form. Instead, we will
keep the parse tree and AST unified for the time being. I envision very
little difference between them -- at least for now -- and so this keeps
things simpler, at the expense of two downsides: 1) the trees will be
mutable (which turns out to be a good thing for performance), and 2) some
fields will need to be ignored during de/serialization. We can always
revisit this later when and if the need to split them arises.
* Add a binder phase. It is currently a no-op.