This change overhauls the core of how types are used by the entire
compiler. In particular, we now have an ast.Type, and have begun
using its use where appropriate. An ast.Type is a union representing
precisely one of the possible sources of types in the system:
* Primitive type: any, bool, number, string, or service.
* Stack type: a resolved reference to an actual concrete stack.
* Schema type: a resolved reference to an actual concrete schema.
* Unresolved reference: a textual reference that hasn't yet been
resolved to a concrete artifact.
* Uninstantiated reference: a reference that has been resolved to
an uninstantiated stack, but hasn't been bound to a concrete
result yet. Right now, this can point to a stack, however
eventually we would imagine this supporting inter-stack schema
references also.
* Decorated type: either an array or a map; in the array case, there
is a single inner element type; in the map case, there are two,
the keys and values; in all cases, the type recurses to any of the
possibilities listed here.
All of the relevant AST nodes have been overhauled accordingly.
In addition to this, we now have an ast.Schema type. It is loosely
modeled on JSON Schema in its capabilities (http://json-schema.org/).
Although we parse and perform some visitation and binding of these,
there are mostly placeholders left in the code for the interesting
aspects, such as registering symbols, resolving dependencies, and
typechecking usage of schema types.
This is part of the ongoing work behind marapongo/mu#9.
This change leverages intrinsics in place of the predefined types.
It remains to be seen if we can reach 100% on this, however I am hopeful.
It's also nice that the system will be built "out of itself" with this
approach; in other words, each of the types is simply a Mufile that can
use conditional targeting as appropriate for the given cloud providers.
If we find that this isn't enough, we can always bring back the concept.
This change stops using the short-hand "!Ref" YAML syntax. The Golang
marshaler encodes it with quotes and, apparently, has no way to suppress
this behavior; this isn't surprising, since the YAML parser we're using
admits it doesn't support this aspect of the YAML spec fully. But that's
okay, the long-hand syntax works just fine, and has the added benefit
that we don't need to special case the logig for JSON versus YAML.
I think things have gotten a little out of hand with the way mu/x/cf
auto-maps properties. In the beginning, it looked like everything could
be trivially auto-mapped, and I wanted to avoid the verbosity of mapping
each property by hand (since you can easily fat finger a name, mess up
capitalization, forget one, etc). But then we began mapping service
references using proper CloudFormation !Refs, which meant suppressing
some of the auto-mappings, etc., etc. This led to properties, extraProperties,
skipProperties, renamedProperties, and so on... Pretty confusing IMHO.
I just took a step back and decided to eliminate auto-mapping. Instead,
you get two options: properties just lists a set of property name mappings,
and extraProperties lets you do template magic to map thing instead if you
wish to take matters into your own hands. The result isn't too verbose
and has a lot less magic going on so it's easier to understand.
This change eliminates the special type mu/extension in favor of extensible
intrinsic types. This subsumes the previous functionality while also fixing
a number of warts with the old model.
In particular, the old mu/extension approach deferred property binding until
very late in the compiler. In fact, too late. The backend provider for an
extension simply received an untyped bag of stuff, which it then had to
deal with. Unfortunately, some operations in the binder are inaccessible
at this point because doing so would cause a cycle. Furthermore, some
pertinent information is gone at this point, like the scopes and symtables.
The canonical example where we need this is binding services names to the
services themselves; e.g., the AWS CloudFormation "DependsOn" property should
resolve to the actual service names, not the string values. In the limit,
this requires full binding information.
There were a few solutions I considered, including ones that've required
less code motion, however this one feels the most elegant.
Now we permit types to be marked as "intrinsic." Binding to these names
is done exactly as ordinary name binding, unlike the special mu/extension
provider name. In fact, just about everything except code-generation for
these types is the same as ordinary types. This is perfect for the use case
at hand, which is binding properties.
After this change, for example, "DependsOn" is expanded to real service
names precisely as we need.
As part of this change, I added support for three new basic schema types:
* ast.StringList ("string[]"): a list of strings.
* ast.StringMap ("map[string]any"): a map of strings to anys.
* ast.ServiceList ("service[]"): a list of service references.
Obviously we need to revisit this and add a more complete set. This work
is already tracked by marapongo/mu#9.
At the end of the day, it's likely I will replace all hard-coded predefined
types with intrinsic types, for similar reasons to the above.
We need to access the bound property values for a given stack, especially
during code-generation. This information was present for services before,
however not for stacks constructed via other means (e.g., the top-most one).
This change adds a PropertyValues bag plus a corresponding BoundPropertyValues
to the ast.Stack type.
This implements support for arbitrary service types on properties,
not just the weakly typed "service". For example, in the AWS stacks,
the aws/ec2/route type requires a routeTable, among other things:
name: aws/ec2/route
properties:
routeTable:
type: aws/ec2/routeTable
This not only binds the definition of such properties, but also the
callsites of those creating stacks and supplying values for them.
This includes checking for concrete, instantiated, and even base
types, so that, for instance, if a custom stack derived from
aws/ec2/routeTable using the base property, in the above example
it could be supplied as a legal value for the routeTable property.
This change properly transforms literal AST nodes during code-gen.
This includes emitting CloudFormation !Refs where appropriate, for
intra-stack references (capability types).
This change adds a handful of property binding tests.
It also fixes:
* AsName should assert IsName.
* Enumerate properties stably, so that it is deterministic.
* Do not issue errors about unrecognized properties for the special
`mu/extension` type. It's entire purpose in life is to offer an
entirely custom set of properties, which the provider is meant to
validate.
* Default to an empty map if properties are missing.
* Add a "/" to the end of the namespace from the workspace, if present.
And rearranges some code:
* Rename the LiteralX types to XLiteral; e.g., StringLiteral instead of
LiteralString. I kept typing XLiteral erroneously.
* Eliminate the Mu prefix on all of the predefined type and service
functions and types. It's superfluous and reads nicer this way.
* Swap the order of "expected" vs. "got" in the error message about
incorrect property types. It used to say "got %v, expected %v"; I
personally find that it is more helpful if it says "expected %v,
got %v". YMMV.
This is an initial pass at property binding. For all stack instantiations,
we must verify that the set of properties supplied are correct. We also must
remember the bound property information so that code-generation has all of
the information it needs to generate correct code (including capability refs).
This entails:
* Ensuring required properties are provided.
* Expanding missing properties that have Default values.
* Type-checking that supplied properties are of the right type.
* Expanding property values into AST literal nodes.
To do this requires a third AST pass in the semantic analysis part of the
compiler. In the 1st pass, dependencies aren't even known yet; in the 2nd
pass, dependencies have not yet been bound; therefore, we need a 3rd pass,
which can depend on the full binding information for the transitive closure
of AST nodes and dependencies to have been populated with types.
There are a few loose ends in here:
* We don't yet validate top-level stack properties.
* We don't yet validate top-level stack base type properties.
* We don't yet support complex schema property types.
* We don't yet support even "simple" complex property types, like `[ string ]`.
* We don't yet support strongly typed capability property types (just `service`).
That said, I am going to turn to writing a few tests for the basic cases, and then
resume to finishing this afterwards (tracked by marapongo/mu#25).
The prior code could miss arrays of strings during conversion because
the arrays created by the various marshalers are weakly typed. In other
words, even though they contain strings, the array type is []interface{}.
This change introduces the encoding.ArrayOfStrings function to perform
this conversion, first by checking for []string and returning that directly
where possible, and second, if that fails, checking each element and copying.
This change moves parse-tree analysis into the Parse* functions, so that
any callers doing parsing don't need to do this as a multi-step activity.
(We had neglected to do the parse-tree analysis phase during dependency
resolution, for example, meaning services were left untyped.)
In some cases, we actually want to suppress auto-mapping for all or
most of the properties. In those cases, it's easier to specify those
that we *do* want rather than the ones we *do not* want. Now with
properties, skipProperties, and extraProperties, we have all the
necessary flexibility to control auto-mapping for CF templates.
Most properties in CF templates are auto-mapped by the aws/cf extension
provider. However, sometimes we want to inject extra properties that are
outside of that auto-mapping (like our convenient shortcut for supplying
name to mean adding a "Name=v" tag). And sometimes we want to skip auto-
mapping for certain properties (like our capability-based approach to
passing service references, versus string IDs).
This change performs template expansion both for root stack documents in
addition to the transitive closure of dependencies. There are many ongoing
design and implementation questions about how this should actually work;
please see marapongo/mu#7 for a discussion of them.
For now, we can simply auto-map the Mu properties to CF properties,
eliminating the need to manually map them in the templates. Eventually
we'll want more sophistication here to control various aspects of the CF
templates, but this eliminates a lot of tedious manual work in the meantime.
The only two AST nodes that track any semblance of location right now
are ast.Workspace and ast.Stack. This is simply because, using the standard
JSON and YAML parsers, we aren't given any information about the resulting
unmarshaled node locations. To fix that, we'll need to crack open the parsers
and get our hands dirty. In the meantime, we can crudely implement diag.Diagable
on ast.Workspace and ast.Stack, however, to simply return their diag.Documents.
This change adds a new Diagable interface from which you can obtain
a diagnostic's location information (Document and Location). A new
At function replaces WithDocument, et al., and will be used soon to
permit all arbitrary AST nodes to report back their position.
This change completes the implementation of dependency and type binding.
The top-level change here is that, during the first semantic analysis AST walk,
we gather up all unknown dependencies. Then the compiler resolves them, caching
the lookups to ensure that we don't load the same stack twice. Finally, during
the second and final semantic analysis AST walk, we populate the bound nodes
by looking up what the compiler resolved for us.
This change implements dependency versions, including semantic analysis, per the
checkin 83030685c3.
There's quite a bit in here but at a top-level this parses and validates dependency
references of the form
[[proto://]base.url]namespace/.../name[@version]
and verifies that the components are correct, as well as binding them to symbols.
These references can appear in two places at the moment:
* Service types.
* Cluster dependencies.
As part of this change, a number of supporting changes have been made:
* Parse Workspaces using a full-blown parser, parser analysis, and semantic analysis.
This allows us to share logic around the validation of common AST types. This also
moves some of the logic around loading workspace.yaml files back to the parser, where
it can be unified with the way we load Mu.yaml files.
* New ast.Version and ast.VersionSpec types. The former represents a precise version
-- either a specific semantic version or a short or long Git SHA hash -- and the
latter represents a range -- either a Version, "latest", or a semantic range.
* New ast.Ref and ast.RefParts types. The former is an unparsed string that is
thought to contain a Ref, while the latter is a validated Ref that has been parsed
into its components (Proto, Base, Name, and Version).
* Added some type assertions to ensure certain structs implement certain interfaces,
to speed up finding errors. (And remove the coercions that zero-fill vtbl slots.)
* Be consistent about prefixing error types with Error or Warning.
* Organize the core compiler driver's logic into three methods, FE, sema, and BE.
* A bunch of tests for some of the above ... more to come in an upcoming change.
Right now, the AWS ECS scheduler simply passes through to the underlying
AWS cloud provider. However, now we have the necessary hooks to start
incrementally recognizing stack types and emitting specialized code for
them (e.g., starting with mu/container).
This change adds code-generation for Stack references other than the built-in types.
This permits you to bind to a dependency and have it flow all the way through to the
code-generation phases. It still most likely bottoms out on something that fails,
however for pure AWS resources like aws/s3/bucket everything now works.
This change adds support for Workspaces, a convenient way of sharing settings
among many Stacks, like default cluster targets, configuration settings, and the
like, which are not meant to be distributed as part of the Stack itself.
The following things are included in this checkin:
* At workspace initialization time, detect and parse the .mu/workspace.yaml
file. This is pretty rudimentary right now and contains just the default
cluster targets. The results are stored in a new ast.Workspace type.
* Rename "target" to "cluster". This impacts many things, including ast.Target
being changed to ast.Cluster, and all related fields, the command line --target
being changed to --cluster, various internal helper functions, and so on. This
helps to reinforce the desired mental model.
* Eliminate the ast.Metadata type. Instead, the metadata moves directly onto
the Stack. This reflects the decision to make Stacks "the thing" that is
distributed, versioned, and is the granularity of dependency.
* During cluster targeting, add the workspace settings into the probing logic.
We still search in the same order: CLI > Stack > Workspace.
This change mostly replaces explicit if/then/glog.Fatalf calls with
util.Assert calls. In addition, it adds a companion util.Fail family
of methods that does the same thing as a failed assertion, except that
it is unconditional.
If the first rune is unprintable, then we don't want to go ahead and force
capitalization on the next character. (Unlike any other non-first rune,
where of course we do.) In the case of a first rune, we want to let the
current default based on the pascal parameter take charge.
This uses normal AWS resource naming conventions during stack template
creation. Part of this is just a "best practice" thing, however, part of it
is also that we generate illegal names if Mu stacks have illegal characters
like /, -, and so on.
AWS uses capitalized property names for its markup, so we should
be looking for "Type" and "Properties", not "type" and "properties"
when validating that a aws/cf is formatted correctly.
This change eliminates the diag.Sink field, Diag, on the Compiland struct.
Instead, we should provide it at backend provider construction time. This
is consistent with how other phases of the compiler work and also ensures
the backends can properly implement the core.Phase interface.
This change implements the aws/cf extension provider, so that AWS resources
may be described and encapsulated inside of other stacks. Each aws/cf instantiation
requires just two fields -- type and properties -- corresponding to the equivalent
AWS resource object. The result is simply plugged in as an AWS resource, after
Mu templates have been expanded, permitting stack properties, etc. to be used.
This change rearranges the last checkin a little bit. Rather than storing
shadow BoundPublic/BoundPrivate maps, we will store the *ast.Stack directly on
the ast.Service node itself. This helps with context-free manipulation (e.g.,
you don't need access to the parent map just to interact with the node), and
simplifies the backend code quite a bit (again, less context to pass).
This is another change of mostly placeholders.
In general, there will be three kinds of types handled by code-generation:
* Mu primitives will be expanded into AWS goo in a very specialized way, to
accomplish the desired Mu semantics for those abstractions.
* AWS-specific extension types (mu/extension) will be recognized, so that we
can create special AWS resources like S3 buckets, DynamoDB tables, etc.
* Anything else is interpreted as a reference to another stack that will be
instantiated at deployment time (basically through template expansion).
This change does rearrange two noteworthy things in the core compiler, however:
first, it creates a place for bound nodes in the public and private service
references, so that the backend can access the raw stack types behind them; and
second, it moves the predefined types underneath their own package to avoid cycles.
This change rejiggers a few things so that we can more clearly introduce
a boundary between front- and back-end compiler phases, including sharing more,
like a diagnostics sink. Future extensions will include backend code-generation
options.