Jot down some thoughts on MuGL

This commit is contained in:
joeduffy 2017-02-21 16:06:41 -08:00
parent 85ba692832
commit 7cc72260d1
2 changed files with 174 additions and 0 deletions

173
docs/design/mugl.md Normal file
View file

@ -0,0 +1,173 @@
# Mu Graph Language (MuGL)
In several cases, Mu creates and operates on object graphs. Sometimes these object graphs are very general purpose, and
in other times, they are limited to subsets (such as resource-only DAGs). These graphs are produced when evaluating
a MuPackage, when determining the resource graph that represents a deployment activity, and so on. Anytime such a
graph must be persisted, Mu serializes using the Mu Graph Language (MuGL) format. This document specifies MuGL.
## Overall Structure
The overall structure of the MuGL file format is straightforward; it consists of a linear list of objects, keyed by a
so-called *moniker*, which is a unique identifier within a single MuGL. Each object contains a simple key/value bag of
properties, possibly deeply nested for complex structures. These objects may reference each other using their monikers.
For example, this directed graph
B
^ \
/ v
A D -> E
\ ^
v /
C
might be serialized as the following MuGL:
{
"vertices": {
"e": {
},
"d": {
"children": [
{ "#ref": "e" }
],
},
"c": {
"children": [
{ "#ref": "d" }
],
},
"b": {
"children": [
{ "#ref": "d" }
],
},
"a": {
"children": [
{ "#ref": "b" },
{ "#ref": "c" }
],
}
}
}
In the event the graph being serialized is a DAG, the order of objects is in linear dependency order -- like the output
of a topological sort -- as with this example. This ensures that deserialization can be done entirely in a single pass.
Any other fields are legal as peers to `mugl.vertices`, as is common with snapshots (e.g., to track the source MuPackage
and arguments). The schema for vertices is similarly open-ended, except that `#ref` objects resolve to their
corresponding object counterparts upon deserialization into a runtime graph representation.
The `#ref` name is chosen to reduce the likelihood of conflicts with real property names; a MuGL file can override
this choice with the special property `ref` in the front matter; for example, this uses `@@r`:
{
"ref": "@@r",
"vertices": {
...,
"a": {
"children": [
{ "@@r": "b" },
{ "@@r": "c" }
],
}
}
}
## Resource Snapshots
Although MuGL is general purpose, it is used for one very specific area of the Mu system: *resource snapshots*. Each
snapshot captures a complete end-to-end view of an environment's resources and their state. These snapshots are used to
version infrastructure, to compare existing infrastructure to a set of changes, and ultimately, to deploy changes.
A snapshot's schema is identical to that shown above for general MuGL graphs, with these caveats:
* The source MuPackage and arguments, if any, are encoded in the MuGL's header section.
* All snapshot graphs are DAGs.
* Every object is a resource; data objects are serialized as regular JSON (and hence must be acyclic).
* All resource objects have a fixed schema.
* All resource monikers are "stable" (see below).
Each resource has a type token (in [the usual Mu sense](mupack.md)), an optional ID assigned by its provider, an
optional list of moniker aliases, and a bag of properties which, themselves, are just JSON objects with optional edges
inside. Any edges within a resource's properties connect it to dependency resources; because snapshots are DAGs, all
dependency resource definitions will lexically precede the dependent resource within the MuGL file.
For example, imagine a resource snapshot involving a VPC, Subnet, SecurityGroup, and EC2 Instance:
VPC <- Subnet
^ ^
\ \
\ Instance
\ |
\ v
SecurityGroup
Assuming it was created from a `my/cluster` MuPackage, we might expect to find the following MuGL snapshot file:
{
"package": "my/cluster:*",
"vertices": {
"VPC": {
"id": "vpc-30629859",
"type": "aws:ec2/vpc:VPC",
"properties": {
"cidrBlock": "172.31.0.0/16"
}
},
"Subnet": {
"id": "subnet-925087fb",
"type": "aws:ec2/subnet:Subnet",
"properties": {
"cidrBlock": "172.31.0.0/16",
"vpcId": { "#ref": "VPC" }
}
},
"SecurityGroup": {
"id": "sg-151cd67c",
"type": "aws:ec2/securityGroup:SecurityGroup",
"properties": {
"name": "SSH",
"groupDescription": "Enable SSH access",
"securityGroupIngress": [
{
"cidrIp": "0.0.0.0",
"fromPort": 22,
"ipProtocol": "tcp",
"toPort": 22
}
]
"vpc": { "#ref": "VPC" }
}
},
"Instance": {
"id": "i-0cd6974f17a414343",
"type": "aws:ec2/instance:Instance",
"properties": {
"imageId": "ami-f6035893",
"instanceType": "t2.micro",
"securityGroupIds": [
{ "#ref": "SecurityGroup" }
],
"subnetId": { "#ref": "Subnet" }
}
}
}
}
### Resource Monikers
A goal of snapshots is that they are diffable and that resources in one graph may be easily compared to like-resources
in another graph. Therefore, we desire some amount of stability to the monikers chosen for resource objects.
The algorithm for generating monikers is likely to evolve over time as we gain experience with them. For now, they
encode the path from root to resource vertex within the original MuGL graph from which the resources were extracted.
It is possible there are multiple paths to the same resource, in which case, the shortest one is chosen as the primary
moniker; if all monikers are of equal length, the first lexicographically ordered one is chosen. In any case, the
aliase monikers are available on the resource definition in case it helps resolve comparison ambiguities.

View file

@ -137,6 +137,7 @@ More details are left to the respective design documents. Here are some key one
* [**Languages**](design/languages.md): An overview of Mu's three languages: MetaMus, MuPack/MuIL, and MuGL.
* [**MuPack/MuIL**](design/mupack.md): A detailed description of Mu's packaging and computation formats.
* [**MuGL**](design/mugl.md): An overview of the MuGL file format and how Mu uses graphs to do deployments.
* [**Stacks**](design/stacks.md): An overview of how stacks are represented using the above fundamentals.
* [**Dependencies**](design/deps.md): An overview of how package management and dependency management works.
* [**Clouds**](design/clouds.md): A description of how Mu abstractions map to different cloud providers.