Commit graph

482 commits

Author SHA1 Message Date
joeduffy
5f0a1e970b Refactor binding logic
This changes a few things around binding logic, as part of eliminating
all of the legacy logic and weaving it into the new codebase:

* Give Scopes access to the Context object.  Related, add a TryRegister
  method to Scope that is like RequireRegister, except that instead of
  fail-fast upon encountering a duplicate entry, it will issue an error.

* Move all typecheck visitation functions out of the big honkin' switch
  and into their own member functions.  As this stuff gets more complex,
  having everything in one routine was starting to irk my sensibilities.

* Validate that packages have names.

* Store both the package symbol, plus the canonicalized URL used to
  resolve it, in the package map.  This will help us verify that versions
  match for multiple package references resolving to the same symbol.

* Add nice inquiry methods to the Class symbol (Sealed, Abstract, Record,
  Interface) that simplify accessing the modifiers on the underlying node.
2017-01-23 10:58:45 -08:00
joeduffy
25efb734da Move compiler options to pkg/compiler/core
The options structure will be shared between multiple passes of
compilation, including evaluation and graph generation.  Therefore,
it must not be in the pkg/compile package, else we would create
package cycles.  Now that the options structure is barebones --
and, in particular, no more "backend" settings pollute it -- this
refactoring actually works.
2017-01-23 08:10:49 -08:00
joeduffy
d7e04dc42e Add a pointer type; make load location expressions produce it
This change adds a pointer type, so that we can turn load location
expressions into proper l-values.
2017-01-23 08:00:09 -08:00
joeduffy
1e1a70345e Record and validate labeled statements and jumps 2017-01-23 07:25:05 -08:00
joeduffy
b724857ae1 Properly enforce accessibility
This change fixes a few mistakes in the old way of checking accessibility,
and adds support for protected class visibility.
2017-01-22 10:09:55 -08:00
joeduffy
7c5241978e Perform member lookup and binding 2017-01-22 09:45:58 -08:00
joeduffy
d907e358ad Assert that all expressions have types
This is a bit of paranoia, however, an invariant of the typechecker
is that, after the final pass, all expression nodes are assigned a
type.  This assertion checks that this is true.
2017-01-21 14:39:20 -08:00
joeduffy
9fe2f28e7b Add walker cases for token nodes 2017-01-21 14:39:05 -08:00
joeduffy
bd231ff624 Implement fmt.Stringer on token types 2017-01-21 12:25:59 -08:00
joeduffy
6209b65a36 Serialize return types as proper AST tokens 2017-01-21 12:14:25 -08:00
joeduffy
bbdc18652c Add missing kind specifiers
The prior checkin failed to place kind specifiers on the token AST nodes.
I'm surprised TypeScript let this fly.
2017-01-21 12:08:44 -08:00
joeduffy
4dc082a2df Use 1st class token AST nodes
Instead of serializing simple token strings into the AST -- in place of things
like type references, module references, export references, etc. -- we now use
1st class AST nodes.  This ensures that source context flows with the tokens
as we bind them, etc., and also cleans up a few inconsistencies (like using an
ast.Identifier for NewExpression -- clearly wrong since this the resulting
MuIL is meant to contain fully bound semantic references).
2017-01-21 11:48:58 -08:00
joeduffy
3518dce92f Test token manipulation
This change includes some tests for token parsing and conversions.  It
also fixes a bug where we treated Type tokens like ClassMembers, when
we ought to have been treating them like ModuleMembers.
2017-01-21 11:08:25 -08:00
joeduffy
02d7ba2417 Add type conversion tests 2017-01-21 10:20:47 -08:00
joeduffy
a58cd7e2a0 Begin typechecking
This change performs typechecking during binding.  This is less about
typechecking per se -- since higher level languages will have presumably
given us well-typed IL -- and more about preparing the AST so that we
can evaluate the fully bound nodes to produce a MuGL graph.  It also
serves as a "verifier" for the incoming MuIL, however.

This is clearly incomplete, as the dozens of TODOs will make obvious.
But it's a clean checkpoint that does enough interesting typechecking
that I am landing it now.
2017-01-21 09:08:35 -08:00
joeduffy
74980ce339 Use stable map enumeration for member walking 2017-01-20 17:34:11 -08:00
joeduffy
c182d71c08 Begin binding function bodies
This change begins to bind function bodies.  This must be done as a
second pass over the AST, because dependencies between modules, and
even intra-module dependencies, might refer to top-level symbols like
types, variables, and functions, and so must be established first.

At the moment, the only node kind we handle is ast.Block, which
merely pushes and pops lexical scopes; however, the next step is to
implement the AST node-specific visitation logic for all statement
and expression nodes.

I've also rearranged how Scopes work to be a little easier to use.
The Scope type now remembers the **Scope slot in which it is rooted,
so that we can simply call Push and Pop on Scopes and have the right
thing happen.
2017-01-20 14:04:13 -08:00
joeduffy
3f8a317724 Use symbol factories
This introduces symbol factory methods to make creating them
less error prone.  In particular, we hadn't been wiring up parents
properly (since they came in after the initial symbol shape).
Now with the factory methods, we'll be reforced to visit creation
sites whenever adding new required elements to symbol types.
2017-01-20 13:02:51 -08:00
joeduffy
4babba157f Use module tokens in the import list 2017-01-20 12:06:08 -08:00
joeduffy
2c0266c9e4 Clean up package URL logic
This change rearranges the old way we dealt with URLs.  In the old system,
virtually every reference to an element, including types, was fully qualified
with a possible URL-like reference.  (The old pkg/tokens/Ref type.)  In the
new model, only dependency references are URL-like.  All maps and references
within the MuPack/MuIL format are token and name based, using the new
pkg/tokens/Token and pkg/tokens/Name family of related types.

As such, this change renames Ref to PackageURLString, and RefParts to
PackageURL.  (The convenient name is given to the thing with "more" structure,
since we prefer to deal with structured types and not strings.)  It moves
out of the pkg/tokens package and into pkg/pack, since it is exclusively
there to support package resolution.  Similarly, the Version, VersionSpec,
and related types move out of pkg/tokens and into pkg/pack.

This change cleans up the various binder, package, and workspace logic.
Most of these changes are a natural fallout of this overall restructuring,
although in a few places we remained sloppy about the difference between
Token, Name, and URL.  Now the type system supports these distinctions and
forces us to be more methodical about any conversions that take place.
2017-01-20 11:46:36 -08:00
joeduffy
bf33605195 Rearrange the library code
This rearranges the library code:

* sdk/... goes away.

* What used to be sdk/javascript/ is now lib/mu/, an actual MuPackage
  that provides the base abstractions for all other MuPackages to use.

* lib/aws is the @mu/aws MuPackage that exposes all AWS resources.

* lib/mux is the @mu/x MuPackage that provides cross-cloud abstractions.
  A lot of what used to be in lib/mu goes here.  In particular, autoscaler,
  func, ..., all the "general purpose" abstractions, really.
2017-01-20 10:30:43 -08:00
joeduffy
729af81e44 Move all cloud switching to mu/x MuPackage
In the old system, the core runtime/toolset understood that we are targeting
specific cloud providers at a very deep level.  In fact, the whole code-generation
phase of the compiler was based on it.

In the new system, this difference is less of a "special" concern, and more of
a general one of mapping MuIL objects to resource providers, and letting *them*
gather up any configuration they need in a more general purpose way.

Therefore, most of this stuff can go.  I've merged in a small amount of it to
the mu/x MuPackage, since that has to switch on cloud IaaS and CaaS providers in
order to decide what kind of resources to provision.  For example, it has a
mu.x.Cluster stack type that itself provisions a lot of the barebone essential
resources, like a virtual private cloud and its associated networking components.

I suspect *some* knowledge of this will surface again as we implement more
runtime presence (discovery, etc).  But for the time being, it's a distraction
getting the core model running.  I've retained some of the old AWS code in the
new pkg/resource/providers/aws package, in case I want to reuse some of it when
implementing our first AWS resource providers.  (Although we won't be using
CloudFormation, some of the name generation code might be useful.)  So, the
ships aren't completely burned to the ground, but they are certainly on 🔥.
2017-01-20 09:46:59 -08:00
joeduffy
59aa09008c Strip trailing newlines from MuJS diagnostics
As I do local development, I noticed errant newlines in the error
messages coming from TypeScript.  That's because its formatting appends
newlines, whereas ours does not (and requires the code printing the
diagnostic to add one).  To make these uniform, we will strip newlines,
if they exist, from the TypeScript preformatted diagnostics.
2017-01-20 09:37:55 -08:00
joeduffy
7b0dc8ee8d Overhaul names versus tokens
I was sloppy in my use of names versus tokens in the original AST.
Now that we're actually binding things to concrete symbols, etc., we
need to be more precise.  In particular, names are just identifiers
that must be "interpreted" in a given lexical context for them to
make any sense; whereas, tokens stand alone and can be resolved without
context other than the set of imported packages, modules, and overall
module structure.  As such, names are much simpler than tokens.

As explained in the comments, tokens.Names are simple identifiers:

    Name = [A-Za-z_][A-Za-z0-9_]*

and tokens.QNames are fully qualified identifiers delimited by "/":

    QName = [ <Name> "/" ]* <Name>

The legal grammar for a token depends on the subset of symbols that
token is meant to represent.  However, the most general case, that
accepts all specializations of tokens, is roughly as follows:

    Token       = <Name> |
                  <PackageName>
                    [ ":" <ModuleName>
                        [ "/" <ModuleMemberName>
                            [ "." <Class MemberName> ]
                        ]
                    ]

where:

    PackageName         = <QName>
    ModuleName          = <QName>
    ModuleMemberName    = <Name>
    ClassMemberName     = <Name>

Please refer to the comments in pkg/tokens/tokens.go for more details.
2017-01-19 17:57:20 -08:00
joeduffy
35fddddb78 Bind packages and modules
This change implements a significant amount of the top-level package
and module binding logic, including module and class members.  It also
begins whittling away at the legacy binder logic (which I expect will
disappear entirely in the next checkin).

The scope abstraction has been rewritten in terms of the new tokens
and symbols layers.  Each scope has a symbol table that associates
names with bound symbols, which can be used during lookup.  This
accomplishes lexical scoping of the symbol names, by pushing and
popping at the appropriate times.  I envision all name resolution to
happen during this single binding pass so that we needn't reconstruct
lexical scoping more than once.

Note that we need to do two passes at the top-level, however.  We
must first bind module-level member names to their symbols *before*
we bind any method bodies, otherwise legal intra-module references
might turn up empty-handed during this binding pass.

There is also a type table that associates types with ast.Nodes.
This is how we avoid needing a complete shadow tree of nodes, and/or
avoid needing to mutate the nodes in place.  Every node with a type
gets an entry in the type table.  For example, variable declarations,
expressions, and so on, each get an entry.  This ensures that we can
access type symbols throughout the subsequent passes without needing
to reconstruct scopes or emulating lexical scoping (as described above).

This is a work in progress, so there are a number of important TODOs
in there associated with symbol table management and body binding.
2017-01-19 13:37:47 -08:00
joeduffy
5aff453cc1 Add rudimentary symbol abstractions
This massages the symbol layer to reflect more closely what we need.

There is a Symbol interface.  It is an interface because it's polymorphic
and we'll need to switch on type tests throughout the code a fair bit.

In addition to the Symbol interface, there are three other interfaces:

* ModuleMember, for any Module member symbols.

* ClassMember, for any Class member symbols.

* Type, to permit polymorphic treatment of Classes and built-in types.

There are concrete symbols for Module, ModuleProperty, ModuleMethod,
Class, ClassProperty, and ClassMethod.  These map directly to the
corresponding AST abstractions and simply permit us to annotate those
AST nodes with some semantic information and, more importantly, to
inject them into the symbol table as we perform binding/typechecking.

Class implements the Type abstraction.

There is also a primitive node with four constant types, AnyType,
BoolType, NumberType, and StringType, each of which is registered in
an export Primitives map, keyed by their name/keyword/token.  These
of course implement the Type abstraction.

Finally, there are ArrayType and MapType symbols, which also implement
Type.  They wrap other types as keys/elements.

I'm peeling off this part from my gigantic pending change, since this is
mostly standalone, and ideally leads to more independent chunks.
2017-01-19 12:02:49 -08:00
joeduffy
4874b2f7c6 Actually perform compilations from mu compile 2017-01-18 15:52:26 -08:00
joeduffy
25632886c8 Begin overhauling semantic phases
This change further merges the new AST and MuPack/MuIL formats and
abstractions into the core of the compiler.  A good amount of the old
code is gone now; I decided against ripping it all out in one fell
swoop so that I can methodically check that we are preserving all
relevant decisions and/or functionality we had in the old model.

The changes are too numerous to outline in this commit message,
however, here are the noteworthy ones:

    * Split up the notion of symbols and tokens, resulting in:

        - pkg/symbols for true compiler symbols (bound nodes)
        - pkg/tokens for name-based tokens, identifiers, constants

    * Several packages move underneath pkg/compiler:

        - pkg/ast becomes pkg/compiler/ast
        - pkg/errors becomes pkg/compiler/errors
        - pkg/symbols becomes pkg/compiler/symbols

    * pkg/ast/... becomes pkg/compiler/legacy/ast/...

    * pkg/pack/ast becomes pkg/compiler/ast.

    * pkg/options goes away, merged back into pkg/compiler.

    * All binding functionality moves underneath a dedicated
      package, pkg/compiler/binder.  The legacy.go file contains
      cruft that will eventually go away, while the other files
      represent a halfway point between new and old, but are
      expected to stay roughly in the current shape.

    * All parsing functionality is moved underneath a new
      pkg/compiler/metadata namespace, and we adopt new terminology
      "metadata reading" since real parsing happens in the MetaMu
      compilers.  Hence, Parser has become metadata.Reader.

    * In general phases of the compiler no longer share access to
      the actual compiler.Compiler object.  Instead, shared state is
      moved to the core.Context object underneath pkg/compiler/core.

    * Dependency resolution during binding has been rewritten to
      the new model, including stashing bound package symbols in the
      context object, and detecting import cycles.

    * Compiler construction does not take a workspace object.  Instead,
      creation of a workspace is entirely hidden inside of the compiler's
      constructor logic.

    * There are three Compile* functions on the Compiler interface, to
      support different styles of invoking compilation: Compile() auto-
      detects a Mu package, based on the workspace; CompilePath(string)
      loads the target as a Mu package and compiles it, regardless of
      the workspace settings; and, CompilePackage(*pack.Package) will
      compile a pre-loaded package AST, again regardless of workspace.

    * Delete the _fe, _sema, and parsetree phases.  They are no longer
      relevant and the functionality is largely subsumed by the above.

...and so very much more.  I'm surprised I ever got this to compile again!
2017-01-18 12:18:37 -08:00
joeduffy
259135c15d Add a visitation API
This change introduces a new visitation API to the new MuIL AST.
The ast.Walk API takes an ast.Visitor implementation and walks the
tree in depth-first order, invoking the visitor along the way.

The visitor gets to choose whether to continue visitation (by returning
a non-nil visitor object), or to stop it (by returning nil).  The
visitation will proceed with that returned visitor, so that a visitor
can "swap out" the visitor used for child nodes if needed.

At the end, the PostVisit function is called, for any clean up logic.

Finally, the ast.Inspector type is available as a simple way of consing
up visitors simply using a function that returns a bool indicating
whether visitation should continue.
2017-01-18 07:56:53 -08:00
joeduffy
f4f9aee28a Add a log statement for unimplemented type nodes 2017-01-17 18:09:43 -08:00
joeduffy
2964bf6ad0 Implement diag.Diagable on MuIL AST nodes
This ensures that source context information flows automatically from
MuIL AST nodes to the various diag-related functions.
2017-01-17 18:01:11 -08:00
joeduffy
3ff9e83f63 Delete predefined types
This code doesn't make sense any longer; all "predefined types" will
simply be expressed as MuPack/MuIL abstractions.
2017-01-17 17:50:23 -08:00
joeduffy
0706472a1c Split pkg/ast; merge symbol code into pkg/symbols
This change helps move us one step closer to eliminating the old metadata-
based AST goo, and replacing it with MuPack/MuIL AST and symbol information.
In particular, all name/token "symbol" code -- things like identifiers,
package/member references, and version specs -- move out of the pkg/ast
package and into the top-level pkg/symbols package, alongside the existing
MuPack/MuIL symbol token types.
2017-01-17 17:41:28 -08:00
joeduffy
01658d04bb Begin merging MuPackage/MuIL into the compiler
This is the first change of many to merge the MuPack/MuIL formats
into the heart of the "compiler".

In fact, the entire meaning of the compiler has changed, from
something that took metadata and produced CloudFormation, into
something that takes MuPack/MuIL as input, and produces a MuGL
graph as output.  Although this process is distinctly different,
there are several aspects we can reuse, like workspace management,
dependency resolution, and some amount of name binding and symbol
resolution, just as a few examples.

An overview of the compilation process is available as a comment
inside of the compiler.Compile function, although it is currently
unimplemented.

The relationship between Workspace and Compiler has been semi-
inverted, such that all Compiler instances require a Workspace
object.  This is more natural anyway and moves some of the detection
logic "outside" of the Compiler.  Similarly, Options has moved to
a top-level package, so that Workspace and Compiler may share
access to it without causing package import cycles.

Finally, all that templating crap is gone.  This alone is cause
for mass celebration!
2017-01-17 17:04:15 -08:00
joeduffy
bbb60799f8 Add a Require family of functions to pkg/util/contract 2017-01-17 15:58:11 -08:00
joeduffy
0260aae0d4 Add the start of a pkg/graph package
This change introduces a pkg/graph package, which is, for now, very
barebones.  This is where the MuGL data types and functions will go.
2017-01-17 15:57:24 -08:00
joeduffy
33f67f16ae Remove the index entrypoint
For now, an index entrypoint in the MuPackage metadata isn't
required (and in fact, gets rejected as an unrecognized field).
2017-01-17 15:07:11 -08:00
joeduffy
bc376f8f8d Move pkg/pack/symbols to pkg/symbols 2017-01-17 15:06:53 -08:00
joeduffy
7ea5331f7f Merge pkg/pack/encoding into pkg/encoding 2017-01-17 14:58:45 -08:00
joeduffy
3a5217d722 Fix truncated output...again
The prior workaround to avoid truncated pending stdout writes, it
turns out, doesn't actually work.  (Piping output leads to more buffering
and asynchrony, and turned up additional problems.)  Digging through some
GitHub issues led me to these "best practices":

    https://nodejs.org/api/process.html#process_process_exit_code

    The reason this is problematic is because writes to process.stdout in
    Node.js are sometimes non-blocking and may occur over multiple ticks of
    the Node.js event loop. Calling process.exit(), however, forces the
    process to exit before those additional writes to stdout can be performed.

    Rather than calling process.exit() directly, the code should set the
    process.exitCode and allow the process to exit naturally by avoiding
    scheduling any additional work for the event loop.

This change adopts this part of the best practices, by simply setting
exitCode upon normal termination and letting the event loop quiesce.

Note that I am still not obeying the other part of the guidance:

    If it is necessary to terminate the Node.js process due to an error
    condition, throwing an uncaught error and allowing the process to
    terminate accordingly is safer than calling process.exit().

Somewhat confusingly, writes to process.stderr do not suffer from these
same problems, because writes to stderr are synchronous.  We prefer to
tear down the process gracefully, without an unhandled exception, and
we are okay losing some stdout writes as a result, given that all error-
related ones will have gone to stderr.
2017-01-17 14:54:15 -08:00
joeduffy
901c1cc6cf Add scaffolding for mu apply, compile, and plan
This adds scaffolding but no real functionality yet, as part of
marapongo/mu#41.  I am landing this now because I need to take a
not-so-brief detour to gut and overhaul the core of the existing
compiler (parsing, semantic analysis, binding, code-gen, etc),
including merging the new pkg/pack contents back into the primary
top-level namespaces (like pkg/ast and pkg/encoding).

After that, I can begin driving the compiler to achieve the
desired effects of mu compile, first and foremost, and then plan
and apply later on.
2017-01-17 14:40:55 -08:00
joeduffy
853a8f5197 Add support for YAML project files to MuJS
This change now recognizes Mu.yaml files, in addition to Mu.json,
from the MuJS compiler.  Not the most important thing in the world,
however all of our project files are in YAML and it's less work to
implement this than to convert them all to JSON ...
2017-01-17 12:09:57 -08:00
joeduffy
037f117303 Start a MuJS design document
This is admittedly very light at the moment, however, I had it on my
enlistment and want to checkpoint *something*, however minimal it is.
2017-01-17 11:42:49 -08:00
joeduffy
3db75444fc Clarify some language around package/module naming 2017-01-17 11:41:12 -08:00
joeduffy
15b043c0c1 Add a missing range check 2017-01-17 10:00:14 -08:00
joeduffy
c576e7cae4 Print the imports in mu describe 2017-01-17 09:55:58 -08:00
joeduffy
6769107c66 Track module imports
This change tracks the set of imported modules in the ast.Module
structure.  Although we can in principle gather up all imports simply
by looking through the fully qualified names, that's slightly hokey;
and furthermore, to properly initialize all modules, we need to know
in which order to do it (in case there are dependencies).  I briefly
considered leaving it up to MetaMu compilers to inject the module
initialization calls explicitly -- for infinite flexibility and perhaps
greater compatibility with the source languages -- however, I'd much
prefer that all Mu code use a consistent module initialization story.
Therefore, MetaMus declare the module imports, in order, and we will
evaluate the initializers accordingly.
2017-01-17 09:50:32 -08:00
joeduffy
dbc17656f9 Emit more types
This change emits more types.  In particular:

* Previously, only primitive types got emitted, yielding "any" for any
  custom types.  Now we emit custom types, including fully qualified
  module names for type references resolving to imported modules.

* Prior to this change, we erroneously used the type node on the function
  declaration itself as an approximation for return type.  To get the
  true return type, we need to dig through a few nodes, including the
  Declaration and Signature.  This change now properly emits return types.

This doesn't close out marapongo/mu#46, however we are getting close.
2017-01-17 09:34:38 -08:00
joeduffy
cca8619351 Fix output truncation issue 2017-01-16 15:18:57 -08:00
joeduffy
a7b4d482a4 Mark ast.Export nodes as "public" in MuJS
By default, exports in MuJS should be available outside of the package.
This change adds the public modifier to them.
2017-01-16 15:07:14 -08:00