Commit graph

11 commits

Author SHA1 Message Date
joeduffy
8bdc81a4e1 Test function type token parsing
This change adds tests for basic function type token parsing.  It also
fixes two bugs around eating tokens and remainders.
2017-01-23 17:20:47 -08:00
joeduffy
4e78f10280 Add some decorated token tests
This isn't comprehensive yet, however it caught two bugs:

1. parseNextType should operate on "rest" in most cases, not "tok".

2. We must eat the "]" map separator before moving on to the element type.
2017-01-23 15:14:04 -08:00
joeduffy
049cc7dbc8 Implement decorated token parsing and binding
Part of the token grammar permits so-called "decorated" types.  These
are tokens that are pointer, array, map, or function types.  For example:

* `*any`: a pointer to anything.
* `[]string`: an array of primitive strings.
* `map[string]number`: a map from strings to numbers.
* `(string,string)bool`: a function with two string parameters and a
      boolean return type.
* `[]aws:s3/Bucket`: an array of objects whose class is `Bucket` from
      the package `aws` and its module `s3`.

This change introduces this notion into the parsing and handling of
type tokens.  In particular, it uses recursive parsing to handle complex
nested structures, and the binder.bindTypeToken routine has been updated
to call out to these as needed, in order to produce the correct symbol.
2017-01-23 14:48:55 -08:00
joeduffy
b724857ae1 Properly enforce accessibility
This change fixes a few mistakes in the old way of checking accessibility,
and adds support for protected class visibility.
2017-01-22 10:09:55 -08:00
joeduffy
7c5241978e Perform member lookup and binding 2017-01-22 09:45:58 -08:00
joeduffy
bd231ff624 Implement fmt.Stringer on token types 2017-01-21 12:25:59 -08:00
joeduffy
3518dce92f Test token manipulation
This change includes some tests for token parsing and conversions.  It
also fixes a bug where we treated Type tokens like ClassMembers, when
we ought to have been treating them like ModuleMembers.
2017-01-21 11:08:25 -08:00
joeduffy
a58cd7e2a0 Begin typechecking
This change performs typechecking during binding.  This is less about
typechecking per se -- since higher level languages will have presumably
given us well-typed IL -- and more about preparing the AST so that we
can evaluate the fully bound nodes to produce a MuGL graph.  It also
serves as a "verifier" for the incoming MuIL, however.

This is clearly incomplete, as the dozens of TODOs will make obvious.
But it's a clean checkpoint that does enough interesting typechecking
that I am landing it now.
2017-01-21 09:08:35 -08:00
joeduffy
2c0266c9e4 Clean up package URL logic
This change rearranges the old way we dealt with URLs.  In the old system,
virtually every reference to an element, including types, was fully qualified
with a possible URL-like reference.  (The old pkg/tokens/Ref type.)  In the
new model, only dependency references are URL-like.  All maps and references
within the MuPack/MuIL format are token and name based, using the new
pkg/tokens/Token and pkg/tokens/Name family of related types.

As such, this change renames Ref to PackageURLString, and RefParts to
PackageURL.  (The convenient name is given to the thing with "more" structure,
since we prefer to deal with structured types and not strings.)  It moves
out of the pkg/tokens package and into pkg/pack, since it is exclusively
there to support package resolution.  Similarly, the Version, VersionSpec,
and related types move out of pkg/tokens and into pkg/pack.

This change cleans up the various binder, package, and workspace logic.
Most of these changes are a natural fallout of this overall restructuring,
although in a few places we remained sloppy about the difference between
Token, Name, and URL.  Now the type system supports these distinctions and
forces us to be more methodical about any conversions that take place.
2017-01-20 11:46:36 -08:00
joeduffy
7b0dc8ee8d Overhaul names versus tokens
I was sloppy in my use of names versus tokens in the original AST.
Now that we're actually binding things to concrete symbols, etc., we
need to be more precise.  In particular, names are just identifiers
that must be "interpreted" in a given lexical context for them to
make any sense; whereas, tokens stand alone and can be resolved without
context other than the set of imported packages, modules, and overall
module structure.  As such, names are much simpler than tokens.

As explained in the comments, tokens.Names are simple identifiers:

    Name = [A-Za-z_][A-Za-z0-9_]*

and tokens.QNames are fully qualified identifiers delimited by "/":

    QName = [ <Name> "/" ]* <Name>

The legal grammar for a token depends on the subset of symbols that
token is meant to represent.  However, the most general case, that
accepts all specializations of tokens, is roughly as follows:

    Token       = <Name> |
                  <PackageName>
                    [ ":" <ModuleName>
                        [ "/" <ModuleMemberName>
                            [ "." <Class MemberName> ]
                        ]
                    ]

where:

    PackageName         = <QName>
    ModuleName          = <QName>
    ModuleMemberName    = <Name>
    ClassMemberName     = <Name>

Please refer to the comments in pkg/tokens/tokens.go for more details.
2017-01-19 17:57:20 -08:00
joeduffy
25632886c8 Begin overhauling semantic phases
This change further merges the new AST and MuPack/MuIL formats and
abstractions into the core of the compiler.  A good amount of the old
code is gone now; I decided against ripping it all out in one fell
swoop so that I can methodically check that we are preserving all
relevant decisions and/or functionality we had in the old model.

The changes are too numerous to outline in this commit message,
however, here are the noteworthy ones:

    * Split up the notion of symbols and tokens, resulting in:

        - pkg/symbols for true compiler symbols (bound nodes)
        - pkg/tokens for name-based tokens, identifiers, constants

    * Several packages move underneath pkg/compiler:

        - pkg/ast becomes pkg/compiler/ast
        - pkg/errors becomes pkg/compiler/errors
        - pkg/symbols becomes pkg/compiler/symbols

    * pkg/ast/... becomes pkg/compiler/legacy/ast/...

    * pkg/pack/ast becomes pkg/compiler/ast.

    * pkg/options goes away, merged back into pkg/compiler.

    * All binding functionality moves underneath a dedicated
      package, pkg/compiler/binder.  The legacy.go file contains
      cruft that will eventually go away, while the other files
      represent a halfway point between new and old, but are
      expected to stay roughly in the current shape.

    * All parsing functionality is moved underneath a new
      pkg/compiler/metadata namespace, and we adopt new terminology
      "metadata reading" since real parsing happens in the MetaMu
      compilers.  Hence, Parser has become metadata.Reader.

    * In general phases of the compiler no longer share access to
      the actual compiler.Compiler object.  Instead, shared state is
      moved to the core.Context object underneath pkg/compiler/core.

    * Dependency resolution during binding has been rewritten to
      the new model, including stashing bound package symbols in the
      context object, and detecting import cycles.

    * Compiler construction does not take a workspace object.  Instead,
      creation of a workspace is entirely hidden inside of the compiler's
      constructor logic.

    * There are three Compile* functions on the Compiler interface, to
      support different styles of invoking compilation: Compile() auto-
      detects a Mu package, based on the workspace; CompilePath(string)
      loads the target as a Mu package and compiles it, regardless of
      the workspace settings; and, CompilePackage(*pack.Package) will
      compile a pre-loaded package AST, again regardless of workspace.

    * Delete the _fe, _sema, and parsetree phases.  They are no longer
      relevant and the functionality is largely subsumed by the above.

...and so very much more.  I'm surprised I ever got this to compile again!
2017-01-18 12:18:37 -08:00