This change adds a --dot option to the eval command, which will simply
output the MuGL graph using the DOT language. This allows you to use
tools like Graphviz to inspect the resulting graph, including using the
`dot` command to generate images (like PNGs and whatnot).
For example, the simple MuGL program:
class C extends mu.Resource {...}
class B extends mu.Resource {...}
class A extends mu.Resource {
private b: B;
private c: C;
constructor() {
this.b = new B();
this.c = new C();
}
}
let a = new A();
Results in the following DOT file, from `mu eval --dot`:
strict digraph {
Resource0 [label="A"];
Resource0 -> {Resource1 Resource2}
Resource1 [label="B"];
Resource2 [label="C"];
}
Eventually the auto-generated ResourceN identifiers will go away in
favor of using true object monikers (marapongo/mu#76).
Glog doesn't actually print out the stack traces for all goroutines,
when --logtostderr is enabled, the same way it normally does. This
makes debugging more complex in some cases. So, we'll manually do it.
This change renames the old Error type to Exception -- more consistent
with our AST, etc. nodes anyway -- and introduces a new Error type ("<error>")
to use when something during typechecking or binding fails.
The old way led to errors like:
error: MU504: tags.ts:32:18: Symbol 'Tag:push' not found
error: MU522: tags.ts:32:8: Cannot invoke a non-function; 'any' is not a function
This is because of cascading errors during type-checking; the symbol not found
error means we cannot produce the right type for the function invoke that
consumes it. But the 'any' part is weird. Instead, this new change produces:
error: MU504: tags.ts:32:18: Symbol 'Tag:push' not found
error: MU522: tags.ts:32:8: Cannot invoke a non-function; '<error>' is not a function
It's slightly better. And furthermore, it gives us a leg to stand on someday
should we decide to get smarter about detecting cascades and avoiding issuing
the secondary error messages (we can just check for the Error type).
This change fixes a few more phasing issues in the compiler. Namely,
it now splits all passes into three distinct phases:
1. Declarations: simply populating names.
2. Definitions: chasing down any references to other names from those
declared entities. For instance, base classes, other modules, etc.
3. Bodies: fully type-checking everything else, which will depend
upon both declarations and definitions being fully present.
This changes a few things with dependency probing:
1) Probe for Mupack files, not Mufiles.
2) Substitute defaults in the PackageURL before probing.
3) Trace the full search paths when an import fails to resolve.
This will help diagnose dependency resolution issues.
This adds support for sugared "*" semvers, which is a shortcut for
">=0.0.0" (in other words, any version matches). This is a minor part
of marapongo/mu#18. I'm just doing it now as a convenience for dev
scenarios, as we start to do real packages and dependencies between them.
This change permits object literals with the type of `any`; although
they will typically get coerced to record/interface types prior to use,
there is no reason to ban `any`. In fact, it is the idiomatic way of
encoding "objects as bags of properties" that is common to dynamic
languages like JavaScript, Python, etc.
This change adds support for loading from export locations. All we
need to do is keep chasing the referent pointer until we bottom out
on an actual method, property, etc.
This change fixes a whole host of issues with our current token binding
logic. There are two primary aspects of this change:
First, the prior token syntax was ambiguous, due to our choice of
delimiter characters. For instance, "/" could be used both as a module
member delimiter, in addition to being a valid character for sub-modules.
The result is that we could not look at a token and know for certain
which kind it is. There was also some annoyance with "." being the
delimiter for class members in addition to being the leading character
for special names like ".this", ".super", and ".ctor". Now, we just use
":" as the delimiter character for everything. The result is unambiguous.
Second, the simplistic token table lookup really doesn't work. This is
for three reasons: 1) decorated types like arrays, maps, pointers, and
functions shouldn't need token lookup in the classical sense; 2) largely
because of decorated naming, the mapping of token pieces to symbolic
information isn't straightforward and requires parsing; 3) default modules
need to be expanded and the old method only worked for simple cases and,
in particular, would not work when combined with decorated names.
Previously we asserted that typemap entries are never nil. It turns
out we represent "void" as the absence of a type in MuIL, and so we need
to permit these for constructors, etc.
This change further rearranges the phasing of binding to account for
the fact that class definitions may freely reference exports, which won't
be known until after binding class names. Therefore, we bind class names
first and then the definitions (function signatures and variables).
This brings the AWS MuPackage's number of verification errors down to 84
from 136.
The old method of specifying a default module for a package was using
a bit on the module definition itself. The new method is to specify the
module name as part of the package definition itself.
The new approach makes more sense for a couple reasons. 1) it doesn't
make sense to have more than one "default" for a given package, and yet
the old model didn't prevent this; the new one prevents it by construction.
2) The defaultness of a module is really an aspect of the package, not the
module, who is generally unaware of the package containing it.
The other reason is that I'm auditing the code for nondeterministic map
enumerations, and this came up, which simply pushed me over the edge.
The prior module binding logic was naive. It turns out that, because of inter-module
references within the same package, the order of binding certain modules and its members
within a package matters. This leads to a more Java/C#/JavaScript/etc. multi-pass model,
versus a classical single-pass C/C++ model (which requires forward declarations).
To do this, we bind things in this order within a package:
* First, we add an entry for each module. It is largely empty, but at least names resolve.
* Second, we bind all imports. Since all modules are known, all inter-module tokens resolve.
* Next, we can bind class symbols. Note that we must do this prior to module properties,
methods, and exports, since they might be referenced and need to exist first. As before,
we do not yet bind function bodies, since those can reference anything.
* Now we bind all module "members", meaning any module scoped properties and methods. Again,
we do not bind function bodies just yet. This is just a symbolic binding step.
* Next, we can finally bind exports. Because exports may reference any members, all of the
above must have been done first. But all possibly exported symbols are now resolved.
* Finally, we can bind function bodies.
This adds a `mu verify` command that simply runs the verification
pass against a MuPackage and its MuIL. This is handy for compiler
authors to verify that the right stuff is getting emitted.
This is pretty worthless, but will help me debug some issues locally.
Eventually we want MuGL to be fully serializable, including the option
to emit DOT files.
This change actually invokes the OnVariableAssign interpreter hook
at the right places. (Renamed from OnAssignProperty, as it will now
handle all variable assignments, and not just properties.) This
requires tracking a bit more information about l-values so that we
can accurately convey the target object and symbol associated with
the assignment (resulting in the new "location" struct type).
This change lowers the information collected about resource allocations
and dependencies into the MuGL graph representation.
As part of this, we've moved pkg/compiler/eval out into its own top-level
package, pkg/eval, and split up its innards into a smaller sub-package,
pkg/eval/rt, that contains the Object and Pointer abstractions. This
permits the graph generation logic to use it without introducing cycles.
This change refactors the interpreter hooks into a first class interface
with many relevant event handlers (including enter/leave functions for
packages, modules, and functions -- something necessary to generate object
monikers). It also includes a rudimentary start for tracking actual object
allocations and their dependencies, a step towards creating a MuGL graph.
This change adds a new Allocator type that will handle all object allocations
that occur during evaluation. This allows us to hook into interesting lifecycle
events that are required in order to create a MuGL resource graph.
This change mirrors the same change we made for local variable scope
addressing. GetPropertyPointer is all about getting the property's
address; instead of GetPropertyPointer, which could be confused as
meaning the property's value is a pointer, we will call this method
GetPropertyAddr instead.
This change dumps the evaluation state after evaluation completes, at
log-level 5. This includes which modules and classes were initialized,
in addition to the values for all global variables.
In addition to this, we rename a few things:
* Rename Object's Data field to Value.
* Rename the Object.T() methods to Object.TValue(). This more clearly
indicates what they are doing (i.e., fetching the value from the object)
and also avoids object.String() conflicting with fmt.Stringer's String().
* Rename Reference to Pointer, so it's consistent with everything else.
* Rename the GetValueReference/InitValueReference/etc. family of methods
to GetValueAddr/InitValueAddr/etc., since this reflects what they are
actually doing: manipulating a variable slot's address.
* Assert that module/class initializer functions returned nil.
* Evaluate the LHS as an l-value for binary assignment operators.
* Also consider two nulls to be equal, always.
This change implements unary operator evaluation, including prefix/postfix
assignment operators. Doing this required implementing l-values properly
in the interpreter; namely, load location and dereference, when used in an
l-value position, result in a pointer to the location, while otherwise they
(implicitly, in load location's case) deference the location and yield a value.
This change uses an enum rather than set of bools for unwind's inner
state, makes its fields private, and adds accessor methods. This makes
it easier to protect invariants and wonky states like being a break and
a throw simultaneously. Now that Unwind is part of the public API (vs
being private as it used to), this is really a requirement (and was
obviously a good idea regardless). Thanks to Eric for CR feedback.
This change adds some fail statements into places that aren't yet
implemented during binding and evaluation so I can qucikly catch
stuff that isn't working as expected yet.