Lesson 7.3

Building the Resolver

DFS, the dependency graph, cycle detection, and topological sort. How src/module/module.go works.

build go 15 min read

The output: a Graph.

The resolver's output is a Graph struct — the complete picture of all reachable modules:

type Module struct {
    Path     string          // absolute file path (canonical key)
    AST      *syntax.Program // parsed AST
    Exports  map[string]bool // names declared via export
    DepPaths []string        // absolute paths of direct dependencies
}

type Graph struct {
    Entry   string             // absolute path of entry module
    Modules map[string]*Module // path -> Module (the full set)
    Order   []string           // topological order: deps first, entry last
}

Graph.Order is the key output. It tells the type checker and code generator exactly what order to process modules. Walking Order left-to-right guarantees every dependency is already processed when you reach the modules that depend on it.

Absolute paths are used as canonical keys throughout. Two use statements with different relative strings but resolving to the same file produce the same key — so the module is only processed once.

The algorithm: depth-first search.

module.Build(entryPath) runs a DFS starting from the entry file. DFS naturally produces topological order as a side effect: a node is "done" only after all its dependencies are done, so appending to the order list on completion gives deps-before-dependents.

The DFS uses two sets to track state:

gray

Currently being visited.

A module is gray while its subtree is being explored. If the DFS encounters a gray node, there is a cycle — the graph has a path from that node back to itself.

black

Fully processed.

A module is black when its entire subtree has been visited, all its dependencies are in the graph, and it has been appended to Graph.Order. Revisiting a black node is a no-op.

gray := make(map[string]bool)   // currently on the DFS stack
black := make(map[string]bool)  // fully processed

var visit func(path string) error
visit = func(path string) error {
    if black[path] { return nil }  // already done, skip
    if gray[path]  { /* cycle! */ } // back edge = cycle

    gray[path] = true
    // ... parse, recurse into imports ...
    delete(gray, path)
    black[path] = true
    g.Order = append(g.Order, path) // post-order = deps before dependents
    return nil
}

The append to g.Order happens at the very end of visit, after all recursive calls return. This is post-order traversal — guaranteeing deps-before-dependents.

Cycle detection with readable error messages.

When a gray node is re-encountered, there is a cycle. A bare "circular import" error is unhelpful — the developer needs to know which files form the cycle.

The resolver maintains a chain slice that tracks the current DFS path. When a cycle is detected, it slices chain from the first occurrence of the repeated path and formats the cycle:

// chain = ["main.monk", "a.monk", "b.monk"] when b.monk imports a.monk:
// cycle message: "circular import: a.monk -> b.monk -> a.monk"

cycle := append(chain[start:], path)  // the loop
names := make([]string, len(cycle))
for i, p := range cycle {
    names[i] = filepath.Base(p)        // just the filename, not the full path
}
return fmt.Errorf("circular import: %s", strings.Join(names, " -> "))

The error message shows exactly which files form the cycle, in the order they were imported. The developer can immediately see which import to break.

The chain is maintained by appending before the recursive call and slicing after (via defer). This keeps the slice in sync with the call stack without any additional bookkeeping.

Collecting exports from the AST.

After parsing a file, the resolver walks its AST to collect exported names. This happens in collectExportNames — a simple one-pass AST walk:

for _, stmt := range prog.Stmts {
    es, ok := stmt.(*syntax.ExportStmt)
    if !ok { continue }

    switch inner := es.Stmt.(type) {
    case *syntax.VarDeclStmt:
        exports[inner.Name] = true    // export let add = ...
    case *syntax.TypeDeclStmt:
        exports[inner.Name] = true    // export type Point = ...
    case *syntax.ExprStmt:
        // bare "export name" — inner expression is an IdentExpr
        if id, ok := inner.Expr.(*syntax.IdentExpr); ok {
            exports[id.Name] = true
        }
    }
}

The result is stored in Module.Exports. The resolver then validates every non-wildcard import against this map immediately, before any type-checking:

if !use.Star {
    for _, name := range use.Names {
        if !dep.Exports[name] {
            return fmt.Errorf("%s:%d: module %q does not export %q",
                filepath.Base(path), use.Line, use.Source, name)
        }
    }
}

This catches misspelled export names and missing exports before the expensive type-checking pass, with an error that includes the file name and line number.

Path resolution.

ResolvePath(source, importerPath) turns a module source string into an absolute file path:

ResolvePath("./utils", "/project/main.monk")
    // -> "/project/utils.monk"

ResolvePath("./lib/math", "/project/main.monk")
    // -> "/project/lib/math.monk"

ResolvePath("./utils.monk", "/project/main.monk")
    // -> "/project/utils.monk"  (extension already present, no double-add)

The function enforces three rules:

1

The source must start with . — no bare names or absolute paths.

2

The .monk extension is added if missing.

3

The file must exist. If it doesn't, a specific error is returned immediately — not a Go "file not found" message, but a Monk error with the module name.

How the rest of the pipeline uses the graph.

types.CheckModules and codegen.GenerateModules both receive the same *Graph and walk Graph.Order left-to-right:

// In types.CheckModules:
for _, path := range graph.Order {
    mod := graph.Modules[path]
    // check mod.AST with bindings from previously-processed modules
    // add exported names to the cross-module scope
}

// In codegen.GenerateModules:
for _, path := range graph.Order {
    mod := graph.Modules[path]
    // emit C for mod.AST
    // exported functions/vars are already emitted when dependents reference them
}

Because dependencies always appear earlier in Order, every name referenced by a module is already in the type map (or already emitted as C) by the time that module is processed. No forward-declaration problem, no two-pass needed.

Key takeaways

1

Post-order DFS naturally produces a topological sort. Deps-before-dependents falls out of the algorithm without a separate sorting step.

2

Gray/black coloring detects cycles in one pass with no extra memory beyond the two maps and the chain slice.

3

Export validation happens at graph-build time — before the type checker runs. Misspelled import names fail fast.

4

Absolute paths as keys mean two different relative strings pointing to the same file are deduplicated automatically.