[GSoC 2022] Ham: Final Report

Blog post by dominicm on Sat, 2022-09-10 10:42

Hello everyone. Thank you for having me the past few months; it’s been a busy, fun ride. This is the final report for Ham, a replacement to the Jam build system.

I’d like to thank Stephan Aßmus for taking the time to mentor me, and the rest of the Haiku community for being responsive and receptive to Ham’s development.

You can find the Ham repo on Github, as well as a project board for current issues. If you have any questions I’m always available on the forum/email/IRC, so give me a ping.

Commits

Original architecture:

  • First GSoC commit
  • Last GSoC commit
  • All changes
  • 79 commits; 19,123 additions; 10,679 deletions (these are inflated due to formatting/renaming, see below)
  • Code changes; this comparison misses some genuine additions, but is past most of the formatting/other bulk changes so it’s easier to see the work done
  • 47 commits; 2,548 additions; 695 deletions

New architecture:

Summary

This project is a continuation of Ingo Weinhold’s original Ham project, which paused development in 2013. Over the course of the GSoC period, I completed the evaluation portion of Ham, which converts a Jam build system into a series of commands to run. There are some minor remaining bugs in the command execution code blocking the Haiku build, which will be worked out following the GSoC period. I also wrote a Ham language specification. It’s purposely detailed, but I plan to make a simpler tutorial sequence for those learning the language.

However, the C++ ecosystem has changed significantly since work began on Ham 12 years ago. Namely, there are better parsing tools available, C++17/20 added APIs that are critically important for Ham, and Ninja provides impressive build capabilities for projects that compile down to it. After discussing it with my mentor Stephan, we decided to lay the groundwork for porting to a new architecture that brings more user-facing benefits, including but not limited to:

  • Source-level errors
  • Advanced debugging output such as:
    • Parsing traces and an AST graph
    • Graph of target dependencies
  • Exact header scanning
  • Better incremental build times
  • Optional whitespace and other language improvements

I’ve documented why a second version was created, the goals for the project, and why certain libraries were chosen in the docs/development/decisions folder.

So far the parsing is complete, and the first couple evaluation classes have been ported over. Work on this portion will likely take some time, but the APIs are similar to the original architecture, so it’s not being written from scratch.

You can find the original architecture in the legacy branch, and the new one in main. I’ll be merging bug fixes into legacy until main is released, but feature requests will be focused on the new architecture.

Work Report

Original Architecture: 2022/6/13 - 2022/7/26

Work on the original architecture was primarily in implementing action modifiers and built-in rules. Notable work includes:

  • Choosing behavior for underspecified modifiers (notably together and piecemeal)
  • Creating an efficient algorithm to determine command size for the piecemeal modifier without brute-force (how Jam implements it)
  • Many bug fixes from evaluation logic to memory usage

For a more detailed breakdown of work done, see the pull requests done during this time.

New Architecture: 2022/7/27 - 2022/9/5

The new architecture uses the new string_view and ranges APIs for efficient manipulation of string lists (important as that is Ham’s primary data structure), a more flexible parsing system, and a bigger focus on unit tests rather than just integration tests. It also uses safe APIs whenever possible, greatly reducing the chance of memory errors.

Documentation

Testing

  • Extensively unit tested all implemented parsing/evaluation functionality
  • Mocks evaluation classes to isolate unit tests
  • Created helper functions to expressively test ASTs (example of failing test below)
-------------------------------------------------------------------------------
Variable replacers
-------------------------------------------------------------------------------
/home/dominic/src/haiku/ham/tests/parse/Variables.cpp:125
...............................................................................

/home/dominic/src/haiku/ham/tests/parse/Variables.cpp:133: FAILED:
  REQUIRE( checkParse(decompose(parse("$(X:G=grist)"), {0}), T<Variable>( {....
with expansion:
  false
with messages:
  Variable[2]{$(X:G=grist)}
   Identifier[1]{X}
   VariableReplacer[2]{G=grist}
    VariableSelector[0]{G}
    Leaf[1]{grist} != Leaf[?]{gr}

===============================================================================
test cases: 132 | 131 passed | 1 failed
assertions: 342 | 341 passed | 1 failed

Parsing

  • Mostly optional whitespace
  • Parses variable expressions at parse-time instead of dynamically at run-time
  • Retains source information
  • Customized errors (no obscure syntax error messages)

Evaluation

  • Input sanity checking
  • Detailed source-level error messages

Parting Words

Thank you all for the amazing few months I’ve had to work on this project. I plan to continue maintaining Ham, quickly reach a stable build of the original architecture, and - eventually - a full release of the new architecture. Ham has been a chance to breath new life into a great build system, and I look forward to making it useful for Haiku and other projects.