This page discusses the SPARC version of Larceny. Some decisions will likely be revisited in the context of a planned port to the PowerPC architecture.
The SPARC version of Larceny is written in three languages: Scheme, C, and SPARC assembly language. Additionally, there are some tens of lines of MacScheme Assembly Language. As a general rule, as much code in the system as is practical is written in Scheme. Of the remaining, everything that is not performance-critical is written in C, and performance-critical parts are written in assembly language. ("Practical" and "performance-critical" are explained below.) In version 0.25, the line counts are:
Scheme code Libraries 8200 lines Evaluator/REPL 1600 lines C code 4500 lines SPARC assembly code 4200 linesThese counts include comments, whitespace, and so on; there are about 2800 assembly language instructions overall and we are working on reducing this to make the system easier to port. More than 2/3 of the assembly language code is in the generic arithmetic implementation.
What is "practical" to write in Scheme is usually a trade-off between desired performance and implementational convenience. For example, bignum arithmetic is written entirely in Scheme at this time, and is quite slow, which is acceptable because we know how to tune this over time, and that tuning will be portable (see the section on Generic Arithmetic, below).
What is "performance-critical" is mostly a statistical matter but also a policy decision: what kinds of programs will Larceny run? Examples of performance-critical operations are storage allocation, certain kinds of generic arithmetic, tag checking, and procedure calls. In many cases we generate in-line code, but call-outs to subroutines are required for non-trivial write-barrier checks, all arithmetic that does not operate on two fixnum operands and where the operation did not overflow, most storage allocation, and a host of other services. These call-outs use calling conventions that are designed to be very fast. If the operation itself is heavyweight (flushing the stack cache for call-with-current-continuation, say) then the call-out is little more than a trampoline into C code. But lightweight tasks like storage allocation are coded in assembly language.
Some of the roots point to Scheme procedures that are used by the run-time system.
There are three advantages to having such a table. First, getting a global variable at run-time is inexpensive: a register-plus-constant-offset load. Second, all globals in the system are always available through this table, and the table has all the variables, so it's straightforward to know what global variables there are. Third, since all roots are in the table, the garbage collector is made simpler, as it only needs to look for roots in one place.
The cost of using one register for the table is high on some register-poor systems (the ubiquitous Intel processors, for example) but is bearable on a RISC system. The table has other uses, too; see the section on