User guide
156 CHAPTER 9. THE DESIGN OF CINTCODE
The registers A and B are used for expression evaluation, and C is used in in byte
subscription. P and G are pointers to t he current stack frame and the global vector,
respectively. ST is used as a status register in the Cintpos version of Cintcode, and PC
points to the first byte of the nex t Cintcode instruction to execute. Count is a register
used by the de bu gger . While it is positive, Count is decremented on each instruction
execution, raising an exception (code 3) on reaching zero. W hen negative, it causes a
second (faster) interpreter to be used.
Cintcode encodes the most commonly occurring operations as single byte instruc-
tions, using multi-byte instructions for rarer operations. The first byte of an instruction
is t he function code. Operands of size 1, 2 or 4 bytes immediately follow some function
bytes. The two instructions used to implement switches have inline data following the
function byte. Cintcode module s also contains static data for stings, integers, tables
and global initialisation data.
9.1 Designing for Compactness
To obtain a compact encoding, information theory suggests that each function code
should occur with approximately equal frequency. The self compilation of the BCPL
compiler, as shown in figure 4. 2, was the main be nchmark test used to generate fre-
quency information and a summary of how often various operations are used during this
test is given in table 9.1. This data was produced using the tallying feature controlled
by the stats command, described on page 125.
The statist i c s from different programs vary gre atl y, so while encoding the common
operations really compact l y, there is graceful degradation for the rarer cases ensuring
that even unusual progr ams are handled reasonably well. There are, for i nst anc e,
several one byte instructions for loading small integers, while larger integers are handled
using 2, 3 and 5 byte instructions. The intention is that small changes in a sour c e
program should cause small small changes in the size of the c or re s ponding compiled
code.
Having several variant instructions for the same basic operation does not greatly
complicate the compiler. For example the four variants of the AP instruction that adds
a local variable into register A is dealt with by the followin g code fragment taken from
the codegener at or .
TEST 3<=n<=12 THEN gen(f_ap0 + n)
ELSE TEST 0<=n<=255
THEN genb(f_ap, n)
ELSE TEST 0<=n<=#xFFFF
THEN genh(f_aph, n)
ELSE genw(f_apw, n)
It is cl ear from table 9.1 that accessing variables and constants requires special care,
and that conditional jumps, addition, calls and indirection ar e also important. Since
access to local variables accounts for about a quarter of the operations performed, about
this proportion of codes were allocated to instructions concerned with local variables.
Local variable s are allocated words in the stack starting at position 3 relative to the P