BMOW title
Floppy Emu banner

Small C

I’ve begun looking more deeply into the “Small C” compiler that Bill mentioned. It looks very cool, yet also wildly inefficient, which is about what I expected. With some more effort, I think I could get a working C compiler for BMOW that generated not-too-awful code, which would be great for porting programs from other systems, or something like getting an open-source chess program running. But for cases where the size and speed of the code are important, Small C isn’t going to replace hand-written assembly.

There are several versions of Small C floating around on the internet. Small C Plus appears to be the most fully-featured, with support for structs and floating point math. Unfortunately it looks difficult to port, being targted for some ancient Amstrad CP/M computer. Maybe I’ll take a shot at it someday.

Small C v2.1 by James E. Hendrix seems to be the “official” version, and is the one discussed in the Small C handbook that I bought. I had a lot of difficulty getting it to compile and run with a modern compiler (Visual C++ 2005), though. It seems to assume that uninitialized memory will contain zero, and uses some non-standard library functions that only exist if you compile Small C with itself. I did eventually manage to get it to build and run, and compile a simple program, but it looks awkward to port.

That leaves Small C for UNIX, which looks to be the most feature-poor, yet the easiest to port. I was able to get it to compile and run under Visual C++ 2005 in a few minutes. All the machine-specific code generation routines have already been pulled out into a separate file, so in theory it should be easy to retarget to the BMOW instruction set.

I was surprised at how small the compiler’s source code is. I expected all kinds of code related to context free grammars and what not. Instead it’s only about 3000 lines of code, which is pretty manageable. As far as I can tell, it does only one forward pass through the code, with a lot of lines like:

if (*ptr == '[' )
    do_array();

There’s a lot that Small C doesn’t have, of course. There are no structs, floats, or bools, and no function pointers. All functions are assumed to return int. You can’t define a variable when you declare it (like int x=4;). And function headers must be written in ancient K&R style C, where the types of the formal parameters are given separately from their names, in the space before the function’s first open curly brace. But in exchange for these limitations, you get a compiler that’s simple enough to understand and retarget without going crazy.

The compiler assumes that the CPU has two general registers, and a stack. It also assumes that each register is large enough to hold an int, which is also the size of a pointer. The minimum entry requirements are therefore two 16-bit registers and a stack pointer, which is already a problem for BMOW. I can treat two 8-bit registers as a single 16-bit register by having the compiler emit assembly to manipulate them individually, but BMOW only has three 8-bit registers, which makes one and a half 16-bit registers. I would need to add a fourth 8-bit register somehow if I wanted to meet Small C’s assumptions about the hardware. A work-around is to use a fixed memory location as the missing register, but that’s ugly and slow.

Small C also assumes that the registers all support the same operations for most purposes. This isn’t the case for BMOW, where many operations like addition are only supported for the accumulator register. To add a number to another register, it’s necessary to save the accumulator somewhere, copy the other register to the accumulator, perform the addition, copy the result back, and restore the original accumulator value. That’s not the kind of code you want your compiler to be generating very often.

The biggest mismatch between Small C and BMOW is address pointers. BMOW has no such concept: you can’t store a pointer in a register (or a pair of registers) and then dereference it. Pointers in memory can be dereferenced, and there are enough flexible addressing modes like stack-relative, x-indexed, so that in most cases your program shouldn’t need to explicitly dereference a pointer. In contrast, Small C assumes that either of the registers can be treated as pointers and dereferenced with ease. Any time a value must be loaded from memory, the compiler emits code to first load the address into the primary register, and then load the data using that address. This results in a block of horribly convoluted BMOW assembly to accomplish the same thing.

An example would probably help. Consider the following C program:

func()
{
    int g;
    g = 0xDEAD;
    return g;
}

This could easily be collapsed to a single line that returns 0xDEAD without the need for a local variable, but ignore that for the moment. So this program must:

  • Allocate stack space for a 2-byte local variable
  • Store 0xDEAD in the stack location of the local variable
  • Retrieve the current value of the local variable from the appropriate stack location (yes it will always be 0xDEAD), and store it in the primary register
  • Return

I’ve defined the “primary register” as the Y and A registers, with Y holding the high byte and A the low byte. So some hand-written assembly for func() might be:

func:
    ; reserve space for local variable "g"
    pha ; push anything on the stack, value doesn't matter, only the change in the stack pointer
    pha

    ; store 0xDEAD in g
    lda #$DE
    sta 2,s ; store DE at stack + 2
    lda #$AD
    sta 1,s ; store AD at stack + 1

    ; retrieve g into primary register
    lda 2,s ; get high byte from stack + 2
    tay ; transfer it to the Y regster
    lda 1,s ; get low byte from stack + 1

    plp ; free the space on the stack used by the local variable
    plp
    rts ; return

That’s not too bad. Compare that with the code generated by my BMOW back-end for the Small C compiler. In addition to YA as the primary register, it also uses X and a fixed memory location (cc_regw) as a secondary register, and another fixed memory location (cc_regtemp) as scratch space. It also makes use of two subroutines to put (cc_pint) and get (cc_gint) an int to/from the primary register, similar to the original 8080 back-end for Small C. cc_gint assumes the address is already in the primary register, and overwrites it with the value at that address. cc_pint assumes the address is in the secondary register, and the value to store is in the primary register.

cc_regw = $4FFF
cc_regtemp = $4FFD

cc_gint:
    sta cc_regtemp
    sty cc_regtemp+1
    phx
    ldx #1
    lda (cc_regtemp),x
    tay
    dex
    lda (cc_regtemp),x
    plx
    rts

cc_pint:
    stx cc_regtemp
    phx
    ldx cc_regw
    stx cc_regtemp+1
    pha
    ldx #1
    tya
    sta (cc_regtemp),x
    pla
    dex
    sta (cc_regtemp),x
    plx
    rts

;func()
func:
;{
;    int g;
    pha
    pha
;    g=0xDEAD;
    stx cc_regtemp
    tsr
    ldx cc_regtemp
    clc
    adc    #<1
    bcc +
    iny
+   pha
    tya
    clc
    adc    #>1
    tay
    pla
    phy
    pha
    lda    #<57005
    ldy    #>57005
    plx
    stx cc_regtemp
    plx
    stx cc_regw
    ldx cc_regtemp
    jsr cc_pint
;    return g;
    stx cc_regtemp
    tsr
    ldx cc_regtemp
    clc
    adc    #<1
    bcc +
    iny
+   pha
    tya
    clc
    adc    #>1
    tay
    pla
    jsr cc_gint
    jmp cc1
;}
cc1:
    plp
    plp
    rts

The tsr instruction transfers the stack pointer to the X, Y, and A registers. X gets the bank byte of the stack pointer, which we don’t need, and since we also don’t want to destroy the value in X, the code pushes and pops it around the tsr.

It all works, but… yuck.

Read 4 comments and join the conversation 

4 Comments so far

  1. Gregg C Levine - June 16th, 2008 6:34 am

    Nice going so far.
    Incidentally I have the original Small-C compiler that an adventurous pair translated from CP/M-80 to MS-DOS here.

    About it’s only limits is in the list of things it just can’t properly process. For that, I suspect you’d need to write a macro in assembler to fill in the blanks between the Small-C code.

    If you want, and do have an appropriate machine for it I can create a zip file of the disk contents and make it available for you.

  2. Steve - June 20th, 2008 7:05 pm

    Is the source you have something different than what’s available here: http://www.cpm.z80.de/small_c.html ?

    As I described in the entry body, I have been able to get Small C to work successfully on BMOW: it’s just inefficient and limited in functionality. I might be able to address the inefficiency somewhat with a peephole optimizer, or other larger changes to the compiler to suit BMOW. But the limited functionality is an inherent attribute of Small C: none of the versions ever supported structs, for example, as far as I can tell. If the version you have is something different, then I’d certainly be curious to check it out.

  3. Gregg C Levine - June 21st, 2008 12:44 pm

    Hello!
    Gee! I missed that one. Thanks!

    Mine is one of the Ron Cain ideas appropriately updated to accommodate my styles. It’s certainly one of them on that list, but after so many updates on my time, and using my hardware I have officially lost track of who’s who. Incidentally my version started off as the v1.1 item by Ron Cain and translated to the wonky world of MSDOS by those guys that I described before.

    Now when you’re ready, I’ll have the file made up and give you a pointer.

  4. Scott Walters - May 29th, 2009 2:13 am

    There’s a C compiler called cc65 that’s a long descendant of Small C that’s more complete and smarter about optimizing. It’s self hosting and there’s also a Unix port for cross compilation to 6502 based targets and it will save its intermediate assembly, useful in this case.

    -scott

Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.