Say hello to Nibbler, the 4-bit homemade CPU! Ever since I built BMOW1, people have written to me asking how to make their own homebrew computers. BMOW is a complex design that can be difficult to comprehend, so I decided it was time to create a minimal CPU that’s easy to understand, easy to build, but still capable of running interesting programs. Ideas for Nibbler began percolating in my brain, and after a few weeks of pencil sketches and hand simulation, it’s finally ready to share. And if you’ve forgotten, a nibble is half a byte or 4 bits, so the name fits the CPU.
Some of you may be thinking: “4-bit CPU? BORING!” I agree that many of the 4-bit CPU designs on the web aren’t very exciting, though that’s not an inherent problem with their 4-bitness, but is caused by shortcomings in the computer that surrounds the CPU. Most designs are limited to 256 nibbles of memory, which just isn’t enough to fit a program that does anything very interesting. I/O is often limited to basic LEDs and switches, further reducing the scope of what’s possible.
My goals for Nibbler are:
- Only use commonly-available 7400 series chips and RAM/ROM. No programmable logic or other goodies.
- Keep the total number of chips as few as possible.
- Employ a simple, straightforward design that’s easy to understand.
- Maintain a clean logical separation between the CPU and the computer surrounding it.
- Run interesting, interactive programs involving several I/O devices.
- NOT: Be the most powerful CPU, or the easiest to write programs for.
The architecture of Nibbler is shown above. The CPU core is just eleven 7400 series chips, plus the clock crystal. RAM and ROM add two more chips, and peripheral I/O in “the computer” adds three more, for a total of sixteen chips overall. Compared to BMOW’s 65 chips and multiple clocks, that’s very lightweight.
Instruction opcodes are 4 bits wide, which allows for 16 possible types of instructions. All instructions require exactly two clock cycles to execute. During the first clock cycle, called phase 0, the instruction opcode and operand are retrieved from memory and stored in a register called Fetch. The second clock cycle, called phase 1, performs the calculation or operation needed to execute the instruction.
A pair of microcode ROMs is used to generate the sixteen internal control signals needed to load, enable, and increment the other chips in the CPU at the appropriate times. The microcode ROM address is formed from the instruction opcode, the phase, and the CPU carry and equal flags. Each microcode ROM outputs a different group of eight of the sixteen total control signals.
A load-store design is used, with all arithmetic and logical computation results being stored into the single 4-bit accumulator register named “A”. Data can be moved between A and memory locations in RAM, but otherwise all the CPU instructions operate only on A. This greatly simplifies the hardware requirements, at the cost of some decrease in flexibility when writing programs.
In contrast to most modern CPUs, the Nibbler design uses a Harvard Architecture. That means programs and data are stored in separate address spaces, and travel on separate busses. The data bus is 4 bits wide, as one should expect for a 4-bit CPU. The program bus is 8 bits wide: 4 bits for the instruction opcode, and 4 bits for an immediate operand.
Program and data addresses are both 12 bits wide, resulting in total addressable storage of 4096 bytes for programs and 4096 nibbles for data. A 12 bit program counter holds the current instruction address. Since instruction opcodes are 4 bits wide, that makes instructions involving absolute memory addresses 4 + 12 = 16 bits in size, or two program bytes.
Nibbler is notable for a few things it does NOT have. There’s no address decoder, because there’s not more than one chip mapped into different regions of the same address space. Program ROM occupies all of the program address space, and RAM occupies all of the data address space. As you’ll see later, I/O peripherals aren’t memory-mapped, but instead use port-specific IN and OUT instructions to transfer data.
Nibbler also lacks any address registers, which means it can’t support any form of indirect addressing, nor a hardware-controlled stack. All memory references must use absolute addresses. That’s a significant limitation, but it’s in keeping with the project’s K.I.S.S. design goals. With the use of jump tables and dedicated memory locations, a simple call/return mechanism can be implemented without a true stack.
Up to sixteen distinct I/O devices can be supported by the CPU, but the planned I/O devices require just one IN port and two OUT ports. The computer’s input comes from four momentary pushbuttons, arranged in a left/right/select/back cross configuration, and connected to the IN port. Output utilizes one of the two OUT ports, and includes the obligatory LEDs used for debugging, as well as a piezo speaker for software-controlled sound, and a two-line character-based LCD display.
The specific 7400 logic family and chips to be used aren’t yet finalized, but in back of the envelope calculations, it looks like the CPU should support a speed of just over 4 MHz. The longest path is for a write to RAM during phase 1: Clock-to-Q delay for the Fetch register, plus propagation delay for the microcode ROMs, ALU, and bus driver, plus data setup time for the RAM. At two clock cycles per instruction, 4 MHz operation would result in 2 MIPS, which is the same or better than BMOW.
I’ll write more about the instruction set and programming model next time. Until then, if you have any comments or questions, I’d love to hear them!Read 10 comments and join the conversation