Tiny CPU is a custom “small CPU” design intended for implementation in a CPLD. Such soft CPU cores typically target an FPGA or large CPLD, but the target device for Tiny CPU is a small Altera CPLD with limited logic resources. This constrains the CPU to a minimal set of features in order to fit. It is an 8-bit CPU with only two registers, and a 10-bit address space. The instruction set is a subset of the 6502 instruction set, with modifications to reflect the smaller address space and number of registers.
Download the Tiny CPU file archive, including the assembler and Verilog source files.
The project is split into two halves, originally imagined as separate chips, but now combined into one. The core CPU module is called Tiny CPU, while a companion module called Tiny Device implements address decoding, bank switching, and peripheral I/O. As a pair, Tiny CPU and Tiny Device are intended to be combined to make a working single-board computer, using only a CPLD and an external SRAM and ROM.
The original target device was Altera’s EPM7128, a 128 macrocell CPLD based on Altera’s older 5V technology. A single macrocell consists of one flip-flop plus some combinatorial logic, and can compute a one bit result from 1-10 inputs, where the result is expressed as a some-of-products of the inputs. An 8-bit register requires at least 8 macrocells, and structures like counters, adders, and muxes consume many more, so 128 macrocells for a full-fledged CPU is a challenge. Tiny CPU was planned to occupy one EPM7128, with Tiny Device in a second identical CPLD. Verilog source for both designs was written and simulated, and both successfully fit into the target device, but no hardware was ever built using this design. See the link below to download the source.
After a long break, development resumed with a new plan, this time using a single Altera Max II EPM570 CPLD instead of the two EPM7128s. The Max II is a more modern device using a different internal technology, and Altera states its logic capacity is equivalent to roughly 440 macrocells. It’s also a 3.3V device, so the SRAM, ROM, and other components from the original design were all migrated to 3.3V as well. Construction of a Tiny CPU demonstration computer using a custom PCB and this hardware is currently in progress.
|0x||SUB abs||SUB imm||SUB abs,X||ADD abs||ADD imm||ADD abs,X||CMP abs||CMP imm||CMP abs,X||NOR abs||NOR imm||NOR abs,X|
|1x||LDA abs||LDA imm||LDA abs,X||STA abs||STA imm||STA abs,X||LDX abs||LDX imm||CPX abs||CPX imm||STX abs|
|BEQ abs||BCC abs||BCS abs|
|3x||PLA||PLX||RETURN||PHA||PHX||JMP abs||CALL abs||INX||DEX|
|imm||immediate||LDA #$1F||operand is literal byte $1F|
|abs||absolute||LDA $1FF||operand is contents of address $1FF|
|abs,X||absolute, X-indexed||LDA $1FF,X||operand is contents of address formed by adding $1FF to the value in the X register|
|impl||implied||INX||operand is implied by the instruction|
|The instruction’s opcode is packed into the most significant six bits of a program byte. Instructions with no operands (implied addressing) require only a single program byte. Address operands are 10 bits, formed from the least significant two bits of the first program byte, and all eight bits of the second program byte. Immediate operands are 8 bits, taken from the second program byte.|
|PC||program counter (10 bit)|
|SP||stack pointer (6 bit)|
|A||accumulator (8 bit)|
|X||index register (8 bit)|
|SR||status register [carry, zero] (2 bit)|
|LIFO, top down, 64 entry, $3C0 – $3FF|
Tiny Device implements bank switching, address decoding, a PS/2 keyboard interface, serial input and output, a parallel LCD driver, tick counter, clock division, a general-purpose parallel port, and an I/O status register. More details about Tiny Device’s functions can be found in the Tiny Device introductory post. The bank switching mechanism is described here.
With its 10-bit address bus, the CPU sees 1K of memory. This is divided into two 512-byte blocks. Block 1 contains the stack, I/O ports, and a scratch RAM area. It is the “common” block, and is always present in the CPU’s address space no matter what is happening with bank switching. In contrast, block 0 is a swappable memory area, and can be mapped to any bank in physical memory.
Physical memory is 64K, and is divided equally between ROM and RAM. The 64K physical memory space is partitioned into 128 banks of 512 bytes each. Any bank can be mapped into block 0. Bank 127 is always mapped into block 1, the common block.
The bank select register is part of the memory-mapped I/O ports in common memory. To swap a bank, the CPU only needs to write the new bank number to the appropriate address.
A few benefits of this bank switching design are:
- Upon reset, bank 0 is mapped to block 0. That puts 512 bytes of ROM, 440 bytes of RAM, the I/O ports, and the stack all in the CPU’s address space. That’s plenty for many small programs, and means they won’t have to bother about bank switching at all.
- Larger programs (lots of program code) can be accommodated by bank switching code segments in/out of block 0, all operating on common data in block 1.
- Programs operating on large data structures can copy some bank-switching helper code to block 1, then swap additional RAM banks in/out of block 0.
- Arguments can be passed on the stack to ROM helper routines in other banks, because the stack is in common memory.
- All of ROM is addressable, with no holes. This makes storing images, audio samples, and other data in ROM much easier.
- There is no difference in handling between ROM and RAM banks. A program running entirely from RAM works just like one whose code is in ROM.
A custom-designed circuit board holds the Altera Max II EPM570T, 512KB Flash ROM, and 32KB SRAM that form the heart of the computer. A 1.8-inch color TFT on a breakout board serves as the display. A piezo speaker and two LEDs provide opportunities for simple I/O. Headers for JTAG, serial, and a PS/2 keyboard enable connections to other devices or a PC. Because the serial interface and keyboard operate at 5V, a 74LVC08 is used to level shift to 3.3V for communication with the other components. A 20-pin expansion header exposes unused Max II pins to provide additional I/O opportunities.18 comments