Archive for the 'Plus Too' Category
Mouse Freeze Debugging

Last October, Plus Too first booted successfully into the Macintosh Finder. Ever since then, it’s exhibited an intermittent mouse freezing bug. The FPGA Mac replica runs normally for a few minutes, during which the mouse works normally, and it’s possible to exercise menus, run programs, and do everything else you’d expect from a working Mac. But somewhere after a minute or two of activity, the mouse pointer invariably freezes in one spot, and the computer seems to halt. The bug appears to be related to mouse movements, and faster, more frequent mouse movements cause the problem to appear sooner. If the mouse remains unmoved, then Plus Too will happily run for hours without problems.
In October I was already tired from the work needed to get Plus Too to that point, and had no desire to chase the mouse freeze problem further at that time. The project sat idle while I turned my attention to other things, and saw now further progress until this week. That’s when I decided it was finally time to track down the cause of the mouse freeze bug.
Mouse Interrupts
Macintosh mouse handling requires two different interrupts in order to work correctly. When the user moves the mouse, the SCC triggers a level 2 CPU interrupt to read the new position data. The interrupt handler adjusts a low memory global variable called MTemp to set the new on-screen mouse pointer position. Then every 1/60th of a second during the VBLANK interval (video retrace), the VIA triggers a level 1 CPU interrupt. The VBLANK interrupt handler erases the on-screen mouse pointer from its old position, and redraws it at the new position indicated by MTemp.
When the Plus Too mouse froze, I found that the level 2 SCC interrupt was still getting called normally, and MTemp was being adjusted correctly. However, the level 1 cursor VBL task was not getting called, so the mouse pointer was never redrawn at the new position. Further investigation showed that no other VBL tasks were getting called either. In fact, no level 1 VIA interrupts of any type were being processed. At first I thought this might be a problem with the Verilog code that implements my VIA replica, but I found that the VIA was still asserting its IRQ line, but the CPU was just ignoring it. Why?
According to the CPU status register, when a mouse freeze occurs, the CPU is permanently stuck with its current interrupt priority level at 1, instead of its normal value of 0. Because interrupts equal to or below the current IPL will be ignored, no VIA interrupts are ever processed, so the mouse VBL task never gets called. Level 2 SCC interrupts can still pre-empt the CPU, so MTemp gets updated correctly, but when the level 2 interrupt handler completes it returns to whatever the CPU was previously doing at level 1.
Stuck at Interrupt Priority Level 1
So how might the CPU get stuck at IPL 1? How does it get to IPL 1 in the first place? The normal way IPL 1 is reached is during a level 1 VIA interrupt handler, when the CPU sets the IPL automatically. These handlers normally do some processing and then return, which automatically restores the IPL to 0. This means one way the CPU could get stuck at IPL 1 would be if a level 1 interrupt handler went into an infinite loop and never returned. Looking at the level 1 interrupt handlers in the Mac Plus ROM, there are:
- One Second timer – From inspecting the code, this is a trivial handler, and will always return.
- VBLANK – This handler explicitly sets the IPL back to 0, so it can be pre-empted by other level 1 interrupts.
- Timer 1 and keyboard – I haven’t implemented these interrupts in the VIA yet, so they can never occur.
- Timer 2 – This is the only VIA interrupt yet implemented whose ROM handler might conceivably fail to return.
- System handlers – After booting the Mac, the system software might install new VIA interrupt handlers or patch the ones in ROM, creating additional opportunities for handlers that don’t return. Unfortunately I have no good way to test that further.
In addition to a non-returning level 1 interrupt handler, the other way the CPU could get stuck at IPL 1 is if some code explicitly sets the IPL to 1. From looking at a disassembly of the ROM code, several routines definitely do this when modifying global lists: VInstall, PostEvent, OSEventAvail, FlushEvents. The Sony floppy driver also explicitly sets the IPL to 1 in at least two cases. There are also many examples in the ROM code where the IPL is set using a value passed in a register or on the stack, where I can’t say for certain what value it’s being set to. And as before, the system software loaded from disk might contain additional code that directly manipulates the IPL, which I wouldn’t see in the ROM disassembly.
The best way to determine what’s happening would be to wait until the mouse freezes, then pause the CPU when it’s stuck at IPL 1, and look at what code it’s executing. I’ve attempted to do just that, but I lack good tools for software debugging (as opposed to debugging the Verilog hardware model), and I haven’t been able to learn anything very useful. Whenever I interrupt the CPU, it’s either executing some system code in RAM that was loaded from disk, or some fairly innocuous piece of ROM code like the trap dispatcher. I’ve been able to determine any higher level purpose to the code that suggests what it’s trying to do or why it never exits IPL 1.
Finding a Fix
One path might be to add MacsBug to my system disk image, then invoke it when the mouse freeze occurrs, and examine the stack trace and disassembly in an attempt to learn more. MacsBug requires the use of a keyboard, though, and I haven’t yet implemented the keyboard hardware. Even if the keyboard worked, I’m reluctant to start into debugging random pieces of system software that I know nothing about, but maybe that’s unavoidable.
Another possibility is to determine what was the most recent time the IPL was changed from 0 to 1. That might not be enough information to solve the problem, but it would be a start. I might be able to find that info using Altera’s Signal Tap logic analyzer, or maybe I could modify the Verilog machine model to keep track of the IPL changes for me.
My hunch is that some piece of code is going into an infinite loop while trying to access a piece of hardware I haven’t yet implemented, like VIA timer 1, the keyboard, serial port, sound hardware, or PRAM. If all else fails, I could just keep adding more hardware to my Verilog model, and see if the mouse freeze problem disappears at some point. One intriguing clue is that the mouse problem is much more difficult to reproduce when the General control panel is in the foreground. This control panel sets the date and time, sound volume, and other settings that are stored in PRAM. With PRAM not yet implemented, the control panel behaves oddly, and the system time never advances beyond 12:00:00 AM. Perhaps the General control panel is constantly attempting to read or write PRAM, which somehow affects the likelihood of the mouse freeze bug occurring? It’s little more than a wild guess, but PRAM is as good a place as any to start implementing more hardware.
One thing I can’t explain is why frequent rapid mouse movements appear to cause the freeze problem, since my investigations suggest the frozen mouse pointer is merely a symptom of VIA interrupts not getting processed, rather than a cause of anything. Since mouse movements generate a level 2 SCC interrupt, maybe there’s a bug in my design that occurs when a level 2 interrupt pre-empts a level 1 interrupt under certain conditions, or when both interrupts are triggered at the same time. There are some bugs in my mouse implementation as well, which appear to cause a backlog of mouse updates under some situations. I’d assumed these were unrelated to the freezing problem, but maybe I should try getting to the problem of that first. I wish I had a clearer idea of how to proceed, instead of just clutching at straws!
4 commentsPlus Too Interrupt Bug
Mark McDougall (tcdev) discovered what looks like a serious bug in the way Plus Too handles interrupts. It appears my design causes the 68000 CPU to use the wrong interrupt handler vector! How it could work at all under those circumstances isn’t clear, since I would expect it to crash the moment an interrupt is first triggered, but I fixed the bug anyway. I had hoped it would eliminate the mysterious freeze-ups I’ve been getting with Plus Too after a few minutes of active mouse movements in the Finder, but sadly it didn’t appear to make any difference.
Vectored Interrupts
Here’s what’s happening. Plus Too (and the Macintosh it replicates) use vectored interrupts. When an interrupt is triggered, the 68000 responds with an interrupt acknowledge cycle. It sets the 24-bit address bus to all 1′s, except for A3-A1, which are set to the level of the interrupt being acknowledged. There is no A0 output from the CPU, since it uses upper/lower byte strobes instead. So to acknowledge a level 1 interrupt (the VIA), the CPU would set the address bus to:
1111 1111 1111 1111 1111 001x
with X being the invisible A0 bit. The memory interface (in this case, my Plus Too design) is supposed to respond by placing an interrupt vector number on the 16-bit data bus. The CPU then multiplies the vector number by 4 internally, in order to get the memory address of the interrupt vector. It then uses that vector to find the location of the interrupt handler routine to execute.
In the case of the Macintosh, the external interrupt handlers begin with vector number $18, which when multiplied by 4 is the vector found at memory address $60. The level 1 VIA interrupt vector is number $19 found at $64, and so on. So to select the proper vector, the memory interface should respond to interrupt acknowledge cycles with $18 + the interrupt level.
A Missing Bit
That’s what I intended to do, but somewhere during development, my Plus Too code lost an address bit. The relevant piece of Verilog code looked something like:
input [1:0] addrLo; // A2-A1 output [15:0] dataOut;
...
assign dataOut = { 13'h3, addrLo }; // use A3-A1 to construct an interrupt number offset from $18
Oops. That code doesn’t do what the comment says. It ignores A3, meaning that interrupt levels 4-7 will never be handled properly. These correspond to the programmer’s debug switch on the Mac. Worse, it generates interrupt numbers that are offset from $C, not $18. So for interrupt level 1 (the VIA), it will generate a response of interrupt number $D, which is at memory address $38.
According to my docs, $38 is an unassigned/reserved vector. In fact, all the vectors from $30 to $5C are reserved or unassigned. So how does that work at all? Why doesn’t it crash the moment a VIA interrupt is first triggered? Is it possible that the reserved vector entry just happens to contain the right value somehow? That seems very unlikely.
Fixed?
The fix is pretty simple: addrLo should be three bits instead of two, and contain A3-A1. I made this change, and Plus Too behaves no differently than before as far as I can tell. It still kind of mostly works, but exhibits frequent freeze-ups after a few minutes of use, that seem to be related to mouse movements somehow. Maybe the two problems are totally unrelated, but I’d hoped the interrupt vector problem might explain the freeze-ups.
I still can’t explain how Plus Too ever worked before, with external interrupt numbers given the wrong offset.
3 commentsA Working Hardware Replica of the Mac Plus

Plus Too is a home-made replica of a classic Macintosh computer using an FPGA. The project reached a major milestone yesterday, booting to the Finder for the first time, and running several programs. Since then I’ve been getting many inquiries, and because not everyone’s been following the project since its beginning, I’ve created a Plus Too project summary page to document the progress so far and my plans for the next version. If you’re new to Plus Too, please begin by reading the project summary page.
The screenshot above shows a Mac Write document opened with Plus Too. Because there’s no keyboard support yet, the document was created on another computer and added to the Plus Too boot disk image. Below the Macintosh screen region, hardware debugging information is displayed in green. This debug overlay is possible because a pixel-doubled 512 x 342 Mac image conveniently leaves some extra vertical space on a 1024 x 768 VGA display. From left to right, the debugging information shows the current state of the CPU address bus, data bus in, data bus out, address strobes, previous address, and breakpoint address. A poor-man’s breakpoint system is implemented by setting a breakpoint address with panel switches. When the address bus matches the breakpoint address, the CPU’s memory transfer acknowledge signal is withheld, effectively pausing the CPU.

The current system is implemented entirely with an unmodified Altera DE1 FPGA development board. The next version will use a custom-designed circuit board instead of the DE1 kit. The revised Plus Too will use a real 68000 CPU, and will add a microcontroller to manage the floppy disk SD card interface. It will also add the physical connectors necessary to use a real Mac Plus mouse and keyboard if desired.
What Works
The current system recreates a computer similar to a Mac 512Ke, with 512K of RAM and no SCSI. It boots from a System 6.0.8 floppy disk image stored in ROM. The disk image is pre-encoded into a series of virtual tracks and sectors, with the proper low-level layout, header, footer, checksum, and GCR disk byte format. This encoding is performed offline, using a custom-made utility program. Applications can be launched from the disk and run normally.
What Doesn’t Work
The disk is read-only, and there’s no keyboard, sound, SCSI, serial ports, real-time clock, or parameter RAM. The planned SD card interface for loading disk images hasn’t yet been built. There are some obvious stability problems, and the system tends to freeze up if the mouse is moved too rapidly. Disk I/O seems strangely slow– slower even than on a real Mac 512Ke or Mac Plus. There’s a long, long way to go before this project could be considered “done”, but it’s an exciting start!
3 commentsPlus Too – Hello World!
Plus Too works! Holy cow, it really works. Hot damn!



I didn’t want to tackle SD card loading yet, so the GCR-pre-encoded 800K disk image resides in ROM, just above the Macintosh ROM image. The floppy drive module uses the video module’s memory access time slot during hblank periods to load disk data, transparently to the CPU.
Plus Too ran for about five minutes while I took these photos, then it locked up. Not bad for the first boot.
Now, to celebrate with a cold beer!
6 commentsHappy Mac

Things are starting to warm up now. I’ve successfully booted Plus Too as far as the Happy Mac startup icon! It’s not booting all the way into the Finder yet, but most of the tricky business with the IWM, floppy, and disk encoding schemes has been proven to work. Hooray!
If you can tolerate some shakey-cam video, here’s a movie that demonstrates Plus Too’s current capabilities. It shows booting to the question-mark disk screen, moving the mouse, inserting a blank disk, ejecting a disk, inserting a fragment of a System 3.3 startup disk, and the Happy Mac.
I’ve divided all of Plus Too’s disk-related functions into three parts: IWM, drive, and disk. The IWM is the floppy disk controller chip in the classic Macintosh, and my model of the IWM is finished and working. The Plus Too drive model replicates the brains of a 3.5 inch Sony floppy drive, which has sixteen 1-bit status and control registers. The drive model is mostly done, but there are still a few functions related to disk swapping and disk writing that are incomplete. The disk model replicates the GCR encoded data format of a 3.5 inch Macintosh disk, and is where more work is needed in order to boot to the Finder.
Plus Too is intended to load 400K/800K disk images from an SD card, perform on-the-fly GCR encoding for each sector, and pass the result to the drive model and IWM. That part isn’t working yet, so I took some shortcuts in the test shown in the video. The GCR encoding was done offline with a Windows PC, using a custom program I wrote. Then five sectors of the encoded data were stored in a block ROM inside the FPGA itself. Five sectors isn’t much, but it’s all I had space for, and it’s enough for the Mac to recognize a boot disk and show the Happy Mac icon. Testing the boot sequence this way enabled me to confirm that the GCR encoding algorithm is correct, and that the IWM and drive models are working, even before the SD card interface and on-the-fly encoding module is ready.
Next Steps
The next logical step is to implement an SD card reader interface, so I can load encoded data from the card instead of from the limited FPGA memory. Once that’s done, I should be able to boot all the way to a working Finder. For a read-only system I technically don’t need to do any more than that, but doing the encoding on the fly instead of with an offline tool would be much nicer. To support disk writes, on-the-fly encoding (and decoding) will be a necessity. The encoding and decoding algorithm is somewhat complex, and I’m unsure whether to attempt to design a Verilog state machine to do it, or incorporate a simple microcontroller core (maybe even Tiny CPU) and do it with a conventional program instead.
Other Concerns
There are all kinds of timing problems and glitches hiding just beneath the surface, and I’m worried. Every now and then I’ll make a change that causes Plus Too to exhibit broken behavior or fail to boot, even some innocuous change that definitely doesn’t affect the logic. Just today I made a change that caused an unexplained boot failure, and in the latest version I get random mouse droppings when the mouse is in a certain area of the screen.
Usually if I rearrange some modules or make some other superficial change, the problem will disappear, but that’s a very scary situation. There’s no doubt I need to master the Altera timing constraints editor to sort it all out, but my earlier attempts to make sense of it were dismal failures. Unfortunately, it doesn’t seem to be possible to translate a statement like “external signals D15-D0 must be valid no more than 50ns after the clock edge” into a simple constraint that I can enter somewhere. The whole system seems geared toward me writing custom Tcl scripts, which so far I’ve refused to do. Reading through the documentation, my eyes quickly glaze over and I wonder again why this all has to be so complicated.
A few concerns remain in the drive subsystem as well. With my current test setup, I always return the same five sectors of data, regardless of what track or side is actually being accessed. It’s possible there’s some hidden complexity there that I’ll need to address, or that I’ll incorrectly map disk image data to the wrong sectors, or that the method I’m using to determine what track and sector is being accessed isn’t even valid. This is a fairly small detail, though, and I’m optimistic I’ll be able to extend the current model to support all tracks and sides without major problems.
7 commentsCrazy Disk Encoding Schemes
Wow. I expected the details of the Macintosh floppy data encoding to be a bit complex, but this is worse than I expected. I think I finally understand it well enough to duplicate it, but I can’t explain why it does what it does. Maybe whoever extended Woz’s code from the Apple II was just in a bad mood.
I’ve been focusing my attention on how a single sector’s data is represented on the disk. Most of it is fairly easy to understand, once you’ve found a reference. Each sector consists of an address block and a data block. Between the blocks are $FF sync bytes. The address block begins with a specific header ($D5AA96, a sort of secret password for old Apple II hackers), then five encoded bytes containing the disk format, track number, sector number, and a checksum, and ends with a specific trailer sequence. The data block begins with a different header, then the sector number, the encoded sector data, and a trailer sequence.
It’s the “encoded data” step where things start to get tricky. Logical data bytes must be encoded into disk data bytes before being written to disk. This is due to physical limitations of the magnetic disk media: bytes with too many consecutive zero bits can not be stored reliably. Of the 256 possible bytes values, only 64 (or is it 67?) values can be stored on disk reliably, so the Mac encodes six bits of logical data at a time into one of the 64 “safe” disk byte values, in a process called 6-and-2 GCR encoding. There’s a 64-entry table in the Macintosh ROM for converting six bits of logical data into the corresponding disk byte, which was often called a nibble (even though it’s not 4 bits). When reading a sector, the process is applied in reverse.
All of this I more-or-less already knew before I began. I expected to find a routine somewhere in ROM that grabs three logical data bytes at a time (24 bits), and shifts out six bits at a time, using the GCR lookup table to produce four disk bytes. Once I found the disk sector write routine, it became clear it does more than that. My first hint was this French page about Apple II DOS 3.3, whose low-level disk format is very similar to the Mac’s. According to this page, data values aren’t used directly as indexes into the GCR table. Instead, each data byte is XOR’d with the previous byte, and the result is used as the index into the GCR table. Why? This is where I fail, because I have no idea why. It seems somehow related to checksumming the data, but it would be easier to use the data values as direct GCR table indexes, and then use the sum of all data values as a checksum.
An unexplained running XOR-based index is strange, but I could live with that if it were the only unexplained part. Unfortunately it seems that either the French page is incomplete, or else the Mac encoding method is more complex than Apple DOS 3.3 encoding. I’ve stared at the 68000 assembly code in the ROM routine for quite a while, as well as C re-implementations from MESS and from Ben Herrenschmidt, trying to grasp some kind of high-level purpose in it, but it just seems arbitrary to me.
Instead of XOR-ing each value with the previous one, it XOR’s each value with the sum of all previous values back to the beginning with a stride of three. For example, the 10th value is XOR’d with the sum of the 9th, 6th, 3rd, and 0th values. To facilitate this, three running sums are maintained for the values on the 0th, 1st, and 2nd stride. But wait, it’s more complicated than that. After every 3 logical bytes, the sum for stride 2 is rotated left one bit position, and the bit that’s rotated out is added into the 0th stride sum, and any overflow there is added into the 1st stride sum, whose overflow is added to the second. Or something like that. It’s all a little crazy.
6 comments