BMOW title
Floppy Emu banner

Archive for October, 2013

Fixing 30 Year Old Apple ROM Bugs

After nearly a year of inactivity, I’ve started work on Floppy Emu again! One of my first priorities was compatibility with Macs that have a 400K floppy drive – the original Mac 128K, and the Mac 512K (not the 512Ke). Floppy Emu emulates a 400K/800K external floppy drive, and it works fine with 400K disk images, so I originally assumed it would have no problems on those old 400K-based machines. Wrong! Reports trickled in of mysterious Sad Mac errors and other problems when using Floppy Emu with those oldest Mac models. After ignoring the problem for months, I finally got ahold of a Mac 512K so I could investigate things firsthand.

Some brief experimentation showed that Floppy Emu was at least partly working with the 512K. When I “inserted” a disk image of a non-bootable disk, the Mac rejected it and showed the X’d disk icon. But when I inserted a bootable 400K system disk image, the Mac chewed away for a moment, then died with a Sad Mac error code 0F0004. So it was clear the Mac 512K could recognize the difference between a bootable and a non-bootable disk, but was failing to actually boot when using Floppy Emu. The same disk image and Emu hardware booted fine on a Mac plus, so the problem looked like an unknown incompatibility between the Mac 512K and Floppy Emu.

The Sad Mac – such a cute way for a computer to die. Much friendlier than a blue screen of death, but just as fatal.

From past experience, I knew the Mac 128K and 512K used a different version of the Apple ROM than found in the 512Ke and Mac Plus. The 512Ke/Plus ROM added support for 800K floppy drives. But as long as only 400K disk images are used, I couldn’t see any reason Floppy Emu shouldn’t work on 128K/512K Macs with the old ROMs. After all, how would the Mac even know that Floppy Emu wasn’t a 400K drive? The real 400K and 800K drives are virtually identical, with the same connector, same internal registers, etc. The only difference is that one is a single-sided drive and one is double-sided. Also the Mac directly controls the speed of a 400K drive with a PWM signal, but an 800K drive ignores the PWM signal and self-regulates its speed.

I hunted the internet for details on 30-year-old boot errors, and found two explanations for error 0F0004. One said “Voltage too Low, adjust voltage to +5.0v.” and another said “Division by Zero”. How could there be two such radically different meanings for the same error? But things started to fall in place after I found this Apple Tech Note, which said that 0F0004 was a result of using an 800K external disk drive on the Mac 128K/512K with the old ROMs. So somehow the Mac was still identifying Floppy Emu as an 800K disk drive, which caused it to die. But how did it know?

 

ROM Diving

When all else fails, it’s time to look at the source code. In this case that meant disassembling the ROM from the 128K/512K to find out what the floppy driver is doing. I’ve done this a few times before now, but it’s still a major pain. Even with a 68K disassembly tool, and substituting symbolic names for all the Mac memory-mapped hardware, it’s still an opaque mess of assembly language code that doesn’t yield its secrets easily. It’s hard enough just to locate the relevant floppy routines, let alone understand the fine details of how they work. But after a day of poking and prodding, I found some code that looked very suspicious:

P_Sony_MakeSpdTbl:
1E82   285F                  Move.L    (A7)+, A4
1E84   343C 0080             Move      $80, D2		; set PWM value to $80
1E88   615C                  Bsr       P50		; measure TACH speed, get speed1 result in D4
1E8A   6B56                  BMI       L309
1E8C   2604                  Move.L    D4, D3		; copy result to D3
1E8E   343C 0100             Move      $100, D2		; set PWM value to $100
1E92   6152                  Bsr       P50		; measure TACH speed, get speed2 result in D4
1E94   6B4C                  BMI       L309
1E96   2A04                  Move.L    D4, D5		; copy result to D5
1E98   9A83                  Sub.L     D3, D5		; D5 = difference between speed1 and speed2
1E9A   E38B                  LsL.L     $1, D3
1E9C   7C04                  MoveQ.L   $4, D6
1E9E   4BFA FFC8             Lea.L     DT19, A5
1EA2   6100 FCA2             Bsr       Sony_SetupSonyVars
1EA6   47F1 101A             Lea.L     $1A(A1,D1.W), A3
1EAA   7400       L304:      MoveQ.L   $0, D2
1EAC   341D                  Move      (A5)+, D2
1EAE   2E02                  Move.L    D2, D7
1EB0   D45D                  Add       (A5)+, D2
1EB2   E24A                  LsR       $1, D2
1EB4   D484       L305:      Add.L     D4, D2
1EB6   9483                  Sub.L     D3, D2
1EB8   6A02                  BPL       L306
1EBA   7400                  MoveQ.L   $0, D2
1EBC   EF8A       L306:      LsL.L     $7, D2
1EBE   6702                  BEQ       L307
1EC0   84C5                  DivU      D5, D2		; divide D2 by (speed2 - speed1)

Comments were written by me, after analyzing the code. This particular routine does some kind of calibration of the floppy drive – it varies the PWM signal, then measures the resulting drive speed as indicated by a value called TACH. I think it’s trying to establish a linear relationship between PWM and TACH, since that relationship may vary slightly between real 400K drives. There’s a lot going on in this routine, and I’ve truncated it to only show the first 25 instructions. But notice it contains a DivU instruction? There aren’t many places that division is used in the original Mac ROM, so that’s significant.

Looking deeper, the routine makes two drive speed measurements, then does some math to compute a value in D2, then finally divides D2 by the difference between the two speed measurements. But what happens if the two speed measurements were equal? Division by zero! Hello, 30 year old ROM bug.

On a 400K drive that’s controlled by the Mac’s PWM signal, the speed measurements will always have different results, because the PWM is different during each measurement. But on an 800K drive which self-regulates its speed, and on Floppy Emu which has a totally fake speed, the PWM changes will have no effect. That means both speed measurements will get the same result, and the Mac will crash with a division by zero error when it calls this ROM routine. Getting two different speed measurements was probably a safe assumption in 1983/1984 when the code was written, but it still would have been nice to do some defensive programming and add a zero check there, to handle the case of a broken drive or broken assumptions.

 

Fixing It

Once I understood the cause of the 0F0004 error, the question was how to modify Floppy Emu to avoid it. The TACH speed signal that Floppy Emu generates is obviously fake, since there are no moving parts. It calculates how fast the drive motor should be spinning, given which track is being accessed, and creates a series of pulses on TACH at the appropriate rate. To avoid the division by zero crash, the TACH rate needs to vary, so that two successive measurements see different TACH speeds.

One solution would be to use the PWM signal from the Mac, since that’s its purpose. By analyzing the PWM duty cycle, the Floppy Emu hardware could infer how fast the Mac wanted the drive to spin, and generate an appropriate TACH to match. Unfortunately, the hardware doesn’t even have the PWM pin connected. And if it did, it’s not certain that it could do the necessary duty cycle and TACH calculations fast enough, or efficiently enough to fit in the remaining logic space.

My solution was to constantly flutter the drive speed TACH signal. The flutter rate must be fast enough that two successive measurements will see different rates, but not so fast that two successive measurements will span the entire flutter cycle and so see the same rate. The flutter amplitude must be large enough for the speed measurements to be different, but not so large that the measured speed falls outside the valid range for the current track being accessed. With a little experimenting, I settled on a flutter cycle period of 640 ms and a flutter amplitude of about 0.25%.

And it works! The image above shows the Mac 512K running System 0.97, Finder 1.0, booted from Floppy Emu. Those fonts sure are weird.

 

A Bit of History

When Macintosh external 800K floppy drives first became available, in 1985/1986, owners of the Mac 128K and 512K faced the same problem I did here, only they couldn’t modify the drive’s TACH behavior to work around the ROM bug. Instead, Apple released a system patch called HD20 which fixed the bug and added 800K drive support. But using it was a pain: you had to boot from a 400K floppy in the internal drive first, which contained the HD20 patch, and then you could mount an 800K floppy in the external drive. Booting from an 800K drive wasn’t possible. It wasn’t a very nice solution.

If that ROM routine’s author had added a zero check, this wouldn’t have been necessary. Mac 128K/512K owners could have booted directly from an 800K floppy in the external drive, loading the HD20 init in the process. Everything would have been great. Instead, that divide by zero bug doomed them all to a miserable 800K experience.

When Apple and Sony were developing the 800K external drive, they must have known this was a problem, and they could have used the solution I did to flutter the TACH speed. In 1985 they couldn’t just drop a 25-cent microcontroller into the drive to synthesize TACH, but they could have added a simple RC circuit to inject some AC “noise” into the TACH signal at the appropriate amplitude and period, achieving the same result. Everything would have been great. But they didn’t, and all those 128K/512K owners were forced to endure the 400K floppy boot-swap dance forever.

Read 5 comments and join the conversation