BMOW title
Floppy Emu banner

Reverse Engineering the HD20

hd20-disk-unreadable

Finally after nine months, a progress update on HD20 emulation! Back in February I wrote about my efforts to reverse engineer the Apple HD20, an external 20 MB hard drive for the Macintosh that was introduced way back in 1985. This was before the introduction of SCSI support, so the HD20 connected via the Mac’s floppy port. With luck, that might make it possible to emulate an HD20 using the hardware I previously developed for Floppy Emu, the Macintosh floppy drive emulator. 20 MB (or more?) of external storage for a Mac Plus or other vintage machine, stored on an SD memory card, with no dependencies on SCSI or aging rotational disks! But doing it would require understanding how the HD20 worked well enough to emulate it.

 
A Little History

The HD20 has a famously strange reputation. Despite sharing the same DB-19 connector as a floppy drive, it used an entirely different communication protocol. When SCSI support was introduced in 1986, external hard drive solutions quickly adopted the new standard, so the HD20 was the only hard drive to ever use this protocol. The protocol was never documented anywhere publicly, and eventually became an old, obscure mystery. Emulating the HD20 was virtually impossible.

Then about a year ago, a couple of ancient Apple internal documents about “directly connected disks” surfaced on the internet. The term “HD20” didn’t appear anywhere in them, but it was clear that’s what they were about. Two docs dated March 1985 and May 1985 outlined the DCD communication protocol. They described a state-based command-and-response system, in which data was transferred in groups of 7 logical bytes encoded into 8 physical bytes. But the docs conflicted with each other in many details, and were silent on other critical points. And as I later discovered, both docs conflicted with tests performed on a real Macintosh system. Still, it was enough to get started. In my February experiments, I got as far as fooling the Mac into believing an HD20 was connected, and receiving/decoding a drive status query. But I was never able to send back a reply, and I eventually lost interest.

 
New Preparations

codewarrior

Thanks to a slow trickle of people who kept encouraging me to look at it, I recently dusted off the HD20 project again. I could receiving a drive status query, but how could I send a reply? What kind of handshaking was needed to tell the Mac a reply was ready? How big was the reply, and what data did it contain? What checksum method did it use? Armed with the DCD docs and some educated guesses, I programmed the Emu hardware to send a drive status response. Then to test it, I wrote a Macintosh program to read arbitrary sectors from disk ID #2, the first connected DCD.

Side note: it’s not a simple thing to write new software for a 30-year-old computer system! In order to create my test program, I used an old copy of Metrowerks Codewarrior, running under emulation in Basilisk II. The finished program was saved to Basilisk’s virtual hard disk image. Then I mounted a virtual floppy disk image in Basilisk, and copied the program to it. After quitting Basilisk, I copied the floppy disk image file to an SD card, put the SD card in a second Floppy Emu (the first one having been turned into an HD20 Emu), booted my Mac Plus with the Floppy Emu attached, mounted the floppy image, and copied the program file to the Plus’s external SCSI disk. Oof!

The Mac really doesn’t like to boot if it thinks there’s a sort-of-working HD20 attached. It either hangs for long periods, or complains endlessly about disk errors. If I used the SCSI drive and managed to boot into the Finder, the Finder would freeze forever while trying to communicate with the emulated HD20, so I could never actually launch my test program. Doh! My solution was to configure the Mac to run the test program immediately at boot-up, instead of running the Finder. With the not-really-an-HD20 disconnected, this can be done by highlighting the test program, then selecting the Finder’s Special -> Set Startup menu.

 
Crawling towards Two-Way Communication

At last I had something to test, and of course it didn’t work. The test program reported error -17, which according to Inside Macintosh means “driver can’t respond to this control call.” I could tell my hardware was receiving the drive status command, and trying to send a reply, but beyond that I was blind. Did the Mac receive the reply, but reject it as misformatted? Was it the wrong size, or encoded incorrectly? Maybe the handshaking wasn’t working, and the Mac wasn’t even seeing any response at all? No matter what I did or what I tried, all I got was error -17, with no further clues to help me troubleshoot.

From another Apple document, I knew there were supposed to be HD20-specific error codes, and I could see places in the Mac ROM disassembly where those errors were generated. They are:

; New HD20 error codes

wrtHsLw         equ     $10             ; HSHK low before starting
wrtHSLwTO       equ     $11             ; Time out waiting for HSHK to go low
wrtHSHighTO     equ     $13             ; Time out waiting for HSHK to go high
rdHsHi          equ     $20             ; HSHK high before starting
rdSyncTO        equ     $21             ; Time out waiting for sync ($AA) bye
rdGroupTO       equ     $22             ; Time out waiting for group
rdHoffSyncTO    equ     $24             ; Time out waiting for sync after holdoff
rdHsHiTO        equ     $25             ; Time out waiting for HSHK high
rdChksumErr     equ     $26             ; Checksum error on response packet
invalidResp     equ     $30             ; First byte in response packet was wrong
sqncNumErr      equ     $31             ; Sequence number in response packet was wrong
dNumberErr      equ     $32             ; Drive number in response packet was wrong
noResp          equ     $40             ; No response packet ever received

Those are some great low-level error codes covering handshaking, checksums, and other details. Why was I only getting error -17, instead of the more detailed codes? With no useful feedback, I continued pounding away at the problem blindly for far longer than I should have, but got nowhere.

Something was generating that -17, and I was determined to find out what. On the early Macs, most of the operating system code is stored in ROM, and this includes the HD20 I/O routines. It’s possible to pull those ROMs, dump the data from them, and run it through a 68000 disassembler. Other people have done this long ago. But the result is only useful to a point – it’s just raw assembly language code, without any symbolic names or comments or any other context to help understand what it’s doing. Just thousands of lines like these:

SubA.L    A4, A4
SubA.L    A5, A5
MoveQ.L   $1, D3
MoveQ.L   $0, D6
MoveQ.L   $0, D7
Move      $64, $1C0(A1)
Move      $4650, $1C2(A1)
Cmp.B     $3, $19C(A1)
BNE       L4809
Move.L    $14C, D7
Move      $A, $1C0(A1)
Move      $2710, $1C2(A1)
Lea.L     $1C4(A1), A4

Ugh. But with slow and painful effort, it’s eventually possible to make at least partial sense of it all. For example, looking backwards from this code, I can see that A1 was previously loaded from memory location $000134, which I know from other Mac programming resources is a pointer to a structure containing drive state, called SonyVars. So all those references like $19C(A1) in the code above are offsets into this SonyVars structure. And from yet another Apple internal document, I learned that offset $19C in SonyVars is the DCD command number. So here’s some code that (in part) checks if the command number is 3 – the drive status command, and branches away somewhere else if it’s not. If it is command 3, it stores the curious number $14C (332 decimal) in another register. Which after more tedious analysis, turns out to be the size of the expected drive status reply. Except for some extra padding and modulo-7 business, which I discovered after still more analysis.

If you feel dizzy, I’ll pause for a moment if you want to lean over and vomit.

Here are the DCD-specifc SonyVars offsets and constants that I learned:

sonyVarEnd      equ     $128
; (4) Direct-connect driver locals.

TagSize         equ     20                      ; 20 bytes tags/block
dcdLclLth       equ     28                      ; (use fields through DriveMisc)
drive3          equ     sonyVarEnd              ; first DCD                     
drive4          equ     drive3+dcdLclLth        ; second DCD
drive5          equ     drive4+dcdLclLth        ; third DCD
drive6          equ     drive5+dcdLclLth        ; fourth DCD
stsRtnAddr      equ     drive6+dcdLclLth        ; DCD status call return address
dcdCmd          equ     stsRtnAddr+4            ; command byte to DCD
response        equ     dcdCmd                  ; response byte (command+$80)
seqNum          equ     response+1              ; mb sequence number (sys commands only)
status          equ     seqNum+1                ; returned status bytes
startBlock      equ     status                  ; starting block # (in commands)
driveOut        equ     startBlock+3            ; we send drive number in this field
driveIn         equ     status+2                ; low 6 bits
tagBytes        equ     status+4                ; tag bytes get stuffed here temporarily
devType         equ     tagBytes                ; first 20 bytes of status are written
devManufctr     equ     devType+2               ;       in the 20-byte tag buffer
devChar         equ     devManufctr+2
devBlks         equ     devChar                 ; low 3 bytes of this longword
devSpares       equ     devChar+4
devBadBlks      equ     devSpares+2
devMisc         equ     devBadBlks+2            ; 8 bytes misc
bufSize         equ     tagBytes+tagSize        ; number of bytes/block
lastStatus      equ     bufSize+4               ; last status returned
lastResult      equ     lastStatus+4            ; error type
dcdFlags        equ     lastResult+1            ; flag of whether we've done a reset
chkTime         equ     dcdFlags+1              ; 100usec check ready count
maxTime         equ     chkTime+2               ; maximum number of checks
stsBuffer       equ     maxTime+2               ; 512 bytes of status
devReserve      equ     stsBuffer
devIcon         equ     devReserve+44
devFiller       equ     devIcon+256             ; 32 bytes for optional name
diskVarLth      equ     devFiller+32
stsBufSize      equ     diskVarLth-stsBuffer

; device characteristics byte:
devChEject      equ     4               ; ejectable if 1
devChWP         equ     3               ; write protected if 1
devChIcon       equ     2               ; icon available if 1

; (4) Direct-connect drive constants.
blkSize         equ     512             ; 512 bytes data/block
syncByte        equ     $AA             ; Sync byte for start of transmission

noError         equ     0               ; resultTypes
nonZerStat      equ     1
comErr          equ     2

Applying these offsets and constants to the ROM disassembly, and doing many hours of analysis, I was finally able to construct a halfway decently commented version of the HD20 I/O routines. It’s not pretty, but it’s enough to resolve the conflicts and unanswered questions left by the DCD documents. Take a look, if you dare: macintosh-HD20-io-routines.asm

 
Checksum Error

This ROM analysis finally revealed one crucial fact: the “missing” HD20 error codes were packed into the most significant byte of the 4-byte value lastStatus, which is set by the disk I/O call. My test program had been displaying this value all along, in decimal format. I had seen lastStatus values like 637534208, but just assumed it was some uninitialized garbage value. But wait, 637534208 in hex is $26000000, and from the table above, $26 is the error code for “checksum error on response packet”. HOLY SHIT! This was my eureka moment, my Rosetta Stone, when I was finally able to get meaningful feedback instead of those blasted error -17’s. I could make changes in the emulator routines, and see changes in the error codes reported by the Mac, and start to do real troubleshooting. Yes!

A checksum error could be caused by many things. I might have the wrong checksum algorithm, or be putting the checksum value in the wrong spot, or encoding the whole message improperly, or any number of other mistakes. Without being able to see the data as it was received by the Macintosh, it was hard to say what was wrong. I was about to get out my oscilloscope and logic analyzer, but then I took another look at the ROM disassembly. After a drive status response, even if there was a checksum error, the response data should be available at SonyVars+$19C. I already had MacsBug installed on my Plus, so I hit the interrupt switch, and used the dm command to display the region of memory. Ta-dah! There was my response data, seemingly received perfectly:

hd20-macsbug

The first byte was $83, which was the command number plus $80. Afterwards followed a pile of other fields and flags, some of which I filled with sequences of consecutive numbers so I could recognize them in the debugger. But there were a few oddities, like the 14-byte break between 07 and 08 in the sequence on the second and third lines. At first I thought this was a bug in my sending code, but it turns out that the Mac actually stores it this way intentionally. I don’t know why, but from examining the ROM disassembly, it’s clear that after the first 26 bytes are received, it jumps the buffer pointer to a new address and stores the remainder of the data there.

The other odd thing about this memory dump completely escaped my notice at first. In the sequence ending on the 22nd line, notice how the last two bytes are FCFC? The expected continuation of consecutive values should be FCFD. This was a clue whose meaning I didn’t discover until later.

Unfortunately, the ROM routines don’t actually store the checksum byte itself, so I couldn’t use MacsBug to examine it and see why it was wrong. It was that FCFC value that finally led me to the answer. For reasons I still don’t understand, it appears that the last byte in a transmission I send from the Emu hardware to the Mac isn’t received correctly. The last byte contains the least significant bits of the preceding seven, which include the checksum byte. By appending an extra dummy byte onto the end of the transmission, the LSB byte was now received correctly, and the last seven bytes before it could be correctly reconstructed.

Once I made this fix, the test program started reporting something new: error -19, read error. And the Emu received a new command after #3 drive status: command #0, read block. The Mac had accepted my drive status reply, and was continuing on to perform a read request! Two way communication at last! Of course I hadn’t implemented a handler for block read requests yet, that was next. But it was time to take a break, and celebrate my progress to this point.

Here is the correctly-formatted drive status structure:

#define DEVICE_CHAR_DISK_IN_PLACE 0x02
#define DEVICE_CHAR_ICON_INCLUDED 0x04
#define DEVICE_CHAR_WRITE_PROTECTED 0x08
#define DEVICE_CHAR_EJECTABLE 0x10
#define DEVICE_CHAR_WRITABLE 0x20
#define DEVICE_CHAR_READABLE 0x40
#define DEVICE_CHAR_MOUNTABLE 0x80

struct DriveStatus
{
	uint16_t deviceType;
	uint16_t deviceManufacturer; // Apple = 1
	uint8_t  deviceChars; // characteristics
	uint8_t  numBlocks[3]; // 3 bytes number of blocks on device
	uint16_t numSpares;
	uint16_t badBlocks;
	uint8_t  reserved[52]; 
	uint8_t  icon[256];
	uint8_t  padding[16]; 
};

The structure is 336 bytes. A valid drive status response is:

$83 - command number plus $80
$00 - pad
$00 - status high byte, zero means no error
$00 - status low byte, zero means no error
$00 - pad
$00 - pad
DriveStatus struct
checksum - choose this so the sum of all bytes (including this one) is 0 modulo 256

This is 343 bytes. Then the response must be encoded using the 7-to-8 encoding method described in the DCD doc, and in my posting from last February. This results in 49 groups of 8 encoded bytes each that are actually sent to the Mac.

 
Checksum Non-Sequitur
Story time about checksum errors: my first computer was an Atari 800, which my family bought when I was 12. Day 1 when it arrived from the store, I attempted to load my very first program from cassette tape. The Atari reported “ERROR 143”. Confused, I consulted the printed manual to learn that error 143 meant “SERIAL BUS DATA FRAME CHECKSUM ERROR”. This was the full explanation, and the only help provided for the error. At the time, those words made as little sense to me as “BYTE VECTOR DIRECTION BUFFER INTERRUPT” might have, and I nearly threw the machine out the window. Fortunately it worked the second time I tried it, and my future in computer technology was assured.

 
Whee!

To prove that the drive status response was really working, I whipped up an ugly BMOW icon, and stuffed its bytes into the icon field of the DriveStatus struct sent from the Emu. Now when I try to mount the emulated HD20 in the Finder, I see the message shown in the photo at the top of this post. Woohoo, that is one ugly icon! But it comes from a successfully received drive status reply, so that makes it awesome.

There are still plenty of other issues to resolve before I can get full HD20 emulation working, not the least of which is actually implementing the read and write commands. Beyond that, here’s a strange one – if you move the mouse during a data transmission, it fails! The Mac ROM routines poll the SCC and VIA chips during HD20 transfers, and if there’s a pending interrupt, it puts the HD20 into a holdoff state so it can service the interrupt. I’m a little hazy on the details of how that works, and I haven’t yet tried to implement the holdoff logic. So for now if you move the mouse or do anything else to generate an interrupt, the transfer fails.

The bigger issues may be unrelated to the HD20 code itself. My current prototype uses the Floppy Emu hardware, but it replaces the Floppy Emu software rather than adding to it. In particular, I don’t think there are enough logic resources in the Emu’s CPLD chip to handle both floppy and HD20 emulation. This would mean you’d have to flash new firmware every time you wanted to switch between emulation types – not exactly a great user experience.

It’s also unclear which Macintosh models could make use of HD20 emulation. The Mac 512Ke and Mac Plus definitely have HD20 support in their ROMs. The Mac 512K can use an HD20 if you first boot it with a System file containing the HD20 Init – but that would require having a working floppy drive and floppy disk, or a second Floppy Emu to serve as the boot disk. According to mac512k.com, the Mac SE, Classic, IIci, and Portable also have HD20 support in ROM, but the SE/30, II, IIx, IIcx, IIsi, IIfx, and LC don’t. I assume that means anything newer than those machines doesn’t have HD20 support in ROM either. It’s not clear if newer machines could make use of the HD20 System Init, but if they could, they’d be subject to the same requirement of having a working floppy drive & disk or a second Floppy Emu for booting.

Even if HD20 emulation only proves useful to owners of the 512Ke, Plus, SE, Classic, Portable, and IIci, that’s still a lot of people! If any Floppy Emu owners with an SE/30, LC, or II-series machine other than the IIci would be willing to help test this, please let me know.

Read 6 comments and join the conversation 

6 Comments so far

  1. Charles Phillips November 22nd, 2014 6:26 pm

    Holi Moli! 🙂

  2. Charles Phillips November 22nd, 2014 6:52 pm

    I have a real HD-20, and (2) – floppy emu’s, I will test anything you want man.

  3. Anonymous November 22nd, 2014 10:19 pm

    * Macintosh 128k – init from floopy emu
    * Macintosh 512k – init from floopy emu
    * Macintosh 512Ke
    * Macintosh Plus
    * Macintosh SE Dual 800k
    * Macintosh SE-FDHD ?
    * Macintosh Classic
    * Macintosh Classic II
    * Macintosh IIci
    * Macintosh Portable
    * Macintosh SE/30
    * Macintosh II
    * Macintosh IIx
    * Macintosh IIcx
    * Macintosh IIsi
    * Macintosh IIfx

  4. Charles November 22nd, 2014 10:21 pm

    whoops sorry about that … its getting late… — Charles
    i am going through my HD20 thread i forgot about a year ago. This one had verification, testing
    that confirmed the HD20 worked with more machines then what was originally thought of.
    https://68kmla.org/forums/index.php?/topic/21594-macintosh-hard-disk-20/?hl=hd20

  5. Charles November 25th, 2014 1:24 pm

    Ck very cool, sounds like you are making large amounts of progress with this man!

    So yes we can add the
    -Macintosh LC
    and the
    -Macintosh IIsi
    and the
    -Macintosh Classic II
    To the HD20 “DCD” native support list
    -And All supported machines that can accept Dougg3 ROM / IIsi ROM
    Except the IIfx, will have HD20 support! IIx – IIcx – SE/30

    WE could add the Macintosh LC-II because the Macintosh LC-II does have Native HD20 support in ROM.!!!

    The Macintosh LC has a second Floppy port near the scsi port, witch natively boots the HD20.
    The Macintosh LC-II has something there, soldered port holes. but it is also missing
    like 4 pins from a normal floppy port. But maybe we could still solder some header to it
    and maybe it will still run a HD20.

    From : BMOW
    ” If someone’s got a real HD20, please do Get Info on the disk and tell me what it says for “where”.”

    -Yes i can try that and tell you what it says.

    Keep up the good work man!!!

  6. Yuhong Bao November 27th, 2014 12:42 pm

    @Charles: If this is true, I am guessing that it was killed from the $178 ROM, but restored in the $37A and $67C ROMs.

Leave a reply. Comments may take a few minutes to appear.