BMOW title
Floppy Emu banner

Testing For Short Circuits

I’m still working on the design for an automated Yellowstone tester. The general idea is to use an STM32 microcontroller that’s connected to all the Yellowstone board’s I/O signals, then drive the inputs in various combinations and verify the outputs. This should catch a lot of potential problems, including open circuits, missing components, and many instances of defective components. But what about short circuits on the Yellowstone board that’s being tested? These aren’t so simple to detect, and they can potentially damage the tester.

The trouble with short circuits is that they can’t really be tested by working with digital signals alone – they force the whole problem into the analog domain. Depending on what’s shorted, and the effective resistance of the short, the behavior can vary greatly. A direct +5V to ground short might blow a fuse, or prevent the tester from even initializing if it shares a common supply. Other shorts might occur between a signal and any supply voltage (+5, -5, +12, -12), or between two supply voltages, or between two signals. The shorted signals might be driven from the tester, or from a chip on the Yellowstone board, or one of each. A short might cause immediate failure somewhere, or it might just cause a signal’s voltage to get pulled down or up. Depending on the change in voltage, it might not be enough to flip a bit and create a detectable error. Lots of components can also likely survive a short circuit for a while, but they’ll get hot, and will probably fail later. How can the tester detect these?

Assuming there’s a fuse, what type should it be? A replaceable fuse will be annoying if boards with shorts are at all common, since the fuse will need replacement often while testing a batch of boards. A poly fuse is resettable, but it can take about an hour to cool down enough to reset, rendering the tester unusable during that time. Neither option seems great.

If there’s a fuse, what should be fuse-protected? A fuse on the board’s 5V supply won’t do anything if there’s a short between an input signal and GND, because the current will be sourced by the tester and not by the Yellowstone board. Nor will it do anything for shorts on the other supply voltages. A short between signal and supply or between two signals may or may not be protected by a fuse, depending on where the current is ultimately sourced.

What max current rating is appropriate for a fuse? The “safe” current level could vary widely depending on what supply or signal is considered, and the current state of all the I/O signals. Then there’s there the 3.3V regulator on the Yellowstone board, which has its own overcurrent protection mechanism. If there’s a short between 3.3V signals, the regulator will limit the current, and a fuse or other mechanism on the tester may not even realize that a short circuit occurred.

My head is spinning just trying to map out the possible short circuit cases, and how to detect and prevent them. At the same time, I’m trying to design a tester that strikes a good compromise between complexity and thoroughness. Maybe I could design a tester with a dozen different fuses and current-monitoring ICs and IR thermal sensors, and it would be great at detecting short circuits, but it would be too much complexity and effort for a small-scale hobbyist product. My goals are to avoid damaging the tester, and to detect a large majority of common board failures including short circuits, while recognizing that it will never be 100 percent perfect.

Read 16 comments and join the conversation 

16 Comments so far

  1. Tom Davies - September 1st, 2021 2:34 pm

    I have zero experience doing this sort of testing, but anyway 🙂 could you check for short circuits with an initial test which just connects the power supplies and checks that the current is within expected limits?

  2. Steve - September 1st, 2021 4:50 pm

    I should clarify that when a Yellowstone board is connected to the tester, the -5V / -12V / +12V “supplies” are really just regular 5V I/O signals from the tester, which can be read and written to help check for shorts. The only true voltage supply is +5V.

    Since there are no voltages anywhere outside of 0-5V, I think the only change necessary to prevent possible damage to the tester is to put a current limiting resistor on each tester output. Something like 220 ohms should do it. I feel better already knowing the tester won’t get cooked by a bad Yellowstone board.

    That still leaves the question of detecting short circuits. I probably need to hope that any short will cause a detectable error in functionality, because measuring the difference in current due to a short seems more problematic the more I think about it. If the current through a short is sourced from a tester output, then it’s the tester’s total supply current that will change, not the Yellowstone board’s. Even if current is sourced from a chip on the Yellowstone board, it will probably only be a difference of like 25 mA out of a total supply current of a few hundred mA. That’s not really fuse territory. There will probably be enough board-to-board variation in normal supply current that even having an exact current measurement won’t be enough to reliably detect short circuits.

    Given the tester design, the only type of short that will cause really large currents is a short between +5V and GND. Any other short will involve at least one I/O pin on a chip somewhere, so the current will be limited by the capacity of the pin’s I/O buffer (at least until it burns out). A +5V to GND short will probably be pretty obvious, and whatever 5V regulator powers the tester board won’t actually reach 5V, so the tester may not even run.

    A slightly weaker +5V to GND short is worth considering, say with an effective resistance of 10 ohms, creating a short circuit current of 500 mA. The 5V regulator can probably handle that, so the tester will run, and the tests might even pass. There are all sorts of current monitor and e-fuse ICs that can limit the current to a preset value, like the STEF05, but they don’t actually shut off the load when the current limit is reached – they turn down the voltage instead. I’d prefer something that simply shuts off the load and signals a permanent fault condition.

    Another option might be a high-side current sensor like ZXCT1107. I could use an STM32 analog input with the output of the ZXCT1107 to measure the load current, and turn off the load if the current is too high. That might be OK, but it would be a software-driven process. The tester would have to poll the load current from time to time, and if there were a brief surge of current due to a transient short-circuit when two shorted signals had opposite voltages, the tester might miss it unless it checked the current after every single I/O change.

  3. Chris Combs - September 1st, 2021 6:47 pm

    The Bourns SMD polyfuses I use in each of my projects typically reset in 0.5-5 seconds, FWIW. Current draws ranging from 250mA to 4A most of the time.

  4. Chris Combs - September 1st, 2021 6:53 pm

    P.S. in the one crappy test fixture I have made so far, I used high-side switching (P FET) through a polyfuse to supply the target, w/ fuses chosen to have hold current just above worst case draw for the target. Then had a sense line to an analog pin through a 10k (and divider because of a voltage difference). Probably some failure modes I didn’t anticipate, but it was pretty amazing to see lots of subtle weirdnesses shaken out by minor voltage variations in the target.

  5. Steve - September 1st, 2021 7:41 pm

    I think polyfuses can be hard to characterize. From what I’ve read, it will become conductive again in a few seconds, but won’t return to its original resistance value for hours. Or maybe never. Depending on the details of the circuit, that might not be an issue.

    What is your analog pin measuring, the voltage or the current? I’m gravitating towards something similar, but instead of a PFET and a polyfuse, using a MAX4789 or related device. It’s a high side load switch with an integrated auto-shutoff if the current limit is exceeded. Though maybe your PFET + polyfuse is simpler and better. Combine this with an STM32 analog pin to measure the actual voltage of the +5V supply, in case it sags.

    I may combine this with a current sensor like the ZXCT1107 so I can actually measure the current. The absolute current may not be a reliable indicator of much, but if I can measure the change in current before and after modifying I/O states, that might be useful somehow.

  6. Tux2000 - September 2nd, 2021 3:24 am

    A lesson learned from several commercial testers: Don’t supply tester hardware and device under test (DUT) from the same source, period.

    In your case: Use one supply for the tester, and another one for the DUT. That way, even a full short on the DUT won’t prevent your tester from working. The obvious next thing to do is to measure the DUT supply voltage. With a hard short, you can be sure it won’t be +5 V (more like a few mV).

    Now, let’s make that setup smarter. Replace the generic +5V supply for the DUT with a lab supply that supports current limiting, and adjust the current limit to just above the peak current of a “golden sample” (i.e. a known-good unit). Now, any short on the DUT will make the DUT supply reduce the voltage. You can measure that, stop the test, and report an error.

    Costs: A simple analog-controlled switch mode supply 0..30 V / 0..3 A sold under various names (LN-3003, MPJA 9616PS, CSM30003SM) costs about 40 €.

    Of course, you could also implement current limiting in your tester hardware, just copy that part from any DIY current-limited lab supply.

    A current sensor may be a good idea, some even come with an integrated ADC that can be read out via I²C (INA220 and similar ones).

    The measured current will tell you about missing parts and broken connections. You know the expected current for the DUT from the golden sample, if the DUT takes far less current, something is wrong.

  7. Chris Combs - September 2nd, 2021 4:16 am

    oh, a lab power supply makes a ton of sense. shoulda done that.

    I was just measuring the voltage, since it would sag quickly with any shorts.

    I haven’t run into that kind of residual PTC resistance difference being a noticeable problem, but am also not an EE, so caveat.

  8. Steven Stallion - September 2nd, 2021 9:22 am

    FWIW, I usually add both short and open circuit testing to my test fixtures. For short circuits, I’ve had the best luck by instrumenting a power supply and taking measurements while functional tests are executing on the fixture. If current or voltage falls out of a predefined range, I fail the board. As far as protecting equipment goes, you could potentially ramp the amount of current made available to the board to catch the problem before you blow an onboard PTC fuse as a last resort. For example, if your fuse blows at 1A and you supply 100mA to the board, there will be a certain amount of voltage droop that should be acceptable and in range. If voltage sags toward 0V, there’s a pretty good chance a short is present and you can fail the test before getting close to blowing a fuse. It takes a bit of fiddling, but if you know your passive current draw, that might be a good place to start.

    HTH,
    Steve

  9. Steve - September 2nd, 2021 10:20 am

    OK, this is beginning to gel. There are several useful ideas here:

    – Ensure that a hard short on the Yellowstone board doesn’t prevent the tester from running. Several ways to do this, including separate power supplies, fuses, and current monitors / switches like MAX4995B.

    – Measure the 5V supply to the Yellowstone board, and fail the test if it ever falls much below 5V. Could use something like a INA220, or just an analog input on the microcontroller.

    – Measure the current, and fail the test if it’s too high or too low (compared to a known-good reference board) for whatever operation is happening at that moment. Or measure the change in current before and after an operation. Could use the INA220 again, or a simple current sense monitor like ZXCT1107 combined with an analog input on the microcontroller.

    When measuring the current, I think it needs to be the whole system current (tester plus Yellowstone) rather than just the Yellowstone 5V supply current. Otherwise it won’t be able to detect problems where a Yellowstone input (tester output) with logical HIGH value is shorted to GND somewhere, because the current through the short isn’t sourced from the Yellowstone 5V supply.

    I’m not sure that soft current limiting is really necessary, or even desired, if something like the MAX4995B already provides a hard shutdown beyond some maximum current. On the other hand, if there’s a current limiter (whether in the bench supply or an IC on the tester board), then the MAX4995B isn’t really needed. I guess the question is if there’s an over-current situation with the Yellowstone board, do you want to reduce the voltage while keeping the current constant, or do you want to turn it completely off? Completely off (MAX4995B) sounds better to me but either way could probably work.

  10. Steven Stallion - September 2nd, 2021 11:29 am

    Personally I’d just shut the supply off. For a test fixture, you’re really looking for a PASS/FAIL – if a board fails my guess is you’d set it aside for further investigation using different tools.

  11. Tux2000 - September 3rd, 2021 9:52 am

    Again, don’t supply tester and DUT from the same supply. It just causes a lot of unnecessary trouble.

    The initial current measurement should be done with the tester as passive as possible, i.e. most tester pins switched to input or high impedance.

    Also, you overestimate the current sink and source capabilities of common gates. Unlike old microcontrollers like the AVRs, which can sink and source 20 mA or more, expect something like 1 mA from common gates. I don’t know all ARM-based microcontrollers, but the SAM series can sink and source about 2 mA per pin. You probably won’t see that difference in the current measurement at all.

    To detect shorts on signal lines, use the read-back function of the microcontroller resp. port expander. Microcontrollers and port expanders often have registers for setting the direction of and the level of a pin, and completely independent from that, a register that reads the actual logic level of the pin. So if you find that you write a 1 and read back a 0, or vice versa, you found a short.

    And while we are at it, make sure to test for cross-signal shorts. For any pair of adjacent signal pins, test all four signal combinations (00, 01, 10, 11) and read back to check for shorts.

    Or, even better: Don’t test for shorts on signal lines at all. It is a tester, not a diagnostics tool.

    Ensure you test all functions of the DUT. Shorts on signal lines will make some tests fail, so you don’t have to explicitly test for shorts on signal lines. Make sure you log expected and actual state in case of any failed test. Most likely, that is sufficient to diagnose many faults that you didn’t even think of yet.

    To make your tester a little bit more robust, you may want to add some “angst resistors” on the signal lines that limit short circuit currents. 10 to 100 ohm should be sufficient, depending on the signal. But, from experience, CMOS outputs can usually survive being shorted to GND or VCC for a while.

  12. Steve - September 3rd, 2021 11:13 am

    > Unlike old microcontrollers like the AVRs, which can sink and source 20 mA or more, expect something like 1 mA from common gates.

    I think it will be higher than that. The 74LVC245 buffers on the Yellowstone card are rated 24 mA source or sink while still maintaining valid logic levels, or 50 mA absolute max. The STM32 in the tester and the MCP23S17 port expander both claim 8 mA while maintaining valid logic levels, or 25 mA absolute max. So I would expect short-circuit currents on signals to be in that range or higher. Maybe I should intentionally short some sacrificial components and measure the current. If the current is only a couple of mA then I agree it’ll be difficult to detect through measuring the supply current.

    > If you find that you write a 1 and read back a 0, or vice versa, you found a short.

    I agree there. What about the case where you write a 1 and read back a 1, or write 0 and read back 0 – does that mean there’s no short? I don’t think so, there could be a short but your output drive might be stronger than whatever you’re shorted to. So this method might fail to detect many kinds of shorts.

    > Shorts on signal lines will make some tests fail, so you don’t have to explicitly test for shorts on signal lines.

    This is the question I’m wrestling with. Most shorts involving signal lines should cause a test to fail somewhere, so maybe I don’t need to explicitly check for them, but what’s the harm in checking anyway? If there’s a mild short, it might not cause an immediate test failure, but it could cause erratic behavior or premature component failure later. A test that measures the supply current might catch this, but a simple functional test might not. There are also some I/Os on the drive interface that are designed to be shorted, due to Apple’s complex design for multiple drive types. It would be helpful to know if this happens unexpectedly, even if it doesn’t cause a test failure.

  13. Steven Stallion - September 3rd, 2021 11:41 am

    It might be worth mentioning that shorts can occur anywhere in the design, not just supply nets. Some of the more difficult to detect issues I’ve encountered were shorts between I/O pins routed to components that didn’t normally interact with each other. I would caution against assuming that shorts on signal traces will make a test fail outright – it all comes down to how much coverage you can provide.

    Out of curiosity, are you having ECT done on your PCBs? At the very least, this can cut down on the number of potential shorts before assembly.

  14. Tux2000 - September 4th, 2021 4:10 am

    “What about the case where you write a 1 and read back a 1, or write 0 and read back 0 – does that mean there’s no short? I don’t think so”

    Of course not, that’s why the short-circuit test must write test patterns on all lines and read back the results. Common test patterns are all-1 (111111), all-0 (000000), wandering-1 (100000, 010000, 001000, …), wandering-0 (011111, 101111, …), checkers (010101 and 101010). Basically the same as with a RAM test in the old days. The RAM tests commonly also wrote the RAM address into the RAM to detect broken address decoding, but that’s not relevant here.

    “This is the question I’m wrestling with. Most shorts involving signal lines should cause a test to fail somewhere, so maybe I don’t need to explicitly check for them, but what’s the harm in checking anyway?”

    Each single test run takes more time, so you can test less boards in the same time. I don’t know, it might not be relevant in your case, but at work, we like to have fast test runs. Simply because testing is boring, and paying trained people to wait (in total) several hours per workday for tests to finish drives the payroll department insane.

    “If there’s a mild short, it might not cause an immediate test failure, but it could cause erratic behavior or premature component failure later. A test that measures the supply current might catch this, but a simple functional test might not.”

    If there is a “mild short”, it won’t draw a lot of current, so you are poking around in the noise floor. From experience with professionally assembled PCBs, “mild shorts” are rare. You get gravestones, misplaced parts, rotated parts (especially diodes and LEDs), sometimes a wrong part in the pick-and-place machine, bad soldering due to the wrong soldering temperature profile, but shorts are either absent or hard shorts.

    “There are also some I/Os on the drive interface that are designed to be shorted, due to Apple’s complex design for multiple drive types. It would be helpful to know if this happens unexpectedly, even if it doesn’t cause a test failure.”

    If unexpectedly shorting out signals isn’t caught by your tests, your tests are incomplete.

    I might repeat myself: You want two different tests.

    One is a functional test on a known-good PCB to test all software and configuration changes, basically anything you put in programmable devices (Microcontroller, Flash ROM, FPGA, CPLD, GAL, PAL, you name it). That’s tested during development, and especially as part of the transition from development to production. It does NOT happen during production.

    The other test is elementar functions of the entire device, including mechanics (if any). In production, you don’t put each controller into an Apple and run it through 5000 combinations of real and virtual drives, and 200 Games and Applications. In production, you test that the FPGA can read all signals and write all signals, that you can read and write RAM, read and write the bus and floppy lines, the dip switches, and so on. You know that when all electrical tests are ok, the software will just work. That’s why it was intensively tested in the transition from development to production.

    (Of course, if you want, you can pick a small sample of the production, say every 1000th board, and repeat the full development tests to ensure nothing was overlooked.)

    And here is another trick from work: You don’t have to test the board with the final software programmed in. You can use a special testing software that is explicitly designed to help testing. At work, our testing software on the DUT is mostly dumb. All smart functions, all user interaction, all alarms, monitoring, error handling, recovery functions are ripped out. There is a control channel, typically a three-wire UART connection (RX/TX/GND), from tester to DUT, and the software of the tester instructs the DUT to do some simple jobs, like toggling pins, generating PWM or DAC signals, or reading out ADC counts. We basically set the DUT in a kind of zombie mode.

    That should also be possible with the Yellowstone. In fact, Yellowstone is much simpler than most devices at work in that it has only digital signals, no nasty analog signals.

    Choose a few control lines, perhaps on spare pins or the JTAG connector, implement a simple chain of shift registers on that pin, and make the FPGA behave like a stupid, multi-pin port expander. (SPI might be a good idea for a protocol.) Allow read back of any pin, setting direction, level, and pull resistors from the tester. That way, the tester can generate and read back signals to/from almost any other component on the board, the bus connector, and the drive connectors. Even RAM tests should be possible.

    (There is even software designed to do exactly that via JPEG. There are consulting companies that specialize in analysing your circuit and generating test sequences that excercise all of your circuit just by using JTAG. Pretty impressive, but also pretty expensive. Only one of our customers uses such a tester, and for really expensive DUTs.)

    After that, reprogram the FPGA with the “real” software and do a simple “are you there” test (like reading the software version or some status register with predictable content) to ensure that the software was programmed ok.

  15. Tux2000 - September 4th, 2021 4:44 am

    To continue a long post:

    At work, we develop products that need some more attention to detail than a toaster or a radio. Most products are quite harmless, but some have the potential to harm people when something goes wrong. In the worst case for some products, people could die when the product fails. Not indirectly, like a broken toaster burning down your house, or your partner clubbing you to death with a radio, but directly, e.g. by allowing lethal currents to flow through your body, or by not supplying you with sufficient amounts of oxygen. Luckily, most of our products are in the “dangerous only when being violently thrown at you” category.

    Nevertheless, we are required to do a lot of paperwork and planning for our products. And that is actually a good thing, even if you only develop in the “radio and toaster” category, where you are allowed to omit most of the paperwork.

    It helps to have a written spec for the product, and more detailed specs for hardware, software, and mechanics. From those specs, you can easily derive test specs for hardware, software, and mechanics. We do that in requirement tracking software, since about five years. Before that, we used Excel, and boy, that sucked! It may work for a one-man-show, but it does not scale to a team of more than two people.

    What you end with is a bunch of tests, each covering one or more requirements. The tracking software has a reporting feature that allows finding requirements without tests. So when you have reached a state where each requirement has at least one associated test, you can be sure that you will test the whole product.

    Another feature of the software is that it allows tracking the execution of tests, and documenting the test execution. Having at least one successful test execution for all tests is required for transitioning hardware, software, and mechanics from development to production.

    So many words for such a simple idea:

    List all of your requirements, make sure you have a working test for each requirement, and then make your PCB tester execute all of the non-software tests. That should give you a pretty complete test of the product.

    Oh, some of our hardware requirments may read like this:
    – “The device should be supplied with 24 V ± 5 %”
    – “The device supply current should not exceed 1.5 A.”

    The actual tests may be harder:
    – “Test that the device works down to 12 V and up to 36 V”
    – “Measure the peak current when XY happens. Verify that it is below 1.3 A”
    – “Test that the supply current at idle is between 85 and 115 mA.”
    – “Test that the supply current with a non-programmed / erased microcontroller does not exceed 60 mA”

    Guess where the mA in the last two tests come from. Hint: Golden Sample.

    And of course: “Test that there are shorts between the following signal lines: …”

  16. Tux2000 - September 4th, 2021 4:45 am

    D’oh!

    “Test that there are *NO* shorts between the following signal lines: …”

Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.