Configuration fault recovery

- Y
- Yannick Lamarre
  
  Contact options for registered users
posted
6 years ago

Tue, May 16, 2017 8:15 PM

Hi all, I've been thinking about this problem for a while and shared it with a few colleagues, but no one has yet to come up with an answer. For some configuration, an FPGA can be configured so that two different drivers are connected on that same line internally. A practical example would be two BUFGs driving the same line on a Spartan6. If those two drivers are driving a different value in a CMOS process, it will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip. Now the question is: How long can it stay in this state before it breaks? An easier starter question: What is likely to break first and how? The follow up to all of this is, can we design a current-limiter/cut-off circuit fast enough to prevent destruction of the chip?

Regards, Yannick Lamarre

- B
- BobH
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Tue, May 16, 2017 10:04 PM

I don't think that the tool chain will let you do that. There are several steps that should be able to catch it and error out. This is assuming that you are using a "mature" tool chain.

Try manually instantiating two drivers to the same clock line and run it through the tools. It may disconnect one for you or it may just refuse to complete. If it automagically disconnects one for you, it may take some real digging in the log files to find it, but I think it will just error out.

BobH

- Y
- Yannick Lamarre
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, May 17, 2017 3:40 PM

few colleagues, but no one has yet to come up with an answer.

drivers are connected on that same line internally. A practical example wo uld be two BUFGs driving the same line on a Spartan6.

t will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip.

Hi Bob, You are skipping the mental exercise here. What about if some cosmic rays t oggle the configuration bits so that the scenario happens? Highly possible in space. This is why there is a market for SEU controllers/monitors and th e likes. Now, back to the drawing board.

- B
- BobH
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, May 17, 2017 5:56 PM

You are correct, I was assuming it was a design flaw.

To your original question, I suspect that a rail to rail short through a couple of FETS would be very hard to detect in in generalized way from the current signature. When a large circuit like a major clock distribution changes state, you will get a significant current spike, probably not unlike what you would see at the beginning of the short circuit situation. With the short circuit, that current will persist until something craters (unless the drivers had some kind of foldback current limiting). That seems like it might be detectable, until you consider what would happen if something like a bunch of relatively static GPIO signals driving external loads (maybe optocouplers @ 20ma each) transition from off to on simultaneously. The destruct current for the clock driver is probably less than the normal current signature in this case.

You MIGHT be able to make a current signature analysis work in highly specific cases, but false tripping would be a serious problem.

The power supply decoupling capacitors are going to make detecting fast current spikes difficult externally. You might be able to monitor the voltage drop across the power supply bond wires in the package or internal distribution system to estimate current flow without adding sense resistance as a way to sense current after the decoupling caps.

In a previous job, I worked on hot swap power controllers. These chips were supposed to deal with the inrush current of charging the bulk capacitance on a board as it switches on, but shut down if the current got too high or the inrush persisted too long. The only way we could prevent false tripping was to set the thresholds and delays a lot higher than you would expect. When they work, they work well. You can short out a 100 Amp 12 Volt rail with a pair of pliers, and it will switch off before the power supply over-current's and shuts the whole cabinet down.

I think detecting configuration changes would be better done through redundant LUTS or some similar method. You might even be able to implement that in an existing FPGA via the tool chain. This would not be the standard vendor type tool chain, but a specialized one. Developing this tool would be a good PHD project for someone.

This is pretty much speculation on my part, and I am not going to claim to be an expert on high rel stuff.

Good Luck, Bob

- T
- Theo Markettos
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, May 17, 2017 10:42 PM

Assuming you managed to defeat all the protections and turn on both transistors, I don't think it will be that bad. The transistors are sized such that they can achieve a suitable slew on the capacitance they will have to deal with. It might be a long wire, but on-chip the capacitance will be fairly small (guess: single pF or less)

Applying a simple T=RC with 1pF and a time constant of 1ns, the resistor is

1K. Short two of those in series and you have 2K across the power rail. If the rail is 1.2v, that's 600uA, or 720uW.

I don't think anything is going to cook with that. Maybe it would be bad if you managed to short a thousand of them, but it would take some effort to procure the cosmic rays.

Theo