Configuration fault recovery

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Hi all,
I've been thinking about this problem for a while and shared it with a few colleagues, but no one has yet to come up with an answer.
For some configuration, an FPGA can be configured so that two different drivers are connected on that same line internally. A practical example would be two BUFGs driving the same line on a Spartan6.
If those two drivers are driving a different value in a CMOS process, it will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip.
Now the question is: How long can it stay in this state before it breaks?
An easier starter question: What is likely to break first and how?
The follow up to all of this is, can we design a current-limiter/cut-off circuit fast enough to prevent destruction of the chip?

Regards,
Yannick Lamarre

Re: Configuration fault recovery
On 05/16/2017 01:15 PM, Yannick Lamarre wrote:
Quoted text here. Click to load it

I don't think that the tool chain will let you do that. There are  
several steps that should be able to catch it and error out. This is  
assuming that you are using a "mature" tool chain.

Try manually instantiating two drivers to the same clock line and run it  
through the tools. It may disconnect one for you or it may just refuse  
to complete. If it automagically disconnects one for you, it may take  
some real digging in the log files to find it, but I think it will just  
error out.

BobH



Re: Configuration fault recovery
On Tuesday, May 16, 2017 at 5:59:27 PM UTC-4, BobH wrote:
Quoted text here. Click to load it
few colleagues, but no one has yet to come up with an answer.
Quoted text here. Click to load it
 drivers are connected on that same line internally. A practical example wo
uld be two BUFGs driving the same line on a Spartan6.
Quoted text here. Click to load it
t will connect both rails together on a low impedance line. Obviously, this
 will cause damages to the chip.
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it

Hi Bob,
You are skipping the mental exercise here. What about if some cosmic rays t
oggle the configuration bits so that the scenario happens? Highly possible  
in space. This is why there is a market for SEU controllers/monitors and th
e likes. Now, back to the drawing board.

Re: Configuration fault recovery
On 05/17/2017 08:40 AM, Yannick Lamarre wrote:
Quoted text here. Click to load it
You are correct, I was assuming it was a design flaw.

To your original question, I suspect that a rail to rail short through a  
couple of FETS would be very hard to detect in in generalized way from  
the current signature. When a large circuit like a major clock  
distribution changes state, you will get a significant current spike,  
probably not unlike what you would see at the beginning of the short  
circuit situation. With the short circuit, that current will persist  
until something craters (unless the drivers had some kind of foldback  
current limiting). That seems like it might be detectable, until you  
consider what would happen if something like a bunch of relatively  
static GPIO signals driving external loads (maybe optocouplers @ 20ma  
each) transition from off to on simultaneously. The destruct current for  
the clock driver is probably less than the normal current signature in  
this case.

You MIGHT be able to make a current signature analysis work in highly  
specific cases, but false tripping would be a serious problem.

The power supply decoupling capacitors are going to make detecting fast  
current spikes difficult externally. You might be able to monitor the  
voltage drop across the power supply bond wires in the package or  
internal distribution system to estimate current flow without adding  
sense resistance as a way to sense current after the decoupling caps.

In a previous job, I worked on hot swap power controllers. These chips  
were supposed to deal with the inrush current of charging the bulk  
capacitance on a board as it switches on, but shut down if the current  
got too high or the inrush persisted too long. The only way we could  
prevent false tripping was to set the thresholds and delays a lot higher  
than you would expect. When they work, they work well. You can short out  
a 100 Amp 12 Volt rail with a pair of pliers, and it will switch off  
before the power supply over-current's and shuts the whole cabinet down.

I think detecting configuration changes would be better done through  
redundant LUTS or some similar method. You might even be able to  
implement that in an existing FPGA via the tool chain. This would not be  
the standard vendor type tool chain, but a specialized one. Developing  
this tool would be a good PHD project for someone.

This is pretty much speculation on my part, and I am not going to claim  
to be an expert on high rel stuff.

Good Luck,
Bob



Re: Configuration fault recovery
Quoted text here. Click to load it

Assuming you managed to defeat all the protections and turn on both
transistors, I don't think it will be that bad.
The transistors are sized such that they can achieve a suitable slew on the
capacitance they will have to deal with.  It might be a long wire, but
on-chip the capacitance will be fairly small (guess: single pF or less)

Applying a simple T=RC with 1pF and a time constant of 1ns, the resistor is
1K.  Short two of those in series and you have 2K across the power rail.
If the rail is 1.2v, that's 600uA, or 720uW.

I don't think anything is going to cook with that.
Maybe it would be bad if you managed to short a thousand of them, but
it would take some effort to procure the cosmic rays.

Theo

Site Timeline