FPGAs starting with incorrect bitstream !?

Question

Hiuntil recently I did live in good faith that all decent FPGAs do havebitstream integrity checks and do not start in case of configuration loadingerrors.This seems not to be case at least for Xilinx Virtex2 FPGAs.I do have a desing and FPGA evaluation system where I constantly seebitstreams that start but have erratic behaviour. This can only be explainedthat there have been errors during download but impact (JTAG download) doesnot report and error and FPGA starts as it would be OK. After power off andreconfigure the error is gone.1) from Xilinx answers: if prog_b pin is being pulsed during JTAG downloadthen the FPGA configuration sync is lost what yields to bullshit loaded intoFPGA and FPGA starting with that bullshit with no errors being reportedduring configuration. My system has a button and pullup resistor on progpin - nobody is pushing it during download.2) Xilinx Virtex2 FPGA have a new feature called AutoCRC what is morereliable as the CRC used in older FPGAs. The normal CRC check (RCRC commandand write to CRC register) are still being used unless its a debugbitstream! -- Good god, but why does impact generate bitstreams with CRCvalue fixed 0x5F57 for all Virtex2/p/s3 devices ?? the meaning of CRC isthat is not constant but calculated?Ok, the AutoCRC is written, but the AutoCRC should only operate on framedata? how are other config writes protected if the normal CRC check seems tobe bypassed ???AnttiPS  0x0000DEFC !!!for those who do not know the meaning 0xDEFC its the DEFault Crc valuewritten to CRC register when CRC check is disabled.When CRC check is enabled CRC is 0x5F57 but the meaning of that - sorry Ican not decode! it must be a magical value that matches any good CRC value(a calculated value!)PPS Xilinx: where is the algorithm for AutoCRC ???

Mike Treseler · Accepted Answer

Check your clock and reset.Consider simulation before synthesis.There are many possible sources of erraticbehavior after download.I would expect a fixed crc sum for a good packet.The packet generator should add the proper suffix word (FCS) to make this happen.Good luck.     -- Mike Treseler

Bob Perlman · Answer

Does this mean it wasn't simulated?No, but it remains to be seen whether that's the problem.  If youhaven't simulated, start there.Bob PerlmanCambrian Design Works

Antti Lukats · Answer

Hiuntil recently I did live in good faith that all decent FPGAs do havebitstream integrity checks and do not start in case of configuration loadingerrors.This seems not to be case at least for Xilinx Virtex2 FPGAs.I do have a desing and FPGA evaluation system where I constantly seebitstreams that start but have erratic behaviour. This can only be explainedthat there have been errors during download but impact (JTAG download) doesnot report and error and FPGA starts as it would be OK. After power off andreconfigure the error is gone.1) from Xilinx answers: if prog_b pin is being pulsed during JTAG downloadthen the FPGA configuration sync is lost what yields to bullshit loaded intoFPGA and FPGA starting with that bullshit with no errors being reportedduring configuration. My system has a button and pullup resistor on progpin - nobody is pushing it during download.2) Xilinx Virtex2 FPGA have a new feature called AutoCRC what is morereliable as the CRC used in older FPGAs. The normal CRC...

Antti Lukats · Answer

this erratic behaviour only happens with known good working bitstream on some downloads. the whole system (1M gate system with MicroBlaze system) is working but soft core microcontroller sees some hard-wired registers return random data (not pre programmed constant). This bad register is consistent for one download attempt and persist after hardware reset also.

you have hardwired register that should be read as 0xAA always - but on some download attempts it reads lets 0xE1 every time you do hardware reset. next download is ok again.

its not funny to simulate Full 1M Gate with MicroBlaze ! and you can not simulate badly configured FPGA anyway, can you?

hm but the check clock and reset, hm, that is a good thing todo maybe, the system has 2 clock domains running from 2 different external clock inputs and 3 DCMs. So the reset of the system is not simple. and yes the register that returns bad data is in other clock domain the system SoC.

debug

but the fixed checksum doesnt seem possible there are 2 checksum locations

1 AutoCRC after frame data this calculated and OK 2 normal CRC this is fixed to 5F57

no way the AutoCRC is correct CRC for previous data and also fixes the next CRC to have a constant value!!

thanks, Mike Antti

Bob Perlman · Answer

I don't know what this means. Are you getting erratic behavior in 1 out of 100 JTAG downloads? Or 100% of JTAG downloads?

Resets do not reset everything. They do not, for example, re-initialize block RAM. If you are depending on the initial contents of a block RAM for proper operation, and your circuit occasionally stomps on block RAM shortly after start-up, your circuit may not work until you reconfigure.

As I said in my previous post, you haven't proved that configuration is the problem. And I'm not, repeat, NOT, suggesting that you somehow simulate the configuration process. But it would be interesting to know if there's a way resources like block RAMs could be corrupted shortly after you come out of reset, perhaps due to problems with interfaces between mutually asynchronous clock domains.

I can't rule out the possibility that you are occasionally loading a corrupted bitstream, but it seems very unlikely. Doctors have a saying: when you hear hoofbeats, think horses, not zebras. If I had a design that I didn't simulate, and configuration seemed to complete successfully, I'd start looking somewhere other than configuration for my problem.

Good luck, Bob Perlman Cambrian Design Works

Jim Granville · Answer

Have you tried read-back of the FPGA in these cases ?Could be a candidate for an overnight run of continualdownload/readback's - is this single device specific, orcommon to multiple FPGAs ?-jg

Antti Lukats · Answer

yes it means that the 1M gate desing with 32K application code for Microblaze has not bein simulated. All the custome IP cores connected to Microblaze of course have been simulated.

Dear Bob,

I have a bitstream that starts always OK when loaded from configuration memory, and start with erratic behaviour 1 from 100 JTAG configuration attempts (even when JTAG configuration did not show any error during download). When the bitstream starts badly it behavies badly after reset also, only full new reconfiguration makes the system to working again. So I do assume it is possible that the CRC check is not sufficent in Virtex2 devices and that they actually do start also in case of bad download sometimes.

You suggested this erratic behaviour of bad starting when loading from JTAG could be found running simulations ?! Well I really cant understand that any simulation models could take into account the errors that happend during download. ?? Or what was it what I could possible find in simulation?

Antti

Ray Andraka · Answer

I'm coming a little late to this conversation, but perhaps this has not been considered. I sincerely doubt it is a configuration problem. Much more likely, you are not coming out of reset at the end of configuration cleanly. The global reset must be considered asynchronous to the clock. Most likely, you are occasionally getting a situation where one or more flip flops are seeing the end of the configuration reset a clock cycle before or after other flip flops in a critical area of your design. Simulation usually won't catch this, so you need to do a careful examination of the start up of your design. I can't tell you the number of designs I've seen that make this common mistake, even from FPGA board vendors with much experience that really should know better.

Check the state machines in your design. The resets for them should come from a

flip-flop in the design that feeds all the reset inputs to the state machine. You can't depend on global reset going away on all flip-flops during the same clock cycle.

Antti Lukats wrote:

--

--Ray Andraka, P.E. President, the Andraka Consulting Group, Inc.

401/884-7930 Fax 401/884-7950 email snipped-for-privacy@andraka.com

formatting link

"They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759

Squirrel · Answer

If it is a V2, and you only experience problems when downloading from JTAG...Keep in mind that the V2 supports partial reconfiguration. As such, the JTAG bit-banger from IMPACT doesn't invoke the global clear when re-configuring, so BRAMs will not initialize, and FFs _may_ not. I think I saw a Xilinx solution record with information on how to invoke the global initialization through JTAG manually...the Chipscope tool does this currently. Or, you can manually short the PROG pin to ground.

By the way, this is documented in the V2 design guide.

-S

So I

JTAG

any

Jim Granville · Answer

If I read this right, you are saying that read-back does show the error, and that error persists on many read-backs until re-config ? That does sound like a config-write-error. Have you tried multiple devices (ideally with differing datecodes ?) If this persists across device/date code boundaries, I would say it shows a serious blind spot. In general, any device program includes a verify step, and on an FPGA devices skipping verify has probably become the norm, because of 'saving time' reasons. If the CRC is not sufficently reliable, then that would make config something of a lottery. [just maybe they do not CRC the whole bitstream ?]

Perhaps someone from Xilinx could clarify more what AutoCRC is, and does ?

-jg

Bob Perlman · Answer

Good question. Antti, when you say that "readback" is consistent, are you referring to the MicroBlaze's readback of that one register, or are you saying that you are seeing an error when you perform a bitstream readback?

Bob Perlman Cambrian Design Works

FPGAs starting with incorrect bitstream !?

Join the Discussion

Didn't find your answer?