Details the latest readouts for actual single event upsets for Virtex 4, and Spartan 3.
The improvements (6 times fewer upsets for Virtex 4, and 2.4 times fewer upsets for Spartan 3 as compared to Virtex II) shows our commitment to making this a non-issue for our customers.
Make sure you demand from you ASIC/ASSP/FPGA vendor reports on their SEU susceptibility! This is not an area where you should take this for granted. Reducing SEU susceptibility is not something the foundry builds in to their process; it is something that takes time and effort on the part of IC designers to implement, and more time and effort to verify.
For a vendor to claim "oh, we do what Xilinx does..." is completely insufficient. We certainly are not telling anyone how we did this (6X improvement): just that we have done this.
The only other company that we are aware of with a "Rosetta"-like program (work with foundry on process; design, simulation, and analysis by IC designers; measurements in neutron and proton beams; actual atmospheric testing in multiple locations at multiple altitudes) is Cypress.
Imitation is the sincerest form of flattery.
Additionally, Cypress has a good book on the effects of cosmic ray neutrons and protons on memories.
For more details, contact your FAE. They have materials accessible to them for customer presentations on this issue.
The level of activity on this board is a good indicator of the number of engineers that are designing with our chips at any given time. This group is also invaluable for us to get feedback on how well we are treating our customers. I would prefer that feedback be directed to our hotline; but, for those who are not getting the service they feel they deserve, I have also encouraged people to contact me (or Peter) directly.
Many people read this board like a stockbroker reads the morning news: it is where folks go to see "what's happening."
Can you post the big URL (as well) for posterity, in case tinyurl ever goes away?
Could you clarify something for me please? I have read the Rosetta stuff that I could find, and I'm still not sure: The experiment is testing the configuration latches only, not the flipflops I use in my designs - is that correct? Can you point me at the document I need to read more carefully :-)
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
September Issue, IEEE Transactions on Device and Materials Reliability has the Rosetta story.
For those who have IEEE library usernames and passwords, or those who belong to this group, you may find the article online:
Go to Journals & Magazines, and to this transactions, and "view forthcoming articles."
Since we wrote this, IEEE owns the copyrights, and we can no longer distribute the paper.
If you wish to have a presentation on Rosetta, our FAEs have powerpoint slide shows on the subject they can present (under NDA).
Given your affiliation (TRW), I imagine you know who is your Aerospace/Defense Xilinx FAE, and can contact him regarding this subject.
The Aerospace/Defense Group has further requirements from the commercial group (heavey ions, total dose, etc.). These are addressed in other reports and publications. Xilinx has a radiations effects industry consortium with more than a dozen members who are actively working on the use, use models, performance, and mitigation of radiation effects. Please ask about this consortium.
We also have agreements with groups in the EU on the same subject, directed from our research arm in Ireland (to facilitate better communications).
Rick Katz of NASA (who posts here often) has proceedings from conferences he can direct you to with even more information that we have presented.
An example here is the MAPLD conference, 2005:
To answer your specific question: what about the D FF in the CLB? What is it's Failure Rate in Time compared to the configuration bits?
There are 200,000 DFF in our largest part (XC4VLX200). the failure rate of these is .2 FIT (ie 1 FIT/Mb). That is .2 failures in 1 billion hours for the largest device (at sea level). The DFF is a very large, well loaded structure as it is programmable to be: a latch, a D flip flop, asynchronous reset, synchronous reset, with other options as well for load, preset, and capture.
Compared to the 6.1 FIT/million bits of configuration memory (as of today's readout) for the mean time to a functional failure, times the number of config bits (in Mbit), the DFF upset rate is many times less.
We also are recording the upset rate in the BRAM.
In Virtex 4, we have FRAME_ECC for detecting and correcting configuration bit errors, and BRAM_ECC for detecting and correcting BRAM errors (hard IP built into every device).
Regardless, for the highest level of reliability, we suggest using our XTMR tool which automatically converts your design to a TMR version optimized for the best reliability using the FPGA resources (the XTMR tool understands how to TMR a design in our FPGA -- not something obvious how to do). In addition to the XTMR, we also suggest use fo the FRAME_ and BRAM_ ECC features so that you are actively "scrubbing" configuration bits so they get fixed if they flip, and the same for the BRAM. The above basically describes how our FPGAs are being used now in aerospace applications.
Well, we are an autmotive only company now, so I have no direct Aero/Defense links most of the time, but we always get to the right people when we have questions to ask :-)
OK, that's good to hear. Presumably the DFF rate is what I need to compare with an ASIC (as they don;t have configuration latches)?
That'll be interesting!
Our application is automotive, therefore will be something like Spartan-3E. We will have to use cleverness to avoid spending too much extra money on silicon - I don't think we can TMR the whole lot... we'll be speaking to your experts!
Anyway, that's design work for later on... Thanks for the info.
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
Mart > Well, we are an autmotive only company now, so I have no direct
OK. Didn't know that. Congratulations. And for autos, we are VERY serious about SEUs, as we understand that no one wants their anti-lock brake system, or their collision avoidance system to suddenly freeze up!
This is a good reason to check on the Spartan 3E, which has added features, as well as being even more than twice as "hard" as the Spartan
3 (more things were done to improve its already good SEU hardness). It will take another six months to tell just how good 3E is, but it is being assembled now.
True, they have no configuration memory. But they do have SRAM blocks, and these can be very bad. Fro a foundry report, for a standard cell SRAM block, they listed 5,000 FIT/Mb as the failure rate. Compare that to ours V4 rate (below) for our BRAM.
In a 90nm ASIC, the DFF is a standard cell: the smallest, fastest possible, with practically no loading. Now we also use standard cell ASIC synthesis for blocks of IP on our chip, so we also know its SEU hardness.
From a 90nm standard cell library, a M/S D FF was chosen:
Qcrit = 6.3 fc Area affected is different (different layout)
FIT/Mb = 191
598 years between upsets (for 1 million)
Compare that with our DFF, and our DFF is 191 times BETTER (less likely to upset).
Seriously, you MUST ask your ASIC vendor to justify their attitude that "they do not have a problem" as they most obviously DO have a problem! After all, we KNOW, as we, too, use standard cells, and we are forced to do things to our blocks to minimize the effects of neutron strikes. For example, we TMR some critical logic that is standard cell based.
Right now, their (ASIC and ASSP vendors) attitude is to do nothing at all (because they are "better", which of course, they are no longer).
Yes, it is. 22 FIT/Mb, or 16 times better than 0.15u. Probably the most dramatic improvement in Virtex 4. Compared to a 90nm ASIC 5,000 FIT/Mb, I seriously suggest you have no choice but to use our FPGAs.
Don't have a number for Spartan 3, or 3E BRAM, yet. Too few BRAM bits in those parts, so it takes forever to get any statistically significant data. We do have LANSCE beam test results, so we know we are better than we were in 0.15u, we just don't know by how much yet like we do for Virtex 4.
By the way, all data (in these postings) is the mean for the 95% confidence level. The variation is +/- 20% based on all of the error factors (we and others have discovered). The list of error factors is quite long, but we have worked hard to get this as accurate as we possibly can.
Talk with your FAE. TMR ONLY the critical element(s). The XTMR tool allows you to pick and choose your level of TMR (there are options for IO, voting, etc. that you may choose to use that apply to your application), and by block, which ones get TMR'd. The tool may have been developed for the aerospace/military market, but as any auto engineer knows, under the hood of a car is a far more hostile environment than a satellite in earth orbit, or on a battlefield (unless you are under the hood of a car on a battlefield).
"Upon transferring copyright to IEEE, authors and/or their companies have the right to post their IEEE-copyrighted material on their own servers without permission, provided that the server displays a prominent notice alerting readers to their obligations with respect to copyrighted material and that the posted work includes an IEEE copyright notice."
As Xilinx employees are authors and co-authors of quite a number of papers, a collection - maybe on the XUP homepage - would be nice.
Why does it not seem right that you as an author reserve at least a few rights when giving most of your rights on a paper away for free to a commercial publisher?
I think it is ridiculous that publisher ever demanded exclusive rights without compensation. If you want to print conference proceedings you can do that with nonexclusive rights.
And I am not the only one. A few years a ago a study was published that showed, that papers that are available online for free are cited twice as often as other papers. As a result more and more authors refused to give away exclusive rights. Universities made it their policy that the right to publish on the university homepage must stay with the university.
Publishers that did not want to play that game risked to lose conference proceedings contracts. Most publishers now allow the author to keep the right to publish on the authors webserver. This is a minimalisitc compromise on the side of the authors.
O'reilly goes a large step further:
I suggest to every scientific author at least to try to retain the rights for his work. Once your paper is accepted by a conference committe, you get the copyright transferal from from the publisher with a few weeks delay. Just refuse to sign it and send them a note that you put the paper under a creative commons license. I doubt that the publisher is going to explain to the conference chair that they are not going to print your paper because they can not get exclusive rights. After all you are offering them a full set of rights for free.