Intel plans to tackle cosmic ray threat

Symon · 2008-04-08T13:32:06+00:00

Dear All, Austin in particular,I saw this and thought of you!Cheers, Syms.

A

austin 18 years ago

Colin,

Yes. One choice. No competition whatsoever.

I really wish there was some competition, as I believe that competition is healthy for an industry. But, sadly, there is none. Nada. Nil.

If you want 405PPC, EMAC, DSP, 200K LUT, etc, there is one, and only one Grade V space qualified vendor: Xilinx.

formatting link

On sale right now, and being designed into all those fancy platforms (to be flown starting in 2009).

No PR here, just solid purchase orders, shipments, and revenue.

We have progressed from "interesting technology, but we can't use you in space" a few years back, to "critical enabler, must use you in this generation like we have in the previous two successful missions to meet the mission requirements."

Austin

Vote

C

cs_posting 18 years ago

That would seem to suggest the semiconductor dies should always be oriented in the vertical plane - substantial reduction in cross sectional exposure, plus anything that does hit it might affect a longer "scratch" of circuitry rather than just a point? Assuming I guess that the interaction isn't entirely confined to discrete points along the flight path by quantum effects.

Vote

S

Symon 18 years ago

Hmm, because it's a flux, isn't the only thing that matters the volume of the die? Edge on to the main direction means there's less exposed area, but much more depth to travel through for the particles that do hit. Is it true that the particles/per unit volume remain the same no matter what the orientation is? Interesting...

Perhaps upside down is the best orientation so the lead in the solder stops some stuff. :-) Oh yeah, damn you RoHS!

Cheers, Syms.

Vote

M

Marty Ryba 18 years ago

"Symon" wrote in message news:fu0m8u$vcj$ snipped-for-privacy@aioe.org...

I forget where I saw the information, it may have been in a briefing from some of our rad-hard experts in Manassas (BAE Systems, proud producer of the RAD750 PowerPC and RAD6000 processors). There are some direction-dependent effects that are just being recognized and dealt with as circuit dimensions shrink. I think the V4 is seeing some of these issues crop up...the "single event upsets" are no longer confined to a single circuit element especially if the rays come in from an oblique angle. The stream of charges/holes created by a particle floods multiple cells IIRC. Not very many rad-tolerant designs deal with this concept yet (correlated upsets in adjacent bits of the logic), though with smart floorplanning a design can probably mitigate this. Austin's comment about deliberately flipping one bit at a time and verifying performance does go out the window when you throw this curveball into the mix. Since the RAD750 is currently fabricated at 150nm (soon 120nm) this effect isn't important (yet), but when you look at 65nm circuits (1/4 the surface area per logic element) this effect is becoming noticed. Of course the performance you get from these denser circuits is why we keep plugging away at making it work. Both Xilinx and BAE Systems can share credit for the Mars Rover's endurance (a RAD6000 is the main computer, and I think Austin described several smaller Xilinx parts in critical subsystems).

Comments, Austin? I'm looking at this second-hand so I'll defer to your obvious focus on this area.

Dr. Marty Ryba

Vote

C

Colin Paul Gloster 18 years ago

(I had already emailed this to Austin in response to an email which he sent me, but I have just noticed that he posted the email to Usenet as well, so for the benefit of those who did not see my private response, I post it now.)

Austin,

I trust that you are sincere, but you would not be the first person to work in aerospace who is mistaken and who is utterly convinced that he is not mistaken and whose confidence is understandably bolstered by many positive, genuine experiences of yours of overcoming would-be faults from radiation.

Generalization is a problem.

Radiation is just a detail. Even without radiation, you can not prove absolute safety. Can you prove at the 100% confidence level that your finite upper bound on metastability is valid?

99.999999999% confidence from empirical measurements are inadequate if you want to claim that a problem is impossible. Can you disprove the claim of quantum mechanics that any component has an infinitesimal (i.e. > 0% therefore necessary to be covered in a claim that something is perfect) probability of being spontaneously teleported to some galaxy we never heard of? Was hysteresis of unknown parameters overlooked in the curve fitting which was used in SPICE? Is the physics of deep submicron processes understood well enough?

As others have done, I congratulate you and Xilinx for many fine posts to newsgroups. I thank you for responding (but I would have been content with you responding on comp.arch.fpga) (if you used email to avoid publicly embarrassing me, thank you), but I am still displeased that Xilinx did not answer a challenge I made in a thread to which Anne L. Atkins and Dr. John Williams also posted in 2007 (or maybe 2006) (I compose these posts and emails at home and as trying to find a way to pay for food is a major objective while I have limited networked time, it is not worth my while to give you an exact reference as you can easily search for yourself) in a similar discussion.

Austin emailed: |------------------------------------------------------------------| |"It is a question of completeness. | | | |Logically going through every bit, is 100% functionally complete."| |------------------------------------------------------------------|

Logic is theoretical whereas the devices are actually subjected to physics. A VHDL simulator can not replace SPICE for electromagnetic compatibility issues and SPICE can not replace empirical experiences and extrapolating empirical experiences to untried conditions can work but it can also fail.

Similar points had been admitted in the book Thomas Kropf (editor), "Formal Hardware Verification: Methods and Systems in Comparison", Springer, 1997; in the final sentence of Section 5.3 of the book He Jifeng, C. A. R. Hoare, Jonathan Bowen, "Provably Correct Systems: Modelling of Communication Languages and Design of Optimized Compilers", 1994; in Section 12.1 What Are Formal Methods? of the book Jim Woodcock and Martin Loomes, "Software Engineering Mathematics: Formal Methods Demystified", 1988; on Page 181 (though oddly enough, almost the opposite was argued on Page 180) of the book Fenton and Hill, "Systems Construction and Analysis: A Mathematical and Logical Framework", 1993; and Dr. Fleuriot (who had been involved in collision and detection issues for aeronautics) of the University of Edinburgh said to me in a personal conversation on January 24th,

2008 "[..] there's no such thing as one hundred per cent guarantees [..]".

In an even more impressive triumph of missing the point than Fenton's and Hill's Pages 180 and 181, Zerksis D. Umrigar, Vijay Pitchumani, "Formal Verification of a Real-Time Hardware Design", Design Automation Conference 1983 contains: "[..] If there are no errors, inconsistencies or ambiguities in the specifications, and no errors in the correctness proof, then a successful proof enables one to be totally confident that the design will function as desired. [..]"

|---------------------------------------------------------------------| |"Sitting in a proton beam is "waiting for Godot" -- how long must you| |wait to check enough bits to achieve the required coverage?" | |---------------------------------------------------------------------|

True. (Though actually there are somewhat usable techniques for aiming at desired locations in a device.)

An even more important problem with a radiation source than what you have raised is whether it is even similar enough to what will bombard the device in the field. This is similar to I.Q. tests: their goal is to measure intelligence but they can not do so, instead they measure one's ability to do well in those tests, and though intelligent people are more likely to tend to do well in those tests, someone who has been practising those tests will get improved marks without actually becoming more intelligent.

A paper in which it is shown that one radiation source can not be relied upon to be a perfect proxy for another is Jamie S. Laird, Toshio Hirao, Shinobu Onoda, Hisayoshi Itoh, and Allan Johnston, "Comparison of Above Bandgap Laser and MeV Ion Induced Single Event Transients in High-Speed Si Photonic Devices", "IEEE Transactions on Nuclear Science", December 2006. A minor discrepancy would probably not be important, but in one device it could make all the difference. Do not make unjustified generalizations.

Even if the relevance of the radiation is not in doubt, it can be very difficult to make measurements, as mentioned in Thomas L. Turflinger, "Single-Event Effects in Analog and Mixed-Signal Integrated Circuits", "IEEE Transactions on Nuclear Science", April 1996.

|---------------------------------------------------------------------| |"It becomes a matter of "too many dollars to keep the lights on." | |(Beam testing is horribly power hungry, and very expensive, eg TSL is| |$250K for a session, not including the airplane tickets, hotel rooms,| |people, rental cars...)." | |---------------------------------------------------------------------|

Omnisys cut costs by using a source in a hospital. As mentioned above, that might not always be good enough, in one case it was.

Anyhow, in a field in which spending $2000-$10000 for four megabytes of radhard memory is not a problem, testing with radiation is not merely a useless luxury.

|-------------------------------------------------------------------------| |"Additional system testing in a beam is highly desired, but the goals are| |not for functional completeness, but to cover whatever might have been | |missed bu flipping 100%, one by one, every configuration bit. | | | |XTMR Tool(tm) software can not be broken by a single radiative event, | |nor by a single bit flip (as verified by NASA, JPL, CERN, etc....)." | |-------------------------------------------------------------------------|

Would that be the same NASA which failed to pay attention to established schedulability analysis techniques for a rover for Mars and which lost a probe in 1973 intended for Venus as a result of being satisfied with a decimal point instead of a comma?

Prof. William H. Sanders boasted on April 27th, 2006 at 12:04 that his group convinced NASA JPL that his group solved NASA JPL's supposedly insoluble fault-tolerant spaceborne computer problem posed in 1992. He showed his supposed solution and as it was not perfect and it did not seem that he was going to admit this without being forced to, I challenged him, so he admitted at 12:28 that it was not perfectly solved because of "[..] the classic problem in fault-tolerant computing of who checks the checker?"

Scott Hensley of NASA said on June 4th, 2007 that his Europa TopoMapper proposal has still not been approved after fifteen years, partially due to the much worse Jovian radiation. If NASA is convinced that the techniques you use are sufficient, then why is this proposal still not approved? (I recently noticed that the European Space Agency is planning a mission to Jupiter. I do not know whether this is similar to the French space agency's example of ignoring common lore and sending doomed hardware into space, or whether the European Space Agency has actually overcome a serious obstacle.)

Would that be the same European center for nuclear research which is partially responsible for the paper Agostinelli, et al., "GEANT4---a simulation toolkit", "Nuclear Instruments and Methods in Physics Research A", 506 (2003) in which it is claimed on Page

252: "[..] It has been created exploiting [..] object-oriented technology [..]" despite being distributed with functions containing copied and pasted statements instead of common statements isolated in a shared function?

That is the same European center for nuclear research whose papers did not predict that physical effects would be observed at particular times of day and not at others due to systematic effects of a locomotive influencing particles' trajectories before they realized that they should look at a railway timetable in order to determine when a train would not be around to disrupt an experiment. I doubt that XTMR Tool(TM) was as much help in that case as you might had thought.

|-------------------------------------------------------------------------| |"Our flow triplicates the voters, so that every feedback path gets a full| |TMR. A failure in a voter is "voted" out by the other two voters." | |-------------------------------------------------------------------------|

TMR can help a lot. It does. It does not work for everything. Your marketing is similar to many inadequate MAPLD papers.

If the probability of an upset for any gate is equal to the probability of an upset for any other gate, then winning_result

Vote

Intel plans to tackle cosmic ray threat

Join the Discussion

Didn't find your answer?