vertex II vs Stratix

I'm glad I'm not the only one who's given up on mySupport. I think it is seriously broken. A number of times, I had to argue with whoever's on the other end that I really was experiencing a Quartus bug. Often, the response I got was a cut-and-paste of the Quartus error message, as if I wouldn't have looked at it myself already.

I'm very thankful there are knowledgeable Altera people because this is the first place I go now for q&a now.

-- Pete

Reply to
Peter Sommerfeld
Loading thread data ...

SD, Responding to your points . . .

  1. I/O was not taken into account on the benchmarks that we published. Altera has run numersous experiments to measure the impact of I/O constraints on fMAX performance. We found that typically these constraints change the relative performance advantage (or disadvantage) by less than 5%, or had relatively minimal impact. We left them out of this analysis for 2 reasons - 1st we do not have I/O constraints for all of our designs (not all customers provide them to us) and 2nd they add complication to (Altera's internal) goal of automating this benchmark process so that it is repeatable with each new software release or new silicon architecture.

  1. We add designs to our benchmark suite as we get them (after they go through a rigorous process of optimizing the HDL code for both Altera and competitive architectures and verifying the results to ensure that any outlying results are legitimate).

  2. We do not use any manual placement for these benchmarks. Reason here is again need to have a methodology that is both automated and repeatable. Critical path information is a critical tool utilized for improving place & route / fitting algorithms - so we regularly analyze this information, though not from the context of evaluating improvement through manual placement.

Dave Greenfield Altera Product Marketing

Reply to
Dave Greenfield

What if I want to change the code in an existing board? That seems like a reasonably common case. I'd expect you would want to cover it. (But maybe I missed the big picture that started this discussion.)

If your current set of tests doesn't have that info, you might be able to run place/route once, capture the pin locations, and then run p/r again with that pinout. (Or perhaps scan it manually and clean things up if they are too ugly. Or get a PCB guy to check it...)

--
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's.  I hate spam.
Reply to
Hal Murray

Hal,

[My response is in the context of internal benchmarking for the purposes of Quartus QoR (Quality of Results) improvement -- that is, comparing Quartus to Quartus. Competitive benchmarking is much more complicated due to the varying capabilities and behaviours of different CAD tools and device architectures, and I'm not familiar with the exact details of this process, except to say that there is a lot of thought that goes into the settings used.]

We typically internally benchmark our Quartus p&r with randomly selected pin placements. This mimics the situation you point out above, where I/O locations are selected and locked down before the design has been fully implemented. In reality, a pinout isn't completely randomly placed, since there will be some correlation of pin placement and logical function (memory buses together, etc.), but in the absence of user-supplied pin outs, it's a good pessimistic way to ensure that the CAD tool does a good job of optimization in the presence of arbitrary pin assignments. An alternative approach (as you suggest) is to use an optimized pin placement that we lock down and then run another placement run with a different random seed.

We also run tests on Quartus where pins are free to move, and ensure that yes, it actually does a better job optimizing Fmax, Tco, and Tsu as a result. We will sometimes arbitrarily set various I/O standards for the I/Os to make sure we can handle complicated I/O placement cases. And we run p&r experiments with both I/O constraints and core Fmax constraints, or just some of each, to make sure Quartus behaves well under different constraint conditions. I bet half our office heat comes from the 100s of dedicated CPUs crunching away on various QoR sweeps 24 hours a day!

As David points out, there are a lot problems with getting real customer pinouts and/or I/O constraints. Besides the primary problem that we often don't receive these with the design, we are also using many designs that weren't originally targeted at the device we're testing. For example, during Stratix II development many of the designs we were using for benchmarking were originally targeted at APEX, Stratix, various Xilinx parts, and even some ASICs.

Regards,

Paul Leventis Altera Corp.

Reply to
Paul Leventis (at home)

Dave,

If I have offended anyone, I apologize.

However, I do not appreciate the mis-quote of this newsgroup in your slides at the presentation. Odd how my comment was distorted and then made it into the presentation. Makes me wonder.

But, all of that aside, you will be happy to know that I no longer will comment on software, or software performance in this forum. We have someone who is now tasked with that subject. I am not an expert in that field (as I so amply demonstrated by opening my mouth and letting everyone know): all I can do is repeat what I hear from our customers, and our own engineers. They can do a much better job.

The questions I posed are legit, however. As well as comparing 90nm to

90nm ('Our latest announced yet to be shipped chip is better than your 2 year 130nm technology chip...')

I will continue to monitor the group for the questions that I can shed some light on: signal integrity, IO modeling, IC Design, etc.

Now you have a (mis) quote from me in your slides on a subject that I have publicly stated I am not an expert in.

Austin

Reply to
Austin Lesea

Dave,

Thanks for your response. If I may address some of these points one last time...

  1. I understand that you don't have constraints for all these designs, but for the designs you ran the benchmarks on, wouldn't it be more thorough to include the I/O timing for the critical path as well? Since you already have the data, it shouldn't be much more effort. Would it be possible to at least show an average Tsu/Tco change on the critical paths for the benchmark designs? I'm not disputing your claims of a 5% difference, but without that data, I'm only getting numbers for the middle slice of the path.

  1. Could you provide the approximate average age of these designs? Also could you comment on whether you think some of the discrepancy in the benchmarking results is due to tool/architecture tuning to these designs? If the designs were used during Altera's tool/architecture development, then they should (and hopefully would) favor an Altera implementation.

  2. Sounds reasonable enough :)

SD

Reply to
SD

Rajeev, although I can't disclose the detailed roadmap here, I can tell you that demand for DSP Builder continues to increase rapidly and we plan to have multiple product releases this year. These product releases will fix issues found by users and add new features to improve the capabilities and ease of use of this tool.

Brian Jentz Altera DSP Product Marketing

Reply to
Brian Jentz

Hi Austin,

To be fair, we do have comparisons posted of Cyclone vs. Spartan-3 (our 1.5 year old 130 nm chip vs. your recently released 90 nm one). Despite the process disadvantage and its age, Cyclone offers 70% better performance. See

formatting link
for details.

It's worth pointing out that 90 nm is not really buying a huge amount of performance as compared to 130 nm -- nothing like the good old days of process scaling. The static power problem posed by

Reply to
Paul Leventis at home

Paul,

Hi.

Yes, S3 is 90nm, but it addresses a completely different market than the Virtex line. S3 was not intended to be 'faster', as the low cost high volume market doesn't care about fast -- they just want to use FPGAs instead of ASICs.

As for 90nm 'not buying a huge performance improvement', that remains to be seen when we announce V4 and publish its specifications. Intel has made performance gains, so has TI. It can be done.

We did not intend to get any improvement in performance at all in S3 (over Virtex II at 150 nm) but just crash through the $$$ barrier for logic per dollar, which is totally different that what we are doing for V4.

So to compare our 90nm S3 product to your 130nm one, is also not relevant: cost is the issue, not speed. Cost wins in this market.

The one thing that I don't understand, is why do you not have SRL16s?

The architecture of having these SRL16s is such a huge advantage (for us), I am amazed you haven't provided a competitive answer to them. Now you have reached parity with the ALM (which is all that we see in our benchmarks), what happened to putting in SRLs?

Aust> Hi Austin,

Reply to
Austin Lesea

Hi Austin,

The Intels of the world don't have the 500 million transistor problem that PLD vendors do -- and the CPU vendors can afford massive thermal solutions. Don't get me wrong, 90 um does buy performance (we see it too -- 50% performance in Stratix II vs. Stratix isn't all from cute architecture improvements). It just doesn't come automatically like it once did, and the power/performance trade-offs are very different.

V4.

We too did not architect Cyclone with speed as the target -- our goal was to produce a very low-cost product, a goal which we exceeded. We were careful to examine the speed-vs-area trade-offs of all decisions and didn't needlessly sacrifice performance. It's pretty easy to make a low-cost chip -- making a low-cost chip that is also zippy is a much more difficult problem.

And speed can be used to reduce cost. If speed is unimportant, why sell faster speed grades of S3? For a user who needs to buy a fast speed grade of S3 (if they can get their hands on one...), their design will often meet performance requirements in the slowest Cyclone device. And if you can get a huge performance win, you can reduce the number of LEs required (reducing degree of parallelism for given throughput, etc.) potentially reducing the size of the device required, which is another cost lever.

I guess we'll have to wait until Cyclone II is released to get a true apples-to-apples comparison of cost vs. performance in the same process. But we'd really have to work hard at cutting performance relative to Cyclone in order for that comparison to not look good for us...

Obviously we have considered them. But the "huge advantage" isn't that clear to me (though Ray has some interesting applications). If all you have are 18K RAMs, then yes, some sort of small memory would be handy. But how much die size are you paying for SRL16s? Clearly SRL16 consumes some appreciable amount of die space otherwise you wouldn't have invested the effort required to design and layout two different slices in Spartan-3. Would that die size be better spent on small discrete memories (such as our M512 blocks)? That all depends on what you think the types of designs implemented in FPGAs look like, how much memory they have, what that memory is doing, how it is organized, whether shift registers are related or not, etc.

The continued presence of M512 blocks in Stratix II suggests that we think that a mix of small, medium, and large memories, aggressive packing of logic with registers into LEs, and SW/IP support for packing shift registers into RAMs provides the most cost-effective solution.

We see a 54% advantage ALM vs. LC, you see a 0% advantage... What does it matter? No one will believe either of us -- all I can do is to (a) say that we believe Stratix II is 25% more efficient than Stratix (which we definitely know how to measure) (b) point to our respective whitepapers on the topic

formatting link
formatting link
Xilnx WP209) and (c) wait five years or so to see what the marketplace has to say.

Regards,

Paul Leventis Altera Corp.

Reply to
Paul Leventis (at home)

Change of topic, but I'm curious.

What fraction of P&R runs have the IO pins locked down?

I'd guess many/most, but maybe my view is warped.

I generally assume that the pin assignment is a cooperative project between the PCB designer and the FPGA designer. Clocks and high speed signals get more attention, but sometimes moving a signal to a different pin helps the board and sometimes it helps the FPGA.

At some point the pin assignment gets fixed and the board gets built. But the FPGA design continues to evolve: sometimes minor bug fixes or easy enhancements, sometimes major upgrades. But all using the initial I/O pin assignment.

--
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's.  I hate spam.
Reply to
Hal Murray

Paul,

Excepting the performance claims, we seem to be in "violent agreement" yet again. It surprises me when people are amazed when we agree on something. In large part we agree on a vast number of issues.

One item you failed to mention is that our die size is a lot smaller than yours in comparable technologies. That also has something to do with trade-offs made in S3. As for cutting the number of SRL16s in half, even Ray wasn't able to use 100% of them, and the area penalty while small, adds up as it is replicated everywhere. Cost cost cost. That is all about S3. (As well as it has to have margin margin margin) as no one works for free.

As for the faster speed grade, we wanted to have a one speed grade family for S3, but there are some customers who basically are trying to replace a VII with the S3. The product was not intented to do that (as pulling an "Osborne" is not good for business!). But, as we had yield to a faster grade, and we can charge for it if people want it, why not get the money?

Poor Adam, I knew him, and traveled and lectured with him a long time ago. Funny his failure is now a standard business school study case. He is but one example of why we should all strive to be humble....

formatting link

Good luck.

Austin

Reply to
Austin Lesea

--

--Ray Andraka, P.E. President, the Andraka Consulting Group, Inc.

401/884-7930 Fax 401/884-7950 email snipped-for-privacy@andraka.com
formatting link

"They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759

Reply to
Ray Andraka
[Austin & Paul, thanks for the civilized and interesting discussion.]

The Computer History Museum recently had an "Osborne Odyssey" evening which gave a bit more nuanced view into the story. The impression I got from the evening was one of a somewhat poorly organized company and Osborne not really being qualified for running the business.

The new machines, while ready to ship, where held back because floppy disks could occasionally get stuck in the drive (requiring a plier to get them out).

Another interesting tidbit: Osborne was quite arrogant and would readily turn down interested resellers for "not being professional enough". All while the major competitor Kaypro would sell as many or as few as anyone would like.

Tommy

Reply to
Tommy Thorn

I thought I might go for a green and not red apple...

We have been discussing at work the need for subscritions and who we pay.

Has anybody done comparisons re: timing and 'compactness' for symplify vrs xst in small (

Reply to
Simon Peacock

He came to visit us at Inmos (for a big stash of pounds) in 79/80 to tell us what he thought we should be doing in micro architecture given his knowledge of 8080s and so on.

I think everyone concluded what a waste of money, nice fella but essentially telling us to do same as Intel,Moto. Kind of brought in as a celebrity. Well he was a brit so we were curious about him. He certainely had solid views on architecture, he was wrong, and later some might say so were we (or maybe not).

regards

johnjakson_usa_com

Reply to
john jakson

I used to see him wandering the streets of Berkeley, staring at his feet. I never spoke to him, but he seemed a sad and broken guy.

Reply to
Pete Fraser

All,

He was a real character. That was part of the whole mystique: he was actually Australian, but for some reason the Brits loved him (must be the accent). I know because I was supposed to have a day off from lecturing, and I was flown from Paris to London to "replace" Adam when he did not show from Stockholm for an appointed 3 day course folks had paid $750 a day to attend.

I started my talk with "anyone who paid for Adam, I am not he. I expect you to at least sit and listen to me until lunchtime before I will entertain giving anyone's money back. Offer me that one courtesy."

At lunchtime, I had one person ask for their money back, and I was glad they did (as I would have thrown the a**hole out if he had not).

As for why I was in Paris, and why he got 'stuck' in Stockholm, I will not say here. But it is one hell of a story if you ever find me and remember to ask me!

Adam was a fine gentleman with a sharp mind, and a gift for writing and explaining.

A quote from one of my brothers is appropriate here: "if I didn't have any idiosyncrasies, I'd have no personality at all!"

The world is lessened for the loss of folks like Adam.

Austin

Reply to
Austin Lesea

I am coming late to the party as usual. I would like to say that I perfer not to see marketing espoused here, pure or otherwise. Marketing is of little value and is counter productive when it gets in the way of seeing what the real issues and facts are. It also tends to create a lot of spurious responses that clutter up the group. That aside, I am pleased at Xilinx's approach to this newsgroup and the assignment of specific representatives to deal with specific areas of expertise.

It seems that many companies have historically considered the newsgroups a place to either market, or nothing but trouble and have stayed away from posting any "official" comments. I am pleased to see both Xilinx and Altera taking this forum seriously and addressing it at a business level rather than just unofficially.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

SD,

  1. To respond to your concerns, our benchmarking team ran a second set of experiments to compare Stratix to Virtex-II Pro in which the circuits are given I/O constraints in addition to Fmax contraints. The results showed a decrease in the absolute Fmax produced for both families of between 5% to 6%, but a negligible change (less than 0.5%) in the relative comparison. So, our results as presented in the Net Seminar remain valid both with a without I/O constraints.

  1. Design age varies greatly though in general the larger designs tend to be newer than the smaller designs. Most of the large designs (>40K LEs) are less than 1 year old. Most of the small & mid density designs are 1-3 years old. To the extent that we look for data points that are "out-lying" and fix them (as they are often representive of broader issues), there is some tuning of our software around these designs. I think this likely contributes to the discrepancy in results, though I would speculate that it contributes much less than the methodology differences.

Dave Greenfield Altera Product Marketing

Reply to
Dave Greenfield

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.