Fastest ISE Compile PC?

Question

Has anyone recently done any benchmarking of Windows PC's for Xilinx ISE Compiles?

Is ISE multithreaded? Can it use multiple processors (or cores)? Do big CPU caches help?

Regards Marc P4-3GHz HT 2GB DDR2-533 RAM

Derek Simmons · Accepted Answer

I'm not sure if QuartusII or ISE is multithreaded but the first generation dual core systems I wasn't impressed with. My experience is my company provided me with a dual core system to do development work with.

When my system wasn't living up to my expectations I did a little research. The MS Windows performance meter and some third party tools showed little activity. When it did it was about 85% or better, occasionally pegged. I did some other poking around on my system and discovered that they cheaped out with the graphics card and hard drives.

My advice to you and this is partly experience and the other part gut feeling is compare the price difference between the Extreme and Dual Core chips if the price is negligible look for the one with faster front side bus (FSB) speed. And, the second item I'd look into is a caching SATA controller that supports mirror and some really fast hard drives. Avoid striping the drives there is a performance hit but try mirroring. From my observations, development tools I use are mostly memory and hard drive bound. When you compile and PAR your design a fast CPU is beneficial but it is also working with a lot supporting files and storing/retrieving information from memory.

In the past most users report the biggest benefits from more and faster memory.

Derek

Tommy Thorn · Answer

Marc, why didn't you try to google for the answer? This question isasked every other week on  I only reply because I havesome new numbers.marc_ely wrote:No, but I have for Quartus which is very similar.No and not for a while to come.Nope.Oh yeah, but once you have that, core frequency is all that matters.I recently went from an Athlon 64 2.0 GHz/1 MiB L2$ to a E6600 Core 2Duo 2.4 GHz/4 MiB L2$. For my benchmark, the time for Synth/P&R wentfrom 12m34/33m40 to ~6m/~15m, thus more then double the P&Rperformance. When overclocked to 3.3 GHz the result scaled to5m54/11m12, thus 3X the P&R performance. Other experiments confirm thatit scales linearly with frequency (assuming memory scales equally).I have expensive memory, but from my experiments the benchmark resultsshowed very little sensitivity to memory bandwidth and latency.The 4 MiB Core 2 Duo is a very fast chip for FPGA work, probably thefastest x86 available, but it's still not fast enough to reduce thecompilation times to an...

marc_ely · Answer

Hi TommyThanks for the info.  Yes I found your posts about 2mins after I sentone out (after searching and finding nothing current).  That's theproblem with info on the web... it's often out of date and finding theright stuff can be needle in haystack.I think I will go for a CoreDuo with 4MB.Marc

mk · Answer

Could you give us some info on what the disk subsystems look like for each machine? (ide, sata what speed, any raid? etc)

What do you think explains for no change in synthesis for C2D change from 2.4GHz to 3.3GHz ?

Tommy Thorn · Answer

I could, but it would misleading as it's completely irrelevent to the posted numbers. The benchmark is operating almost exclusively out of the buffer cache and even then it's not reading that much data.

That said, for everything else disk latency matters a lot, so I used a single SATA 150 GB Raptor (15,000 RPM) in the new box. The old box had a quiet average speed Samsung PATA drive (7,200 RPM).

My measurements were too informal. There is a change, just not as substantial. I'd need to study this closer to understand what's going on.

Tommy

Michael Schöberl · Answer

My system has arrived and I did a quick benchmark:

my lab-system: P4, 2.6 GHz, 2GBytes RAM another system: P4, 3 GHz, 2GBytes RAM my new machine: Core 2 Duo, E6700, 2GBytes RAM with Asus P5LD2 Deluxe

a full run with ISE 6.3 (from synthesize to bitgen) with a recent design takes:

my lab-system: 30 minutes another system: 28 minutes my new machine: 14 minutes

I would say it is worth the money and I guess we'll buy some more of those machines ...

bye, Michael

marc_ely · Answer

I took the plunge and built up a 2nd PC using a Core2Duo.

Here are the specs: Old PC: P4 3GHz HT, 2GB DDR2-533 RAM, Gigabyte GA81915 mobo, stock cooler New PC: Core2Duo E6600, 2GB DDR2-800 RAM, ASUS P5B Mobo, ArcticFreezer7 cooler

Using a Spartan3 design running clean from scratch in ISE 8.2.3i Old PC: 82mins New PC: 35mins New PC (overclocked to 3.2GHz): 25mins

I'm really pleased with the Core2Duo and would recommend it.

Marc

JJ · Answer

While the CoreDuo looks the thing right now, on the disk side I'd be interested to know if the new IDE Flash drives that go up to 32GB are any use as a replacement for high RPM drives.

The only reviews I have seen (Toms IIRC) obviously have much lower latency but not yet much throughput around 30MBytes/sec but at least the ms delays should now be us delays. At this stage I wouldn't be concerned about wearout as I expect these things to be get replaced sooner or later, prices seem to be falling on Flash much faster than DRAM now and the throughput is bound to reach closer to PATA max rates.

just a thought John Jakson

pbdelete · Answer

Conclusion dual cores (multiprocessor) benefits Xilinx ISE substantially?

Thomas Entner · Answer

No, cache size matters.... As far as I know, neither ISE nor Quartus use the second core, but both benefit from the huge cache.Thomas

JJ · Answer

Not just regular L2 cache but the TLB or address cache matters evenmore I suspect but harder to characterize and explain. When the dataset is still beyond even the bigger combined cache of a Dual, theincrease in associative ways of the bigger TLB kicks in to reduce theincidence of the OS having to refill MMU page tables which can blow nscache hits into several 100ns accesses for full cache miss.I ran a test on an older 2GHz Athlon XP2400 and a 2.6GHz D805 for aloop that just randomly accesses ints from a 512MB array using a maskto control the variability of address from 256 ints to the 128M max andfor each case run the loop 1M times.I believe this represents the worst possible behaviour of any CADapplication that must traverse huge graphs or trees that can not fitcache but easily fit DRAM.The D805 generally runs 30% faster as the clock suggests while thetests are entirely cache bound but the Athlon has 256K of L2 with 256ways in the TLB. The 805 has 1MB of L2 in each core and I expect...

Fastest ISE Compile PC?

Join the Discussion

Didn't find your answer?