Need to speed up Stratix compiles.

We're currently running a 3 GHz Pentium with 2 GB memory under Windows 2000.

We hope to speed things up by 15-20%, by going to AMD X86-64 and / or Linux.

Has anybody tried this? Any feedback?

Reply to
Pete Fraser
Loading thread data ...

No, but I'd like to know the results on the AMD 64.

I spead up my 3.2GHz P4 compliles by 20% by making sure my memory was running at dual channel 400MHz. Turned out it was running dual channel

333MHz.

I had to actually downgrade my memory slightly, because my motherboard saw cas2 DDR and dropped to 333. With cas 2.5 it was confident to go to 400.

YMMV, Ken

Reply to
Kenneth Land

I went from a 2.5GHz Pentium to a 3GHz Xeon and got a very consistent

33% speed increase in Stratix compiles and SOPC Builder generation. I suspect the increased cache size is the most critical thing, since clock rate increased by only 20%, but I'm only speculating. Both machines ran XP and RAM was 1GB in both machines.

I'm curious to hear how your compiles improve with AMD/Linux.

-- Pete

Reply to
Peter Sommerfeld

I can't tell you what to expect from an AMD64 with today's software, but I expect this will be the platform of choice for the next couple of years. It may not be the best investment at the moment, but I expect by the end of the year, much of the software will be optimized for 64 bit operation and you will see over half the new engineering workstations running an AMD64 processor.

IIRC, AMD is producing a low cost version of the AMD64. I expect sales will take off very quickly. Once these start showing up on the software developer's desks we will see them optimizing for it.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

The server versions of Win2000 can handle up to 32GB of RAM, if that's the main limiting factor. Can get pricey, though :o((

Unless there's a version of the compiler specifically for a 64-bit architecture, then you're unlikely to see any real speed gain to justify the cost, and even if there is, I doubt the gains would be all that impressive.

64-bit CPUs really only come into their own in applications that need to access very large virtual address spaces (>4GB). Mostly, that's server-type apps. The need for 64-bit arithmetic is likely very small in this case.

Can the compiler multi-thread? If so, a mobo with a couple of HT Xeons (4 CPUs), will give you all the extra horsepower you'll need. If not, a dual-processor system would still perform a lot better, since one CPU can work flat-out on the compile, while the other is handling the OS and other background tasks.

All this assumes that the compiler's performance is, in fact, CPU or memory bound as you imply. Are you actually certain that this is indeed the case? Might a faster disk system help?

--
  Max
Reply to
Max

Quartus II 3.0 does not run on X86-64. See news: or

formatting link

The fix should be trivial since it's just the driver script which does not recognize the architecture. If it tried to run some X86 code rather than checking uname it would work. I dunno about 4.0 though. Anybody tried?

Petter

--
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Reply to
Petter Gustad

Seems to be a common misconception that 64bits just increases the amount of addressable memory. More importantly for most applications is that twice the data is moved or operated on per clock cycle.

Ken

Reply to
Kenneth Land

On the disk speed issue I have one data point. I upgraded my 1GHz PIII-M laptop drive from a slow 4200 RPM to the fastest 7200 RPM available (for laptops) and my Nios system build went from about 16 min. to about 15 min. Not worth the pain and expense of swapping the drive.

On memory, I upgraded the memory in my 3.2 GHz P4 from 512 to 1GB and there was no noticable difference until I set the memory from 333MHz to 400MHz dual channel. Then my system build went from 5 min. to 4 min. - 20%.

Ken

Reply to
Kenneth Land

Hi Max,

Provided the peak memory consumption of Quartus for the compilation in question is less than the amount of physical memory in the system, increasing the amount of memory will not help compile time. For non-trivial designs, a Quartus compile will be most heavily influenced by CPU speed, and then by memory sub-system speed -- disk speed will have little influence.

CAD tools process a lot of data. I don't know if a Xeon (bigger cache) is much faster than a normal P4 (smaller cache), but I wouldn't be surprised if this were the case for the same reason that a Xeon processor is supposedly better for server applications -- bigger cache helps applications whose data set doesn't fit into the cache.

Regards,

Paul Leventis Altera Corp.

Reply to
Paul Leventis (at home)

of

64-bitness _is_ mostly about addressable memory -- it is rare that 64-bit integers help reduce run-time. Please see my previous postings on the topic and some of the replies to it:

formatting link

Regards,

Paul Leventis Altera Corp.

Reply to
Paul Leventis (at home)

formatting link

I think the OP was refering to the wider datapaths. I don't know the cycle level details of the AMD or Intel 64 bit but an obvious and simple speed gain can come from a wider HW fetch. (even running < 64 bit opcodes ) and then a simple check if the next opcode / next data value is in that block.

This works in systems where the CPU must wait for slower downstream memories, and even the smaller single chip microcontrollers are starting to do this. eg Philips ARM uC has 128 bit FETCH. Clearly, random code or data will not be helped, but a large % of code will be sequential. I'm not sure the AMD/Intel offerings hit the SIMD (Single instruction/multiple data ) of other cores, but even without that, some HW gains would be expected.

-jg

Reply to
Jim Granville

Hi Jim,

Yes, wider memory interfaces/cache data lines can help, but as you say, this is independent of op-code size. If I recall correctly, AMD and Intel processors already fetch 64-bit blocks, but this may have been increased. The latest m/b chipsets for both families of processors use dual-channel DDR (128-bits wide) and so I would not be surprised if they've increased the size of fetches.

As vendors introduce 64-bit capable processors (such as Opteron), they often also enhance various aspects of the CPU architecture in ways that help both

32- and 64-bit code. And while the 64-bitness of x86-64 may not matter much for speed, the doubling of the register files etc. could result in faster performance.

It's every computer engineers dream to be a processor architect, isn't it? :-)

Regards,

- Paul

Reply to
Paul Leventis (at home)

The only common misconception is that swapping for a 64-bit processor in a desktop PC will lead to a large performance increase. It doesn't. (Other than any gain from a higher clock speed, of course.)

Like to make a guess as to the extra overhead in a 64-bit version of current OSs, btw?

Data is only data if it's meaningful. The use of 64-bit arithmetic variables is comparatively rare in most applications. Certain scientific and CAD packages do make heavy use of 64-bit floats, but I doubt that's the case here (and high-end processors tend to use 80-bit data paths around the FPU anyway). There's not a lot to be gained from accessing memory in 64-bit chunks if you're only interested in 32 of them (there is an effect on cache hits with vectors, but it's not measurably worthwhile in practice).

There will be some effect on prefetch, but it depends on the state of the L1 and L2 caches and the instruction pipeline(s) themselves. Tests I've seen suggest an increase of memory bandwidth efficiency of only around 1-2% at best.

If you want a 64-bitter to really earn it's corn, use it in something like a database server with 64GB of RAM and a multi-TB disk farm. Give the poor thing something *meaningful* to do with the extra 32 bits. You'd still need 64-bit software though.

--
  Max
Reply to
Max

Not in a low-spec machine like that, no. The options in a laptop are limited, and there's no way to increase the disk controller bandwidth. But the effect on a powerful workstation of installing a RAID with a high-bandwidth controller and drives such as U-320 SCSI can have a dramatic impact. As always though, it depends on the application.

That doesn't mean a lot. You only need to add more memory if you're running out of it ;o)

--
  Max
Reply to
Max

Or running synthesis, place & route, static timing analysis etc. on an ASIC design requiring 6GB RAM.

Petter

--
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Reply to
Petter Gustad

We can all speculate about the relative merits of processor enhancements, but these machines are very complex and the only real way to tell what helps is to try it. Since we are not all ancient Greeks philosophizing in our armchairs, it would be a good idea to pick a design and to run it on a few different workstations, hopefully including an AMD64.

I have always been surprised that the FPGA vendors don't put some effort into evaluating platforms and releasing the results. I know this can be a bit of a can of worms, but every time I look at buying a new machine, the first question I research is how fast it will run the FPGA design software. Then I am often trying to speculate on my own since I don't have much info to go on.

I seem to recall that there at least used to be some available info on how much memory was needed to optimize run time as a function of part size. But I haven't seen new info on that in quite a while.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

I had assumed that had happened already. Silly me.

Perhaps we'll just buy an AMD machine and see what it does, but I thought somebody might have tried that already.

Anybody know how solid the Quartus II 4.0 Linux port is? I can't get an answer out of Altera.

Reply to
Pete Fraser

I suspected that might be the case, but I wasn't quite sure. I'm more used to programming language tools that use library files extensively, where a fast disk system (or a big ramdisk) can give very worthwhile speed gains.

Is there any possibility of making Quartus multi-threaded? That strikes me as the most likely way to get a dramatic performance increase, though I know it's not always easy to achieve with heuristic apps.

While the extra cache is important in itself, much of the performance gain of the Xeon is also due to the greater degree of parallelism and deeper prefetch lookahead, thus making better use of memory bandwidth throughout.

--
  Max
Reply to
Max

I would like to get see synthesis and place and route tools I could run on a cluster of cheap PC's. I would be happy with less than linear speedups, e.g. using a 16-node cluster to get a 8x speedup.

Petter

--
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Reply to
Petter Gustad

I doubt you'd get anywhere near. Trying to implement those algorithms efficiently on the sort of loosely-coupled architecture you propose would be nigh-on impossible. It's not easy on a single SMP box, but it's doable.

A quad Xeon (8 x CPU) box would cost less than four single decent-spec machines anyway.

--
  Max
Reply to
Max

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.