CORDIC in a land of built-in multipliers

Since I learned the algorithm in college, I've always wanted to implement it on an ASIC or FPGA... And never had a place where it was the right fit. As some of the other posters have noted, gates and multipliers are becoming almost "free" in many applications. Block RAMs are also sometimes available to just store trig. tables.

I think I remember coding an primitive CORDIC algorithm in VHDL some time in the 90s. From what I recall, it wasn't used anywhere, I just wanted to see how to do it.

--Mark

Reply to
Mark Curry
Loading thread data ...

Are the ASICs you're using based on LUTs, or are they a sea-of-gates where you define the interconnect?

--
www.wescottdesign.com
Reply to
Tim Wescott

Interesting question.

In the early era (80's through early 90's) if a LUT was needed in an ASIC it was a layout-based disign consisting of a mask-programmed ROM.

Certain other logic functions (notably, ALU's and parallel multipliers) would also be layout-based). More random logic would be based on routing together standard cells.

(At the time, a "gate array" or "sea of gates" approach suggested the cells were very small, on the order of a gate or two as opposed to cells for D-flops, adder sections, etc.; and typically that only a few metal layers were customized.)

In the modern era, lookup tables are usually synthesized from gate logic along with the rest of the random logic and physically, the only large blocks in the digital part of the design are RAM's. Of course the converters and other analog blocks are separate blocks.

The exception is large CPU chips where the ALU's, memory switches, and so forth are still layout-based designs.

If one has a photomicrograph of a chip you can usually see what they are doing and how they've paritioned it.

Steve

Reply to
Steve Pope

Is

FPGA,

Ray Andraka used to be a frequent comp.arch.fpga poster. On his website he lists several FPGA projects that use CORDIC, however I notice that these all seem to have older dates, e.g. from the Xilinx Virtex-E era.

formatting link

Allan

Reply to
Allan Herriman

He was very old school having developed libraries of hierarchical schematics for all manner of complex functions with full specification of placement of each logic function. So he was not happy with the idea of changing over to HDL which Xilinx told him would be the only entry means supported eventually. Once he gave it a serious try and discovered he could do the exact same thing with structural code, he was happily convinced HDL was the way to go. I expect he has a full HDL library of CORDIC functions.

--

Rick C
Reply to
rickman

Alright. By the way, Steve, do you follow polar codes? Looks like they have comparable performance to LDPC now:

formatting link

Gene

Reply to
Evgeny Filatov

]> Cordic is used for items like sine, cosine, tangent, square root,etc

Used it for pixel-rate polar to rectangular (needed both angle & radius). Vendor's CORDIC IP worked fine. It used a ~12 stage pipeline.

Reply to
jim.brakefield

Same here. CORDIC is something that is cool, but does not seem useful any longer. It takes too many cycles and is only useful if you don't care abou t throughput, but in those applications, you are probably using a processor , not an FPGA. You keep the latency but increase throughput by pipelining, but then it's bigger than an alternative solution. If you can, you use a blockRAM lookup table for trig functions. Otherwise you'd probably use emb edded multipliers and a Taylor series (with Horner's Rule). Or a hybrid, s uch as a Farrow function, which can be as simple as interpolating between l ookup elements. The Farrow idea works really well in sine lookups since th e derivative of sine is a shifted sine, so you can use the same lookup tabl e. That's what I've used.

For arctan(y/x), which is needed in phase recovery, I've used 2-dimensional lookup tables, where the input is a concatenated {x,y}. You don't need a lot of precision. There are also good approximations for things like sqrt( x^2+y^2).

A lot of ideas persist for a long time in textbooks and practice when they are no longer useful. I do error correction now and all the textbooks stil l show, for example, how to build a Reed-Solomon encoder that does 1 symbol per cycle. If I wanted to do something that slowly, I'd probably do it in software. Sure a lot easier. FPGAs are only useful if you are doing thin gs really fast.

Reply to
Kevin Neilson

True, but there are probably simpler ways. You can use a 2D lookup in a blockRAM. And an approximation that can be reasonably precise is

mag = A*max(I,Q) + B*min(I,Q)

where A,B are constants. (You can find suggestions for these on the web.)

Reply to
Kevin Neilson

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.