fastest FPGA

From my own experience:

A 2-D example using fixed length SRLs that comes to my mind is a 90 degree pixel rotation.

If you have a 16x16 array of vectors that come in in the order

A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 Aa Ab Ac Ad Ae Af B0 B1 B2 B3 B4 ... C0 C1 ... . . . P0 P1 P2 P3 ...

And want to send them back out rotated 90 degrees so the order is

A0 B0 C0 D0 E0 F0 G0 H0 I0 J0 K0 L0 M0 N0 O0 P0 A1 B1 C1 D1 E1 ... A2 B2 ... . . . Af Bf Cf Df ...

You can do this completely pipelined for 16x16 blocks without intermediate load/unload cycles with 30 shift registers and a 16 bit barrel shift. The

Reply to
John_H
Loading thread data ...

Just a nitpick but wouldn't this be a transpose? You'd need to invert in X or Y to get a 90 degree rotation.

-Dave

--
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture
Reply to
David Ashley

If Sally comes into the room followed by Barbara then Sheila and finally Carol but exiting the room is four pairs of shoes followed by four nicely folded outfits followed by a basket of lingerie and finally four unclad women racing after their departed belongings, is it just a transposition? Things got very rearranged in the process.

In the example above the A values enter first followed by the B values and so on. When they exit the rotator scheme, they exit as the zero label values followed by the 1 label values and so on. The transpose is a 90 degree rotation of 16x16 blocks within a 256 element grid. To get this to run continuously with simple registers would require 384 registers. When the resource usage can be nearly quartered, isn't it something to consider?

The issue at hand was data reordering. The rotation is a simple reorder but in a way that isn't easy to parallelize at high speeds without throwing a huge number of resources at the problem when the information is available in a serial fashion.

Reply to
John_H

Transpose - it's a term from linear algebra, at least that's what I'm thinking of. A[i][j] becomes A[j][i] for 0 something to consider?

I don't know if we're discussing the same thing. The way your data goes from input to output is a transpose, not a rotation. I'm just compaining about the terminology. BTW I'm not making this up :).

I don't really follow how the circuit works. I mean, before you can output P0 you would have had to read in every single row from A to O, that's a lot of data you need to store. Perhaps on the order of a

384 element shift register?

-Dave

--
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture
Reply to
David Ashley

The rotate/transpose uses an input SRL "triangle," increasing SRL delays from the earliest bit to leave per word (0-length SRL or direct connect) to the latest bit to leave (15-length SRL). The barrel shifter transposes the input SRL outputs to an output SRL triangle. The earliest bit to leave from the first word goes directly from the input to the longest output delay (15-length SRL) so it will match up with the shortest output delay (0-length SRL or direct connect) that takes the last word's earliest bit directly; the first bit from the 16th word shows up at the same time as the first bit of the first word. The latency from the start of the 26x16 square to the start of the output is

15 clocks plus any pipeline stages (such as in the barrel shifter). When one block ejects from the mechanism, the next block loads.

A "simple" transpose or rotate that maintains the pipeline would require a large number of parallel-in, serial-out shift registers which must be implemented as discrete registers. 384 registers for the selective load/global shift approach.

The same mechanism takes less that 100 LUTs to accomplish the same goal with the same speed capability.

SRLs are a win for a transpose or rotate where the function size is almost 1/4 of a more traditional approach.

- John_H

Reply to
John_H

I've been seeing this "SRL" term used a lot, what does it mean?

:) I just did a google search for "srl xilinx" and got some useful info, and so I created a Wikipedia page on it since one didn't seem to exist.

formatting link

Anyone reading this feel free to expand on it.

-Dave

--
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture
Reply to
David Ashley

How do you get the wikipedia corrected? I clicked the link on the SRL page to the FPGA page and then on through to the partial re-configuration page at...

formatting link

This page says "In current versions of software, Xilinx supports partial reconfiguration on Spartan 3...". I am pretty certain that this is not supported in Spartan 3. I have requested that this be supported in Spartan 3 since they came out and I still have not seen it appear.

Am I wrong, or is the wikipedia wrong?

Reply to
rickman

Yes, but since the input is serial and we only take one output at a time, the SRL 16s let us collapse the shift register into LUT resources giving a 16:1 savings. Since the data is input in row raster form, it can naturally be done by shifting each row into a series of SRL16s. Then the read out is down columns, so you read one sample out of each Row's shift register, advancing the shift register after each read.

Reply to
Ray Andraka

You can correct the wikipedia article yourself, click on the "Edit this page" tab on the top, make your edits, then save. You might want to create an account if you don't have one already.

There is no guarantee wikipedia is correct. Witness the fact that I only learned what an SRL was last night, and here I am writing an "encyclopedia" article on the subject :).

The thing was, I went to wikipedia first to find SRL, and nothing came up. So I went to google, found a page, then went back and updated wikipedia, in case other people turn there first. That's what wikipedia's all about.

-Dave

--
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture
Reply to
David Ashley

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.