# Transforming vector position to binary value

Hi @all,

I have a vector with 32 bit [31 downto 0].

After some calculations the vector could look like that:

a)0000 0000 0000 0000 0000 0000 1000 0000 or b)0000 0000 1000 0000 0000 0100 0000 0000

How can I transform the position(s) of the one(s) in the vector to a corresponding binary value?

for example in a) '1' at position 7 of the vector =>

binary value would be: b111

Is there some simple and fast possibility to make this transformation?

Thanks a lot.

Best regards

How do you create a "binary value" representing the word that has more than one bit set?

My favourite onehot-to-binary recoder is:

variable onehot: std_logic_vector(LOTS downto 0); variable binary: std_logic_vector(FEWER downto 0); ... binary := (others => '0'); for i in onehot'range loop binary := binary or std_logic_vector(to_unsigned(i)); end loop;

Verilog version available at extra cost :-)

This optimises well (to a collection of wide OR gates) in every synthesis tool I've tried. But it will fail horribly if more than one bit of the input "onehot" word is set. What do you expect to happen, in case (b)?

Jonathan Bromley, Consultant

Jonathan Bromley, Consultant
If i understand you correctly, you request for a feature called priority encoding. It also resolves cases with two or more 1s using priority.

"Jonathan Bromley" wrote

Whoops. That should have been...

binary := (others => '0'); for i in onehot'range loop if onehot(i) = '1' then binary := binary or std_logic_vector(to_unsigned(i)); end if; end loop;

It's usually a good idea to make the output somewhat dependent on the input ;-) Sorry.

Jonathan Bromley, Consultant

Jonathan Bromley, Consultant
Dear Mr Bromley,

it is the following background:

The 32-bit vector can have up to 4 locations with '1', for example

0000 1000 0000 0000 0101 0000 0000 0100

The ones indicate that a data packet has been found at the different write-address-positions of a cam. In a second step I have to fulfill a second check in a separate RAM structure. For this purpose I want to create out of the 32-bit-HIT-vector 4 addresses. In the example shown that would mean that I have to create the following addresses (for the second check step):

1. b00010
2. b01010
3. b01110
4. b11011 These addresses will then be assigned sequentially to the RAM wraddress port. So is there a possibility to get these four addresses out of the
32-bit-vector within one clock cycle?

Thanks a lot.

Best Regards Andres V.

That is a VERY different problem from the one you first posted!

I'm sure it is possible, but it is certainly difficult and I cannot see a clean solution right now. However...

If the four RAM writes are to be sequential, why not obtain the four addresses sequentially? That would be much simpler. Use a priority encoder (as Valentin suggested) to locate the first 1 bit. Then, while you are writing to the chosen RAM location, clear that first bit to 0 and use the same priority encoder again to find the second 1 bit. Keep going like this until you have cleared the whole 32-bit input word to zero.

A 32-to-5-bit priority encoder is not trivial, but it should be OK unless your clock speed is very fast.

Jonathan Bromley, Consultant

I missed the original postings but the sequential detection sounds like a great approach from what I see above.

The 32-to-5-bit priority encoder is something I worked with just recently, targeting the Virtex-II architecture with Synplify Verilog. The speeds I was able to get were greater than 300 MHz. If you're working with these tools or just want to see the approach, feel free to reply-email directly. I could probably (suggest ways to) push the speed even further with some tricks I have left.

- John_H

As I had posted a while ago, the Virtex-II BlockRAM is surprisingly efficient as a priority encoder. Use one BlockRAM as a dual-ported 4K x 4 ROM. Feed 12 inputs as address to one port, and the next 12 bits as address to the other port. That gives you two sets of 4 outputs defining the priority-encoded position in the two 12-bit inputs separately.No problem with multiple 1s! Should work at 200+ MHz, but note that the BlockROM is a synchronous device, it operates on a clock edge! The remaining 8 inputs and the combining of the eight BlockROM outputs can be done the conventional way, but might also be done in another BlockROM (generating a second clock latency!) Peter Alfke, Xilinx Applications

Dear Mr Alfke,

thank you for your proposal. Do you have some application note in which the connection of the ROMs is explained? How many ROMs are used at all? As I understand you there are used four ROMs? Do you mean with "to one port" the write address port and with "to other port" the read address port?

Best regards A. Vazquez G&D System Development

Andres, let me explain:

There is no app note, but it is really quite simple.

Any Virtex BlockRAM can be loaded with data contained in the configuration bitstream. If you then never write again, you have a BlockROM.

Obviously, a 4K x 4 ROM can detect anything on its 12 address lines, and describe it as a 4-bit output. This is all well-known.

The trick that makes this solution so efficient is the use of the other port of the same BlockROM. The two ports are completely independent. There is no read port or write port. That's just a typical use when implementing FIFOs. You can use both ports to write, or - as in this case - you use both ports to read. Since you use the same priority encoding for both sets of 12 inputs, you can use the common ROM storage, and thus handle 24 inputs, giving you two sets of 4 outputs.

It's the dual-ported nature of the ROM, and using the same encoding for both sets of 12 address inputs, that makes this so efficient.

The rest of the logic, combining the two sets of 4-bit outputs, and handling the additional 8 inputs ( for a 32-bit encoder), will be conventional.

I like using BlockRAMs for unconventional applications, especially since that relieves the interconnect structure, and - if you have more BlockRAMs than you need - its actually free (not just efficient) :-)

Peter Alfke, Xilinx Applications

What's the clock frequency for this design? At what rate are the 32 bit words generated?

Can you take 32 clocks to produce the desired results? If not, if the frequency of operation is low enough, you could multiply it by 32 and get your results within a clock or two.

If none of the above works, you could probably pipeline a solution. During each stage a prioritized decoder would pick off the first "1" and give you a result. At the same time, that "1" would be masked off so that it won't appear in the next stage. Repeat four times and you have your solution. It will probably take four to eight clocks depending on details (don't have my thinking cap fully on).

Martin Euredjian
...

Is there an app note or techxclusives article listing unconventional uses of BlockRAM's and multipliers? That could be useful to trigger some creative thinking.

Martin Euredjian
Don't be silly; if it's in a Xilinx appnote, it's _ipso facto_ conventional :-)

Jonathan Bromley, Consultant

The 234 character URL to the TechXclusives article is digested down to this:

The title of the article is "Using Leftover Multipliers and Block RAM in Your Design"

- John_H

Those hidden gems in the Xilinx website. It's kind of nice not being able to find anything with ease sometimes 'cause every so often you run across something interesting by sheer luck and it's like discovering a whole new world. :-)

Martin Euredjian
Martin,

When I first joined Xilinx, I spent hours going through the external web site. I was amazed at just how much "stuff" there was out there. Often I read a question here, and point people immediately to the online answer. I do have to remember that it is is not as easy for others to find these gems (hidden or not) as it is for me.

My favorite pitch to a prospective new customer is to ask them to go to our website, and place a query for some kind of information (eg "Signal Integrity"), and then do the same for the competition. Not only do we have many more "hits" on just about any topic you can think of, the quality of the information is vastly superior. The only problem we seem to have is how to make it easier to get at all of it (which, as Peter points out, we are trying constantly to improve).

Aust> > BTW, the search engine in the upper left corner of the xilinx.com

Having had an occasional hand at doing web work for myself I wholy appreciate how complex the problem can be. The quality of information on the Xilinx site is definetly up there.

...now, if you could only switch from eight menus per page (or whatever it is) to a more "linear" environment, that would be great.

There's a little gem out there in the Web design world called "Typo3"

