Andres, let me explain:
There is no app note, but it is really quite simple.
Any Virtex BlockRAM can be loaded with data contained in the configuration bitstream. If you then never write again, you have a BlockROM.
Obviously, a 4K x 4 ROM can detect anything on its 12 address lines, and describe it as a 4-bit output. This is all well-known.
The trick that makes this solution so efficient is the use of the other port of the same BlockROM. The two ports are completely independent. There is no read port or write port. That's just a typical use when implementing FIFOs. You can use both ports to write, or - as in this case - you use both ports to read. Since you use the same priority encoding for both sets of 12 inputs, you can use the common ROM storage, and thus handle 24 inputs, giving you two sets of 4 outputs.
It's the dual-ported nature of the ROM, and using the same encoding for both sets of 12 address inputs, that makes this so efficient.
The rest of the logic, combining the two sets of 4-bit outputs, and handling the additional 8 inputs ( for a 32-bit encoder), will be conventional.
I like using BlockRAMs for unconventional applications, especially since that relieves the interconnect structure, and - if you have more BlockRAMs than you need - its actually free (not just efficient) :-)
Peter Alfke, Xilinx Applications