I have a core which generates 400+ parallel outputs per once. Each output can take only 3 possible values only: A,B,C [they can be coded in binary using only 2 bits]. I am looking to write these 400+ outputs to the off chip ram, the quickest time possible, without affecting the core processing speed or halting its execution. Do you see any smart implementation to achieve this goal!
The core processes a 16-bit input word
I am using Virtex-II Pro with one off chip bank sram
Many thanks :)