how to design this datapath unit for DSP using VHDL/Verilog?

- W
- walala
  
  Contact options for registered users
posted
20 years ago

Sat, Aug 30, 2003 5:07 AM

Dear all,

I want to design an arithmatic datapath unit for digital signal processing using VHDL and/or Verilog.

The input are 5 elements(either sequential or parallel) each having 8 bits. It needs to multiply each of these 5 inputs with a predefined constant matrix(10x10, floating point scaled and round to integer). The output will be a 10x10 matrix summing the above five matrices up, each element having 12 bits). So for each element of the matrix, I can have a MAC unit. The internal computation will be 16 bits.

Hence for each 5 inputs x1, x2, x3, x4, x5, the output matrix

Y=x1*C1+x2*C2+x3*C3+x4*C4+x5*C5 where Y, C1, C2, C3, C4, C5 are matrices;

If I put an MAC for each element, I will have a purely parallel architecture, but I need 100 16bits MAC units, which will be too resource consuming.

I am considering to make a parallel-serial architecture, at each time, it outputs one row, which will be 10x12 bits... so the output will be row-by-row.

I also need to consider to streamlize the datapath operation. Since there will be a stream of 5 elements input in a non-stop fashion, the output will also be non-stop streaming. So after one row is outputted, that row can be used for computation/storage of the results for the next 5 input elements.

I am ok so far in thinking... but further thinking makes me confused and perplexed... how to do sequential timing control(how to what to do at which cycle)? do I need to pipelining? how to design the architecture?

Finally, how to program this? Is there any examples for this?

Please help me!

Thanks a lot,

-Walala

Loading thread data ...

- W
- walala
  
  Contact options for registered users
Vote on answer
posted
20 years ago

Sat, Aug 30, 2003 5:08 AM

Dear all,

I want to design an arithmatic datapath unit for digital signal processing using VHDL and/or Verilog.

The input are 5 elements(either sequential or parallel) each having 8 bits. It needs to multiply each of these 5 inputs with a predefined constant matrix(10x10, floating point scaled and round to integer). The output will be a 10x10 matrix summing the above five matrices up, each element having 12 bits). So for each element of the matrix, I can have a MAC unit. The internal computation will be 16 bits.

Hence for each 5 inputs x1, x2, x3, x4, x5, the output matrix

Y=x1*C1+x2*C2+x3*C3+x4*C4+x5*C5 where Y, C1, C2, C3, C4, C5 are matrices;

If I put an MAC for each element, I will have a purely parallel architecture, but I need 100 16bits MAC units, which will be too resource consuming.

I am considering to make a parallel-serial architecture, at each time, it outputs one row, which will be 10x12 bits... so the output will be row-by-row.

I also need to consider to streamlize the datapath operation. Since there will be a stream of 5 elements input in a non-stop fashion, the output will also be non-stop streaming. So after one row is outputted, that row can be used for computation/storage of the results for the next 5 input elements.

I am ok so far in thinking... but further thinking makes me confused and perplexed... how to do sequential timing control(how to what to do at which cycle)? do I need to pipelining? how to design the architecture?

Finally, how to program this? Is there any examples for this?

Please help me!

Thanks a lot,

-Walala

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.