HELP with Asynch RAM

PLEASE HELP:

In my design I have a bank of memory made up of 64 reg[31:0]'s. They get synthesized into latches, and kill my timing because of a huge 64 X 32 sensitivity list.(See below).

How else can I synthesize a 64X32 block of Memory with Asynch Read, Asynch Write, and where the Data_In shows up immediately on the Data_Out?

Can I do it with Xilinx CoreGen Modules?

//This part handles the read for the 64 32-bit registers which store the data. //The rd_address is 1-hot, 64-bit reg.

always @ (rd_address or q0 or q1 or q2 or q3 or q4 or q5 or q6 or q7 or q8 or q9 or qa or qb or qc or qd or qe or qf or q10 or q11 or q12 or q13 or q14 or q15 or q16 or q17 or q18 or q19 or q1a or q1b or q1c or q1d or q1e or q1f or q20 or q21 or q22 or q23 or q24 or q25 or q26 or q27 or q28 or q29 or q2a or q2b or q2c or q2d or q2e or q2f or q30 or q31 or q32 or q33 or q34 or q35 or q36 or q37 or q38 or q39 or q3a or q3b or q3c or q3d or q3e or q3f) begin case (1'b1) // synopsys full_case parallel_case rd_address[0]: DO = q0; rd_address[1]: DO = q1; rd_address[2]: DO = q2; rd_address[3]: DO = q3; rd_address[4]: DO = q4; rd_address[5]: DO = q5; rd_address[6]: DO = q6; rd_address[7]: DO = q7; rd_address[8]: DO = q8; rd_address[9]: DO = q9; rd_address[10]: DO = qa; rd_address[11]: DO = qb; rd_address[12]: DO = qc; rd_address[13]: DO = qd; rd_address[14]: DO = qe; rd_address[15]: DO = qf; rd_address[16]: DO = q10; rd_address[17]: DO = q11; rd_address[18]: DO = q12; rd_address[19]: DO = q13; rd_address[20]: DO = q14; rd_address[21]: DO = q15; rd_address[22]: DO = q16; rd_address[23]: DO = q17; rd_address[24]: DO = q18; rd_address[25]: DO = q19; rd_address[26]: DO = q1a; rd_address[27]: DO = q1b; rd_address[28]: DO = q1c; rd_address[29]: DO = q1d; rd_address[30]: DO = q1e; rd_address[31]: DO = q1f; rd_address[32]: DO = q20; rd_address[33]: DO = q21; rd_address[34]: DO = q22; rd_address[35]: DO = q23; rd_address[36]: DO = q24; rd_address[37]: DO = q25; rd_address[38]: DO = q26; rd_address[39]: DO = q27; rd_address[40]: DO = q28; rd_address[41]: DO = q29; rd_address[42]: DO = q2a; rd_address[43]: DO = q2b; rd_address[44]: DO = q2c; rd_address[45]: DO = q2d; rd_address[46]: DO = q2e; rd_address[47]: DO = q2f; rd_address[48]: DO = q30; rd_address[49]: DO = q31; rd_address[50]: DO = q32; rd_address[51]: DO = q33; rd_address[52]: DO = q34; rd_address[53]: DO = q35; rd_address[54]: DO = q36; rd_address[55]: DO = q37; rd_address[56]: DO = q38; rd_address[57]: DO = q39; rd_address[58]: DO = q3a; rd_address[59]: DO = q3b; rd_address[60]: DO = q3c; rd_address[61]: DO = q3d; rd_address[62]: DO = q3e; rd_address[63]: DO = q3f;

Reply to
Frank
Loading thread data ...

Hi

Two of the most over used and abused directives included in Verilog models are the directives "//synopsys full_case parallel_case". The popular myth that exists surrounding "full_case parallel_case" is that these Verilog directives always make designs smaller, faster and latch-free. This is false! Indeed, the "full_case parallel_case" switches frequently make designs larger and slower and can obscure the fact that latches have been inferred . These switches can also change the functionality of a design causing a mismatch between pre-synthesis and post-synthesis simulation, which if not discovered during gate-level simulations will cause an ASIC to be taped out with design problems.

it is generally a bad coding practice to give the synthesis tool different information about the functionality of a design than is given to the simulator.

Whenever either "full_case" or "parallel_case" directives are added to the Verilog source code, more information is potentially being given about the design to the synthesis tool than is being given to the simulator

In general, do not use "full_case parallel_case" directives with any Verilog case statements.

ARH

Reply to
ARH

discovered during gate-level simulations will

Additionally, Xilinx Block Memory does not support Asynchronous read and write modes. If you want to use Asynchronous modes, then you are now automatically killing the timing of your design. If meeting timing is your objective ideally you would want to map to Block Memory and in order to do this, change to synchrnous timing. Asynchronous timing , especially when not careful of how you are using it, will always end up in bad timing scores.

If you do want to do this in asynchronous mode, then the best you can do is to code it to use distributed memory. If you want tools to infer the memory, then please refer to the coding templates for the synthesis tool you are using and you will achieve the correct inference. As you have coded above, this is a latch based implementation.

Thanks Duth

Reply to
Duth

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.