Simulation vs Synthesis

So I have a partly-complete design for a 6502 CPU, it's simulating just fin e for the implemented opcodes, but when I run synthesis, I get a whole load of "Sequential element (\newSPData_reg[23] ) is unused and will be removed from module execute.", one for each bit in the register, in fact.

I know the logic is *trying* to use this register, I can see the values in the register changing during simulation runs, but I can't for the life of m e see why it would be removed - the 'execute' module is basically a case st atement, with one of the cases explicitly setting the value of the 'newSPDa ta' register.

Again, in the simulation, I see the case being executed, and the values cha nging. I guess what I'm looking for is any tips on how to tackle the proble m ("The Knowledge", if you will), I've already tried the 'trace through the logic for the case that should trigger the case in question, and see if an ything jumps out at me'. I remain un-jumped-out-at [sigh].

I'm happy to send the design if anyone wants to have a look, but it's a chu nk of verilog code, so didn't want to paste it here...

Cheers Simon.

Reply to
Simon
Loading thread data ...

Usually logic is removed because the result is not used anywhere. You can design and simulate a design only to see the synthesizer remove the entire thing if it has no outputs that make it to an I/O pin.

So where are the outputs of your register used? Do they actually connect?

--

Rick
Reply to
rickman

Hi Simon,

Before you spend a lot of time analysing the synthesis messages I would suggest you run a gate level simulation. If that fails then start looking at the gate level and warning/note messages. Synthesis tools are quite clever these days and it is not unusual for the synthesis tools to add/remove or merge registers.

Good luck,

Hans

formatting link

Reply to
HT-Lab

Doesn't the 6502 have a 16-bit or smaller stack pointer? If so, maybe you've declared it bigger than it should be, and the synthesizer is trimming it to what's used?

--
www.wescottdesign.com
Reply to
Tim Wescott

Thats why we have github.com

Back probably before you were born somebody created an optimizing compiler that gave an incredible benchmark time until they realized that the benchmark did an huge amount of work to create a result but then never bothered to use it anywhere.

All the code was optimised away.

John Eaton

--------------------------------------- Posted through

formatting link

Reply to
jt_eaton

Simon - just ignore the message and move on. Really. Synthesis optimizations are quite advanced these days - both combinatorial and across registers.

Some sort of optimization that may not be obvious to you, may have combined your register bit with another, leaving this one "unused". It's ok. Trust the tool, and just move on.

Regards,

Mark

Reply to
Mark Curry

?

Actually, this may be it. I had tried to counter this by exporting the data bus (both input and output) in the top-level test-bench module, but thinkin g about it, the registers it's removing are from code that exercises the BR K instruction, which only affects the stack-pointer and program-counter, bo th of which are internal to the CPU in the design as it stands, and the BRK instruction is currently the only thing to manipulate the stack pointer (I 'm going alphabetically through the instruction list, and I've only got as far as EOR :)

Sounds like a likely candidate for what the problem is. Presumably once oth er things start to also reference the stack pointer (eg: PLA - pull accumul ator from stack, where A is also linked to the data-bus via other instructi ons) it will sort itself out.

If this isn't the case, is there any sort of annotati>

If it were just the bits I know are be>

I see - this dovetails into what Rick is saying above - assuming I'm right that things won't be able to be optimized away once all the instructions ar e present.

I was just concerned that I was doing all this work, and at the end of the day I'd have a "cpu" that didn't do very much :)

Thanks all, again :)

Simon

Reply to
Simon

(snip)

(snip)

I believe so, but I haven't tried it.

Sometimes I only want to know if a design will fit into a certain FPGA, and not actually do anything with it. In that case, I would not want to optimize things out.

Most often, though, when something goes away, the optimizer is right and my design is wrong. It only takes a small mistake to propagate through and remove a lot of logic.

I have had, at least once, all logic removed!

-- glen

Reply to
glen herrmannsfeldt

jt_eaton wrote: (snip)

There are stories like that, but one I remember is where the code was not optimized away, but all done at compile time. Well, that means almost all optimized away.

It was a Fortran benchmark that did complicated calculations using statement functions. Statement functions in Fortran are one line functions, used similar to the way #define is used in C, though at the time it was not usual for them to be expanded inline.

The IBM OS/360 Fortran H compiler, from the 1960's, did expand them inline, and also did constant expression evaluation, unlike many other compilers. The resulting code printed out the constant.

In the case of FPGAs, it might optimize down to a constant output, with no logic left.

-- glen

Reply to
glen herrmannsfeldt

(snip)

Since he did the simulation, that is probably what he should do.

Often enough, I test my designs in an FPGA, and optimizing out means the logic is wrong.

One I remember was video display logic where the logic was wrong on the video output. Maybe an enable for a tristate output.

The result then recursively eliminates logic from that point back, which was everthing except the sync generator.

I think the messages are generated in the order that the constant signals are found, so start with the message that comes first. It will say that some signal is constant, often a flip-flop that has a constant output. Find out why.

Yes, combined registers are okay.

-- glen

Reply to
glen herrmannsfeldt

I don't think you are grasping the situation. If the output of the register isn't connected to anything, you have no need for it, so it is removed. It does not matter if any other instructions use this register, if *any* one part of the design uses this register output, it won't be removed... unless that part of the design is also removed first.

If the output of the register isn't being used by the design, why do you care if it remains? It can't impact any result the hardware can calculate.

I think you still need to look at the code and figure out why the registers don't drive any inputs.

Rick

--

Rick
Reply to
rickman

Maybe I'm not. What I was trying to say is that:

- The execute module does write to 'newSPData' during execution of BRK, vi s:

newSPData[`NW:0] = pcPlusOne[`ND:`W]; newSPData[`W*2-1:`W] = pcPlusOne[`NW:0]; newSPData[`W*3-1:`W*2] = ps | 8'h10;

- The 'newSPData' register is declared in the execute module's port list

module execute ( ... output reg [`W*3-1:0] newSPData, // Bytes to stuff onto stack output reg [1:0] numSPBytes // Number of bytes to stuff onto stack );

- The 6502 module does link through to these ports:

execute execute_inst ( ... .newSPData(newSPData), .numSPBytes(numSPBytes) );

- The 6502 module does use the 'wire' vars that link through to the execut e registers...

... if ((action & `UPDATE_SP) == `UPDATE_SP) begin if (numSPBytes == 1) begin stack[SP]

Reply to
Simon

If nothing is using the stack output, there is a decent chance that it is getting optimized out, then there is no user for newSP's output and it will get optimized out. Check and see if the stack registers are getting optimized out.

If you have brought the stack out to a top level output (pin) it should not get optimized out.

A mistake that I have made, is to mis-spell the wire connection and then there is no user for the outputs. The easiest way to check that is to inspect the simulation at the inputs to the next stage that uses the data and make sure that they are wiggling as you expect and not showing undefined as they would for an undriven wire. The second easiest way to check that is to eyeball the naming for this problem.

Good Luck, BobH

Reply to
BobH

Ok, I think you understand what I am saying. If "stack" is being optimized away, I would expect to see that also be in the warning messages. Is it? If not, I can only assume that is not the problem.

I don't know that you need to actually have instructions in your design that utilize the stack data. As long as there is a data path from "stack" to other logic and the control signals are driven from logic that is not optimized away it should remain.

I agree with the others that if you believe this problem is because your design is not complete, move on. I don't think it will be any harder to find with a completed design than with a partial design, possibly the opposite.

I will also say however, that unit testing can be very useful if you aren't designing it on the fly. If you have decomposed your modules with full specification of what they do and all the ins and outs, you should be able to write a test bench for each module.

--

Rick
Reply to
rickman

If you make a spelling error, won't that be flagged because that signal hasn't been declared?

--

Rick
Reply to
rickman

So, oddly enough, there's no mention of 'stack' in the synthesis report (CTRL-F doesn't find anything either), even though it's declared (as registers) alongside the zero-page file in an identical fashion:

//////////////////////////////////////////////////////////////////////////// // Set up zero-page as register-based for speed reasons //////////////////////////////////////////////////////////////////////////// reg [`NW:0] zp[0:255]; // Zero-page //////////////////////////////////////////////////////////////////////////// // Same for stack //////////////////////////////////////////////////////////////////////////// reg [`NW:0] stack[0:255]; // Stack-page

'zp' gets a lot of mentions (mainly that it's too sparse to go into blockRAM) but nary a hint of 'stack' anywhere to be seen.

Looking at the summaries, there are 268 8-bit registers declared, which is only sufficient for either 'stack' or 'zp', but not both together (unless it's cherry-picking the used ones from both declarations of course).

Curiouser and curiouser, quoth the raven^W^W^W said Alice...

No I haven't, it's self-contained within the '6502' module, but I could try doing that tonight.

Yep, I've done that too :)

Reply to
Simon

Just to follow up, it definitely is because it's being optimised away. If I add a port which links to a byte of the stack register space, and link it to the top-level test bench...

module cpu_6502 ( ... output reg [`NW:0] stackff ); ///////////////////////////////////////////////////////////////////////// /// // Set up the stack as a register array ///////////////////////////////////////////////////////////////////////// /// reg [`NW:0] stack[0:255]; // Stack-page always @ (posedge(clk)) stackff

Reply to
Simon

It's been at least five years since I've actually done FPGA work, but I always took unexpected optimizations of this sort to mean that I didn't have my head screwed on straight, and I needed to figure out what I was doing wrong.

Most of the time, I was right.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com
Reply to
Tim Wescott

I am not clear what you are trying to do with the stack here. Do you have a relatively complete CPU implemented?

I would expect something like: module CPU6502 ( output wire [ 7:0] data_out, output wire [15:0] address_out, input wire [ 7:0] data_in,

output wire write_enable, output wire read_enable,

input wire irq_in, input wire nmi_in, input wire clk, input wire rstn )

I would expect that you would have an 8 bit stack pointer that would get muxed onto the address bus, possibly with offsets from the instruction stream. The newSP value would go into the stack pointer when you are updating the stack.

RAM would get hung on the address and data buses with block decode logic to decode the upper address bits into a chip select for the RAM and peripherals. Since FPGA's don't do tri-state buses, there will be a read data in mux to select the data source from the addressed bus target for reads.

sort of like: module mcu ( output uart_txd, input uart_rxd, input clk, input rstn )

wire [15:0] address; wire [ 7:0] data_out; wire [ 7:0] ram_data, rom_data, uart_data; reg [ 7:0] data_in; wire ram_block_sel, rom_block_sel, uart_block_sel; wire write_enable, read_enable;

CPU6502 cpu ( .data_out (data_out), .data_in (data_in), .address_out (address), .write_enable (write_enable), .read_enable (read_enable), .irq_in (irq), .nmi_in (1'b0), .clk (clk), .rstn (rstn) );

RAM_1Kx8 ram ( .address_in (address[9:0]), .data_in (data_out), .data_out (ram_data), .write_enable (write_enable), .chip_sel (ram_block_sel) );

ROM_1Kx8 rom ( .address_in (address[9:0]), .data_out (rom_data), .read_enable (read_enable), .chip_sel (rom_block_sel) );

UART uart ( .txd (uart_txd), .rxd (uart_rxd), .reg_select (address[1:0]), // 2 bits of address .data_in (data_out), .data_out (uart_data), .write_enable (write_enable), .read_enable (read_enable), .irq_out (irq), .clk (clk), .rstn (rstn) )

// address block decode assign rom_block_sel = address [15:13] == 3'b111; // top address assign ram_block_sel = address [15:13] == 3'b000; // bottom address assign uart_block_sel = address [15:13] == 3'b001;

// read data path mux always @( * ) begin case (address[15:13]) 3'b000: data_in = ram_data; 3'b001: data_in = uard_data; 3'b111: data_in = rom_data; default: data_in = 8'h0; endcase end

endmodule

Reply to
BobH

Yeah, that makes sense. There is a chain of

input --> A --> B --> C --> output

where A, B and C are registered values. Each of them may have many other internal signals combining to produce the value in the signal and may be used in other logic internally. If none of the destinations reach an output or if the inputs are optimized so they do not depend on anything, but rather are constants (like A and '0' which will always produce a '0' result) then that logic will be optimized away. This can remove an entire chain, or perhaps just B and C or any other combination.

In Simon's case this may well be due to inputs which are not driven because the instruction decode logic is not implemented. This can either be causing logic to be optimized because it is constant, or to be optimized because the output is never gated into the next register. Unless he provides the full code we can't debug this.

I always work from block diagrams which help me to "see" my data flow which makes it easy to see which control points need to be driven. I expect Simon's problem is in the instruction decode logic not driving a signal, but it is hard to tell. The fact that a register is changing value in the simulator means that it is being given variable values, but does not mean the output is being used for anything. I'm very unclear why he has a 32 bit register in an 8 bit processor.

--

Rick
Reply to
rickman

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.