Wishbone interface, FPGA newbie and advice

Hi,

I am just starting out with respect to utilising FPGAs within my designs. Up to this point in time I've mostly been involved with microcontroller based designs, but I'm seeing where FPGAs can help solve various types of tasks I find myself in and hence wanting to learn how to use them properly.

I've brought a small development board and have gradually implemented various designs for making various beeps and sirens on a speaker, and decoders such as a binary to 7segment decoder etc etc. I term them "simple" designs since they are designs that can be "completely wrong" but at the same time still "work" (i.e. yes it does what I set out to do, but the design techniques might not be the best, or it wouldn't work if I had a bit a little more clock slew etc etc...).

I am at the stage where I think I am almost confident enough to design a FPGA into a hobby project of my own with enough confidence that I'll be able to develop the right VHDL source to make it "tick"...

To this end I've started attempting to simulate/develop a 68HC11 to Wishbone interface (the idea being to graft an FPGA onto the databus of an existing HC11 produt) and I'd like some advice on what I've got so far. Not knowing much about data busses typically utilised within FPGA designs etc Wishbone looked like a good idea, especially when sites such as

formatting link
appear to support it for their freely available cores.

My VHDL source for a wishbone master which has an HC11 data bus interface on the other end is shown below. It's a slightly "confused" master, in that it's not as seperted out as the Wishbone specification details, i.e. it's got elements of an InterCon and SysCon module thrown in as well... Sorry about the poor coding standard (the Wishbone/HC11 bus signals are named using different conventions etc), I just thought I'd "throw it out there" rather than worrying about tiding it up first...

entity hc11_wb_master is port ( -- Wishbone bus interface WB_CLK_O : out std_logic; -- Clock WB_RST_O : out std_logic; -- Reset WB_ADR_O : out std_logic_vector(1 downto 0); WB_DAT_I : in std_logic_vector(7 downto 0); WB_DAT_O : out std_logic_vector(7 downto 0); WB_WE_O : out std_logic; -- Write enable WB_STB_O : out std_logic; -- Strobe -- HC11 bus interface e_clk : in std_logic; reset : in std_logic; cs : in std_logic; -- chip select for FPGA (HC11's CS2) rw : in std_logic; -- read/write* strobe addr : in std_logic_vector(10 downto 0); data : inout std_logic_vector(7 downto 0); ); end hc11_wb_master;

architecture Behavioral of hc11_wb_master is signal write_cycle : std_logic; begin WB_CLK_O

Reply to
Christopher Fairbairn
Loading thread data ...

Hi!

Wishbone

I've done a similar project before: I have an async master to WB interface. It's (as far as I can tell) working but it's not an easy task to do.

on

inactive

from HC11 to wishbone

direction...

There are some problems with this design:

- You don't handle error and retry requests from the WB side and don't generate WB_CYC_O.

- There's no wait-state generation. You don't detect any wait-state requests from the WB side and don't generate wait-states for your async master (HC11). That can cause problems if you communicate with slow devices (for example with a FIFO which, when full, generated waits)

- More importantly, your logic is all wrong. The WB bus is syncronous and have this feature: a cycle starts by the master asserts WB_CYC_O (which you don't generate to begin with) and ends when the target asserts WB_ACK_I (or WB_ERR_I or WB_RTY_I). After that, at the next clock, if WB_CYC_O remains active, it starts a new cycle. So your logic can generate multiple writes or reads to/from the same location depending on the timing. That can cause serious problems with transmit or receive FIFOs even in your case of the serial controller. (If not, than that's a bug in the serial controller...) Even more problematic is that this 'feature' combined with the lack of proper wait-state handling can cause invalid data to be written to any location and invalid data to be read from any location that are not zero-wait-state.

I'm sorry, I've been down that road too, so I know :-(...

not

problems

of

clocked

I

I don't think it is an issue.

Tri-state seems to be OK.

on

of

Setup-hold times are device (and place-and-route) specific, so I can't answer that without knowing more about your target architecture. The FPGA and uP datasheet should answer most of your questions. In general FPGAs are much faster than an HC11 so you might have setup problems on the HC11 side but others should work fine.

I'll paste my circuit here for reference. Note, that it does not use the CPUs clock to sync up the WB part, so it can run on much higher clock speeds (in my case 70MHz). That can help meet the timing (with proper wait-state handling of course) but can cause all kinds of meta-stability issues, I'm not sure I've addressed properly either. Please note that I'm not a professional either, I'm not claiming my design to be nice or flowless. It at least worked in a real HW... Any comment are welcome.

Andras Tantos

================================================================

library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL;

entity wb_async_master is generic ( dat_width: positive := 16; adr_width: positive := 20; ab_rd_delay: positive := 1 ); port ( wb_clk_i: in std_logic; wb_rst_i: in std_logic := '0';

-- interface to wb slave devices wb_adr_o: out std_logic_vector (adr_width-1 downto 0); wb_sel_o: out std_logic_vector ((dat_width/8)-1 downto 0); wb_dat_i: in std_logic_vector (dat_width-1 downto 0); wb_dat_o: out std_logic_vector (dat_width-1 downto 0); wb_cyc_o: out std_logic; wb_ack_i: in std_logic; wb_err_i: in std_logic := '-'; wb_rty_i: in std_logic := '-'; wb_we_o: out std_logic; wb_stb_o: out std_logic;

-- interface to the asyncronous master device ab_dat: inout std_logic_vector (dat_width-1 downto 0) := (others =>

'Z'); ab_adr: in std_logic_vector (adr_width-1 downto 0) := (others =>

'U'); ab_rd_n: in std_logic := '1'; ab_wr_n: in std_logic := '1'; ab_ce_n: in std_logic := '1'; ab_byteen_n: in std_logic_vector ((dat_width/8)-1 downto 0); ab_wait_n: out std_logic; -- wait-state request 'open-drain' output ab_waiths: out std_logic -- handshake-type totem-pole output

); end wb_async_master;

architecture xilinx of wb_async_master is constant ab_wr_delay: positive := 2; -- delay lines for rd/wr edge detection signal rd_delay_rst: std_logic; signal rd_delay: std_logic_vector(ab_rd_delay downto 0); signal wr_delay: std_logic_vector(ab_wr_delay downto 0); -- one-cycle long pulses upon rd/wr edges signal ab_wr_pulse: std_logic; signal ab_rd_pulse: std_logic; -- one-cycle long pulse to latch address for writes signal ab_wr_latch_pulse: std_logic; -- WB data input register signal wb_dat_reg: std_logic_vector (dat_width-1 downto 0); -- internal copies of WB signals for feedback signal wb_cyc_l: std_logic; signal wb_we_l: std_logic; -- Comb. logic for active cycles signal ab_rd: std_logic; signal ab_wr: std_logic; signal ab_active: std_logic; -- internal copies of wait signals for feedback signal ab_wait_n_rst: std_logic; signal ab_wait_n_l: std_logic; signal ab_waiths_l: std_logic; signal ab_wait_n_l_delayed: std_logic; signal ab_waiths_l_delayed: std_logic; -- active when WB slave terminates the cycle (for any reason) signal wb_ack: std_logic; -- signals a scheduled or commencing posted write signal write_in_progress: std_logic; begin ab_rd

Reply to
Andras Tantos

Consider synchronizing the interface to a faster fpga clock and generate your own synchronous read and write strobes in just the right places.

They aren't.

Looks reasonable for a first cut. You have to compare your sim waveforms to the H11 and wishbone data sheets. Consider making an oe signal to drive data Z between cycles.

I don't know the interfaces, but WB_STB_O One thing this project has highlighted is the need for me to learn more

It's not too late.

You already know how to run the simulator,so get a copy of Ashenden's guide to vhdl as a language reference, and get busy. Consider adopting a synchronous testbench style:

formatting link

Use vhdl procedures to do this. Using an intermediate text file makes it harder, not easier.

-- Mike Treseler

Reply to
Mike Treseler

The reason why I started with the idea of using the HC11 microcontroller's E-clock to clock the wishbone interface was I imagined it would bring simplisity, as I wouldn't need to deal with the interfacing between the two different clock domains (i.e. the CPU's crystal derived clock and the FPGAs clock).

In my circuit I don't have to worry about the HC11 going to a low power state (where the e-clock signal would stop) as I'm not intending to use such features.

Having said that it appears that things might be more flexiable if I decouple that requirement, use a faster clock within the FPGA and deal with the fact that each bus is now asychronous to each other.

Ok, well if I'm reading the HC11's waveforms correctly atleast on that bus it's slightly different.

It has a single R/W* signal. A logic zero on this signal indicates that the present bus cycle is a write operation and it can be held low for consecutive bus cycles in cases where double-bytes are being written. The R/W* is speced as being valid whenever the ADDRESS bus contains a valid address (which is almost the entire bus cycle), as such it's more a "write request" rather than a "write strobe".

With the programmable chip-selects there is a choice on when they are asserted. Programming a register with the correct value will mean that the chipselect will be asserted as soon as the address is placed upon the bus. Changing that value can change the length of the chipselect strobe to make it only occur for the second half of the bus cycle (when e-clock is high meaning device should place data onto bus).

I know 100% that I'm not understanding basic bus interfacing at the moment. I entered this project thinking (for the HC11 at least) that I could basically be on the look out for a particular clock edge and then simply look at the Read/Write signal and the chipselect to see if the transaction was meant for me...

And I think it's almost like that.. for example looking at the HC11 datasheet indicates that on the rising edge of it's e-clock (a 4th of it's crystal oscillator speed) the read/write strobe, the address and the chipselect (if programmed correctly) will all be valid and hence I could latch them into the FPGA...

But that's where I get stuck and my knowledge starts to run out. For a read operation (HC11 reading a register from the FPGA) I think I'm fine... I can detect the signals such as the read/write strobe on the rising edge of the clock and output my data, then as soon as the eclock goes low I can tristate the bus again (and this makes sense as the HC11 reference manual has the following sentance in it.. "The E-clock can be used to enable external devices to drive data onto the data bus during the second half of the bus cycle (E clock high)". So I could have something like

hc11_data Looks reasonable for a first cut.

Reading the various replies to my initial posting and looking at the various datasheets etc has made me appreciate excatly how little I actually know about this... or even digitial logic in general when it comes to sequential designs.

So at least in the mean time I've decided to concentrate on an even smaller subcomponent of the desired goal.

Instead of attempting to simulate the entire HC11 to Wishbone bus interface I'm going to concentrate on getting a simple HC11 bus interface designed and simulated properly. I.e. attempting to basically simulate a 74HC series

8bit latch hanging off the HC11's databus... something I can do well with "real" hardware :-)

I think at the moment I have enough issues with respect to properly reading the waveform timing diagrams in the HC11 reference manual to think about a the complete design. Especially when you throw in considerations such as how I'm going to deal with issues such as a slave wanting to extend a bus cycle.. something which isn't as simple as asserting a signal on the HC11's bus..

Once I get that far, then I can start worrying about interpreting the waveforms in the wishbone spec and dealing with the "translation" of signals between the two busses.

Thank you for this reference. The example testbench in that thread was sort of what I was aiming for when I talked about reading stimulus from a file. At present my test bench is physically hardcoded to perform the individual bus cycles, i.e. I have something along the lines of

cs

Reply to
Christopher Fairbairn

At the moment the three initial wishbone slaves I'm intending to interface to are all pretty primitive and have their ACK_O simply tied to their STB_I input as allowed by the wishbone spec if they don't require any waitstates.

As such I shouldn't need wait states, but it is something I've been concious about and keeping in the back of my mind... knowing full well that murphie's law will present a nice wishbone slave that I desire to use at some stage in the future which will require waitstates... I'm just not excatly sure how to deal with it, considering I can't stall the HC11 bus.

As part of my research and investigations today I discovered an asynchronous wishbone master on

formatting link
which appears to be developed by yourself
formatting link

Is this correct? It appears that the version on the opencores website is a lot simplier than the version you presented in your posting. Reading through the source for the one on opencores.org has cleared a lot of things up for me and "turned a couple of lightbulbs on" in my mind.. What's the main differences between the two? I'm having difficulty following the one in the newsgroup posting while I can pretty much follow the one on

formatting link

The testbench support code in the

formatting link
project has also helped me out. The use of a VHDL function to wrap up the inner workings of a bus cycle should stop me from duplicating all those lines of code in my testbench for every read/write I perform on the bus...

Thanks, Christopher Fairbairn.

Reply to
Christopher Fairbairn

Hi!

concious

As long as you are aware of the limitations, it's OK.

asynchronous

Yeah, I know ;-).

things

The difference is that the one I've shown here actually works. I didn't have time to update the cores on OpenCores, but as I started using them I've found a lot of problems, and in the case of this core for example, I had to completely re-write the whole thing. The old one pretty much goas along the lines of your implementation, and as such, has the same fundamental problems.

I know the presented core is a bit confusing, so here are some of the ideas I've used:

You have to make sure, you generate exectly one WB cycle for each async bus cycle. That requirement makes the handling of read and write cycles different. For a write cycle, you have to make sure, you use the right data from the async bus in the write cycle on the WB bus. In the general case, the write data is valid at the rising edge of the (negated) control signals of the async bus. This means, that you have to delay the write cycle on the WB side until the write has been finished on the async side. For reads, however you have to start the WB cycle in parallel with the async cycle to make sure you have the right data available at the end of the cycle. Also, in the general case, at the beginning of the read cycle you might not have a valid address on the address-bus, so you might have to wait some time before starting the read-operation. All in all, you need delayed writes and parallel reads.

This requirement make the interface quite complicated, because writes happen after the fact. What if the async side initiates another cycle while you're performing that delayed write? You'll have to wait until the current WB bus activity ends, and start the operation only afterwards. The way you handle this wait is again, a bit different for reads and writes, but this fact alone makes correct async-side wait-state generation a must.

Another thing to consider is what happens if the async master does not honor your wait states and ends the cycle prematurely. That's an error, of course, but you at least have to recover from it somehow.

And finally I've added two different type of wait-state generations: one is a handshake-type, the one used in for example the EPP printer port communication, and the other is the normal open-collector type wait signal used in most uC busses.

Just a side-note: I had test-banches for the old core, and it looked OK to me. At the moment I've added it to a real HW, problems started. Createing good test-benches is really hard. You can test only for what you've thought about and chances are, you got those things right in the design. Real life however tests your design on it's own way, and at the end of the day, that's the test that should pass, not (only) your test-bench.

Regards, Andras Tantos

Reply to
Andras Tantos

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.