Using an FPGA to drive the 80386 CPU on a real motherboard

I've read perhaps 30% of the NT as an adult, and yes, I have considered my "path". I don't believe in anything supernatural, and therefore am not religious. There is plenty of good advice in the NT, especially the teachings of Jesus (much less so with Paul, who goes directly against Jesus in a lot of his philosophy). There is no need to introduce a "god" in order to appreciate that "love thy neighbour" is a good basis for how to live your life.

I have asked these sorts of questions, and considered them critically, in many aspects. And herein lies a key difference between us - you have not considered the answers, or viewed them critically. Somewhere along the line, you have picked up the idea that you have found /the/ answer in a particular interpretation of Christianity. Now you no longer look for answers, or take time to think over questions properly - when faced with a question, you assume you already know /the/ answer, and then you change the question to fit that answer.

I believe in seeking answers, and searching for "truth" - but I do so in the world around me, and the people around me, rather than in imagined ideas or books written by ignorant people from a different time and a different culture, aimed at controlling their fellow tribesmen.

I have seen what your religious ideas have done to you (as far as you have written and acted in Usenet groups). No, thank you very much, it does not appeal in the slightest.

I /have/ seen people who have become happier people, or "better" people, as a result of religious beliefs - but that is because there religion fills a social or psychological gap in their lives. Like most people, I don't have such gaps that need filling with a religion. (Like everyone else, I do have gaps or flaws - just not ones that would be fixed by a religion.)

I will never be a high-level basketball player, because I am short and round rather than tall and thin. Baring psychological illness or brain damage, I will never be religious (at least, not in the way you are) because I have an open mind and think rationally and critically - I will not accept any supernatural concepts such as "god" without /real/ proof. That runs contrary to the "leap of faith" required for believing in a god, or gods, with no proof - merely a trust in what other people write or say about them.

Reply to
David Brown
Loading thread data ...

It is bad enough that one person is polluting the group with totally off topic discussions. Do we have to make it a group effort?

Why don't you exchange private email rather than public posts? Then the rest of us don't need to see what is essentially a private conversation.

--

Rick C
Reply to
rickman

Fair enough - sorry about that. I don't usually reply much to Rick's non-technical posts.

I don't plan on taking this any further (in the group, or in email).

Reply to
David Brown

I point you to the One who can guide you, David. His name is Jesus. He can instruct you properly on the things you think you know today. I challenge you to answer Him.

Best regards, Rick C. Hodgin

Reply to
Rick C. Hodgin

I consider 100MHz to be fast, 10MHz to be slow. On a clock-independent level, I consider logic 2 LUT deep to be fast and logic more than 4 LUTs deep to be slow.

Bad wording - I meant "parameterize" in the sense of "use language features of verilog", assuming one uses verilog. :)

For the most part, I will acknowledge and defer - you have more experience than I do. But I'd say that "width" does relate to "speed" and "complexity" - the thesis I'm pushing because wide enough ... bit chunks ... require addition of extra gates to the sides and at some point require additional gates in the depth to ... integrate/consume ... the wider output.

Illustration: let's assume we're making a comparator, but using only LUTs (so no carry chain/mux magic). If we compare two 2-bit numbers, we can use a single LUT4 to produce an output, with the whole construct being 1 gate deep. If we increase the width to 3 or 4 bits, we now need two LUT4s to compare the numbers. But those two gates have two outputs. To further "compress" it to a single output, we need a third LUT4 *in series* with the other two. So the whole thing is now 2x slower and 3x bigger than it used to be. Now, taking into account the users of the comparators output, we can optimize. If the output used to go into a LUT with one spare input, we don't have to add the extra series gate, we can route the extra side gate to the spare input. But that increases the complexity, may require a (buggy) optimizing synthesizer/place&route and may tie you down if you ever need to change something that would affect the "spare" input that got pressed into optimization.

This would be much easier if I knew for sure what the standard terminology was. Self-taught and all that..

The way I usually do things, I have batches of verilog that gets synthesized, placed and routed in one go. I don't floorplan. With that setting, a change in one module can have nasty consequences somewhere unrelated. I had this happene when I added a larger (16 bits) counter in one part that made the whole thing run too slow. The solution was chopping the counter into a series of smaller (4x4-bits) counters and manually carrying the carry between them.

I infer the problem was that the big counter was taking up prime real estate in the chip that the (big and clunky home-grown CPU) needed. Chopping it up allowed the p&r tool to spread the counter over a wider area, allowing the CPU bits to be closer together and work fast. The chip was over 95% full that time.

I remember increasing a register from 2 bits to 3 bits killed the maximum frequency the circuit can run. It became at least 2x slower. If I remember correctly, further increasing it to 4 bits fixed most of the problems. It was bizzare.

If I look hard enough, I may even be able to find the code. I seem to remember it was in a routing module of some kind. Or some other glue module that would connect the CPU core to the SRAM. That was on a MachXO2.

I am aware of lowRISC, and of Milkymist, of some x86 (soft) chip, and of Ben Nanonote and of Novena (which doesn't fully fit the bill).

I was originally going to use Milkymist, with maybe some peripherals swapped but I didn't like the way they implemented the DDR controler (there wasn't a lot of it). Furthermore, lm32 - the systems CPU - had problems. Its load/store module would block the instruction pipeline and didn't look like could be easily converted into a non-blocking implementation. Also, on a cache miss, the CPUs cache would block everything, fill the entire cache line (which could be several words long) and release the block. I felt like it would be easier to take a simpler CPU and extend it with the requisite functionality.

A warning: the following paragraph lists the *exact* causes of my frustration with lowRISC, and may cause you to get frustrated in turn, especially if you have a stake in the project. :)

I took a look (back in October or November 2015) at lowRISC but two (well, three) things put me off. First, the code is hard to find. Just now, as I was fact-checking my post, I had to click dozens of times on the web-site to see a link to GitHub. And, once there, it still took another dozen clicks and some fudging to find at least some of the actual verilog. I still don't know where to find the code for the CPU, for example. Milkymist had no such problems. Neither did the repository for Nyuzi CPU, or the AEMB/AEMB2 repo. The second problem was that the web site mostly lists a lot of ... well ... fluf, but is remarkably short on meaty details. How about a list of the peripherals? It doesn't even have a page on Wikipedia! The wiki magic could have saved it.

The third problem it shares with OpenRISC and it is that the benchmarks show it running at about 40MHz or less in a rough class FPGA-s I was likely to implement it in. Meanwhile, Lattice was promising 85MHz for its lm32. The final nail in the coffin for me and OpenRISC was this document:

formatting link
I am aware they list OpenRISC as running at 185MHz. It is also the place where I discovered AEMB.

Reply to
Aleksandar Kuktin

It's not so much an issue of terminology, but of technology. A comparator in an FPGA would use a carry chain. Yes, this results in a delay that increases linearly with data width, but in general the delay is so short that for any data size up to 64 bits it won't significantly impact the speed. So unless you are going for very large data paths, this is not a major factor in your CPU speed.

All the bits in a register run in parallel, so length doesn't directly impact the speed. The only factor of register length that would impact speed is the length of the routing that connected the registers. You would need to look at the timing report to see what was causing your routing delays. Trying to analyze it a priori really isn't practical.

In any given design there can always be issues where a small change in design causes a huge change in results. This is due to the chaotic behavior of the tools when a design starts to push the density or speed of the device.

--

Rick C
Reply to
rickman

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.