Lack of bit field instructions in x86 instruction set because of patents ?

- M
- MooseFET
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 1:24 PM

d

Intel has tried to burn all records that there ever was a 432. It was proposed just at the same time as the 286 came out. Its performance sucked. N 286s would always outperform N 432s and do so on far less power.

It is unfortunate that we are stuck going down the x86 path but that is the one where everything seems to have gone.

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 4:33 PM

IIRC the i432 was actually a chipset comprising of 3 different ICs, whereas 286 was a single chip.

It is weird that the virtual memory and the memory protection was available since 286 yet neither OSes no applications use it because it comes against the programming paradigm of the flat memory.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

- J
- Joel Koltner
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 4:34 PM

Yeah, I was doing a bit too much hand having there (and "100 times" is certainly too large). What I meant was that *latency* to memory can be upwards of 100 CPU cycles these days, when all the fancy caching and predictive fetching algorithms don't happen to have already moved the RAM data closer to the CPU core by the time that it's needed. E.g., you can still write a program that makes a 3GHz Pentium perform no faster than a 300MHz Pentium by executing "worst case" memory access patterns... although I agree that in real world applications that doesn't happen.

---Joel

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 5:07 PM

The segmentation was used quite extensively if you actually had a 286, e.g. XMS (DOS) or 286-mode (Windows 3.1). But the 8086 didn't have it, and the 386 offered a 32-bit flat address space, with memory protection available on pages, so everything used that rather than segmentation.

But the 286 had such a short reign. Initially, most applications retained

8086 compatibility, so any use of 286-specific features tended to be isolated. By the time that the 8086 really started dying off as a viable platform, the 386 was out. So the 286 basically got leap-frogged.

- J
- Jim Thompson
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 5:11 PM

Weren't there a year or two of 286-based PC's? The XT?

...Jim Thompson

--
| James E.Thompson, P.E.                           |    mens     |
| Analog Innovations, Inc.                         |     et      |
| Analog/Mixed-Signal ASIC\'s and Discrete Systems  |    manus    |
| Phoenix, Arizona  85048    Skype: Contacts Only  |             |
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  |
| E-mail Icon at http://www.analog-innovations.com |    1962     |
             
 I love to cook with wine     Sometimes I even put it in the food

- S
- Stephen Fuld
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 5:25 PM

snip

ISTM that most of the arguments against the X86 come down to "aesthetics". While I agree that it is ugly, it has shown itself to be capable of very high performance and extensibility (new instructions, wider addressability, new addressing modes, etc). You can do low power, but perhaps not as low as a different architecture 64 bit chip could be, but I suspect the difference would be modest.

So other than aesthetics, what is wrong with X86?

--
  - Stephen Fuld
(e-mail address disguised to prevent spam)

- S
- Stephen Fuld
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 5:27 PM

snip

I don't know for how long, but there certainly were some, both BY IBM and Compaq, and probably others.

No, the XT was the 8088 with a hard disk. The 286 based models were the AT.

--
  - Stephen Fuld
(e-mail address disguised to prevent spam)

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 5:36 PM

There used to be the XT 286 models as well. The same XT architecture yet using the 286 CPU.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 5:49 PM

The x86 is not too bad. Actually it has its roots from i8080. I can only imagine if, say, PIC16 architecture was selected as a base for PC. How would you like the dual core 4GHz PIC ?

The x86 instruction set with the variable command length is very inconvenient for pipelining and speculative execution. That results in the overcomplication of the modern CPU hardware.
The non-orthogonal set of registers makes the code optimization a non-tractable problem.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

- T
- Terje Mathisen
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 6:51 PM

Not at all: 16-bit x86 code has so many registers statically allocated that a compiler pretty much has all those decisions fixed up front.

I.e. with SI/DI for source/destination, CX as shift count/loop count, DX:AX as the accumulator, BX for any needed table lookup or indexing, and BP for stack frames, there's nothing left to do. :-)

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 7:03 PM

Unless you want to do excessive stuff like loops within loops, or accessing more than one data structure at a time.

Nobody needs that, right?

John

- R
- Robert Redelmeier
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 7:07 PM

Then you just go to PUSH / POP semantics for nesting.

Some might view x86 as short on registers (I don't anymore). It makes up for it with _blazing_ fast L1 cache and convenient [EBP+..] addressing for parms & locals.

-- Robert

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 7:16 PM

In article , snipped-for-privacy@ev1.net.invalid says...>

Right. Optimization is easy. ;-)

RISC has both (more registers and blazingly fast cache).

- J
- John Larkin
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 7:40 PM

It's barbaric. Intel just applied massive amounts of cmos process technology and illegal business tactics to a stupid architecture.

Interestingly, all their attempts at better architectures have been expensive failures. i432, i960, Itanic, ARM.

John

- T
- Terje Mathisen
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 7:49 PM

Obviously not.

More seriously, a few years ago the fastest possible code for doing vector math on an AMD cpu would process everything three times:

1) Load a cache-sized block of data by reading one byte from each cache line, using integer loads.

2) Process that block with fp operations, writing the results to a fixed (cache-resident) buffer.

3) Copy the temp results to the target array using MMX Non-Temporal moves, avoiding any cache pollution caused by the otherwise needed read-for-ownership memory accesses.

One of the (but not the most important) reasons this was so fast was that each individual loop would actually fit nicely within the 7-8 available registers!

Terje

--
- 
"almost all programming can be viewed as an exercise in caching"

- K
- krw
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Tue, Mar 10, 2009 11:32 PM

There is some pretty good microarchitecture in there too. Don't forget, Intel isn't the only one to apply some pretty impressive lipstick to the x86 pig. Many better than that of Intel.

Because they can.

Itanic was part of your "illegal business tactics" (and "stupid architecture" ;-).

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Wed, Mar 11, 2009 5:54 PM

That's makes naive code generation easy, but it also makes optimisation really hard. Optimisation means using all of the registers, not just the "right" ones. Highly optimised code often uses -fomit-frame-pointer, to allow EBP to be used as a general-purpose register. Needless to say, that makes accessing parameters and local variables rather ugly.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Wed, Mar 11, 2009 6:01 PM

Aesthetics is the wrong term, as it implies something without any impact upon functionality.

The x86's architectural ugliness means that a great deal of inefficiency is involved in getting the current levels of performance. A RISC chip with comparable performance would require far less silicon and far less power.

- R
- Robert Redelmeier
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Wed, Mar 11, 2009 6:11 PM

Not much. ESP is then used as a frame pointer. The instructions get a couple of bytes longer and the offsets slightly less readable.

"Premature optimization is the root of all evil" [Knuth]. Also, optimization is not what it used to be. The cost of register spills has gone down while the cost of mispredicted branches has gone _way_ up. Processors have not gotten uniformly faster.

-- Robert

- J
- Joel Koltner
  
  Contact options for registered users
Vote on answer
posted
15 years ago

Wed, Mar 11, 2009 6:16 PM

...hence the reason you don't see traditional x86 CPUs in cell phones, PDAs, etc...