intel 386 et al

- H
- Hul Tytus
  
  Contact options for registered users
posted
4 years ago

Wed, Sep 11, 2019 9:06 PM

Anyone know of a good handbook describing the machine code of Intel's

386's? A page describing each instruction along with some text about the various sequences caused by interupts and descriptions of the various tables and "descriptors" and special purpose registers and the effects of "protected mode" is the hope. An older text that didn't describe the 64 bit versions would be ok, maybe preferable. Hummel's book I've seen, but it's more of a lengthy volume than a handbook. Any others that aren't so windy?

Hul

- P
- Paul Rubin
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Wed, Sep 11, 2019 9:27 PM

I don't know if it's exactly the format you wanted, but I liked "Programming the 80386" by John Crawford and Patrick Gelsinger, who were involved with the 386's design. It did a good job of explaining how memory mapping, the protected mode segment registers, call gates for crossing privilege domains etc. all worked. I still don't understand why today's OS's don't use those features. They would also allow application programs to be set up like miniature OS's with protected memory regions, for things like in-memory databases.

- H
- Herbert Kleebauer
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Sep 12, 2019 7:14 AM

formatting link

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Sep 12, 2019 8:37 AM

I presume this is for some sort of history project?

It is a /long/ time since I have read details of the 386 - you are talking about a processor that was outdated over 25 years ago.

However, if my memory and understanding is correct, many of these advanced protection features were overly complex and extremely slow.

In the days of the 386, there were four main classes of operating systems for it. One was DOS - still popular. Since MS at the time had close to zero concern for security or reliability, it used very little of the protection features, or memory mapping abilities. It did not even use 32-bit modes very much (32-bit DOS extenders were made by third parties). Then there was early Windows. Again, security and protection were not a concern for MS, though they used a couple of features to get multi-tasking of DOS programs. Then there was *nix type systems. These simply did not need the call gates and other bits and pieces for security - all they need are a distinction between user mode and kernel mode, and a way to switch between them. And they didn't need any "virtual" modes or other complications that the 386 provided to let you use old binaries on newer protected systems - you just compiled your

*nix code anew for the new system.

The complexities of the call gates and other features of the 386 were concepts from a bygone era by the time the 386 came out, and were never of use in the kind of systems used by the 386. And they have no use now either - modern protection rings and hardware virtualisation are massively more efficient, as well as being simpler and more flexible.

- S
- Stefan Reuther
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Sep 12, 2019 4:42 PM

Am 11.09.2019 um 23:06 schrieb Hul Tytus:

If you're really looking for just the 386 and not the gazillions of extensions to the Intel 32-bit architecture: in the 90's, we were passing around a file called "386intel.txt", titled "INTEL 80386 PROGRAMMER'S REFERENCE MANUAL 1986", which explains pretty much everything about the 386.

A short search turns up this link for a copy:

formatting link

Set your text encoding to cp437, like in the old days.

Stefan

- H
- Hul Tytus
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Sep 12, 2019 8:28 PM

Thanks Paul, I take a look at it.

Hul

Paul Rub> > Anyone know of a good handbook describing the machine code of Intel's

- H
- Hul Tytus
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Thu, Sep 12, 2019 8:33 PM

Thanks Stefan, that should be easy to look at.

Hul

Stefan Reuther wrote:

- M
- Mat Nieuwenhoven
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Sep 13, 2019 9:37 AM

32 bit OS/2 (and descendants like eCs or ArcaOS) definitely use the call gates. This caused problems with some virtual machines which did not expect that. I don't know anything about the speed differences between call gates and more modern mechanisms. I don't think there were many compilers that supported segments in 32 bit mode (maybe Watcom), so 32 bit is always flat mode. Segments gave you some protection too.

Mat Nieuwenhoven

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Fri, Sep 13, 2019 10:31 AM

OS/2 was written by IBM (at least, those low-level bits were done by IBM. Some of the other bits were done by MS). The IBM's came from a background with bigger systems - mainframes - and liked this kind of powerful hardware feature. So it doesn't surprise me that they used it.

I think that is correct, in the days of the 386 at least. But I don't know details there. My PC programming at that time was targetting

16-bit Windows or DOS (although I used OS/2 as the OS).

- G
- George Neuner
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Mon, Sep 16, 2019 6:57 PM

That's true ... validating segment descriptors and segment limits when loading the segment selector took > 1000 cycles. This was so onerous that the i486 and later included a small cache of validated descriptors. But the cache never was large enough to help programs that needed to use many segments - IIRC, it held only 6 entries - and as time went on it shrank to just 2 entries.

Another problem with segments was there were too few of them available: 8K local (per process) segments is not really enough for fine grain object protection, and 8K global segments is not a whole lot when you consider all the uses the operating system might have.

But the biggest problem was that the segment selector was a visible component of addressing. This may have been acceptible on the 8086, but segments there were purely for addressing and had no protection dimension. It became a debate issue starting with the i286, but that chip had so many other issues that segment visibility was lost in the noise.

When it was announced that the 80386 would include transparent paging, many people hoped that its segmentation behavior would be rethought. Segment advocates hoped that the conflation of segment selection with addressing would be abandoned, that segments would become protection domains only, that segments would be able to be defined and used more dynamically, and that using them would be made much faster.

[I'm not taking any positions on this, I'm just recalling things I saw in media and in Usenet discussions at the time.]

Agreed, the i286/i386 ring mechanism was complicated to use, but that was because the protected segment mechanism itself was complicated to use. And again, with only 8K global segments, there weren't enough segments available to protect OS services using call gates unless you seriously restricted the number of service entry points [there were a few OSes that did].

YMMV, George

- R
- Robert Wessel
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Tue, Sep 17, 2019 6:01 AM

Descriptor loading was slow, but I'm pretty sure it wasn't *that* slow. It's hard to imagine what it could be doing for that long. I remember times more like 40 clocks on a 286.

- U
- upsidedown
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Tue, Sep 17, 2019 6:40 AM

Sounds more about iAPX432, in which some instructions were real slow.

- G
- George Neuner
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Tue, Sep 17, 2019 9:51 PM

Protected mode segment switching on the i286 was fairly quick, but the i386 (and later) behaved very differently.

Loading a segment register in protected mode could result in a long sequence if the descriptor was not already in the descriptor cache:

- read the descriptor from memory into the cache - validate the descriptor contents And on the i386 and later - set the "Accessed" bit in the descriptor - write back the modified descriptor to memory

The i386 additionally performed an unnecessary limit check on the current offset value, but did not throw any faults when the check was done as part of the descriptor load - it just wasted additional cycles.

Validating the descriptor was done in microcode and could take hundreds of cycles. The i286 did not define or check many of the descriptor's control bits, whereas the i386 and later defined and checked all the bits.

The i286 did /not/ modify and write back the descriptor - the "Accessed" bit was defined for the i286 but was not set by the hardware [if the OS used it, it had to deal with it manually]. The i386 and later automatically set the bit on load and wrote back the descriptor to memory. [The i486 and later had data caches to absorb the write, but the i386 did not.]

The i286 and i386 had only one descriptor cache line per segment register, so they took the full descriptor load hit every time the register was modified. The caches became multi-way in later chips so as to (try to) keep already-validated descriptors available in case they were needed again.

I know that the slowness of protected mode segment switching has been discussed at length in the past - if not here, then in the arch or x86 forums. Unfortunately I can't easily locate an online reference for segment switch times. I figured Agner Fog would have something, but he doesn't seem to have benchmarked the system instructions. [Or if he has, I stupidly can't seem to find his results].

George

- R
- Robert Wessel
  
  Contact options for registered users
Vote on answer
posted
4 years ago

Wed, Sep 18, 2019 2:43 AM

The Intel 386 reference (copy available on Bitsavers) say 18 or 19 clocks for a MOV to a segment register. As usual, that would be quite optimistic and base on zero memory wait states, so would be rather longer than that.

The Intel 486 manual says 9 clocks.

Perhaps you're thinking of a task switch via a task gate, which could definitely be multiple hundreds of clocks.