ISA vs. patent/trademark

Well, what you describe is pretty much how Alpha does it. MIPS does it differently.

Alpha calls this instruction ldq_u (u for unaligned).

Alpha uses three instructions for that. Two extq instructions for shifting and masking r1 and r2, and an or instruction to combine the results. Overall an unaligned load looks like this on Alpha:

lda at,0(t0) ldq_u t9,0(at) ldq_u t10,7(at) extql t9,at,t9 extqh t10,at,t10 or t9,t10,t3

The lda (for computing the effective address) could be optimized away in nearly all cases, but that effort was apparently not expended by gas. It is interesting that the offset for the second ldq_u is 7, not

8 (and the extqh must match that). My guess is that this is done so that you do not get an exception when you use this sequence for loading the last word of a page with an aligned address.

Hmm, this requires two instructions, which are just used for this purpose AFAIK: extqh and extql (ldq_u is also used for byte loads etc. on the Alpha). How much longer would the sequence be if we allowed only one 2-in-1-out special-purpose instruction, or none (but slightly more general-purpose shift-and-mask-byte instructions)?

I can see how to do it with one less instruction with two special-purpose instructions: extqh does not need to set the low-order byte (this can be covered by extql in every case), so it could store the low-order bits of the address there. Then extql could be modified to take the result of extqh instead of the address, and perform the merge. The sequence would look like:

lda at,0(t0) #can be optimized away ldq_u t9,0(at) ldq_u t10,7(at) extqhx t10,at,t10 extqlor t9,t10,t3

This probably would have required additional muxes in the data path, though.

Followups set to comp.arch.

- anton

--
M. Anton Ertl                    Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
Reply to
Anton Ertl
Loading thread data ...

ARM has an "interesting" way of handling unaligned word addresses: If the address is not word aligned, it uses the rounded-down address to load a word but then rotates the word such that the byte at the unaligned address is the LSB of the resulting word (this is for little-endian mode). The behaviour is probably a side-effect of the byte load instruction (which ANDs with 0xFF after the rotate). I once wrote a fast string copier that exploited this behaviour, but I don't think it makes unaligned word access any faster.

Torben

Reply to
Torben Ægidius Mogensen

Unless it's a TAD.

Or an A.

Although TAD (for Two's Complement Add) is hallowed by its appearance on the PDP-8, that isn't really a major alternative.

But there are two basic schools of thought on assembler mnemonics.

One is the IBM 704 school of thought, where every mnemonic is exactly three letters long.

The other is the IBM 360 school of thought, where the mnemonics are as short as possible to be distinct.

Of course, today's assemblers tend to draw inspiration from another computer, the PDP-11, and use various symbols preceding operands to indicate addressing modes.

John Savard

Reply to
jsavard

Of course you can patent something new that eliminates the need for something old!

The patent is on the two instructions, and their use in eliminating the need for the usual hardware that would support unaligned access.

The use of the two instructions in question seemed like something that wasn't obvious when I first saw them. I think it passes the obviousness test.

However, the patent office has a much lower bar for obviousness than you or I would have. Even though I think this patent was reasonable, they certainly grant many others that I don't think they should.

Eric

Reply to
Eric Smith

the

This certainly does make sense. There definitely is prior art for computers that do not support unaligned access to memory; the System/360 comes to mind.

And they can handle unaligned operands, but it takes at least four instructions:

A 5,ALIGNED

becomes, say

LH 6,UALIGNED SLL 6,16 IH 6,UALIGNED+2 A 5,6

or even

LC 6,UALIGNED SLL 6,24 L 7,UALIGNED+1 SR 7,8 N 7,#X'00FFFFFF' O 6,7 A 5,6

so if the MIPS speeds things up by having a "fetch left half of n bytes" followed by "fetch right half of N-n bytes, then perform the operation" instructions (as a RISC chip, it might not have the 'perform the operation' part) it has indeed done something new.

John Savard

Reply to
jsavard

ICM was introduced with 370 ... insert character under mask ICM 6,B'1111',UNALIGNED ar 5,6

problem with LH was that it was arithmetic (not logical) and propogated the sign bit (and it required half-word alignment).

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Reply to
Anne & Lynn Wheeler

better

paper by

OK, but both preceeded the ARM chip and code compression for RISC is therefore not a new thing discovered by ARM Ltd. They base their Thumb patent on the claim that architectures have only been developed to increase the performance, not to reduce code size, and that they discovered the need for code space reduction for RISC. I believe that the width of datapaths has been driven mostly by the need to increase addressspace. If you do not accept the ARM claim, then the Thumb patent becomes really weak.

--
Best Regards
Ulf Samuelsson                ulf@atmel.com
Atmel Nordic AB
Mail:  Box 2033, 174 02 Sundbyberg, Sweden
Visit:  Kavallerivägen 24, 174 58 Sundbyberg, Sweden
Phone +46 (8) 441 54 22     Fax +46 (8) 441 54 29
GSM    +46 (706) 22 44 57
Reply to
Ulf Samuelsson

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.