Arm registers

sci.electronics.design Arm Cortex assembley code

Does the Arm Cortex support both the "thumb" 16 bit machine codes and also the 32 bit codes? I had been under the impression that Cortex devices only provided the thumb codes but a little time going through "Arm v7-M Architecture Reference Manual" indicates otherwise. Whether or not that reference manual is suitable for Cortex information needs verification also. There may be a more pertinent text available. And another question: the thumb instructions show 8 available registers rather than 16. However, somewhere on the internet a site states that 16 registers are available & which group of 8 varies with the "processor state". Any suggestions on an Arm manual or text that defines "processor state"? The CPS (change processor state) machine code description in the "Arm v7-M Architecture Reference Manual" doesn't mention register groups.

Hul

Reply to
Hul Tytus
Loading thread data ...

Thumb instructions can be 16 or 32 bit, the 32 bit instruction are different than the non-thumb 32-bit mode, which may be the confusion, and are sometimes called Thumb-2 instructions, as they weren't available in the first version of the Arm Processors that just had the basic 16 bit Thumb instructions.

The Arm v7-M processor has 16 32-bit registers, but the top 3 are special purpose (LR, SP, PC), so only R0-R12 are really general purpose.

Most of the 16-bit thumb instruction can only access R0-R7, (but some will use the special registers implicitly), to get to the full set of registers you often need to use a 32-bit Thumb instructions.

Reply to
Richard Damon

Richard - you've verified that the v7-m processors use both the 16 bit and the 32 bit instructions and for that I thank you. I'm guessing now the Cortex processors, not the v7-m types, are what I should be looking at. Do you know the title of Arm's Architectural Reference Manual for the Cortex versions?

Hul

Richard Dam> > sci.electronics.design

Reply to
Hul Tytus

One thing to note, there are multiple 'Cortex' lines, the Cortex-A, the Cortex-R, and the Cortex-M

Which one you are looking at will change the -M in the manual name.

Arm v7 updated the Thumb instruction set from the Thumb-1 set which was

16 bits only, to include the Thumb-2 instructions (using unused codes in the Thumb-1 set).

Reply to
Richard Damon

You seem to be mixing up a range of different things here. (That is not surprising - ARM has /seriously/ confusing terminology here, with similar names for different things, different names for similar things, different numbers for instruction sets, architectures, and implementations.)

The answers you got from Richard are all correct - I am just wording things a little differently, to see if that helps your understand.

In the beginning, Arm used a 32-bit fixed-size instruction set. It was fast, but not very compact. Most instructions had three registers (so you have "Rx = Ry + Rz"), with 16 registers - that adds up to 12 bits of instruction just for the registers.

When they started getting serious in the microcontroller market, this extra space was a cost - it meant code was bigger than for other microcontrollers, and you needed a bigger and more expensive flash. So they invented the "Thumb" mode. Here, instructions are 16-bit only and so much more compact. These were a subset of the ARM instructions - instructions only cover two registers, and for many operations these could only come from 8 registers (so only needed 6 bits of the instruction set). The cpu decoder expanded the Thumb instructions into

32-bit ARM instructions before executing them.

Thumb gave more compact code, but slower code, and there were some things you simply could not do in Thumb mode. So the microcontrollers had to support both instruction sets, and you would switch sets in code (interrupts, for example, would be in ARM mode for speed). It was all a bit of a mess.

So ARM invented "Thumb-2". This was a set of 32-bit instructions that can be mixed along with the slightly re-named "Thumb-1" 16-bit instructions. Sometimes it is now all called "Thumb2", or just "Thumb" (since for the vast majority of ARM programmers, 16-bit only thumb was ancient history from before they started using the devices). 16-bit and

32-bit instructions are called "narrow" and "wide" forms. These new 32-bit additions to thumb meant you could get a mixed code that was compact, fast, and covered all needs (including full access to all registers).

So for the Cortex-M devices, Thumb2 is the only set needed - they don't support "old-style" 32-bit Arm instructions. You no longer need to worry about changing states or modes, you are always in "thumb2" mode.

Different Cortex-M processors support different subsets and extensions of this, so it is important that you inform your compiler of exactly which device you are using. Do that, and you don't really need to care about the details for the most part - the compiler does the work. But it can be interesting to know what is going on.

There are also Cortex-A devices for applications processors - basically, running Linux (including Android) or, occasionally, Windows. And there are Cortex-R devices for safety-critical systems. But I expect you are talking about Cortex-M devices here.

Reply to
David Brown

Thanks again Richard. I will assume that the "Arm v7-m Architecture Reference Manual" is most current. The only similiarly titled text on Arm's web site was for a 64 bit version.

Hul

Richard Dam> I was looking in the Arm??v7-M Architecture Reference Manual manual.

Reply to
Hul Tytus

David - I'm in the process of putting together an assembler for the Arm

32 bit processors. There is no all-knowing compiler envolved so I need to know what machine codes are valid for a given processor. Any suggestions on where that information can be found?

Hul

David Brown wrote:

Reply to
Hul Tytus

Why? There are perfectly good Arm assemblers available freely. Is this just for fun, or do you have a good reason behind it?

Arm use what they call "Unified Assembly Language" to let people write assembly in a single consistent format that can be used to generated object code in the various instruction formats. That means a single UAL instruction might lead to multiple object code instructions if necessary (for example, loading a register with a constant might need multiple instructions depending on the constant value and the instruction set).

Add to that there /lots/ of different subsets of the instructions that are available on different devices. It is a big effort writing an assembler her.

Reply to
David Brown

The ARM Architecture Reference Manuela give the detailed encoding of every instruction. You may need to look at the documentation for a given processor class to see which instructions are legal for that given processor, as that varies by machine class/sub-class.

It will be a bit of work to assemble the > David - I'm in the process of putting together an assembler for the Arm

Reply to
Richard Damon

Why on earth?

What is wrong with the GNU arm-none-eabi-as?

The arm-none-eabi-gcc has switches for different processor architectures.

ARM has an instruction reference card for Thumb2, with different processors marked on the card.

--

-TV
Reply to
Tauno Voipio

Yes, I've noticed a slight difficulty so far. Once again, I do appreciate you aid along those lines.

Hul

Richard Dam> The ARM Architecture Reference Manuela give the detailed encoding of

Reply to
Hul Tytus

David: writing an assembler isn't really a "big effort" although the documentation of the Arm and the heaps of machine codes it has moves in that direction. As for "why?", there are several reasons and fun is one. Thanks for the direction regarding the Arm device.

Hul

David Brown wrote:

Reply to
Hul Tytus

Tauno - just the GNU copyright provides reason for not using it. I would expect anyone using it, or considering same, to a major extent, whether government sponsored or private, would look closely.

Hul

Tauno Voipio wrote:

Reply to
Hul Tytus

Arm has a machine-readable architecture description that's amenable to formal proof - describes all of the instructions and where all the bits go. There are pathways to take this and generate various bits of toolchain, if you would rather not do it manually:

formatting link

Theo

Reply to
Theo

You probably want the ARMv8-M architecture reference, unless you're dealing with older hardware:

formatting link

THeo

Reply to
Theo

If you think about microcontrollers, then most relevant probably is ARM v8-M Architecture Reference Manual which I found under file name DDI0553A_e_armv8m_arm.pdf

It contains list of instructions with their encodings. v8 is resonably recent (maybe newest) version of ARM architecture. Since ARM is mostly adding instructions it probably covers all intructions that you want to support. ARM defined several subsets, for example Cortex -M3 Devices Generic User Guide file name DUI0552A_cortex_m3_dgug.pdf describes Cortex M3, in particular provides list of valid instructions.

Collectiong information that you need looks like large and tedious job: you need to check instruction descriptions to know if instruction is valid for given subset (given mnemonic may be valid, but specific combination of argument may be invalid). I am not aware any documents containing needed data is syntetic form. You may try a shortcut: generate "program" that contains all varianants of all instructions and look which one GNU assembler accepts for given subset (model). Of course, in this way you will repeat any bugs in tables in GNU assembler. OTOH tables in GNU assembler are probably debugged at least as well as ARM documents...

--
                              Waldek Hebisch
Reply to
antispam

Your code is just data passed through the tools, the GNU copyright does not limit it.

You'll be safe as long as you are not going to modify the tools. Even then you have to publish the modifications, not the code processed using the tools.

For libraries, there is the GNU LGPL coversin such cases and permitting to use them.

--
-TV 


On 17.4.2021 22:38 PM, Hul Tytus wrote: 
 Click to see the full signature
Reply to
Tauno Voipio

I have always felt that "fun" is a perfectly good reason for any programming project - you don't need more than that. But there is no doubt that as assemblers go, making one for ARM is complicated by all its variations and the fact that unlike most assemblies, there is not a one-to-one relationship between assembly instructions and machine code instructions.

But have fun with it anyway!

Reply to
David Brown

extent, whether

The copyright ownership has no particular relevance here, it is the license that is important. And the license lets you use the tools completely freely for any purpose - commercial, governmental, private, or whatever. It only places restrictions on what you can do with the source code to the assembler - you can modify it that source as much as you like, but if you give people a copy of a binary version of the assembler, you need to give them a copy of the modified source too. The GPL does not affect your ARM code in any way.

The GNU toolchain - assembler, linker, compiler collection - are far and away the most used development toolchain in the world, and cover the widest range of targets. If the copyright or license caused some restriction in how they could be used or who could use them, someone would have noticed by now.

And if you want to be able to take the source code for an ARM assembler, modify it, and pass on (or sell) the binaries while keeping your changes secret, then you could look to the LLVM project which also includes an assembler and supports ARM as a target, and is all under a BSD/MIT style license.

Reply to
David Brown

I'm aiming at the 32 bit devices and looking at the v7 text.

Hul

Theo wrote:

Reply to
Hul Tytus

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.