Does the Arm Cortex support both the "thumb" 16 bit machine codes and also the 32 bit codes? I had been under the impression that Cortex devices only provided the thumb codes but a little time going through "Arm v7-M Architecture Reference Manual" indicates otherwise. Whether or not that reference manual is suitable for Cortex information needs verification also. There may be a more pertinent text available. And another question: the thumb instructions show 8 available registers rather than 16. However, somewhere on the internet a site states that 16 registers are available & which group of 8 varies with the "processor state". Any suggestions on an Arm manual or text that defines "processor state"? The CPS (change processor state) machine code description in the "Arm v7-M Architecture Reference Manual" doesn't mention register groups.
Thumb instructions can be 16 or 32 bit, the 32 bit instruction are different than the non-thumb 32-bit mode, which may be the confusion, and are sometimes called Thumb-2 instructions, as they weren't available in the first version of the Arm Processors that just had the basic 16 bit Thumb instructions.
The Arm v7-M processor has 16 32-bit registers, but the top 3 are special purpose (LR, SP, PC), so only R0-R12 are really general purpose.
Most of the 16-bit thumb instruction can only access R0-R7, (but some will use the special registers implicitly), to get to the full set of registers you often need to use a 32-bit Thumb instructions.
Richard - you've verified that the v7-m processors use both the 16 bit and the 32 bit instructions and for that I thank you. I'm guessing now the Cortex processors, not the v7-m types, are what I should be looking at. Do you know the title of Arm's Architectural Reference Manual for the Cortex versions?
You seem to be mixing up a range of different things here. (That is not surprising - ARM has /seriously/ confusing terminology here, with similar names for different things, different names for similar things, different numbers for instruction sets, architectures, and implementations.)
The answers you got from Richard are all correct - I am just wording things a little differently, to see if that helps your understand.
In the beginning, Arm used a 32-bit fixed-size instruction set. It was fast, but not very compact. Most instructions had three registers (so you have "Rx = Ry + Rz"), with 16 registers - that adds up to 12 bits of instruction just for the registers.
When they started getting serious in the microcontroller market, this extra space was a cost - it meant code was bigger than for other microcontrollers, and you needed a bigger and more expensive flash. So they invented the "Thumb" mode. Here, instructions are 16-bit only and so much more compact. These were a subset of the ARM instructions - instructions only cover two registers, and for many operations these could only come from 8 registers (so only needed 6 bits of the instruction set). The cpu decoder expanded the Thumb instructions into
32-bit ARM instructions before executing them.
Thumb gave more compact code, but slower code, and there were some things you simply could not do in Thumb mode. So the microcontrollers had to support both instruction sets, and you would switch sets in code (interrupts, for example, would be in ARM mode for speed). It was all a bit of a mess.
So ARM invented "Thumb-2". This was a set of 32-bit instructions that can be mixed along with the slightly re-named "Thumb-1" 16-bit instructions. Sometimes it is now all called "Thumb2", or just "Thumb" (since for the vast majority of ARM programmers, 16-bit only thumb was ancient history from before they started using the devices). 16-bit and
32-bit instructions are called "narrow" and "wide" forms. These new
32-bit additions to thumb meant you could get a mixed code that was compact, fast, and covered all needs (including full access to all registers).
So for the Cortex-M devices, Thumb2 is the only set needed - they don't support "old-style" 32-bit Arm instructions. You no longer need to worry about changing states or modes, you are always in "thumb2" mode.
Different Cortex-M processors support different subsets and extensions of this, so it is important that you inform your compiler of exactly which device you are using. Do that, and you don't really need to care about the details for the most part - the compiler does the work. But it can be interesting to know what is going on.
There are also Cortex-A devices for applications processors - basically, running Linux (including Android) or, occasionally, Windows. And there are Cortex-R devices for safety-critical systems. But I expect you are talking about Cortex-M devices here.
Thanks again Richard. I will assume that the "Arm v7-m Architecture Reference Manual" is most current. The only similiarly titled text on Arm's web site was for a 64 bit version.
Hul
Richard Dam> I was looking in the Arm??v7-M Architecture Reference Manual manual.
David - I'm in the process of putting together an assembler for the Arm
32 bit processors. There is no all-knowing compiler envolved so I need to know what machine codes are valid for a given processor. Any suggestions on where that information can be found?
Why? There are perfectly good Arm assemblers available freely. Is this just for fun, or do you have a good reason behind it?
Arm use what they call "Unified Assembly Language" to let people write assembly in a single consistent format that can be used to generated object code in the various instruction formats. That means a single UAL instruction might lead to multiple object code instructions if necessary (for example, loading a register with a constant might need multiple instructions depending on the constant value and the instruction set).
Add to that there /lots/ of different subsets of the instructions that are available on different devices. It is a big effort writing an assembler her.
The ARM Architecture Reference Manuela give the detailed encoding of every instruction. You may need to look at the documentation for a given processor class to see which instructions are legal for that given processor, as that varies by machine class/sub-class.
It will be a bit of work to assemble the > David - I'm in the process of putting together an assembler for the Arm
David: writing an assembler isn't really a "big effort" although the documentation of the Arm and the heaps of machine codes it has moves in that direction. As for "why?", there are several reasons and fun is one. Thanks for the direction regarding the Arm device.
Tauno - just the GNU copyright provides reason for not using it. I would expect anyone using it, or considering same, to a major extent, whether government sponsored or private, would look closely.
Arm has a machine-readable architecture description that's amenable to formal proof - describes all of the instructions and where all the bits go. There are pathways to take this and generate various bits of toolchain, if you would rather not do it manually:
If you think about microcontrollers, then most relevant probably is ARM v8-M Architecture Reference Manual which I found under file name DDI0553A_e_armv8m_arm.pdf
It contains list of instructions with their encodings. v8 is resonably recent (maybe newest) version of ARM architecture. Since ARM is mostly adding instructions it probably covers all intructions that you want to support. ARM defined several subsets, for example Cortex -M3 Devices Generic User Guide file name DUI0552A_cortex_m3_dgug.pdf describes Cortex M3, in particular provides list of valid instructions.
Collectiong information that you need looks like large and tedious job: you need to check instruction descriptions to know if instruction is valid for given subset (given mnemonic may be valid, but specific combination of argument may be invalid). I am not aware any documents containing needed data is syntetic form. You may try a shortcut: generate "program" that contains all varianants of all instructions and look which one GNU assembler accepts for given subset (model). Of course, in this way you will repeat any bugs in tables in GNU assembler. OTOH tables in GNU assembler are probably debugged at least as well as ARM documents...
Your code is just data passed through the tools, the GNU copyright does not limit it.
You'll be safe as long as you are not going to modify the tools. Even then you have to publish the modifications, not the code processed using the tools.
For libraries, there is the GNU LGPL coversin such cases and permitting to use them.
I have always felt that "fun" is a perfectly good reason for any programming project - you don't need more than that. But there is no doubt that as assemblers go, making one for ARM is complicated by all its variations and the fact that unlike most assemblies, there is not a one-to-one relationship between assembly instructions and machine code instructions.
The copyright ownership has no particular relevance here, it is the license that is important. And the license lets you use the tools completely freely for any purpose - commercial, governmental, private, or whatever. It only places restrictions on what you can do with the source code to the assembler - you can modify it that source as much as you like, but if you give people a copy of a binary version of the assembler, you need to give them a copy of the modified source too. The GPL does not affect your ARM code in any way.
The GNU toolchain - assembler, linker, compiler collection - are far and away the most used development toolchain in the world, and cover the widest range of targets. If the copyright or license caused some restriction in how they could be used or who could use them, someone would have noticed by now.
And if you want to be able to take the source code for an ARM assembler, modify it, and pass on (or sell) the binaries while keeping your changes secret, then you could look to the LLVM project which also includes an assembler and supports ARM as a target, and is all under a BSD/MIT style license.
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.