Absolute addressing on the ARM

- N
- Nils M Holm
  
  Contact options for registered users
posted
10 years ago

Sun, Mar 16, 2014 8:22 PM

Hi and sorry about butting in out of nowhere.

I have a question about absolute addressing on ARMv6 processors as used in the Raspi. Recently I have written a back end for said processor and wondered about the best method for loading a value from an absolute address into a register when the absolute address cannot be known at compile time (i.e. cannot be placed in range for PC-relative addressing).

I came up with the following code to load a value from X:

.data X: .long 0 /* arbitrary distance here */ .data L1: .long X .text ldr r0,L1 ldr r0,[r0]

which works fine.

Now someone told me that it might be possible to construct absolute addresses with MOV/MOVT and let the linker fix the gory stuff.

I doubt that because of the limitations the ARM seems to place on immediate values in MOV and MOVT. If I understand the manual correctly, immediate operands of MOV and friends must be 8-bit values that can be shifted to the left by up to eight bits.

Wouldn't this limitation make MOV/MOVT unsuitable for loading absolute addresses that cannot be known at compile time?

Or am I missing something? Any hints would be welcome!

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- T
- Tauno Voipio
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Mar 16, 2014 9:23 PM

The normal way would be to have the address constant in the code section, at a place outside the code flow. This implies the the address can be resolved latest at link time. Otherwise, you need a pointer in the data (or .bss) section to be filled in at run- time.

If you are using assembler, there is a literal syntax:

ldr r0,=X @ this will create a PC-relative access ldr r0,[r0] .....

.pool @ this has to be within the PC-relative addressing @ range from the instruction, outside of code flow.

The .pool pseudo-op is not absolutely necessary, if the module is so small that the literal can be accessed at the end. The assembler will generate the literal pool anyway at the end, if there are any unresolved literals at the end of assembly.

----

A different story is that the address is a virtual address: the address translation in the hardware will change it to the physical address in the way the kernel feels fit.

If you want to access a physical bus address, e.g. a periperal register, you need to negotiate the addressing with the kernel. There is no way you can directly point to an absolute physical address from user-mode code without getting the kernel to map it for you.

HTH

--

Tauno Voipio

- H
- Hans-Bernhard Bröker
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Sun, Mar 16, 2014 10:14 PM

AFAICS, strictly speaking there's no such thing as absolute addressing on the ARM.

And that's not even particularly uncommon. When a RISC-like platform uses fixed-size instruction words (ARM does), and they're the same width as the CPU's address width (ARM nearly does), that usually means they can't do true absolute addressing --- an instruction just isn't wide enough to hold a complete address. Such architectures could only do zero-page absolute addressing, i.e. absolute addressing for a narrow subset of the address space: the "zero page". And since that subset may well be so small as to be useless, they sometimes won't even bother with it at all.

In a nutshell, all addressing on the ARM is relative. There's just immediate addressing. I.e. you gan get data from inside the machine instruction, but no addresses.

- S
- Simon Clubley
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 1:21 AM

MOV/MOVT are absolute addressing works just fine. I've never used this construct myself (I tend to just use one of the ldr variants in my hand written ARM assembly code) however the above does show up in generated code. This is the generated code for some bare metal C code of mine for the Beaglebone Black:

[The code was compiled with optimisation turned on hence the duplicated source code in the objdump output.]

==========================================================================

80300290 : void board_init_phase1(void) { /* * Disable watchdog. Stop sequence is on page 4202 of spruh73j.pdf */ BBBB_WDT->WDT_WSPR = 0x0000aaaa; 80300290: e3a03a05 mov r3, #20480 ; 0x5000 80300294: e30a2aaa movw r2, #43690 ; 0xaaaa 80300298: e34434e3 movt r3, #17635 ; 0x44e3 while((BBBB_WDT->WDT_WWPS & 0x10) != 0) 8030029c: e1a01003 mov r1, r3 void board_init_phase1(void) { /* * Disable watchdog. Stop sequence is on page 4202 of spruh73j.pdf */ BBBB_WDT->WDT_WSPR = 0x0000aaaa; 803002a0: e5832048 str r2, [r3, #72] ; 0x48 ==========================================================================

Here's another more readable example:

==========================================================================

80300260 : * We increment a counter here so a debugger can tell if this routine _is_ * actually been called. */ void default_interrupt_handler(void) { default_interrupt_handler_count++; 80300260: e3003920 movw r3, #2336 ; 0x920 80300264: e3483030 movt r3, #32816 ; 0x8030 80300268: e5932000 ldr r2, [r3] 8030026c: e2822001 add r2, r2, #1 80300270: e5832000 str r2, [r3] // exec$abort(); } ==========================================================================

The above would seem to suggest otherwise (although as I said, I have not tried using MOVT manually so there may be a limit I am missing).

If you look at the 32-bit opcode above, you should be able to see how the encoding of the 16-bit values is broken down.

I hope the above output from objdump helps get you started. :-)

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP 
Microsoft: Bringing you 1980s technology to a 21st century world

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 1:30 AM

Are you using assembly because you want to, or because you feel you have to?

I ask, because you mention "compile".

If you're doing things in C, the way to do it is to just declare your address as "extern". Then define the actual physical address in an assembler file or in the linker command file.

I make peripheral definitions that end up with a line in a header file that looks like

extern volatile SSomePeripheralOrAnotherRegs RALPH;

and a line in the linker command file that's something like

RALPH = 0x40039400;

(Note: of course I don't use silly names: "SomePeripheralOrAnother" would be too ambiguous).

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com

- D
- Dimiter_Popoff
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 3:57 AM

Does ARM do that zero page thing? Power does, lower 32k and top 32k of the 32-bit space.

It also does absolute jumps to I think it was 24 bit addresses (the lowest) - not so sure about the 24 bits though, it's been a while since I made use of it in vpa.

Well I would not go as far as "useless" but it certainly is avoidable to have it, two opcodes instead of one on the not so frequent accesses of that type. Well I suppose I could be talked into your "useless" too, won't take that much, but I still use it.

Dimiter

------------------------------------------------------ Dimiter Popoff, TGI

formatting link

------------------------------------------------------

formatting link

- T
- Tauno Voipio
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 7:32 AM

No. All addressing is register-indirect or register-relative. There are many different ways to get a constant (it can be an address) into a register. There are different ways to have a constant in the instruction code. The possible constants are different in the different instruction sets (32 bit, Thumb1 and Thumb2). For a general 32 bit pattern, there is the PC-relative addressing with the constant embedded into the code section outside of program execution flow.

The jumps (branch instructions) are relative. An absolute jump is made by loading a constant into register 15 (PC), or loading a constant into a register and using the bx instruction.

--

Tauno Voipio

- N
- Nils M Holm
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 7:46 AM

Yes, this is exactly how I had understood things. In my example above, L1 really should be in the text section.

This sounds very useful, but unfortunately, the GNU assembler does not seem to support this syntax (or it uses a different one), so I will have to manage literals myself.

I currently do not have to deal with virtual addresses, but this is also interesting to know.

Anyway, thanks for your reply! It basically confirms what I have found out myself.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- N
- Nils M Holm
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 7:51 AM

Thank you for your examples! However, I tried MOVW/MOVT with the exact values you used, and got

Error: selected processor does not support `movw r3,#2336' Error: selected processor does not support `movt r3,#32816'

I have seen that the Beaglebone Black is based on an ARMv7 core while the Raspi uses an ARMv6 core. Maybe MOV with 16-bit immediates is only supported on the ARMv7? The manual says that the ARMv6 supports only

8-bit immediates with a 3-bit shift.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- N
- Nils M Holm
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 8:00 AM

Maybe that part was not obvious in my original post. I have written a compiler back end for the ARMv6, and that back end emits assembly. So I am actually looking for a template for loading an arbitrary abolute address or large literal into a register.

I already have a template that works fine, but then someone suggest to use MOV/MOVT instead of a literal pool. I think that MOV/MOVT will not work on the ARMv6, because it can only load eight-bit immediates with

3-bit shift, so MOV/MOVT are not suitable for later fixup. So that was the most important part of my question:

Can MOV/MOVT be used on the ARM *v6* to load *any* 16-bit value?

The manual says no, my experiments with GAS say no, but maybe I have missed something. I am new to ASM programming.

Yes, my compiler compiles a subset of C89 to ARMv6 assembly.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- N
- Nils M Holm
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 8:03 AM

ARM, not ASM. I have 30+ years of experience with loads of processors.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- T
- Tauno Voipio
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 8:41 AM

On 17.3.14 09:46, Nils M Holm wrote:

I beg to have a different opinion. Below is an abbreviated listing of ARM7TDMI (AT91R40008) startup code, translated with GNU assembler.

---- clip clip ----

ARM GAS /tmp/ccBojrRY.s page 1

1 # 1 "hwstart.S" 6 .globl main @ main program entry 11 .globl __bss_start__ @ -> .bss area in RAM 12 .globl __bss_length__ @ .bss area byte length 14 28 33 @ Compatibility macros 34 @ -------------------- 35 36 #if defined(__thumb__) 37 #define LSR(reg,cnt) lsr reg,reg,cnt 38 #define SUBS(reg,cnt) sub reg,cnt 39 #else 40 #define LSR(reg,cnt) movs reg,reg,lsr cnt 41 #define SUBS(reg,cnt) subs reg,reg,cnt 42 #endif 43 129 130 0000 1548 hwstrt: ldr r0,=WD_OKEY @ get watchdog write key 131 0002 1649 ldr r1,=at91wd @ -> watchdog register bank 132 0004 0860 str r0,[r1,#WD_OMR] @ disable watchdog 133 140 141 @ Clear .bss 142 143 0010 0020 mov r0,#0 @ get a zero 144 0012 134A ldr r2,=__bss_start__ @ -> bss area to zero out 145 0014 134B ldr r3,=__bss_length__ @ bss area length 146 0016 9B08 LSR(r3,#2) @ count in full words - any? 147 0018 02D0 beq zlpex @ no - skip clearing 148 149 001a 01C2 zloop: stmia r2!,{r0} @ clear a word 150 001c 013B SUBS(r3,#1) @ bump count - done? 151 001e FCD1 bne zloop @ no - loop 154 @ Call main program: main(0) 155 156 zlpex: @mov r0,#0 @ no arguments (argc = 0) 157 0020 FFF7FEFF bl main @ enter main program 158 159 0024 ECE7 b hwstrt @ loop on return 201 @ Startup data 202 @ ------------ 203 204 0056 00004023 .pool 204 00000000 204 00000000 204 00000000 204 0000 205 .end

---- clip clip ----

--

-TV

- N
- Nils M Holm
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 8:48 AM

Oops, are right! I accidentally tried it with MOV instead of LDR.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- W
- Wouter van Ooijen
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 9:06 AM

Strange, this is the syntax I use all the time. Do you use a separate asm file, or i-line assembly in C? Which error message do you get?

Wouter van Ooijen

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 9:47 AM

It's a little off-topic for your question, but would you mind telling us why you are doing this, and what you aims are here? There are already several excellent C compilers for the ARM (gcc, llvm, Keil/ARM, IAR, GHS, CodeWarrior - and probably a few others that I've forgotten). The main ones here are highly optimising, support a range of ARM cores, and cover modern C and C++ standards (including C11 and C++11 in some cases). So I am very curious as to your goals in making a compiler for a subset of C89 - who will use it? Or is this just for fun or education?

Other than that, my advice here is that when you wonder about code generation, write a simple example function in C and compiler it with gcc at different optimisation settings. Use the generated assembly as inspiration for your own code generator.

- N
- Nils M Holm
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 10:21 AM

The original version of the compiler was for education and was hosted on and targeted at FreeBSD/386. Later I kept hacking it for fun and added back-ends for the x86-64, 8086 and, lately, the ARMv6. I also added runtime support for Linux, various BSDs and DOS. Support for Windows and Darwin were added by contributors.

What I like about the compiler is that it is simple, easy to hack, and boostraps in 4 seconds on a 700 MHz Raspi. Of course, its code generator is rather limited, and its code runs (on average) almost twice as long as code generated by GCC -O0, but it is suffient for most stuff I do.

It is mostly intended to be studied, though. Quite a few people who had given up on other small C compilers told me that my code is quite easy to follow.

If you want to have a look, see:

formatting link

The README on the page summarizes the omissions from The C Programming Language, 2nd Ed.

In case you have a look, please bear in mind that I have never programmed in ARM assembly before, so the emitted code may be worse than dictated by the limitations the compiler.

That is exactly what I have done. The only question that brought me here was if it was possible to load an absolute address using MOV/MOVT on the ARMv6. GCC and Clang do not do this, they use a literal pool instead.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- D
- David Brown
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 12:22 PM

Thanks - that puts things in a more complete context. "Real" compilers such as gcc are far from simple or understandable, even for people who have worked with them for years. llvm is a bit clearer and more structured, but it too is a huge project. So a limited small compiler for educational purposes seems like a good idea, even if it cannot be used for "real" programs.

I think - given your aims - the choice should be whatever is easiest and clearest to implement. You don't need to consider fast or small code, or compatibility with other tools.

- S
- Simon Clubley
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 1:04 PM

It looks like you are correct.

I know it doesn't work on the ARMv5 series (ARM 9 and friends) because the same type of code compiles to the traditional ldr PC relative load from a literal pool.

I don't have any ARMv6 boards so I have not been looking at code generated for the v6 MCUs but for some reason I thought this worked on the v6 as well as the v7 architecture.

Sorry to have wasted your time.

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP 
Microsoft: Bringing you 1980s technology to a 21st century world

- N
- Nils M Holm
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 2:11 PM

Not at all! It is good to know that this works on the v7 but not on the earlier cores.

--
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
10 years ago

Mon, Mar 17, 2014 8:41 PM

I'm not familiar with ARMv6. Gnu compiles Arm Cortex M3 code to use movt and movw. It compiles Arm Cortex M0 code to use a load from a literal stored in the .text section after the function code.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com