arm-gcc: avoid calling other system functions

I'm trying to put all ISRs in RAM[*]. Unfortunately, depending on the code, arm-gcc could call some system functions that are located in Flash.

For example, if I use a switch statement, the compiler calls a piece of code of system libraries (that are in Flash) that use a table lookup to manage the switch.

Is it possible to force arm-gcc to avoid calling system functions from ISRs, so it can be executed completely in RAM?

[*] This is why I need to erase/write Flash memory and during these operations the Flash can't be read and code in Flash can't be executed.
Reply to
pozz
Loading thread data ...

By "system functions" do you mean gcc built-in functions from libgcc?

Or functions from /usr/lib/*whatever*?

If you're talking about libgcc, that's not what most people mean when they refer to "system libraries".

No, I don't think there's any way to prevent gcc from emitting calls to functions provided by libgcc. However, you should be able to put the ISR code in a separate file and link it statically with libgcc, and then locate the resulting object file's code in RAM. Alternatively, you could just manually tweak your linker script to place the specific libgcc functions used by your ISR functions in RAM.

Using ISRs to program flash seems pretty strange...

--
Grant Edwards               grant.b.edwards        Yow! I am a jelly donut. 
                                  at               I am a jelly donut. 
                              gmail.com
Reply to
Grant Edwards

Il 17/05/2018 14:37, pozz ha scritto:

Something similar to this.

formatting link

Reply to
pozz

Yes, now I know those functions come from libgcc.

Built-in functions is a better name?

It seems complex.

Any help?

No, I need to serve interrupts *during* Flash erasing/writing. So the ISR code must be placed in RAM.

Reply to
pozz

Calling them "libgcc functions" is probably the best choice. Anybody familiar with gcc will know what you mean.

Embedded systems development usually is. :)

formatting link
formatting link
formatting link
formatting link

OK, that makes more sense.

--
Grant Edwards               grant.b.edwards        Yow! I feel like I am 
                                  at               sharing a ``CORN-DOG'' 
                              gmail.com            with NIKITA KHRUSCHEV ...
Reply to
Grant Edwards

There is, AFAIK, no way to force this. It is not normally a problem as ISRs should generally be as short and fast as possible.

For functions that must be run from ram, don't call other functions unless it is unavoidable. And when you do, mark those functions as being in the RAM section, or as __attribute__((always_inline)). This doesn't work for language support functions - you avoid these by avoiding using such language features (such as division or software floating point).

Note that you can't use ISRs during flash operations unless the vector table is also moved to ram.

While you are erasing or writing flash, it is usually easiest to simply disable interrupts. If you really need an interrupt or two running, disable all but the critical ones and make sure these are short and fast.

Reply to
David Brown

Or "language support functions" or "language support library functions". These are functions that are there to make the language work, rather than part of the standard library.

Reply to
David Brown

Not always. Of course it depends on what you are trying to do.

Ok, thank you for the references.

Reply to
pozz

I always try to keep the ISR short, simple and fast. However a simple switch statement (a different way to write a sequence of if statements) is compiled in assembler code that uses "libgcc functions".

And usually I don't call functions directly from my ISR (it's gcc that calls its libgcc functions).

Only a few times I need to call functions from inside the ISRs. It usually happens when I want to create a sufficiently generic low-level driver.

For example, consider a UART driver. Most of the time I push the received bytes in a FIFO buffer during ISR. The mainloop polls the FIFO buffer and pop byte by byte and process them. However I sometimes need to process bytes as soon as they are received, so in ISR. For example, because the frame is addressed and I need to receive the frame only if it is directed to me (maybe because a frame addressed to another node could be much longer than the frame I can manage). In order to have a generic driver, the application calls a uart_set_callback(). In this case, the driver calls the "user" callback function when a byte is received, *during ISR*.

Another example is an ADC driver that I wrote some time ago. I had 20 analog signals that I needed to convert in 4 different states, depending on the ADC value and depending on the configuration of each signal.

switch(signal_conf[signal_idx]) { case CONF1: if (adc_value < 100) signal_status[signal_idx] = ALARM; if (adc_value < 200) signal_status[signal_idx] = IDLE; if (adc_value < 300) signal_status[signal_idx] = SABOT; if (adc_value < 400) signal_status[signal_idx] = SABOT+ALARM; break; case CONF1: if (adc_value < 150) signal_status[signal_idx] = ALARM; if (adc_value < 250) signal_status[signal_idx] = IDLE; if (adc_value < 350) signal_status[signal_idx] = SABOT; break; ... }

In order to have a sufficiently generic ADC driver, the application calls adc_set_callback() and the ADC ISR calls this "user callback" when a new sample is available.

Or switch statements...

I know.

I'm trying to use MCU Flash to save some non-volatile configuration parameters. The user can change the configuration when he wants. At this moment, the Flash must be erase/written, however the device should work as usual.

Reply to
pozz

Can you give an small example here? That sounds very strange. gcc can use a variety of tactics for switch statements, including jump tables, calculated jumps, sequences of conditional tests, and binary trees of conditional tests - and combinations of these. But I can't think of any situation where it would use a language support function (at least, not on an ARM).

(I've snipping most of the rest of the post, because it all sounds sensible enough. Just remember that if you set callback functions, these also need to be short, in ram, etc., according to need.)

You may be asking the impossible here. Flash writes are usually quick, and can often be squeezed in by disabling most interrupts. (DMA on things like UARTs can help give you more leeway here.) But erases take time. Many devices allow erase suspend, which is one possibility for dealing with things that /have/ to use flash while you are trying to do an erase. It can be messy and fiddly, and regular erase suspend can reduce the erase/write lifetime of the flash. And of course many devices have separate flash planes, letting you read from one plane while the other is being erased or written.

But if you have a single plane flash device, and need to erase (rather than just writing in a log structure), and need to run normally while erasing - there might not be a good solution at all. A small serial

Reply to
David Brown

Which ARM processor and which GCC version, which compilation switches?

I beg to differ. The following snippet of an interrupt handler is compiled (GCC 4.8.2) for a Cortex-M4 with the compilation switches

-Os -mthumb -mcpu=cortex-m4:

C snippet:

switch (rstate) { case RXHUNT: break;

case RXB0: rcvb0(ch); break;

case RXB0Q: rcvb0q(ch); break;

case RXB1: rcvb1(ch); break;

case RXB1Q: rcvb1q(ch); break;

case RXDATA: rcvdata(ch); break;

case RXQUOT: rcvqdata(ch); break;

default: rstate = RIDLE; break;

Assembler listing (create with: -Wa,-adhlmns=$(@:.o=.lst)):

129 .L19: 130 0036 2378 ldrb r3, [r4] @ zero_extendqisi2 131 0038 013B subs r3, r3, #1 132 003a 062B cmp r3, #6 133 003c 54D8 bhi .L20 134 003e DFE803F0 tbb [pc, r3] 135 .L21: 136 0042 55 .byte (.L17-.L21)/2 137 0043 04 .byte (.L22-.L21)/2 138 0044 0A .byte (.L23-.L21)/2 139 0045 13 .byte (.L24-.L21)/2 140 0046 19 .byte (.L25-.L21)/2 141 0047 24 .byte (.L26-.L21)/2 142 0048 44 .byte (.L27-.L21)/2 143 0049 00 .align 1 144 .L22: 145 004a 032D cmp r5, #3 146 004c 16D0 beq .L37 147 004e 7D2D cmp r5, #125 148 0050 07D1 bne .L96 149 0052 0323 movs r3, #3 150 0054 49E0 b .L91 151 .L23: 152 0056 032D cmp r5, #3 153 0058 10D0 beq .L37 154 005a 7D2D cmp r5, #125 155 005c 0ED0 beq .L37 156 005e 85F01005 eor r5, r5, #16 157 .L96: 158 0062 2574 strb r5, [r4, #16] 159 0064 0423 movs r3, #4 160 0066 40E0 b .L91 161 .L24: 162 0068 032D cmp r5, #3 163 006a 07D0 beq .L37 164 006c 7D2D cmp r5, #125 165 006e 09D1 bne .L95 166 0070 0523 movs r3, #5 167 0072 3AE0 b .L91 168 .L25: 169 0074 032D cmp r5, #3 170 0076 01D0 beq .L37 171 0078 7D2D cmp r5, #125 172 007a 01D1 bne .L88 173 .L37: 174 007c 0123 movs r3, #1 175 007e 34E0 b .L91 176 .L88: 177 0080 85F01005 eor r5, r5, #16 178 .L95: 179 0084 6574 strb r5, [r4, #17] 180 0086 0623 movs r3, #6 181 0088 2FE0 b .L91 182 .L26: 183 008a 032D cmp r5, #3 184 008c 03D0 beq .L39 185 008e 7D2D cmp r5, #125 186 0090 16D1 bne .L97 187 0092 0723 movs r3, #7 188 0094 29E0 b .L91 189 .L39: 190 0096 6268 ldr r2, [r4, #4] 191 0098 3C4B ldr r3, .L98+4 192 009a 032A cmp r2, #3 193 009c 19D9 bls .L43 194 009e 5969 ldr r1, [r3, #20] 195 00a0 187C ldrb r0, [r3, #16] @ zero_extendqisi2 196 00a2 0022 movs r2, #0 197 00a4 0A60 str r2, [r1] 198 00a6 597C ldrb r1, [r3, #17] @ zero_extendqisi2 199 00a8 1A70 strb r2, [r3] 200 00aa 41EA0021 orr r1, r1, r0, lsl #8 201 00ae 1983 strh r1, [r3, #24] @ movhi 202 00b0 374B ldr r3, .L98+8 203 00b2 0921 movs r1, #9 204 00b4 5A62 str r2, [r3, #36] 205 00b6 0122 movs r2, #1 206 00b8 1A61 str r2, [r3, #16] 207 00ba 1960 str r1, [r3] 208 00bc DA60 str r2, [r3, #12] 209 00be 15E0 b .L17 210 .L97: 211 00c0 6368 ldr r3, [r4, #4] 212 00c2 832B cmp r3, #131 213 00c4 05D8 bhi .L43 214 00c6 2846 mov r0, r5 215 00c8 0BE0 b .L92 216 .L27: 217 00ca 032D cmp r5, #3 218 00cc 01D0 beq .L43 219 00ce 7D2D cmp r5, #125 220 00d0 02D1 bne .L90 221 .L43: 222 00d2 FFF7FEFF bl initrcv 223 00d6 09E0 b .L17 224 .L90: 225 00d8 6368 ldr r3, [r4, #4] 226 00da 832B cmp r3, #131 227 00dc F9D8 bhi .L43 228 00de 85F01000 eor r0, r5, #16 229 .L92: 230 00e2 FFF7FEFF bl putrcv 231 00e6 01E0 b .L17 232 .L20: 233 00e8 0023 movs r3, #0 234 .L91: 235 00ea 2370 strb r3, [r4] 236 .L17:

There is not a single system or compiler support library call. Even the local functions have been inlined.

--

To make a piece of code to reside in RAM, just declare it into 
a section which is linked together with the .data section: 

#define RAMCODE  __attribute__((section(".ramcode"))) 
#define LONGCALL __attribute__((long_call)) 
#define NOINLINE __attribute__((noinline)) 
#define RAMFUNC NOINLINE LONGCALL RAMCODE 

static RAMFUNC void my_ram_function(void) 
	{ 
	/* My RAM code here */ 
	; 
	} 

Snippet of associated linker script: 

	/*  Initialized data with ROM copy   */ 

	.data : 
		{ 
		_rwstart = . ; 
		*(.ramcode) 
		*(.data*) 
		. = ALIGN(4); 
		_rwend = . ; 
		} > ram AT > flash 

The .data initialization code at start will set up the code 
into proper place in RAM.
Reply to
Tauno Voipio

Read this:

formatting link

Reply to
pozz

That refreshes my memory! But I didn't get an example then, and haven't got one now. What sort of switch situations are you seeing calls to language support functions?

Reply to
David Brown

[ snip ]

This all started back in 2009. For the stated minor savings in code space vs all the headaches it causes I think there should be a specific switch to override this. Maybe there is.

formatting link

--
Chisolm 
Republic of Texas
Reply to
Joe Chisolm

So we are talking about Thumb1 code - a mode that has been outdated for many years, and is almost certainly irrelevant to the OP's question? (Thumb1 is very different from Thumb2, used by the Cortex-M devices.)

And perhaps it only applies to PIC - position independent code, another rare feature in this kind of microcontroller? It is common to use "-Os" optimisation flag, so if that enables this type of switch implementation then it will turn up more. (Personally, I use -O2 rather than -Os on ARM - it rarely makes code much bigger, but often a good deal faster.)

Still, while I have not seen it myself in switch code, it is always possible for language support library calls to turn up in odd places when compiling code. On many targets, especially when optimising for space, they can turn up in things like function prologues or epilogues.

Reply to
David Brown

I don't know if my compiler generates Thumb1 or Thumb2 (or a mix of) code. Anyway I remember the libgcc function that is called is __gnu_thumb1_case_uqi.

Reply to
pozz

Since the Cortex families, there have been no ARM devices that support only Thumb-1 or mixtures of 32-bit ARM and Thumb-1. The Cortex-M are all Thumb-2. The Thumb-1 instructions were all 16-bit, while Thumb-2 supports those same 16-bit instructions plus a number of 32-bit instructions (eliminating the need for 32-bit ARM instruction set.)

There is, as far as I can see, some inconsistencies as to whether the "Thumb-1" refers to the 16-bit instructions and "Thumb-2" refers to the

32-bit instructions, so that Cortex-M devices support Thumb-2 and Thumb-1, or whether "Thumb-1" refers to the instruction set consisting of only 16-bit instructions while "Thumb-2" refers to the instruction set consisting of a mix of 16-bit and 32-bit instructions (where the 16-bit instructions are, conveniently, the same as in "Thumb-1"). Adding to the confusion, the details of the instruction sets and the supported instructions varies by ARM architecture generation and details of the device.

Your compiler supports them all - though gcc 9 will deprecate some of the older ARMs. What is important to know is the type of core you have on the microcontroller, and make sure the compiler flags match it.

Reply to
David Brown

Your compiler command is missing the mcpu=cortex-m4 (or -mcpu=cortx-m3) switch on the command line.

--

-TV
Reply to
Tauno Voipio

I'm using Cortex-M0+ MCU, so I use the switch -mcpu=cortex-m0plus. I hope this is correct.

Reply to
pozz

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.