I am about to port an application from an AVR to an ARM. Most of the code is written in C but one particular driver is written in AVR assembler. I was wondering if a converter exists to translate AVR assembler to ARM assembler code. I find it easier to get to grips with new a new assembler syntax by starting with an existing piece of code instead of writing it from scratch.
If you really want it in assembler, you can then look at the compiler output! But I expect C on the arm will be faster than assembly on the AVR (unless you are deliberately running the ARM at low clock speed).
Le Mon, 01 Oct 2007 12:28:40 +0200, Meindert Sprang a écrit:
I'm not sure that this kind of translator really exist. I already ported a custom driver (CompactFlash) from AVR to ARM. ARM assembler is much more complex than the AVR's so i resigned to implement it in ARM asm, i did it in C.
I don't know. In my experience, some things are better written in assembler. In this case, the driver consists of an interrupt handler running at 19200 Hz on an 8MHz AVR.On the ARM (running at 30MHz), I want to have it run at
153,600Hz simply because the handler (a software UART) needs to run at 38400 baud instead of 4800 baud. And this driver is written so compact that I cannot believe a C compiler could do any better. A compiler can only make certain generic assumptions while I as a programmer know what the driver is supposed to do and I can for instance reserve certain registers to speed up things.
I wrote a 38400 baud software uart for an AVR with a 7.27 MHz clock, with interrupts at 153.6 kHz. With only 48 clock cycles between interrupts, it was worth writing the routing in assembly. But on an ARM using a fast interrupt, it should be practical to write the code in C if you're careful. Using assembly might save a little run time, since you can dedicate registers to specific data instead of loading them at each call to the interrupt. Perhaps your compiler will support allocation of globals directly to registers, which would save that step too.
Even if you decide to write the interrupt code in assembly, the original AVR code will be useless except perhaps as inspiration - you cannot translate from assembly on one cpu to a completely different cpu and expect decent results.
Well, you may be right. But I think my advice stands, write it in C first and try it. Look at the assembly output, do some timings.
ARMs have the FIQ - I was able to get this running at >1MHz, using C, on a ~40MHz device. I don't know what ARM variant you are using, but the fastest is usually ARM mode (vs thumb) running out of on-chip RAM.
As for being able to do better than the compiler - all I know is that gcc is better at ARM assembly than *I* am!
Seems to be a bad idea to run a S/W UART at 38,400 BAUD.
You can get at least 4 H/W UARTs with a SAM7A3 and at 30 MHz, it can run zero waitstates from flash. The SAM7 will have DMA support, so there should be very little overhead. You should be able to run this at Mbit speed without much hassle.
If you want to have a cheaper version, then you can use one of the SAM7S series chips, soon down to 16 kB code.
3 H/W USARTs with DMA support, and the last UART is best implemented using the Synchronous Serial Controller, which also has DMA support. One possible implementation is to let the transmitter run one bit per clock, with manually insertion of start stop bits, and then let the receiver work at a higher rate to allow multiple samples per bit. The RXD signal needs to be connected to an external interrupt, so when you receive a character, then the start bit will trigger the external interrupt. The interrupt routine will disable the external interrupt and start a timer, which times out at the end of the stop bit.
The timer interrupt will analyze the received data to check for false start bit, and if start bit OK, it will look at selected bits of the oversampled RXD. It will also reenable the external interrupt to prepare for the next character.
Obviously, the SSC UART should run at a higher priority than the H/W UARTs.
Will you have yet another communication channel to multiplex the UART data (USB?)
I think you will have little luck finding an automatic translator. Your application works near the limits of the source platform, and is probably structured (and highly optimized) towards it. It may even depend on cycle-exact behaviour.
If any, the best reasonably expectable results of a translator would be just a rough approximation to your final result.
It's probably quicker to split the problem into two parts. The core part (which detects edges and samples the bits) should be done in hand- written ARM assembler. The rest (which doesn't have any critical timing requirement) could be done in C, or with an automatic translator if you find any.
Note that often changes to the algorithm help improve the performance. For your application I see two options for performance improvements:
If you have enough edge detectors and timer interrupts, you don't have to oversample the RX channels. You can detect the start bits using the edge-detector, and then program timer-interrupts to catch the middle of each received bit. On ARM, there's just one (well, two) interrupt vector, so you can handle all timer interrupts at once and thus avoid adding up latency to an otherwise very slow worst case.
If you prefer the oversampling method, and have enough memory, there's no need to do the processing within the interrupt. You can just sample the pins to a circular memory buffer and do offline processing. The buffer should be large enough to hold at least one complete RX symbol. An EOR run reveals edges (start bit). Once found, the middles of each bit can be read at a fixed memory offset. Using this method, the time-critical ASM portion of your code is about