general questions regarding ARMs

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Hi - I'm fairly certain I'm going to use an ARM in a project I'm
working on. I have a good deal of experience with Atmel AVRs, but I
have never used an ARM before. So I was wondering if you all could
answer a couple basic questions regarding ARMs:

1. programming: My understanding of programming an ARM is that you do
it with a JTAG cable. I was looking at this one, specifically:
http://olimex.com/dev/arm-jtag.html as it is fairly inexpensive. Is
this a good plan? Bad plan? I've also heard of boot-loading. Is this
possible on Atmel ARMs? if it is, does this only temporarily load code
into the chip, or permanently load it in the flash?

2. boot process: when an AVR turns on it begins by executing the first
line of code in the flash - the reset interrupt. Is this how ARMs work?
I noticed that the ARMs that I was looking at can only access their
flash in a single cycle at about half the clock speed - so how would
executing code work for this? I mean how would it run at double that
speed if it couldn't access the code? Does all code get loaded into the
RAM and executed there?

3. operating systems: in my experience with AVRs I've never had an
operating system running onboard. Is this what is done on ARMs
generally? I've noticed that there is ARM Linux - could that run on a
55Mhz Atmel ARM based off of a ARM7TDMI with 64KB of memory and 256KB
of flash? Would this even make sense?

4. DMA: the chip I'm using boasts about its Peripheral DMA Controller
(PDC). It says that it greatly reduces the overhead on the processor.
But reading about it, I am having trouble understanding how DMA differs
from no DMA.

Sorry for asking so many questions - I'm just very new to the ARM
world. Thanks!

-Michael J. Noone


Re: general questions regarding ARMs

Quoted text here. Click to load it

In the case you describe, it means execution speed is going to
be limited by the flash bandwidth.  If you pick one with a
cache, that can make a huge difference.   Many ARM setups run
code from RAM because it's faster.

Quoted text here. Click to load it

You can do that if that's what you want to do.  It does tend to
speed things up.

Quoted text here. Click to load it

Some do, some don't.  Depends on resources and requirements.

Quoted text here. Click to load it

No.

By "memory" I presume you mean RAM?

Quoted text here. Click to load it


Are you asking what DMA is?  

--
Grant Edwards                   grante             Yow!  .. I want to perform
                                  at               cranial activities with
We've slightly trimmed the long signature. Click to see the full one.
Re: general questions regarding ARMs
First of all sorry for not replying inline - I haven't figured out how
to do that with Google Groups just yet, and I was forced to switch to
google groups due to an ISP change.

By memory I do indeed mean RAM.

I suppose yes - I'm asking what DMA is. My embedded experience is with
AVRs only - and I've never encounted DMA when working with them.
Reading through the datasheet about DMA I was having trouble figuring
out how it was different from how an AVR handled things and what
benefit it provided.

Thanks,

-Michael


Re: general questions regarding ARMs

Quoted text here. Click to load it

Hmm. I've seen instructions on how to follow-up properly using
GG, but I don't have a link handy.

Quoted text here. Click to load it

DMA allows you to set up a block transfer operation to transfer
a block of memory from one place to another.  You
configured source address, destination address, and count by
writing to registers in the DMA controller.  Then you tell the
DMA controller to "go", and the data is transferred while the
CPU goes off and does other things. The DMA controller will use
bus cycles that are idle or it will pause the CPU and "steal"
bus cycles if it has to.

When the transfer is done, you can usually configure the
controller to cause an interrupt.

It's basically a memcpy() operation that's done by hardware, in
the background.  Typically the source and destination pointers
can be configured to auto-increment (typically for blocks of
memory) or not (typically used for peripheral I/O registers).

Let's say I have a buffer containnig 4K of data that I want to
send out the UART.  I do something like this:

DmaController.sourcePtr = &buffer;
DmaController.sourceMode = AutoIncrement;
DmaController.destPtr = UART.txDataReg;
DmaController.destMode = NoIncrement;
DmaController.trigger = UARTtx;
DmaController.commandReg = Start;

Then I go off and do whatever else I want to do while the DMA
controller sends the buffer full of data out the UART.

Similar for receiving data or for copying a block of memory
from one place to another in RAM.

--
Grant Edwards                   grante             Yow!  I'm young... I'm
                                  at               HEALTHY... I can HIKE
We've slightly trimmed the long signature. Click to see the full one.
Re: general questions regarding ARMs

Quoted text here. Click to load it

  DmaController.count = 4096;
  
Quoted text here. Click to load it

Oops, forgot a step.

--
Grant Edwards                   grante             Yow!  They don't hire
                                  at               PERSONAL PINHEADS,
We've slightly trimmed the long signature. Click to see the full one.
Re: general questions regarding ARMs
Thank you Grant - that was a very clear explanation. That really clears
things up for me.

-Michael


Re: general questions regarding ARMs

Quoted text here. Click to load it

I've never used it myself, but IIRC, you click on something called
"show options" which reveals a different "Reply" button, and use that
rather than the "Reply" button at the bottom of the article.

HTH,

                               -=Dave
--
Change is inevitable, progress is not.

Re: general questions regarding ARMs

Quoted text here. Click to load it

Excellent - I had just assumed that reply was teh same as at the bottom
of each post, which isn't inline. I like how clearly it's labeled.


Re: general questions regarding ARMs

Quoted text here. Click to load it

Static or SDRAM?  Most ARMs come with kilo bytes of static, but upto
hundred mega bytes of external SDRAM.  You can jtag into RAM or boot
load from external flash.

Quoted text here. Click to load it

AVR memories are internal.  ARM memories are usually external.  DMA
means transferring from another external device without using the CPU.

Quoted text here. Click to load it

Usually faster.



Re: general questions regarding ARMs


Quoted text here. Click to load it

from the datasheet "Internal High-speed SRAM, Single-cycle Access at
Maximum Speed". By external do you mean a seperate chip?

Thanks,

Michael


Re: general questions regarding ARMs

Quoted text here. Click to load it

Internal SRAM are ususally 32K to 64K.

Quoted text here. Click to load it

Yes, external SDRAM chips upto 256M, most ARMs has SDRAM controller
only.  Some can stack dies, which can be argued as internal or
external.

Quoted text here. Click to load it


Re: general questions regarding ARMs

Quoted text here. Click to load it

You click "Show Options" near the top, which expands to show... er...
options. One of those options is Reply - if you click it, you get a
quoted reply window.

The reply link at the bottom is semi-broken, it seems like it only
works properly once in a session.


Re: general questions regarding ARMs

Quoted text here. Click to load it
You can program through serial interface or JTAG or parallel
programmer, this is not ARM specific but more vendor specific and most
of all tool specific. Using the high end tools will giove you the most
headache because they usually do not provide direct download into flash
software (e.g. Greenhills and ARM). Compilers from IAR and Keil however
support many ARM devices from different vendors and you can directly
download your code into the flash. afaik this is possible with Atmel
flash.

Quoted text here. Click to load it
Yes
This depends on tyhe memory (Flash) implementation. Wider memory
intefaces such as the Philips 128-bit wide flash provide much faster
execution than a 32-bit wide flash as implemented on the newer Atmel
SAM7S devices and even more so than on the older devices which only
have a 16-bit bus to the flash.

Quoted text here. Click to load it
In an embedded environment you execute usually from both memories. If
your flash is slower than the clock rate and narrow, you should limit
the code executed form there to non realtime critical code. You are
talking about Linux, then all your code has to be non realtime critical
;-)

Quoted text here. Click to load it
No way! You need approx 4 MB of RAM to run Linux preferably 16 MB+

Quoted text here. Click to load it
Basically DMA does in one cycle (get some data from location a and
store it in location b) what otherwise needs a whole lot more cycles as
the ARM architecture is a Risc based architecture. That means all data
transfer goes through registers no memory to memory transfers (that's
what the DMA is doing).



Quoted text here. Click to load it


Re: general questions regarding ARMs

Quoted text here. Click to load it

Well I should mention I'm a college student (3rd year EE at UIUC) so I
was hoping to use some of the free tools out there. What options would
that leave me?

Quoted text here. Click to load it

Right now I'm planning on using a SAM7X.

Quoted text here. Click to load it

Oh - everything needs to be realtime for this. I thought there were
some special linux kernels designed to run real time code?

Quoted text here. Click to load it

Any idea if that much RAM could be added to an Atmel ARM? I skimmed
through the datasheet to see if it had any sort of external memory
controller - and I didn't see one, but I might not be looking for the
right thing.

Quoted text here. Click to load it

Got it. Thanks for your help!

-Michael


Re: general questions regarding ARMs

Quoted text here. Click to load it

JTAG is a wonderful way to debug. I found that the "Angel debug
monitor" on the Atmel eval boards seemed to have some issues with IAR
embedded workbench.  With JTAG, loading and debugging is very
straightforward.  I got the example project from uc-os11 from the
web/and book running quite easily, even though I don't have much
experience and the example code had some new features that were not
supported in the older OS source code from the book.  I made some
slight change to allow it to compile, then it seemed to run fine.

I suggest that you go to the IAR website and look at their kickstart
kits.  That includes the JTAG interface and development board.  The
free IAR tools are limited to 32kB code size, which will get you
started.  It seems to me that CGG tool chain is only a realistic
option if you have a lot of patience.  I also believe that the
efficiency is not the best relative to the specialist embedded
compilers.

best regards,
Johnny.



Re: general questions regarding ARMs
Quoted text here. Click to load it

The JTAG interface is a backdoor into the processor chip at
the boundary between the processor core and on-chip peripherals.
On an AT91xxxxx chip this is the boundary between the ARM part
and Atmel part of the chip.

You can read and write the processor registers and make
the processor run single instructions or start the processor
running. The JTAG loaders use this to write a
Flash writing program and the desired Flash contents into
the system RAM and then use the writing program to copy
the contents into the Flash chip.

Quoted text here. Click to load it

It's a bit more complicated with ARMs. The processor starts
by fetching the first instruction at address 0 which should
for this purpose be in permanent memory (e.g. Flash). On the
other hand, the processor exception vectors are in the lowest
8 fullwords (32 bytes) of memory, and it's desirable to have
them in writable memory.

The dilemma is solved by the EBI (External Bus Interface) which
locates the first chip select (CS0-) at address zero after a
hardware reset. The chip select registers can then be programmed
by the Flash code, and the new values taken into use with the
Cancel Remap EBI command. The change is a bit tricky, as it's
not desirable to lose the code memory during the switch.

Quoted text here. Click to load it

The memory is too small even for an ucLinux version. Of the
Atmel chips, full Linux needs a memory mapping unit, and
only AT91RM9200 has one.

Quoted text here. Click to load it

The PDC makes it possible to run autonomous serial block
I/O on the Atmel chips. It's not a general DMA unit.

On the ARM7TDMI, much of the DMA-like functionality can
be implemented using the FIQ Fast interrupt.

Quoted text here. Click to load it

Welcome - an ARM is not an impossible beast to tame.
Been there - done that.

HTH

--

Tauno Voipio
tauno voipio (at) iki fi


Re: general questions regarding ARMs
"On the ARM7TDMI, much of the DMA-like functionality can
be implemented using the FIQ Fast interrupt. "

think so? the worse case FIQ latency is pretty huge on the ARM7TDMI's
I've work on to make "software DMA" pretty impractical, which is the
reason I thought manufacturers are offering PDC's and the like on these
devices


Re: general questions regarding ARMs
Quoted text here. Click to load it

Of course. it's not as fast as the real cycle-stealing DMA,
but it's fast enough for e.g. floppy data.

I'm using it on an AT91R40008 running at 4 MHz (yes, less than
10 % of rated, due to power limitations) and collecting data
from an A/D at hundreds of kHz. On another construction, FIQ
fits the bill in refreshing a LCD panel at 70 kHz.

--

Tauno Voipio
tauno voipio (at) iki fi


Re: general questions regarding ARMs
Being able to collect A/D at hundreds of khz at 4Mhz using FIQ is
excellent (assuming you have time left over to do something useful
during the collection).

The ARM I'm currently working with (ADuC702x) has a 1Mhz 12 bit A/D, 50
cycle worst case FIQ latency, at top speed of 40Mhz thats 1.2 usec, at
4 Mhz it would be 12 usec, nowhere near fast to process an A/D at
hundreds of kHz using FIQ like you are. I have to sit there and poll
the A/D to collect the data at that rate.


Re: general questions regarding ARMs
Quoted text here. Click to load it

It seems that your FIQ handler is in some way fishy
(or the base code keeps FIQ off for long times).

Here's my fiq handler for the data collection:
(It's written in GCC embedded assembly code)


    /*   FIQ bank register usage:
    
         r8,  -> destination buffer, -> AIC set request
         r9, pixel pair counter
         r10, -> I/O read port
         r11, -> PIO output reset port
         r12, capture control bit
         r13, work register
         r14, return link register      */
        
        ".code 32\n"
        
        "fiq1:\t"
        
        "ldr r13,[r10]\n\t"     /* get a pixel pair */
        "str r13,[r8],#4\n\t"   /* save it - bump pointer */
        
        "subs r9,r9,#1\n\t"     /* bump count - done? */
        "subnes pc,lr,#4\n\t"   /*  no - dismiss */
        
        "str r12,[r11]\n\t"     /* stop capture & FIQ */
        "ldr pc,fiq2\n"         /* continue to completion */
        "fiq2:\t.word fiq4\n"   /* -> completion handler */

        "fiq3:\n"               /* handler end label */

    /*   Fast interrupt completion in normal text segment   */
    /*   ------------------------------------------------   */

        "fiq4:\n\t"
        "ldr r8,fiq5\n\t"       /* -> AIC base */
        "mov r13,%2\n\t"        /* get AIC SWI bit */
        "str r13,[r8,%3]\n\t"   /* cause SWI */
        
        "subs pc,lr,#4\n"       /* dismiss the FIQ */

        "fiq5:\t"
        ".word at91aic\n\t"     /* -> AIC base */

------

The core handling is at the FIQ vector address (byte address 0x1e).

There are just 4 instructions in the fast loop: ldr for reading
the data, str for storing it, subs for counting and subnes for
dismissing the FIQ. The execution time is far under 50 clocks:
after the initial interrupt latency, it should use 5 clocks.

--

Tauno Voipio
tauno voipio (at) iki fi


Site Timeline