A timer driver for Cortex-M0+... it rarely doesn't work

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I don't know if it is an issue specific to Atmel SAMC21 Cortex-M0+  
devices or not.

I wrote a simple timer driver: a 32-bits hw counter clocked at 875kHz  
(14MHz/16) that triggers an interrupt on overflow (every 1h 21').  In  
the interrupt I increment the 32-bits global variable _ticks_high.
The 64-bits number composed by _ticks_high (upper 32-bits) and the hw  
counter (lower 32-bits) is my system tick.  This 64-bits software  
counter, incremented at 875kHz, will never overflow during my life, so  
it's good :-)

In timer.h I have:
--- Start of timer.h ---
#include <stdint.h>
#include <io.h>  // Atmel specific

#define TIMER_FREQ    875000

typedef uint64_t Timer;
extern uint32_t _ticks_high;

#define volatileAccess(v) *((volatile typeof((v)) *) &(v))

static inline uint64_t ticks(void) {
   uint32_t h1 = volatileAccess(_ticks_high);
   TC0->COUNT32.CTRLBSET.reg =
        TC_CTRLBSET_CMD(TC_CTRLBSET_CMD_READSYNC_Val);
   uint32_t l1 = TC0->COUNT32.COUNT.reg;
   uint32_t h2 = volatileAccess(_ticks_high);
   if (h2 != h1) return ((uint64_t)h2 << 32) + 0;
   else          return ((uint64_t)h1 << 32) + l1;
}

static inline void TimerSet(Timer *tmr, uint64_t delay) {
   /* delay is in ms */
   *tmr = ticks() + delay * TIMER_FREQ / 1000;
}

static inline int TimerExpired(Timer *tmr) {
    return ticks() >= *tmr;
}
--- End of timer.h ---

In timer.c I have the ISR:
--- Start of timer.c ---
...
void TC0_Handler(void) {
   if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
     ++_ticks_high;
     TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;
   }
}
...
--- End of timer.c ---

The idea is simple (and stolen from a post appeared on this newsgroup).  
At first, the 64-bits software counter must be calculated disabling  
interrupts, because if a timer interrupt triggers during calculation,  
the overall software counter could be wrong.
By reading _ticks_high before and after reading hw counter, we can avoid  
to disable the interrupts.


Now I have a code that uses timers. It's a simple state-machine that  
manages communication over a bus.
--- Start of bus.c ---
...
int bus_task(void) {
   switch(bus.state) {
     case BUS_IDLE:
       if (TimerExpired(&bus.tmr_answer)) {
         /* Send new request on the bus */
         ...
         TimerSet(&bus.tmr_answer, timeout_answer);
         bus.state = BUS_WAITING_ANSWER;
       }
       break;

     case BUS_WAITING_ANSWER:
       if (TimerExpired(&bus.tmr_answer)) {
         /* No reply */
         bus.state = BUS_IDLE;
         TimerSet(&bus.tmr_answer, 0);
       } else {
         if (reply_received() == true)    {
           /* Analyze the reply */
           bus.state = BUS_IDLE;
           TimerSet(&bus.tmr_answer, 0);
         }
       }
       break;
   }
   return 0;
}
...
--- End of bus.c ---

I don't think I need to explain the code in bus.c. The only thing to  
specify is that bus_task() is called continuously in the main loop.

99% of the time this code works well. Unfortunately I have seen some  
strange events. Rarely (very rarely, one time in a week) the bus seems  
frozen for a time. After that it restarts the normal activity magically.
There's a thing that relates those strange events to driver of timers:  
the bus stall time lasts exactly 1h 21', the overflow period of hw counter.

I suspect there's a problem in my low-level driver and sometimes, maybe  
near the overflow, the code doesn't work as I expect.
Maybe the TimerSet() function sometimes sets a wrong value to the  
uint64_t timer, maybe a tick value that will happen only at the next  
overflow of the hw counter.

Do you see where is the problem?


Re: A timer driver for Cortex-M0+... it rarely doesn't work
Quoted text here. Click to load it

[snip]

Quoted text here. Click to load it

Does changing the optimisation level change the problem ?

Does a review of the generated code using objdump match what you
would expect ?

Have you tried placing come kind of debug marker in TC0_Handler()
(maybe turning on an LED) to see if TC0_Handler() is called without
the interrupt flag being set ?

This last one is in case there's some kind of rare timing issue
which causes the overflow handler to be called without TC_INTFLAG_OVF
being set yet when INTFLAG is examined.

Simon.

--  
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world

Re: A timer driver for Cortex-M0+... it rarely doesn't work
Il 27/04/2017 01:24, Simon Clubley ha scritto:
Quoted text here. Click to load it

As you can think, it's very difficult make a test and try, because I  
have to wait for the rare event that could happen after a week.

Anyway I think yes, disabling optimisation should solve, but it's not a  
solution.


Quoted text here. Click to load it

Do you mean reading the output listing with assembler instructions? I'm  
not an expert of ARM assembler, anyway I tried to read it and it seems  
correct.


Quoted text here. Click to load it

No, it would be very strange. TC0_Handler() function address is stored  
only in the vector table, so it is called only when TC0 peripheral  
requests an interrupt.
Anyway, if the interrupt flag is not set, in TC0_Handler() there's a if  
and the increment of _ticks_high isn't done.

Quoted text here. Click to load it

There's an if and the increment of _ticks_high wouldn't be done.


Re: A timer driver for Cortex-M0+... it rarely doesn't work


Quoted text here. Click to load it

change the timer clock so it overflows faster than 1.21h...

Bye Jack

Re: A timer driver for Cortex-M0+... it rarely doesn't work
First you need to make ticks_high volatile.  It doesn't make any sense  
to me why you'd cast it volatile in one execution context but not the  
other.  If it's shared between two execution contexts, the compiler  
needs to know that.

It sounds like you have a concurrent access problem that you need to  
protect against.  If it was me, I'd disable interrupts instead of your  
trick.

JJS

Re: A timer driver for Cortex-M0+... it rarely doesn't work
Il 27/04/2017 02:17, John Speth ha scritto:
Quoted text here. Click to load it

_ticks_high is accessed in TimerSet(), TimerExpired() and the ISR  
TC0_Handler().  Timerset() and TimerExpired() reads _ticks_high by  
calling ticks() function.  So _ticks_high is accessed only in two  
points: ticks() and TC0_Handler().

ticks() is called during normal background flow (not interrupt), so the  
access to _ticks_high must be volatile.
TC0_Handler() is the ISR, so I think a volatile access to _ticks_high is  
not necessary (the ISR can't be interrupted).


Quoted text here. Click to load it

I liked this trick because avoids disabling interrupts. TimerExpired()  
is often called.

I found the post (from Wouter van Ooijen) where this trick is suggested:
https://groups.google.com/d/msg/comp.arch.embedded/9d8I5FFbmX4/6yL33UR92F8J

Don Y suggested other improvements in the same thread.

Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 27/04/17 09:37, pozz wrote:
Quoted text here. Click to load it

"volatile" does not mean that a variable is shared between two execution
contexts - it is neither necessary nor sufficient to make such sharing work.

To get shared accesses right, you need to be sure of what is accessed at
what times - it is the /accesses/ to the variable that need to be
volatile.  Marking a variable "volatile" is simply a shortcut for saying
that /all/ accesses to it must be volatile accesses.

Quoted text here. Click to load it

Yes, the volatile accesses here are fine.  Omitting the volatile for
_ticks_high does not give any benefits or disadvantages, because you are
only doing a single read and write of the variable anyway - there is
very little room for the compiler to optimise.  The compiler can move
the non-volatile accesses to _ticks_high to after the clearing of the
overflow flag, maybe saving a cycle or two if it is a Cortex M7 that can
benefit from more sophisticated instruction scheduling.  But otherwise,
it does not matter one way or the other.

Quoted text here. Click to load it


Re: A timer driver for Cortex-M0+... it rarely doesn't work
Quoted text here. Click to load it

So true, David.  I noticed that a variable shared between two  
asynchronous execution contexts can benefit by volatile declaration.  I  
would instinctively declare it volatile because it's the right thing to  
do in this case.  It won't solve the OP's problem though.

JJS


Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 27/04/17 19:24, John Speth wrote:
Quoted text here. Click to load it

You say you agree with me - then it looks like you completely /disagree/.

"Instinctively declaring it volatile" is the /wrong/ thing to do when  
you share a variable between two contexts.  It is wrong, because it is  
often not needed, but hinders optimisation.  It is wrong, because it is  
often not enough to make it volatile.  And it is wrong, because  
"instinctively" suggests you make it volatile without thought, rather  
than properly considering the situation.


It is certainly the case that making a shared variable volatile can  
often be part of the solution - but no more than that.


Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 27/04/17 00:55, pozz wrote:
Quoted text here. Click to load it

Identifiers that begin with an underscore are reserved for file scope -
you are not supposed to use them for "extern" variables.  I would be
extremely surprised to find that a compiler treated them in any special


Quoted text here. Click to load it

That looks familiar :-)

Quoted text here. Click to load it

(I don't know what the CTRLBSET stuff is for - I am not familiar with
Atmel's chips here.)

If the low part of the counter rolls over and the interrupt has not run
(maybe interrupts are disabled, or you are already in a higher priority
interrupt, or interrupts take a few cycles to work through the system)
then this will be wrong - l1 will have rolled over, but h2 will not show
an updated value yet.

So if the high parts don't match, you need to re-read the low part and
re-check the high parts.  It is perhaps easiest to express in a loop:

static inline uint64_t ticks(void) {
    uint32_t h1 = volatileAccess(_ticks_high);
    while (true) {
        uint32_t l1 = TC0->COUNT32.COUNT.reg;
        uint32_t h2 = volatileAccess(_ticks_high);
        if (h1 == h2) return ((uint64_t) h2 << 32) | l1;
        h1 = h2;
    }
}

You can also reasonably note that this loop is not going to be re-run
more than once, unless you have interrupt functions that last for an
hour and a half, or something similarly bad - and then the failure of
the ticks() function is the least of your problems!  Then you can write:

static inline uint64_t ticks(void) {
    uint32_t h1 = volatileAccess(_ticks_high);
    uint32_t l1 = TC0->COUNT32.COUNT.reg;
    uint32_t h2 = volatileAccess(_ticks_high);
    if (h1 != h2) l1 = TC0->COUNT32.COUNT.reg;
    return ((uint64_t) h2 << 32) | l1;
}


Quoted text here. Click to load it

When you want a boolean result, return "bool", not "int".  (This is not
your problem, of course - it's just good habit.)

An alternative method here is to just use the low 32-bit part (with
delays limited to 2^31 ticks), but be sure that you have considered
wraparound.  Keep everything in unsigned at first, to make overflows
work as modulo arithmetic:

static uint32_t ticks32(void) {
    return TC0->COUNT32.COUNT.reg;
}

static inline void TimerSet32(Timer32 *tmr, uint32_t delay) {
    *tmr = ticks32() + delay * TIMER_FREQ / 1000;
}

static inline bool TimerExpired32(Timer32 *tmr) {
    int32_t d = ticks32() - *tmr;
    return (d >= 0);
}

Note that doing the subtraction as unsigned, then converting to signed
int, gives you the wraparound behaviour you need.  Converting an
unsigned int to a signed int when the result is out of range is

define it as modulo arithmetic:
<https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html




Quoted text here. Click to load it

"Good artists copy, great artists steal", as someone famous once said.

Quoted text here. Click to load it

It sounds like you are missing a tick interrupt - maybe you are somehow
blocking the overflow interrupt on occasion.  Maybe the CTRLBSET line is
the problem (since I don't know what it does!)  But it could be the
problem I noted above in ticks().

Diagnosing problems like this are a real pain - and demonstrating that
you have fixed them is even worse.  The key is to find some way to speed
up the hardware timer so that you can provoke the problem regularly -
then you can be sure when you have fixed it.  Are you able to make this
timer overflow at a lower bit count - say, 12 bits rather than 32 bits?
 If not, then try this for your hardware interrupt function:

void TC0_Handler(void) {
    if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
        ++_ticks_high;
        TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;
        TC0->COUNT32.COUNT.reg = 0xfffff000;
    }
}

That will force overflows and _ticks_high counts to run much faster.


Quoted text here. Click to load it


Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 27/04/17 10:01, David Brown wrote:
Quoted text here. Click to load it

Quoted text here. Click to load it

Looking over these again, I can see that they will have the same
potential problem.  They are fine for when the high part of the counter
is also in hardware - but /not/ if it is dependent on interrupts which
could be delayed.  If you have a situation like this:

_ticks_high is 1
timer reg is 0xffff'fff0
interrupts are disabled
timer reg rolls over, and is now 0x0000'0003
the ticks() function will return 0x0000'0001'0000'0003 instead of
0x0000'0002'0000'0003

You will need to check the hardware overflow flag to make this work.

static inline uint64_t ticks(void) {
    uint32_t h1 = volatileAccess(_ticks_high);
    uint32_t l1 = TC0->COUNT32.COUNT.reg;
    uint32_t h2 = volatileAccess(_ticks_high);
    if (h1 != h2) {
        // We just had an interrupt, so we know there is
        // no pending overflow
        l1 = TC0->COUNT32.COUNT.reg;
        // Or l1 = 0 for slightly faster code
        return ((uint64_t) h2 << 32) | l1;
    }
    if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
        // There has been an overflow, and it was not handled
        l1 = TC0->COUNT32.COUNT.reg;
        return ((uint64_t) (h2 + 1) << 32) | l1;
    }
    // If there has been an overflow or an interrupt, it happened
    // after the first reads - so these are consistent and safe
    return ((uint64_t) h1 << 32) | l1;
}

If possible, change to the 32 bit version :-)  Failing that, if your
chip supports chaining of timers in hardware, use that.


Thanks for this thread, Pozz - it's a good question, makes people think,
and can hopefully help other developers.


Re: A timer driver for Cortex-M0+... it rarely doesn't work
Il 27/04/2017 10:27, David Brown ha scritto:
 >> [...]
Quoted text here. Click to load it

Yes, but this situation it's impossible in my case.  Who could disable  
interrupts here?


Quoted text here. Click to load it

Atmel SAM TCx peripherals are 16-bits counters/timers, but they can be  
chained in a couple to have a 32-bits counter/timer. I already coupled  
TC0 with TC1 to have a 32-bits hw counter. I can't chain TC0/TC1 with  
TC2/TC3 to have a hardware 64-bits counter/timer.


Quoted text here. Click to load it

Yes, those problems are fascinating :-)


Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 2017-04-27 pozz wrote in comp.arch.embedded:
Quoted text here. Click to load it

Just out of curiosity, I had a look at the SAM C21 Family datasheet. It's
been a long time since I used Atmel ARM controllers (SAM7).

In the discription of the TC, I see no fixed 16 bit width and coupling of
timers. Only that any TC channel can be configured in 8, 16 or 32 bit mode.
Am I looking at the wrong datasheet or section?

If the timers are indeed 8, 16 or 32 bit configurable, that could be a way
to speed up your testing. Just set your timer to 8 or 16 bit (and add some
code to set the other bits valid) and speed up overflows with a factor of
2^24 or 2^16.

--  
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

Beer -- it's not just for breakfast anymore.

Re: A timer driver for Cortex-M0+... it rarely doesn't work
Il 01/05/2017 10:09, Stef ha scritto:
Quoted text here. Click to load it

When you use a TC in 8- or 16-bits, your are using a single TC  
peripheral. When you configured TC0 in 32-bits, you are automatically  
using TC1 too, that works in "slave" mode:

    The counter mode is selected by the Mode bit group in the Control A
    register (CTRLA.MODE). By default, the counter is enabled in the
    16-bit counter resolution. Three counter resolutions are available:
    [...]
    ? COUNT32: This mode is achieved by pairing two 16-bit TC
      peripherals. TC0 is paired with TC1, and TC2 is paired with TC3.
      TC4 does not support 32-bit resolution.
      [...]

IMHO this means TC is a 16-bits counter.


Quoted text here. Click to load it

Oh yes, if you read one of my previous post, I made exactly this to  
speed-up the raise of the bug.  I discovered it was due to the lack of a  
sync wait loop after writing the read command to CTRLB register.




Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 2017-05-02 pozz wrote in comp.arch.embedded:
Quoted text here. Click to load it

Yes, you are right, found it now. Earlier I didn't dive deep enough into
the 1000+ page datasheet to see this 'detail', sorry.

Quoted text here. Click to load it

Ah, may have missed that, I started reading the thread a bit late and there
were a lot of posts. ;-)

Good you found the cause.


--  
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

He who has but four and spends five has no need for a wallet.

Re: A timer driver for Cortex-M0+... it rarely doesn't work
Il 27/04/2017 10:01, David Brown ha scritto:

Quoted text here. Click to load it

;-)


Quoted text here. Click to load it

Atmel says you *need* to write CTRLB register before reading COUNT  
register from TC0 peripheral.  I don't know why exactly, but if you  
don't, you can't read the correct COUNT value.


Quoted text here. Click to load it

Yes, you are right and I thought about this possibility... but IMHO it's  
impossibile.

ticks() are called only by TimerSet() and TimerExpired() and those two  
functions are called only in normal background (not interrupt) code.
This means ticks() always runs in a lower priority than TC0 ISR.  
Moreover, I never disable interrupts in any part of the code.

I don't understand what do you mean with "[...] or interrupts take a few  
cycles to work through the system". Is it possible to read a rolled-over  
hw counter value (0x00000003) and a not incremented _ticks_high?

    uint32_t l1 = TC0->COUNT32.COUNT.reg;
    uint32_t h2 = volatileAccess(_ticks_high);

COUNT register is read before _ticks_high second read. If COUNT has  
rolled over and the rolled value is in l1, the interrupt was fired for  
sure... IMHO.  So h2 should contain the incremented value.


Quoted text here. Click to load it

This doesn't solve the problem you described.

_ticks_high = 1
90    <- h1 = 1
91
...
99
0     <- read l1
1
2     <- h2 = 1 (interrupt has not fired, your assumption)
3

ticks() could be that is completely wrong.


Quoted text here. Click to load it

Quoted text here. Click to load it

Yes, I know this method and I used it many times in the past. However it  
has two drawbacks:
- the maximum delay is 2^31 (in my case, around 40 minutes, but I have
   delays of 1h)
- many times you need an additional "timer running/active" flag

The second point is more important. If you want to switch on a LED after  
10 minutes when a button is pressed:

bool tmr_led_active;
Timer32 tmr_led;
void button_pressed_callback(void) {
    TimerSet32(&tmr_led, 10 * 60 * 1000);
    tmr_led_active = true;
}
void main_loop(void) {
   ...
   if (tmr_led_active && TimerExpired32(&tmr_led)) {
     switch_on_led();
     tmr_led_active = false;
   }
}

If you don't check the timer flag, the led will switch on at random times.


Quoted text here. Click to load it

Yes, I will try your ideas.

Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 27/04/17 11:38, pozz wrote:
Quoted text here. Click to load it

You are sure you are not calling ticks() from within another interrupt
function?

Quoted text here. Click to load it

There is always a delay between an overflow in the timer, and the actual
interrupt function being started.  Depending on details of the chip, it
is possible there will be several cpu cycles of the current instruction
stream executed before the interrupt function is run.  If I remember
rightly, the M3/M4 has a 5 stage pipeline.  When the interrupt is
registered by the core, it effectively means that a "jump to interrupt
vector" instruction is squeezed into the instruction stream - but these
current 5 instructions must be completed before the interrupt call takes
effect.  You will also have a cycle or two delay in the NVIC, and
perhaps a cycle or two delay getting the signal out of the timer block
(especially if the timer does not run at full core speed).  Cortex M
interrupts are handled quickly - but not immediately.

Quoted text here. Click to load it

Yes - hence my follow-up post.

Quoted text here. Click to load it

Quoted text here. Click to load it

In such cases, you rarely need the same level of accuracy.  Use your
accurate ticks32() counter to let you track a seconds counter in the
main loop of your code - this can be read freely without worrying about
synchronisation.

Another method is to make the timer overflow interrupt occur more often
- say, every millisecond.  This increments a single millisecond counter
(either 32-bit, or 64-bit, or split lo/hi 32-bit - whatever suits).
Then reading that is easy, because the change to this counter is always
done in the same context as an atomic operation - there is no
complication of independent updates of the low and high halves.  And
most of the time, it is enough to just use the low 32-bit part.


Quoted text here. Click to load it

So you have to add a flag - big deal.  You are not using any more
memory, your code is smaller, and it is simpler to be sure that
everything is correct.

<snip>
Quoted text here. Click to load it

Good luck - it is not an easy task.  Don't forget to let us know the
source of the problem, and the solution!



Re: A timer driver for Cortex-M0+... it rarely doesn't work
Il 27/04/2017 12:54, David Brown ha scritto:

Quoted text here. Click to load it

Yes, sure.  I am very careful during writing interrupts code, even if it  
wasn't sufficient in this case.


Quoted text here. Click to load it

Really?  So, the core is able to read the register of a peripheral  
(TC0->COUNT32.COUNT register, in my case) with a *new* (i.e., rolled)  
value, but the ISR hasn't run yet?  In other words, a bus access can be  
done (TC0 is on the bus) while interrupt request is pending?

If this is the case, it is a mess :-(


Quoted text here. Click to load it

Yes, of course. My point here is that you have to **remember** that the  
timers you are using can roll-over at any time in the future, so they  
can change from "not expired" to "expired".

After using TimerSet32() and after the timer expires, you could expect  
it stays "expired", until you arm it again with TimerSet32(). This is  
not true, because TimerExpired32() could returns "false" at a certain  
time in the future.

This is not a problem with timers that are repetitive (armed again as  
soon as they expire), but with one-shot timers.


Quoted text here. Click to load it

I configured TC0 in 16-bis mode, so now TC0_Handler() is fired every  
75ms.  I changed accordingly ticks() to create a 64-bits by shifting  
_ticks_high for 16-bits.

     if (h2 != h1) return ((uint64_t)h2 << 16) + 0;
     else          return ((uint64_t)h1 << 16) + l1;

I was lucky because I can reproduce the problem more often... and  
incredibly the problem is the opposite.

Remember my state-machine in bus.c:

   switch(bus.state) {
     case BUS_IDLE:
       if (TimerExpired(&bus.tmr_answer)) {
         /* Send new request on the bus */
         ...
         TimerSet(&bus.tmr_answer, timeout_answer);
         bus.state = BUS_WAITING_ANSWER;
       }
       break;

     case BUS_WAITING_ANSWER:
       if (TimerExpired(&bus.tmr_answer)) {
         /* No reply */
         bus.state = BUS_IDLE;
         TimerSet(&bus.tmr_answer, 0);
       } else {
         if (reply_received() == true)    {
           /* Analyze the reply */
           bus.state = BUS_IDLE;
           TimerSet(&bus.tmr_answer, 0);    [*]
         }
       }
       break;
   }

When the problem occurs (the bus blocks), bus.state is BUS_IDLE. The  
problem is with instruction [*].  That instructions should arm  
bus.tmr_answer timer such that it expires immediately and a new request  
is send on the bus (in the future, it will be simple to introduce a  
delay between the reply and the next request).

Sometimes the timer doesn't expire immediately, but after the hw counter  
roll-over again.  Indeed I see the bus blocked for 75ms.

We were thinking that _ticks_high hasn't incremented yet in task() when  
reading hw counter value.  But this would have produced an old already  
expired time, not a future not-expired time. Here the problem is with a  
wrong time in the *future*.
In other words, when the problem occurs, ticks() reads a new corrected  
and incremented value for _ticks_high, but an old not-rolled hw counter  
value.

Why this?  I think it's the usual problem with register syncronization  
in Atmel SAM devices?  Before reading or writing certain registers, you  
need to check if the peripheral is in syncing.  I don't know what this  
exactly means, but it relates to the presence of different asyncronous  
clocks.  But I use only one reference clock (an external crystal) that  
is routed to the Cortex-M core and all peripherals... so I thought  
syncronization wasn't necessary.

In this case, it seems syncronization solve my problem, so my ticks()  
function is now:

   static inline uint64_t ticks(void) {
     uint32_t h1 = volatileAccess(_ticks_high);
     TC0->COUNT32.CTRLBSET.reg =
          TC_CTRLBSET_CMD(TC_CTRLBSET_CMD_READSYNC_Val);
     while(TC0->COUNT32.SYNCBUSY) {
     }
     uint32_t l1 = TC0->COUNT32.COUNT.reg;
     uint32_t h2 = volatileAccess(_ticks_high);
     if (h2 != h1) return ((uint64_t)h2 << 32) + 0;
     else          return ((uint64_t)h1 << 32) + l1;
   }

Datasheet says you need to write the CMD bits of TC0->CTRLB register  
with a known value (TC_CTRLBSET_CMD_READSYNC_Val), before reading COUNT  
register.  Why?  I don't know.  TC0->CTRLB is a **Write-Syncronized**  
register, i.e. the value you are writing will be really wrote after sync  
time.  Maybe the sync loop I added waits for time needed to the CTRLB  
command to be executed... otherwise the COUNT value you read immediately  
after could be wrong (IMHO!)


After this... I think my code is always affected by the original problem  
if, as you explained, TC0 interrupt is delayed when I'm reading hw  
counter in ticks().  I don't have the expertise to understand this  
possibility happens really, so it's better to find another method.

I liked the idea to use a 64-bits counter for ticks that will never  
roll-over during the entire lifetime of the device.

Re: A timer driver for Cortex-M0+... it rarely doesn't work
On 27/04/17 13:51, pozz wrote:
Quoted text here. Click to load it

I cannot say for sure that this can happen.  But unless I can say for
sure that it /cannot/ happen, I prefer to assume the worst is possible.

Quoted text here. Click to load it

True.  But perhaps that can be baked into your "Set" and "Expired"
functions, or otherwise made "unforgettable".  The aim is to simplify
the code that /may/ have rare race conditions into something that cannot
possibly have such problems - even if it means other code is bigger or
less efficient.

Quoted text here. Click to load it

If you can make your hardware timer function run every millisecond (or
whatever accuracy you need), then use this:

extern volatile uint64_t tickCounter_;

static inline uint64_t ticks(void) {
    uint64_t a = tickCounter_;
    while (true) {
        uint64_t b = tickCounter_;
        if (a == b) return a;
        a = b;
    }
}

void TC0_Handler(void) {
    if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
        tickCounter++;
        TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;
    }
}    

If ticks never accesses the timer hardware register, there cannot be a
problem with synchronisation.  There is no need to use the timer
peripherals complicated synchronisation and locking mechanism, nor any
concern about interrupt delays.  Re-reading the 64-bit value until you
have two identical reads is sufficient to ensure that you have a
consistent value even if a timer interrupt occurs in the middle of the
64-bit read.



Re: A timer driver for Cortex-M0+... it rarely doesn't work
Il 27/04/2017 15:39, David Brown ha scritto:
[...]
Quoted text here. Click to load it

Yes, you're right.  One of my colleague would have said: "put on metal  
underwear, just to be sure" :-)

[...]
Quoted text here. Click to load it

Yes, I know.  Indeed I will abandon my first approach to put together hw  
and sw counter, joint in ISR code.
It is a technique learned from this newsgroup... it's a pity the  
original author isn't reading (I remember Don Y added some personal  
ideas to this approach and he read this ng in the past days).

[...]
Quoted text here. Click to load it

Yes, it is a solution.  There's a small drawback: you have a frequent  
interrupt (1ms).


Maybe there's another solution to fix the first approach.  The problem  
was that hw counter can roll over and the "rolled" value can be read,  
while the sw counter (my _ticks_high) is at the "old" (not incremented)  
value yet.
The idea is to configure the timer to stop when it reaches TOP  
0xFFFFFFFF value (one-shot timer). It can be restarted in ISR, together  
with incrementing _ticks_high.

There's another drawback, a small drawback.  The hw counter is the clock  
of the machine. When it reaches the TOP value, it stops for a short  
time.  So the system time appears frozen for this short time.
However this happens every 2^32 * Counter_Freq (in my case, every 1h and  
21').


 From this story, I learned another important thing.  Why did I missed  
this bug?  Because it could appear only every 1h and 21'.  It /could/  
appear, because it is random, so it could appear after 1000 times 1h21'  
(i.e. after 2 months!!!!)

In the future I will avoid to use so long time.  In my case, I don't  
really need the full 64bits.  If I use a smaller 16-bits hw counter and  
the full 32-bits sw counter, I will have a 48-bits system tick (in my  
case, a periodicity of 10 years).
In this case, a potential bug is related to the shorter period of the  
16-bits hw counter, only 75ms.  There is a much greater possibility to  
see the problem in my lab during testing and not in the user hands.



Site Timeline