timestamp in ms and 64-bit counter

pozz · 2020-02-06T12:43:30+00:00

I need a timestamp in millisecond in linux epoch. It is a number that doesn't fit in a 32-bits number. I'm using a 32-bit MCU (STM32L4R9...) so I don't have a 64-bits hw counter. I need to create a mixed sw/hw 64-bits counter. It's very simple, I configure a 32-bits hw timer to run at 1kHz and increment an uint32_t variable in timer overflow ISR. Now I need to implement a GetTick() function that returns a uint64_t. I know it could be difficult, because of race conditions. One solutions is to disable interrupts, but I remember another solution. extern volatile uint32_t ticks_high; uint64_t GetTick(void) { uint32_t h1 = ticks_high; uint32_t l1 = hwcnt_get(); uint32_t h2 = ticks_high; if (h1 == h2) return ((uint64_t)h1

J

Jim Jackson 6 years ago

Ah yes. Early versions of Windows NT. Crashed if they had an uptime of 49 and a bit days - I wonder why :-)

Mind you getting early NT to stay up that long without crashing or needing a reboot was bloody difficult.

Vote

U

upsidedown 6 years ago

Which NT version was that ?

My NT 3.51 very seldom needed reboots. In many years I booted it only three time after, Eastern, Christmas and summer vacations, since I did not want to leave the computer unattended for a week or more at a time.

Vote

D

David Brown 6 years ago

I believe it was Windows 95 that had this problem - and it was not discovered until about 2005, because no one had kept Windows 95 running for 49 days.

Maybe early NT had a similar fault, of course. But people /did/ have NT running for long uptimes from very early on, so such a bug would have been found fairly quickly.

Vote

D

David Brown 6 years ago

That's another good approach, yes.

Vote

P

pozz 6 years ago

This helps if the bug is deterministic. If it isn't and it doesn't happen after startup at the first wrap-around, it takes 49 days to have another possibility to see the bug.

Vote

U

upsidedown 6 years ago

That is a more believable explanation.

Both VAX/VMS as well as Windows NT use 100 ns as the basic unit for time of the day timing.

On Windows NT on single processor the interrupt rate was 100 Hz, on muliprocessor 64 Hz.

Some earlier Windows versions used 55 Hz (or was it 55 ms) clock interrupt rate, so I really don't understand from where the 1 ms clock tick or 49 days is from.

Vote

K

Kent Dickey 6 years ago

This is actually a very tricky problem. I believe it is not possible to solve it with the constraints you have laid out above. David Brown's solution in his GetTick() function is correct, but it doesn't discuss why.

If you have a valid 64-bit counter which you can only reference 32-bits at a time (which I'll make functions, read_high32() and read_low32(), but these can be hardware registers, volatile globals, or real functions), then an algorithm to read it reliably is basically your original algorithm:

uint64_t GetTick() { old_high32 = read_high32(); while(1) { low32 = read_low32(); new_high32 = read_high32(); if(new_high32 == old_high32) { return ((uint64_t)new_high32

Vote

R

Richard Damon 6 years ago

But, as long as the timing is such that we can not do BOTH the read_low32() and the read of ticks_high in that delta, we can't get the wrong number.

This is somewhat a function of the processor, and how much the instruction pipeline 'skids' when an interrupt occurs. The processor that he mentioned, A STM32L4R9, which uses an M4 processor, doesn't have this much of a skid, so that can't be a problem unless you do something foolish like disable the interrupts while doing the sequence.

If we put a proper barrier instruction between the read low command and the second read high (and we may need that just to avoid getting a cached value that we read from the first read), and declare it as volatile so the compiler doesn't do its own caching, then the problem doesn't occur. Again, not a problem on his processor, as it is a single core processor (but we still need the volatile).

Vote

R

Rick C 6 years ago

49 days? You mean 49 minutes?

Never used NT, but I used W2k and it was great! W2k was widely pirated so MS started a phone home type of licensing with XP which was initially not well received, but over time became accepted. Now people reminisce about the halcyon days of XP.

Networking under W2k required a lot of manual setting up. But it was not hard to do. A web site, World of Windows Networking made it easy until it was bought and ruined with advertising and low quality content.

Now I have trouble just getting to Win10 computers to share a file directory.

Rick C. +- Get 1,000 miles of free Supercharging +- Tesla referral code - https://ts.la/richard11209

Vote

D

David Brown 6 years ago

Oh, Win95 OSR 2 was not /too/ bad. It could keep going long enough to play a game or too. (For actual work, I moved from Win3.1 to OS/2, until NT 4.0 came out.)

Did you not use NT 4.0 ? It was quite solid. W2K was also good, but XP took a few service packs before it became reliable enough for serious use.

I don't remember any significant issues with networking with W2K. I see a lot more now with Win10, and even Win7 when people have forgotten to turn off the automatic updates.

Indeed. And a recent update just stopped local names on the network working properly with DNS - Windows 10 has just decided that you need to use the full name (with the ".network.net" or whatever added), or you have to manually force it to use that suffix automatically. The DNS and DHCP setup we have has been working happily for many years - I don't know what MS have done to screw it up in the latest Win 10 and Win 7 updates.

Vote

K

Kent Dickey 6 years ago

The interrupt skid matters for how large the window is, but the problem happens even if the "skid" was 0.

Look at it this way: the hardware counter logic is something like:

always @(posedge clk) begin if(do_inc) begin cntr += 1; if(cntr == 0) begin interrupt = 1; end end end

Then at cycle 0 cntr=ffff_ffff and do_inc=0. At cycle 1, do_inc=1 and cntr=0 and interrupt=1.

In that cycle, software could read cntr=0. The interrupt CANNOT have taken place yet since interrupts aren't instaneous--the signal hasn't even made it to the interrupt controller yet, it's just this clock module has decided to request an interrupt. (The ARM GIC support asynchronous interrupts, so it takes several clocks just for it to register the interrupt).

This is always somewhat a function of the processor, but the problem is inherent to all CPUs. A simple 6502 or 8086 or whatever has the same problem and cannot fix it easily either.

The hardware cannot get this case right without some extreme craziness. That would be a pre-interrupt detection circuit, prepared to drive the interrupt early so the CPU reacts in time.

The right way to look at it--hardware interrupts are delayed tens or hundreds of cycles always from when you think they happen to when you receive it. Then you'll get your algorithms right.

Kent

Vote

E

Ed Prochak 6 years ago

Good points, Rick, but this conversation has me wonder:

Why use a design that is handling the high 32bits in the application layer and the low 32bits separately in at the ISR?

Apparently you are using this only for interval timing? If you are looking to maintain calendar time, then you will need to store the high 32bits as Rick mentioned.

restoring the high 32bits from nonvolatile storage is a boot up issue, and storing the value may require work outside the ISR. But is required only once in 3 years as Rick pointed out. But you would have some drift anyway since you have no way to measure the time the system is down.

Ed

Vote

R

Rick C 6 years ago

e

ick()

and cntr=0

aken

it

to

t

blem

That

pt

eive

That's why stuff that is hard on a CPU is so easy in an FPGA. Even if I ha ve a soft core in my FPGA design I know the clock cycle timing of the CPU a nd it doesn't have the many cycles of delay to processing an interrupt. On my design the next clock after an interrupt is asserted the CPU is fetchin g the first instruction of the interrupt handler as well as saving the retu rn address and status register.

Commercial CPUs provide a bunch of hard to use features like interrupts wit h priority, etc. because programmers think of the CPU cycles as being a pre cious commodity and so want to make a single CPU do many things. The CPU i s often much smaller than the memory on the die and all of it is inordinate ly inexpensive really.

On an FPGA it is easier to just provide the 64 bit counter in the first pla ce. lol But if you want, it is easy to make the counter interrupt the CPU before the counter rolls over and the software can increment the upper hal f at the correct time as well as make the full 64 bits an atomic read.

This thread is a perfect example of why I prefer FPGAs for most application s.

Rick C. ++ Get 1,000 miles of free Supercharging ++ Tesla referral code - https://ts.la/richard11209

Vote

R

Robert Wessel 6 years ago

The "tick count" in the (Win32) OS was always 1000Hz (as reported by GetTickCount(), for example). The physical ticks were massaged to correctly update that count.

Vote

R

Robert Wessel 6 years ago

If you have an atomic 32-bit read, and a 32-bit compare-and-swap, it's not too hard. I had to deal with this in Windows back before the OS grew a GetTickCount64().

The basic idea is to store a 32-bit extension word, with* 8 bits that you match to the top of system tick, plus 24 bit of extension (IOW, you'll get a 56-bit tick out of the deal).

Basically, you get the system tick, then read the extension word, and compare the high bytes. If they're the same (IOW, the extension word was set in the same epoch as the system tick is currently in), just concatenate the low 24 bits of the extension word to the left of the system tick, and return that.

If the high bytes are different, determine if there's been a rollover (high byte of system tick lower than high byte of extension word). Then update the extension word value appropriately (unconditionally copy the system tick high byte to the extension word high byte, and increment the low 24 bits if there was a rollover). CAS that updated value back to extension word in memory (ignore failures). Loop to the start of the process to re-get the extended tick.

The one thing this requires is that this routine get polled at least once every ~49 days.

*You can fiddle the exact parameters here a bit, this is just what I did.

Vote

D

Dimiter_Popoff 6 years ago

I think he refers to the scenario where the lower 32 bits are a hardware timer register and the upper are interrupt incremented by software somewhere. It is possible that you read the lower 32 bits as ffffffff then the upper 32 bits *after* the transition ffffffff -> 0 and the interrupt processing; there is no sensible way of knowing if that occurred based only on reading these two 32 bit values.

If 1ms is the target the solution is easy, just do IRQ every

1 ms and handle both longwords in the IRQ service routine (while masked). IIRC the OP suggested that in his original post.

Reading the 64 bits using 32 bit reads then is trivial, e.g. this is how the timebase registers are read on 32 bit power (read upper, read lower, read upper again, if changed do it all over again), I think Kent referred to that in his previous post as well, probably more have mentioned it on the thread.

Dimiter

====================================================== Dimiter Popoff, TGI

formatting link

======================================================

formatting link

Vote

L

Les Cargill 6 years ago

This is why it's important to be able to seed the timer counter.

Les Cargill

Vote

R

Robert Wessel 6 years ago

The point of the described algorithm is specifically to detect and handle the case where the base (short) timer value overflows, but without locks, or requiring more than a single additional word in storage.

Vote

U

upsidedown 6 years ago

With multimedia timers enabled that may be the case, however, without them, the sleep time granulation is 10 ms, which would indicate a 100 Hz update rate. Also SetTimeAdjustment() win32 call assumes 100 Hz (or

64 Hz) update rate.

Vote

U

upsidedown 6 years ago

NT 4.0 solid ??

NT4 moved graphical functions to kernel mode to speed up window updates. When doing some operations in kernel mode on behalf of a user mode function, the first thing that the kernel mode routine should do is to check that the parameters passed to it were accessible from _user_ mode. Unfortunately this was not done initially, so passing by accident a NULL pointer to these functions crashed the whole computer, not just the application. SP1 added these checks.

In general, each NT4 service pack introduced new bugs and soon the next SP was released to correct the bugs introduced by the previous SP. Thus every other SPs were actually usable.

Even NT5 beta was more stable than NT4 with most recent SP. NT5 beta was renamed Windows 2000 before final release.

Vote

timestamp in ms and 64-bit counter

Join the Discussion

Didn't find your answer?