switching context on MSP430

Hi all,

I am working on a small RTOS for MSP430. It should just provide scheduler with ability to switch and change context.... Of course I have system tick ISR which is handling ticks. And here is the thing...When I take a look in disassembly, it says that before msp430 enters system tick ISR it pushes registers R15,R14,R13 and R12(and of course PC and SR) on stack. So I used that for context switching....At the beginning of the ISR I just save SP in TCB of interrupted Task so I could later restore it just by changing SP(changing stacks) and popping from stack. But what will happen if I have code like this in my task routine ? ///////////////////////////// unsigned int j,i,k,q,m; for(i=1;i

Reply to
brOS
Loading thread data ...

As near as I can establish from your message, you have just observed that the compiler will not save registers that are not used in the ISR. This is a common optimization. If your ISR uses all possible registers, the compiler will automatically save all of them.

Reply to
larwe

Hi all,

I am working on a small RTOS for MSP430. It should just provide scheduler with ability to switch and change context.... Of course I have system tick ISR which is handling ticks. And here is the thing...When I take a look in disassembly, it says that before msp430 enters system tick ISR it pushes registers R15,R14,R13 and R12(and of course PC and SR) on stack. So I used that for context switching....At the beginning of the ISR I just save SP in TCB of interrupted Task so I could later restore it just by changing SP(changing stacks) and popping from stack. But what will happen if I have code like this in my task routine ?

///////////////////////////// unsigned int j,i,k,q,m; for(i=1;i

Reply to
brOS

I would save context in the TCB, to maintain locality and where it can't be got at by an out of bounds stack pointer. The safest way is to save all the registers. You can fine tune later if there are performance issues and once it's all working...

Regards,

Chris

Reply to
ChrisQ

No. Entering the ISR is a hardware function, and it saves whatever the processor saves. If you're seeing it in assembly it's _after_ the processor has jumped to the ISR. Keep that straight; confusion here can lead to really weird code behavior later.

You need to save the whole context of the machine, meaning all the registers that may ever get used. If you happen to know that your tool chain never ever uses some set of registers, or only ever uses them after saving them locally in protected blocks, then you don't have to save those particular registers. Otherwise -- save 'em.

If your compiler lets you declare functions as 'interrupt' then you can get it's view of what's important by declaring an extern C function and calling it from an interrupt function. This will force the compiler to make no assumptions -- see what it saves off to the stack then.

--
www.wescottdesign.com
Reply to
Tim Wescott

That's not right. You've go to save _all_ of the registers when switching context (including the flags/status register).

Because the ISR didn't _use_ R10. The compiler knows what registers are used by an ISR, so it only saves the ones that are used. When switching contexts you've got to save _all_ the registers.

Definitely all of them.

Right.

No. There is no limitation because an RTOS will save the entire context.

It doesn't matter much -- it requires the same number of bytes either way.

--
Grant
Reply to
Grant Edwards

Which can mean cooperatively, or not.

Are they allowed to pre-emptively end the execution of one task and start another? Or is this just a 'tick'?

Disassembly of what, exactly? The ISR, itself?

By the way, are you aware that not every c compiler for the MSP430 will operate in exactly the same fashion? I'm not sure what c compiler you are targeting, but if you want this to work with several, you will need to investigate several. Not only in the case of pre-emption but even for co-operative task switching.

It's fairly common to simply push registers and then store the SP in a task block. Or, alternatively, store some registers in the task block along with the SP, instead of pushing them. The choice depends.

Part of the question of what exactly to do will depend on the c compiler and whether or not you are allowing pre-emption. From your earlier comments, I gather you _do_ want to support quantums and timer pre-emption. If that's the case, you are right to worry a bit. But the above doesn't really touch on the important issues. I'll say why. Later, I'll talk more.

In the above case, a given c compiler will have some registers assigned special purposes that transcend any and all functions. The SP and PC and status register are obvious examples which are shared throughout. Often, there is something else called a 'frame pointer' or 'base pointer' (x86 parlance) which is used to keep track of the current function's stack frame. There may be other such things, as well. All of these special registers must be preserved for a given thread... but some may automatically be saved, so it's good if you know exactly what is going on at all times. You will need to explore the c compiler and chip docs for this information.

Also, a c compiler will usually divide the remaining registers into two types: those preserved across calls and those allowed to be scratched across calls. Where that dividing line is drawn and exactly which registers these are will depend on the c compiler, itself. Those which are scratched do not need to be preserved by the called routine. However, they still need to be preserved by an interrupt, because that doesn't take place by way of an event that the c compiler can 'see' though normal function calls (except in the case of the 'R08 perhaps.) However, an interrupt function only needs to preserve what is likely to change. (In the case of a thread switch driven by a timer interrupt event, though, that usually means all of them.)

Actually, the above isn't strictly correct. Some may also be assigned for use as function parameter values, too. It might be zero, one, two, three, four, etc. However many there are, the c compiler may assign as many of the first few c function parameters as it can to them. For example, if the c compiler allows up to four here (let's say these are R12 to R15, but in reverse priority) and the function includes just two 'int' parameters, then R14 and R15 may be assigned these parameter values and R12 and R13 may either be assumed to be scratchable by the called function, or not, depending on the compiler. It can choose either way. Whichever it chooses, it will be consistent about it. And if the c compiler needs to re-use R14 and R15, or if the called function needs to take a pointer to passed parameters, then it is likely the c compiler will "spill" the parameter registers onto the stack so that they have 'addresses' which can be applied.

Returning to your example above, I'll assume that i, j, k, q, and m are all 'auto' variables. In that case, they may... or may not... be assigned to registers. Or both, depending on which block of code in the function is currently executing. You cannot count on either way. If they are (currently) on the stack and not in registers, then you don't need to worry. If they are (currently) in registers, then you do. But then, you have to worry about registers, anyway, if you are going to pre-empt executing code bodies like the above. So just save appropriate registers, appropriately. And you are okay.

Mostly. There may be hardware states you need to preserve. On some MSP430 chips, the multiplier may be one of these. The c compiler may already protect it by using interrupt disable/enable sequences. But it may not and, if not, you may need to worry about that somehow (it's not necessarily always solvable... as that depends upon the hardware.) Another example may be writing to flash, etc. Not everything is always interruptable and restorable, under pre-emption. Pre-emption imposes some significant thinking time on your part.

Not for the most part. There are exceptions, such as cases that hardware may impose from time to time. But not in the way you asked... for loops are always, in my experience, easy to handle.

Depends on your own needs. Do you need to access the registers of a process, elsewhere? If so, a task block location is usually somewhat easier to access and modify. But not always. There is no single answer here. This is one of those answers you should be providing, not us. It's a design issue for you to work out.

I'm going to add a few comments of mine below. They are general, but I hope useful.

Preemption vs Cooperation

------------------------- Probably one of the more important decisions to make when tailoring an operating system is whether or not preemption is supported -- because preemption involves some careful thought and effort to implement well.

Just so we are clear, preemption is defined this way: When preemtion is disabled, an interrupt is not permitted to cause a rescheduling event. When preemption is enabled, such hardware events may cause rescheduling to occur.

Preemptive systems may take control of the CPU, and switch away from the current process to some new process, at any point in time. Because of that, there are special considerations to worry about and usually more state to save away. This means more RAM gets used, in addition to taking more time (and code.) Cooperative systems will only change from the current process to a new one when the current process uses a system function call of some kind and this is usually a much more convenient point because certain registers are assumed to be scratched and it's pretty certain that any temporary, but static variables used by some library routine that may be used in common by various threads/processes have expired their purpose and can safely be re-used by the next process. For example, if the c compiler has assigned some registers as "scratchable across function calls" and the cooperative process calls an operating system function to switch to a new process, that task switch does not need to save those scratchable registers. This saves space and time.

Let me emphasize the library issue by pointing out some specific examples that confound preemptive switching. I've encountered floating point libraries on IBM PC C compiler tools that use temporary, static memory to perform their calculations before returning (in cases, for example, where the CPU does not include floating point support, itself.) [Yes, that's old news these days. But there was a time when floating point wasn't a given and software did the task on the x86 processor.] In any case, if you preempt a thread that is in the middle of one of these FP library calls and switch to a new thread that may also call a similar floating point library call which also uses those same temporary, static locations -- then it very well may ruin the proper execution of these threads, unless the process state includes these static temporaries in what it saves for each thread/process. This doesn't just happen with FP libraries (in fact, the FP libraries may be just fine), but with other libraries as well. Anything that has 'static state' that may be disturbed by some new thread can be a problem. It may not be, too. For example, you may _want_ the side effects in the case of I/O calls.

There may also be memory-mapped hardware that is commonly used by various threads, such as add-on multiplication units, that rely upon a very precise sequence of events in order to specify what operation they will perform. Interrupting such a sequence either requires saving the temporary state of the hardware so that it can be restored, having application software prevent interrupts/preemption during these sequences, or else adding back-tracking software to analyze such sequences and restart them properly (if it isn't possible to read the current hardware state.) [That can be done by analyzing the instruction stream leading up to the point of interruption.]

I've also encountered several different C compilers using static memory for compiler temporaries. It's not good practice by c compiler vendors, but it happens. Trust me. It happens. These compilers often add "live variable analysis" so they can save and restore such temporaries across calls they can analyze at compile-time, but they are unable to analyze what happens to them at run-time when interrupts may occur. Again, this can yield very serious difficulties to an implementer who wants to support preemption.

In the case where cooperative switching takes place, it only happens when there is a system function call made, directly or indirectly, by the current thread or process. It's almost always safe to switch threads at this point, because unknown or difficult to track hardware states aren't left at loose ends at this point and neither are unexpected temporary static variables (because a compiler can see that a function call is being made to a routine not in the currently compiled source code unit and will therefore use its live variable analysis to generate any necessary saving and restoring of those temporaries across the function call.) More, at these points, there usually are several registers that are assumed to be scratched by the compiler, so there are usually fewer registers to worry about saving, as well. (As already pointed out.)

In other words, it makes a difference. If you are in a very tight memory system for your embedded work and every byte makes a difference, then try and carefully consider the idea of avoiding preemption of any kind. If round robin sharing of the CPU is important or if you need somewhat more precise timing of sleeping threads starting up or if you need to quickly start processes on the basis of other hardware events, you'll probably use preemption. But be aware that there is a price in terms of your diligence in evaluating exactly what constitutes the complete state of a thread and also finding enough RAM in order to save it on a per-thread basis.

You'll encounter arguments saying that no self-respecting compiler would use static temporaries or that any decent floating point library would use the stack for its temporary state or that any reasonable compiler will automatically protect sequential hardware states with disable/enable interrupt prologues and epilogues. In short, you might be willing to gamble on the idea that the world will be kind to you and that these difficulties simply won't occur and you shouldn't have to worry about or deal with them. That's your choice, too. But be wary. You may get what you deserve and find troubles down the road. My advice is to enable preemption if and only if you really know what it means, have done your homework and due diligence, and have been able to assure yourself through careful, thoughtful research that these problems are not present and/or that you do know how to cope with them. Or, just take a shot at it, and test the heck out of your application. Just know that cooperation is fairly safe to use, even in relative ignorance and bliss, and that preemption simply isn't safe unless you've thought about it a bit.

Real-Time Clock

--------------- Another important decision to make is whether or not a real-time clock is desired for the application. I think you've basically said it is. But the real-time clock serves two primary purposes. First, the real-time clock can count down a delay so that threads in the sleep queue can awaken (get moved to the ready queue) when their time delay expires. (Technically, you can have a sleep queue even if you don't use a real-time clock, but then there will be no way to automatically count down time delays and thus appropriately move such threads to the ready queue when their delay expires.)

Second, the real-time clock can time out the currently running thread if quantums are enabled. This won't do anything if preemption is disabled. But with preemption, this will cause the operating system to reschedule the current thread and, if other processes have the same priority, to round robin share the CPU time.

Including real-time clock support does NOT necessarily imply preemption. If preemption is disabled but the real-time clock is used, this only means that threads can move from the sleep queue to the ready queue on timed intervals and that the current thread can have its quantum counted down. But if preemption is disabled, then that is all that happens. No change in the current thread need be considered until the current thread makes a system call of some kind, directly or indirectly, that may then cause a reschedule event.

A real-time clock is essentially a hardware interrupt event. Disabling preemption means only that this real-time clock event isn't permitted to cause rescheduling. The remaining actions may still continue. Enabling preemption means that rescheduling is permitted when the interrupt occurs and this implies that thread motion from the sleep queue to the ready queue or else thread quantum time-out can cause the current process to change, if there is an equal or higher priority process ready to run.

[Obviously, the real-time clock may also be used to keep track of elapsed times or other similar purposes (wall clock time, up time, various metrics, etc.)]

Making a real-time clock available has a price to pay. Some CPU time is used by the clock event handler each time the clock interrupts the CPU. If the clock is operated too rapidly the time spent in its event handler can consume nearly 100% of the CPU time, leaving almost nothing left for normal process operations. So be careful about deciding the interval.

It also uses (or at least, shares) a hardware timer resource, which may itself be a scarce resource and/or used for something important to the application. Sharing a timer may also link together two very different functions which should be allowed to have their timing determined entirely independently.

Quantums

-------- Quantums are used for round robin sharing of the CPU time -- if preemption is enabled and the real-time clock is also enabled. Without a real-time clock the current thread quantum cannot time out and without preemption enabled no switching to the next ready thread of equal priority can occur (assuming no higher priority thread is ready, of course.)

Quantums can be enabled even if there is no preemption and even if there is no real time clock event. A quantum merely requires some RAM for a quantum value for the current process. If there is no real time clock, then the quantum isn't automatically updated. And if there is no preemption, then there is no rescheduling when the quantum reaches zero. But no conflict inherently arises from the lack of either one of these associated facilities -- or from the lack of both of them.

.....

Hope that helps a little and doesn't make things worse.

Jon

Reply to
Jon Kirwan

One thing to watch out for on the'430 is the hardware multiplier. Its state isn't part of the normal register set, and HW multiply operations are not atomic. If you use the HW multiplier in more than one task, you've either got to save/restore it's state as well or you have to make sure that interrupts are disabled around HW multiply operations.

--
Grant
Reply to
Grant Edwards

Did I fail to introduce that idea in my post? I seem to recall including some words there.

Jon

Reply to
Jon Kirwan

I probably missed it.

--
Grant
Reply to
Grant Edwards

I probably missed it.

--
Grant
Reply to
Grant Edwards

I probably wrote too much.

Jon

Reply to
Jon Kirwan

Have you ever thought about setting up a website (perhaps a wiki), and just posting pointers when answering posts like these? Sometime in this group you post an amazing amount of information - it seems a shame to think this work will disappear (no one actually follows the rule of searching a newsgroup archives before posting a question). Posts like that are also so long that many people will skip them, thinking they'll read them later when they have more time. If you had this stuff on a wiki, I know I would read them more often.

mvh.,

David

Reply to
David Brown

Hehe. Okay. I admit I do have a site I can use where I can install server software. I'll see about setting something up. You are probably right, though it does take some effort to set things up, think about what may be desired, and get them seeded well enough with starting material. But it is something I should do. Kick me more. ;)

jon

Reply to
Jon Kirwan

There are generally two cases where you save the context of registers, but essentially only one reason.

You save the context of the registers because you want to be able to restore them and you have to save them because they are potentially going to be trashed by the ISR code. The two cases of save and restore of context are on ISR entry/exit and pre-emptive task switching.

So the standard entry routine of an ISR will save the registers that are going to be over-written by the ISR code. In theory, you only need to save the registers that have the potential to be changed because all of the other registers will remain intact. By only preserving the registers that are going to be used in your ISR, you greatly speed up entry to the ISR and thereby reduce interrupt latency and in the case of high frequency ISRs you will also affect overall system performance.

In a non-pre-emptive system, generally the context would be restored by the ISR as it exits and the system would go back to where it was previously. Those registers that were trampled on by the ISR would be restored and the program would continue as if nothing had happened other than the progress of time. The event of the ISR will have been injected into the system and dealt with at some later point.

When you move on from the standard context save and restore of an ISR and into the realm of pre-emptive task switching, you open up a whole new world (of pain). You could constrain your entire application code to using local variables and use compiler and assembler directives to prevent registers from ever being used. This would reduce the context that was required to be saved and restored, but could seriously affect the performance of your application. In most pre-emptive systems, tasks maintain their own stack linked to the TCB and the context can be safely saved and forgotten about.

My opinion is that your ISR should save and restore as little as possible so that your latency and performance are as good as possible. Your scheduler however should save and restore as much context as is practicable to make each task as free as possible to do the job in hand.

In the code example you have given, if the ISR does not use R10, then there is no need to save and restore it because it will still be whatever it was when the ISR exits. However, if ANY task based code ANYWHERE in your system uses R10 as you have shown, then you will need to save it and restore it in your task stack (or force your entire application not to use R10)

The bottom line is that you only need to save and restore what might be trashed between pre-emption and return. In the case of an ISR you only need to save and restore what you know your ISR uses. In the case of your task being pre-empted, you need to save and restore what any other task in the system might trash.

Reply to
Kevin P

I'd like to thank you all who helped me with this. It means a lot, especially because I'm new in this. But I would like to ask you something else. Is there some kind of universal test which can verify proper work and measure RTOS performance? Let's say I have finished my small RTOS for MSP430. Of course it's very simple. It's just have kernel API for task creation(which cannot be called from task routine, at least for now) and task suspension. Basically it just saves and switches context, and it has two possible scheduling algorithms to choose from (both preemptive). Now, I would like to measure its performance and to be sure that everything is fine by now because I would like to add more features later. So any help considering this subject is welcome. Is there some book that you could recommend, or online benchmark.....????

Thanks in advance !!!

--------------------------------------- This message was sent using the comp.arch.embedded web interface on

formatting link

Reply to
brOS

I don't know of a "universal test." And I'm not sure how I'd attempt to design one. Do you know what you'd expect in one?

As far as the MSP430 goes, which is actually a fairly simple case, you might look at specifying the fixed part of the data footprint together with the per-process and per-semaphore (and per-whatnot) data required and specify a formula there. That would allow folks to get a close figure on what it would cost them in terms of scarce ram on the MSP. You might also provide a similar figure for the flash requirements. Add to that the timer interrupt cpu time requirements and the typical case in terms of % of cpu used by the timer event, context switch times, and latency requirements from the sleep queue to running a highest priority process; such things might be nice to know. Features supported are important to know, too.

I have built my own O/S that runs on the MSP430 and some other processors. The principal feature is compile-time configurability. When a feature is not enabled or selected, the code and data associated with it is eliminated from the resulting object code. It's a prime goal that adding features to the source code may not incur a cost if an end-user doesn't require them and deselects them in the compile-time configurations. A minimally compiled system does NOT have a real-time clock, quantums, semaphores, messages, thread priorities, sleeping threads, and so on -- but it does support all of those features if desired at compile-time. In the minimal case, though, it's just a simple, cooperative switcher, able to keep the stacks separated (and that's about it.) It supports Harvard and von Newmann architectures and the kinds of memory systems that common microcontrollers provide (such as read/write and read-only for either code or data.) It has an explicit design requirement to isolate those portions of the thread/process data structures which can be arranged into read-only memory from those portions which require read/write access during operation in order to reduce ram requriements further. (Threads can be defined at compile-time, not only at run-time.) I could provide figures to compare with, if I knew what you were wanting to compare against.

You and I are neither the first nor the last to do this kind of stuff, but I think it is important that you try your hand at it. I wish everyone did. It's an extremely good learning experience for any c programmer to go through to improve their own understanding of both the c language and exactly what an operating system actually does. Please keep up the work!

Jon

Reply to
Jon Kirwan

Sounds like a different version of the halting problem to me. ;)

There are RTOSes that are certfied by various agencies, but I think that involves a lot of code review/inspection and detailed test-cases. I don't so how a universal test would be possible.

--
Grant
Reply to
Grant Edwards

Maybe it's just me recollecting teaching myself 8086 assembly in the

80's but I think it was the case back then to save all the registers.
--
// This is my opinion.
Reply to
jebblue

For orthogonal instruction sets in which autoincrement and autodecrement addressing modes can be used with at least one other register than the stack pointer, it really does not matter very much.

However, if short push/pop operations can only be used with the stack pointer and the reference to the TCB would require longer/slower instructions (or if the TCB is in a different address space requiring extra code for the access), saving on the current stack would be preferable.

Also saving to the TCB might usually require a longer period when the interrupts must be disabled.

Saving on the stack the interrupts could be enabled during the context save phase and later on the context restore phase, but the interrupts need only be disabled while storing the current stack pointer to the current TCB and loading the new stack pointer from the selected new TCB.

Paul

Reply to
Paul Keinanen

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.