Delta Queue Help - Paging Mr. Kirwan

E

eeboy 14 years ago

I am trying to implement a delta queue to schedule some tasks in the future. The structure is based a lot on comments that Jon Kirwan has made in this news group. I have constructed something but I don't think its very elegant (I don't write code often). It also seems a little flaky perhaps due to incorrect use of the volatile specifier.

Implementation is located here:

formatting link

The idea behind what I have is there exists a static buffer which is filled with delta queue objects. Each delta queue object has pointers to the previous/next delta queue object, a relative delay to the previous task and some process information (process ID and function pointer). There are 3 global pointers, the ready pointer, sleep pointer and the available pointer. The ready pointer points to a list of tasks to be executed. The sleep pointer to a list of tasks which are... well... asleep and not ready to execute. The available pointer basically points to the end where there is an available slot. These pointers only move forward. When one is pushed up against another, that 'sub-queue' is empty. For example, when the ready pointer equals the sleep pointer, there are no ready tasks.

SchedulerTick() is called by the systick interrupt at a rate of 1ms. SchedulerRun() is called from the main loop of the application (when it's not doing anything else) so that my systick interrupt is very short. SchedulerInsert() is called as needed to schedule a task.

I may have more issues than I know but my current questions are:

1) Since the sleep pointer is the only thing adjusted in the interrupt (SchedulerTick()), I have declared it as volatile so that it's value is never assumed by the compiler. Is this the correct usage? A few times, I have gotten stuck in the while loop within SchedulerTick() because the next and previous pointers have gotten corrupted and I never reach the available pointer (although they still point to valid queue objects). Not sure if they are related...

2) The SchedulerTick() and SchedulerRun() functions are pretty straightforward. The SchedulerInsert() gets a bit messy. Most of the mess is caused by the fact that a task can be inserted in the queue before the sleep pointer. This is what forces me to disable interrupts in the SchedulerInsert() function. Anyone have any recommendations to keep from disabling the interrupts? I can live with the small delay.

Any suggestions?

--------------------------------------- Posted through

formatting link

Vote

J

Jon Kirwan 14 years ago

I have a few questions before I go digging through your code. First, what is the processor? I would guess from the NVIC mnemonic I see that it is an ARM, probably Cortex-M3? I haven't tried doing much (near zero, to be honest) coding on such processors... but I do have a setup here that I could try things on, I suppose. Would take some reading, though, to be sure I had a clue first. Second, I have the Code Sourcery stuff (free style) set up on my machine to work with the Energy Micro tools and they do seem to compile and go. (About all I've done so far.) So what compiler tools are _you_ using? Just curious.

So rather than go through your code with an ignorant eye, it might be better that I just describe the overview and you check to see if you are achieving it.

I will use TIMER_BLOCKED and TIMER_UNBLOCKED to mean that timer interrupts are disabled but that the timer is allowed to continue running. I will use TIMER_OFF and TIMER_ON to mean that the timer counter is enabled or disabled. No change to the interrupts are implied. I will use TIMER_INIT to mean setting up the timer registers but that the timer counter is disabled from counting. Nothing is implied about the timer interrupts here, either. So it leaves them alone.

You probably need some kind of init function for the queue. I prefer to write it so that it works even if the queue is in operation. So I typically block the queue timer right away to prevent it from looking at anything while I'm screwing around in the init routine. Then unblock. Of course, I also set up the timer.

I do something like this:

TIMER_BLOCK; quantum= 0; sleepq= availq= readyq= 0; for ( i= 0; i < N_DQUEUE; ++i ) { n[i]= i+1; q[i]= MAXTIME; z[i]= NULL; } n[N_DQUEUE-1]= 0; TIMER_INIT; // timer should be off but ready. TIMER_UNBLOCK;

I've kept the above pretty simple. However, to make things slightly faster, I create a special variable called 'quantum' that holds the counter value for the first sleeping function. That way I don't have to index into an array, but just use a simple static which is usually faster. So the value of the first sleeping function's remaining time is in 'quantum' and not in the associated timer slot for the node. (It's copied from there when the sleep queue advances.)

You may notice that I use n[] for the next pointer (no previous pointer exists), q[] for the delta timer quantums, and z[] for the function pointer. q[i] is set to the maximum possible value for empty slots, z[i] is always NULL for them of course, and each points to the next except for the last one, which is of course set to the first once the loop exits above. So that's about it for the init routine.

The interval timer routine may sometimes need to re-enable the timer right away. Depends on the processor and I am not familiar yet with the Cortex-M3. But obviously, the interval timer ISR must handle any timer-hardware-related stuff, as required. Beyond that the algorithm is pretty simple:

if ( quantum > 0 && --quantum == 0 ) { auto int i= sleepq; do { i= n[i]; if ( i == availq ) { sleepq= i; TIMER_OFF; return; } } while ( q[i] == 0 ); sleepq= i; quantum= q[i]; }

If quantum is 0, there's nothing to do because that's not legal unless there is nothing to do. In that case, skip out and handle hardware related things before leaving. Else, the quantum is decremented and checked to see if the first sleeping process times out. If so, all that extra code in there happens.

At this point, I load up the sleepq into a temporary variable and advance it past any subsequent sleeping processes that _also_ have a delta quantum of 0. Note that this means reading the next pointer for each, since they are not necessarily in sequential array index order (though they start that way.) Once the sleepq has been pushed forward, the sleepq variable is updated.

It's a good time to note here that sleepq and quantum are marked as 'volatile.' As you can see, they are modified in a timer interrupt. So that's important. Note that availq, q[], and n[] are not modified here. Just read. The rest isn't even examined.

The execution routine is pretty simple, too:

auto int i; auto void (*f)( void ); while ( (i= readyq) != sleepq ) { f= z[i]; q[i]= MAXTIME; z[i]= 0; readyq= n[i]; (*f)( ); }

I call this routine "cooperatively." This means it is not driven by any timer, nor does the timer routine attempt to execute anything. So usually the main code will repeatedly call this routine after setting up some first routine as the sleeping one, as in:

for ( ; ; ) dqueue_execute( ); // or whatever you name it

Anyway, since this routine executes functions and these functions may in turn call this function or the insertion routine (or even the init routine, I suppose), it's important to re-check the value of readyq at the top of the loop. Do not place it into a temp variable as part of a for() loop, as it's value may become stale. So this may require (depending on optimizations) you to mark readyq as volatile, as well. sleepq should already be volatile, so that will be examined each iteration, already.

The while loop tests if the process is ready to run. If so, it gets the function pointer into a temporary, sets the queue entry back to default values, advances the readyq pointer, and _then_ executes the function once the structure is safely updated.

This order is very important. The structure must be in good shape _before_ making any calls because they may call into the dqueue routines. This is a common mistake (okay, I admit it, for me anyway; sometimes I get too aggressive with the loop code.)

It's pretty vital that the readyq be read each iteration and it is similarly important that executing the function is the very last thing the loop does before going at it for another try.

This leaves 'insert.' Okay. The most complicated for last. I have two parameters for it, so call them f and dtime. f is the function pointer to insert and dtime is the number of ticks to wait out from 'now.' The code starts out like:

int cur= availq; if ( dtime

Vote

E

eeboy 14 years ago

Thanks for the detailed response! I am still digesting your input but I thought I'd respond to answer your questions. You are correct. I am using a Cortex-M3. I am compiling on IAR's Embedded Workbench as well.

We seem to have a lot of similarities at first glance. On the first read I noticed a few things I need to double check and so I'll spend some time doing that now. I think the main difference is in how the queue is set up. You use several arrays and I have defined the following:

//! Delta queue structure encapsulating a complete process entry into the queue

typedef struct dq{ struct dq * psPrev; //Address of previous queue entry struct dq * psNext; //Address of next queue entry unsigned long ulDelta; //Delta ticks (ticks relative to the next/previous process)

tProcessObject sProcess; //Process to be executed } tDeltaQueueObject;

.. where tProcess contains some process specific stuff (the function, a PID, etc). Note that I am maintaining a pointer to the previous entry here.

Thanks again! I'll be back shortly with questions :).

--------------------------------------- Posted through

formatting link

Vote

J

Jon Kirwan 14 years ago

I guess I'd add that the .h files, and most especially the deltaqueue.h files, are missing from the web page. That would help -- for example, I do not see the definition for sDeltaQueue, though I can assume.

Let's start here. In SchedulerInit() you don't do any intialization of the .psPrev or .psNext pointers in each of the array elements of sDeltaQueue. Why not? This may be the result of a carefully constructed design, so I'm more curious than anything else. But I'm slightly uncomfortable with the queues not actually having their links set up at the beginning.

I do notice that you test the null cases in FreeEntry() in order to scan the entire array for a free node. So I'm sure you have written some code to deal with at least one of the situations that may arise. But there is no need for this work. If you arrange things at the outset, then it is always the case that the next node pointed at by the available queue pointer is the right one to use. You don't need to search. In your case, and I've not gone through all the code yet, you appear to just imagine that anything that isn't nailed down is available... but you need to search for them. So it works a little different from what I'm used to. It must be the case that you just unlink stuff in some list and then null out the psPrev and psNext in the freed item and just leave it to be rediscovered.

My design, which has worked for a few decades in a variety of places, is to keep all nodes linked in a large loop which is broken into three sectors and to use the avail, sleep, and ready queue pointers moving around the loop. Moving a node from the sleep queue to the ready queue is nothing more than advancing the sleep queue pointer around the loop by one step. Nothing else to do for the interval timer and this helps keep things fast where it counts. Moving a node from the avail queue to the sleep queue, and I disallow the case where the time to wait is zero so an avail queue node is never moved directly the ready queue, involves a simple advancement of the avail queue pointer and then a sleep queue insertion process which is more complex. So the insertion code involves a focus on that insertion process, which makes sense. And moving a node from the ready queue back to the avail queue involves, once again, nothing more than advancing the ready queue pointer. So this puts the 'hard' work on the insertion process. And it's not that bad. The rest is nil.

I wonder about the approach of having a 'sea of unconnected' available nodes with ready and sleep queues that grow and shrink and have to manage the case where pointers in once available nodes are garbage values until they get installed in the sleep queue. I suppose that can be made to work. And I'm not complaining. Just commenting. But since you are having troubles, perhaps it's not entirely unrelated to that approach and confusion about what I'd written about earlier where I wasn't imagining things the way you've chosen to go.

That said, let's take the case of an empty structure and the first insertion. Here, most of the insertion code gets skipped (the while bails out) and after FreeEntry() gets called you have psAvailable and q both pointing at the same node. You mark psPrev of that node to point to itself and psNext to be NULL. (I'm ignoring the NVIC stuff, being ignorant.) So at the end, I don't even see the sleep queue being modified, but I do see that the available pointer now points at something. That seems a little bizarre to me. Why do you even bother with an available queue pointer, if you have a design where everything not pointed at by the sleep or ready queue is, by definition, available? It seems not needed. And why here do we wind up with the sleep and ready queue pointers unmodified while the available queue pointer is modified?

I'm in need of an explanation of your design concept, I guess. That will go a long way in figuring out when some bit of code is right or wrong.

Jon

Vote

J

Jon Kirwan 14 years ago

Thanks. I think I'm using Eclipse, though of course I have the IAR, Atollic, and the one from Rowley that I could try out within the associated limitations of each that I must follow (fair usage, optimization, space, etc.) In the Eclipse case and the Code Sourcery tools, there are no limitations I'm aware of right now. And the price was right (I owe Code Sourcery a vote of thanks for their generosity in every six months putting out a formal revision for us very cheap types.)

But that's no real difference at all. I just unbundled them. You bundled them up. Either way is just fine. To access psNext, I might use psNext[4] where you might use queue[4].psNext. Same semantic, really. And if the compiler is smart enough (most are) very similar code will result.

However, it turns out in laying these things into larger systems (I usually do include them with semaphores, as well), there are some space advantages in my method. (Not every node in every queue requires every component.) So I can cut down just a tiny bit on memory usage. In the Cortex-M3, it will likely make no nevermind. But on something like the PIC16C54, with 25 bytes of sram, it makes all the difference in the world. Okay. Yeah. The PIC16C54 is maybe pushing an extreme to make a point. But you get the idea. Sometimes it does help a little. And I don't see much difference in the concepts, so I break it all out.

I noticed.

Okay.

Jon

Vote

J

Jon Kirwan 14 years ago

That last probably should be: TIMER_UNBLOCKED;

Sorry, Jon

Vote

E

eeboy 14 years ago

There's not much to the deltaqueue.h file but I've posted it here (

formatting link

). sDeltaQueue is actually defined on line 43 of deltaqueue.c. Note that in the original posting of deltaqueue.c I left out SchedulerElapsedTicksGet and SchedulerTickCount. They aren't important to the deltaqueue functionality... but they are the reason I leave the timer always on. Hope that makes sense...

Besides the obvious goal of scheduling a task to be completed in the future, I also wish to be able to remove tasks from the queue by process ID. In my application this could be very useful. I have a need for many timers but only 6 hardware timers available. MOST of the timers that I need don't need great precision but I need to be able to call off the timer. I haven't even begun to implement this as the insert is not solid yet... but it is one of my requirements.

Also, I wanted to decouple this code and make it as portable as possible so I tried to not have anything specific to any architecture. For example, ScheduleTick() is linked to the system tick interrupt vector in the main c file. I will eventually remove the other included files (interrupt.h, hw_nvic.h, hw_types.h) by registering callbacks to turn on and off the master interrupt.

Finally, the NVIC references are nothing to worry about. I am just accessing the hardware register of the current count so that I could calculate how long an insert took. This code will not remain. Make sense?

Ok... so now let me explain what I am doing without confusing myself. I have declared sDeltaQueue sufficiently large. Initially everything is cleared. I assume an entry with NULL pointers, no PID and no function pointer is an open slot. My ready, sleep, and available pointers are all equivalent and point to the sDeltaQueue[0].

When an insert is performed, I begin iterating at the sleep pointer and do so until I find a slot (based on delta) or I bump up against the available pointer. So, when nothing is in the queue, I insert at the next available location pointed to by the available pointer and I increment the available pointer by finding an open slot. This new available pointer now points back at the task we just inserted and that task points forward to the new available pointer.

The same thing happens on subsequent inserts where the delay provided to the insert function is greater than the sum of the deltas (it's placed at the end of the list).

When a task is inserted and it goes somewhere besides the end of the queue it's handled a bit differently. First, a free slot is located. The slot is then linked in (adjust pointers of new slot, previous, and next). The delay of the inserted and the task ahead are adjusted. Then the process information is added (PID, pf). Finally, I check to see if the task that we just inserted was inserted before the sleep pointer. If it was, then it becomes the new sleep pointer (I don't like doing this, I'd rather I never touched the sleep pointer).

That's the insert function. The run function is run from the main loop of the program. It is pretty straight forward. I just monitor the ready pointer to see when it does not equal the sleep pointer. When this is true, there are tasks to run. I simply execute the process, clear out the task (zero pointers, and pid so that it becomes a free slot) then increment the ready pointer. I did note that my execute should come last after reading your message. However, I only ever call the insert function from this executed task. I never call the run task from the executed task. So, the ready pointer should not be modified.

The tick function is even simpler. If the the sleeping task has a delta, I decrement it. I then move the sleep pointer forward until I reach a non-zero delta or bump up against the available pointer.

So that's the thought behind it. Not that there can't be any major flaws. I struggled with finding a nice insert process. I found the links were easy to think about and seemed efficient... but I never liked the idea of modifying the sleep pointer inside the insert.

I wanted to get that off... I still need to go through your original message and compare it to my implementation. I've been interrupted all day but have quiet time now.

Thanks again!

--------------------------------------- Posted through

formatting link

Vote

J

Jon Kirwan 14 years ago

If you notice, in my large post I carefully exhibit a method for insert that _also_ does not modify the sleep queue pointer. That's why the "switcheroo." So I share your intuition here. We are in synch.

Also, since you are looking for a delete function this really is going beyond the simpler delta queue pattern and more into a very simple operating system pattern. I can provide something to help on that score. In fact, a lot. If you want it. Of course, if you are like me you will want to design and code it yourself but perhaps learn the pattern well from something you look over. So I'm good with that.

Jon

Vote

E

eeboy 14 years ago

Actually, yes, I'd love to look it over. I am interested in how complex/simple it is. A delta queue would certainly serve my purposes for quite a while. I can work around not being able to remove items from the queue but it would make things messy (lots of flags). The ability to remove processes would be tops!

I think I have finally got the delta queue working reliably with all your help. I ended up dropping the method of linking/unlinking that I was using in favor of your method. I've single stepped through the insert process so many times and all looks well. So far, no flaky behavior.

--------------------------------------- Posted through

formatting link

Vote

J

Jon Kirwan 14 years ago

Hmm. Very simple? For example, I can send you one that I wrote in less than a day, from scratch since I had no code with me at the time while sleeping under a desk at a client site 1000 miles away from home.

It provides what you've mentioned and includes per-process exception handling in c (not c++), as well (the start of a linked list of longjmp buffers are maintained in the process structure.) It also includes simple messages targeting processes, where there is a word stored in the message structure for that (and if two are sent before the first is grabbed, then the first is lost -- which is fine for many uses and very simple and safe to implement.)

I think of it as simple, yet useful. And it is only a slight extension to this delta queue idea. I didn't add semaphores in this case. But it would be easily added.

Operating systems is something I enjoy, I guess. Got my start in 1975 with a timesharing system, then went on to working on the Unix v6 kernel in 1978 (where I learned c), and I am more of a parsimonious type when it comes to operating systems than a full-featured, general purpose type. I like compile-time feature sets that can be included or excluded by simply setting #defines, for example, and take up only those resources that are absolutely required by those selections and no more than that. And I want the c code to adjust itself accordingly. (I suppose a compiler-generator would be another answer to that.)

Small is better, if it works. I dislike unnecessary excess.

On the more general purpose types, I preferred the earlier BSD386 to the earlier Linux because the kernel was a far cleaner design from the ground up, for example. But I haven't looked over today's FreeBSD, yet. So I can't comment with an informed opinion on how things are now.

I did spend a couple of years teaching operating systems in an undergrad classroom (65 students in an average classroom) at a decent 4 year university. So I know a couple of things, or at least those who hired me felt it was so. I recently stopped by (it's been 15 years) and was greeted with many smiles and remembrances from staff who still work there. So I guess I made some kind of impression. (Yeah, probably not so good and they were just being kind to me. ;)

It's not a problem for me. I enjoy it when folks take a crack at these things and actually put some work into it. You clearly have. So I feel that anything I contribute will be matched and exceeded by your own efforts and that makes it all worth the moment to me.

I guess a longish post wouldn't kill the group. I could then worry about other venues later on (web site using doxygen, for example.) I suppose this is something I should do.

Perhaps write a book, someday, if my ego gets big enough and I'm crazy enough to try (there are plenty good ones out there already.) But perhaps the fact that there are only a few targeted at self-educating types who need access to several different operating system patterns to understand and select may mean there is room for one more. Something that starts with a very simple pattern and builds upon it, incrementally.

That's excellent. Especially considering that I was typing from memory. ;) Which is why I ordered things the way I did. It helped me to think as I went, to save the more complicated insert for last.

I'm very glad you have struggled through all this. It has probably deepened well within you this way and may help inform embedded projects in the future.

I'll see about preparing a post of some kind. :) ADHD types beware.

Jon

Vote

E

eeboy 14 years ago

I've just recently discovered Doxygen (as you probably noticed from my code). I am just starting to use it, and I am probably using it wrong but I love the LaTeX output from it and how it utilizes PGF/Tikz to create the maps.

Yeah... that's impressive! I kept that in mind the whole time (looking for a potential bug). There were a few times that I thought I found one but of course, I ended up being wrong... :)

Only way to do it with something like this. This is the foundation of an application. Funny thing is, even though you pretty much laid out in your original post it still took me many hours to code that piece because I was trying to understand every little detail that was going on.

Great! Looking forward to it...

--------------------------------------- Posted through

formatting link

Vote

J

Jon Kirwan 14 years ago

Oh, I really want to learn about it, myself. I have looked on from afar, but haven't started using it yet. I've also looked at literate coding, as well, but also never really had tools I wanted to try out and use under Windows when I've troubled myself to go look around and I've not met anyone yet using these tools in their work that I could talk with.

Point here is that I might ask you about your experiences. I'm interested in how it works to generate HTML, because in some cases I'd like to do that rather than Latex (which I'd use to generate PDF, most likely.)

I try. :)

There is a HUGE difference between some hand-waving post and achieving the intimate details of a successful implemetation. You've read the former and attempted the latter and that gains a great deal of respect from me. And it means you appreciate the results.

Well, I'll give the idea a shot soon. May as well link the post into this thread. It's not as though it deserves a separate thread. Likely you are the only one who will appreciate the details, for now. And that's enough for me.

Jon

Vote

J

Jon Kirwan 14 years ago

You mentioned the desire for deleting cooperative processes and this usually entails some kind of process ID so that one process can use it to terminate another.

An isolated delta queue pattern doesn't include that feature and shouldn't. It's job is to simply time the execution of functions. Functions that want to repeat themselves are left with the responsibility to re-insert themselves when they are started, or sometime during their execution or the execution of some other function, so that they get another shot. A more complex example may get the fuller details across on this last point:

Imagine a test and measurement system for which each measurement cycle requires a set of three timed and different types of ADC raw measurement events, plus a final computation event to combine those raw values into a final measurement. You might break this up into the functions: M (measurement), R1, R2, and R3, plus C (computation.) Let's say you also want the measurement output to be precisely timed so that the DAC is updated with a predictable, precision interval. Let's assume for discussion that the timer rate is set to 100us intervals and the measurements are 100ms apart. Some initialization is needed in M before R1 starts, too. And R1 is triggered at 2ms, R2 at 20ms, R3, at 70ms, and C at 85ms (enough time for the computation to complete before the next M cycle.) It might look like:

M: DAC=prior measurement; insert(M,1000); insert(R1,20); insert(R2,200); insert(R3,700); insert(C,850); do some inialization; return; R1: Sample lead point of integration slope; return; R2: Collect array of timed samples along slope; return; R3: Sample ending offset; return; C: Perform calculations and place result into the variable used for 'prior measurement' by M.

Note that M restarts itself for the next cycle, then sets up the right timing for the other parts of the cycle. Those parts do NOT reinsert themselves. That's the job of M. They just do their job. Since M is precisely timed, the measurements are emitted to the DAC precisely with little jitter (and that can be made very little on some cpus.) This may be important if the DAC output is used for closed loop control or FFT where the regularity is _assumed_ by some other mathematics going on following this. So it can be important.

An obvious assumption in the above layout is that M can do its work in less than 2ms (because R1 will want to start then) and so on. So the time between each function must be sufficient to get the work done and return in time for the execution function to start the next one when the timer moves it over to the ready queue. In a measurement system, that is probably important. In some other systems, it may not be so crucial. So your use of the pattern will vary, depending.

The delta queue is a sufficient pattern for such things and I use it in test and measurement equipment I design and code, pretty often. It's good enough for a lot of good work.

So the functions come up, run once, and that's it. There is no need for process IDs. Once they are inserted, they _will_ get run once when their time comes. If they want to run again, someone is responsible for that job.

This simplicity means that a very simple implementation is possible and doesn't require solving things like a separate stack (which c and c++ don't intrinsically provide and breaks with traditional compiler and linker systems design models) or process IDs. A cooperative approach can be used, too. And if a for(;;) execute(); style approach is used in main, execution jitter relative to the timer events is usually modest and tolerable without getting into complexities like separate stacks.

Once you introduce the idea of deleting functions, you likely need process IDs or, at least, some kind of pointer. The process ID is better than a memory pointer (for safety, at the very least.) So there you go. You now need to at least modify the insert routine to provide process IDs to the caller. Of course, you might also want functions that can find a process ID from a function pointer. And, if you are adding in process IDs and search-by-function-pointer support, you are getting precariously close to an actual operating system pattern -- even if cooperative.

You could just take the delta queue pattern and add those details. In other words, the insert function returns some kind of ID that can be used by a delete function that would search the sleep queue (and possibly the run queue if you want to catch even those functions that have timed out and are ready to run but haven't run yet because your function is now running.) A delete would then readd the delta time to some other following function (if there is one) so that the time isn't lost (remember that during insert, time is subtracted before insertion, so that needs to be added back.) It's possible the function to be deleted is the one being timed, so just remember that in my earlier design 'quantum' holds that value to be added back when modifying such code.

But I won't bother with those details. I think you are perfectly capable of extending what I wrote before to keep track of these details while adding a delete function to the existing pattern. And it doesn't make sense to post another boring post with small details I'm sure you are capable of.

So what is the next pattern? Probably an operating system that is still cooperative-based (in that functions do not get executed, even when ready, until some main function decides to call execute.) Why is this the next _step_ in patterns? Because it would be a HUGE STEP to move to preemption. The difference between cooperative and preemptive designs is monstrously large. Partly because preemption may involve saving static areas of libraries you aren't even aware of, if they might be called. Or it may require very special saving of registers in coprocessors of various kinds that might be in use when preemption occurs but can be guaranteed not to be in use across function calls. In short, you have a LOT MORE work involved than merely allocating stacks, if you want preemption to work well. And I consider that at least two steps forward, not one.

So the next logical progression is to add process IDs, add separate stacks (even this can be a problem if the compiler inserts "stack check" code you aren't familiar with), and add the basic ideas of process IDs and the ability to delete impending threads. But not to create complete processes in the usual sense where all initialized, non-initialized, heap, and stack are copied or re-inited for the new process -- in other words, global static data will be global to all threads; only the 'auto' variables are on a per process basis; and the heap is still shared. It's simple. But it means that one truly new concept is added. Which is why it is the next step. That new concept is to separate the stacks.

So that is where I'm going with this. Is that okay?

Jon

Vote

J

Jon Kirwan 14 years ago

... but _cannot_ be guaranteed not to be ...

Vote

E

eeboy 14 years ago

I follow and I am on board. This sounds to be exactly what I am after and you haven't scared me too much. I think I may be capable of it.

--------------------------------------- Posted through

formatting link

Vote

J

Jon Kirwan 14 years ago

okay. So let's figure out what we already have in the delta queue pattern. We've got an interval timer and interrupt event, from the hardware. That's good. We've got this basic concept of a delta queue. That's going to still work for the sleep queue, just fine. Processes can go to sleep and get moved to the ready queue, like before. So at least the basic concept appears good on the face of it.

Perhaps an early question to consider is whether or not you want to maintain the circular loop, though, that divides a fixed loop into three sectors of sleep, ready, and avail. That design was predicated on the idea that a function gets one shot and must re-insert itself to get another shot. In a more traditional operating system design, a process or thread stays in the ready queue until it's no longer needed and either destroys itself (by returning, for example) or else is killed by some other process or thread.

I didn't mention this earlier, but this is another conceptual change. In the delta queue case, it's _all_ about timing functions and not really about threads.

But now that we are considering the idea of a separate stack, we now have the idea of retaining the contents of 'auto' variables, which greatly improves the readability of code. It would also be a terrible burden to create a separate stack only to destroy it after an invocation and then to have to re-create it when it re-inserts itself again, etc, or else somehow keep it if the routine does re-insert itself but destroy it magically after invocation otherwise. The whole thing gets messy unless you choose to move to a new idea -- that a thread stays on the ready queue until self-destruction or getting killed by the running thread.

The simplest idea here is to run it "round robin" without the use of priorities. In this case, a thread just goes to the back of the ready queue when it voluntarily gives up its time. But now we are, in effect, deleting and re-inserting at the end. This is why your request for a delete made me think about this pattern. It really is logical to take things on with this new idea of keeping threads on the ready queue unless they wish to go to sleep and wait in the delta queue.

So let's think a second on this. If a thread sits on the ready queue, how do they wind up in the sleep queue? Well, they ask. They just say, sleep(10), or something like that in their code. That moves them off of the ready queue and inserts them into the sleep queue, which is still the venerable delta queue concept. How does a process move from the sleep queue to the ready queue? The old way we've already discussed -- it times out and gets moved (along with any following threads that also have their adjacent delta times set to zero.) So that part works as before, anyway.

What happens then when a thread is created? In the delta queue case, it's just a call to the insert function. But that inserted it into the sleep queue. With this new concept, the default probably should be to insert the new thread into the ready queue (at the end?) and let the thread get a little time first before it decides for itself that it wants to sleep (if it does) or otherwise wants to relinquish control without going to sleep.

Ah hah! We have a new need now. A process must be able to, in effect, give up control and reinsert itself into the ready queue and allow the operating system to choose which thread will next execute. So now we have a sleep call and a ... hmm, let's call it a switch call to reflect the fact that it allows the operating system to switch to another thread. So a sleep and switch call. Ultimately in concept, a thread can only lose control in a cooperative system by making an operating system call -- say to pkill(), psleep(), or pswitch(). Or some other thing we might later devise.

Okay. So what about main()? Is it a thread? Or not? Well, it inherits the default stack that the compiler provides. Basically, it's still the main program. Everything else is threads. So what does the main program do in this context? Well, it needs to initialize the thread system and start some threads if it wants to and then basically could do nothing much than call the execute function over and over. Just like before, really.

There is a problem with this, though. It means the main function must always be executing. Technically, this isn't strictly necessary anymore because if processes can stay on the ready queue now and if there is a pswitch() call that hands control back to the operating system then it can switch to the same or different thread again and there really is no longer a strict need for "pumping" through a repeated call to execute, now. It will all just happen, so long as there is at least one ready queue thread available. (Which can be arranged by adding a 'null' process or else setting up 'main' as a valid process in the ready queue and making sure they are always ready to run, if needed.)

Another interesting detail to consider is if you'd like to put the processor itself into some sleep mode if there are no ready processes and depend upon the delta queue timer interrupt to wake the processor up, potentially move a thread to the ready queue, and then run it or else go back to sleep if nothing woke up. This can be handled in the null or main thread. It stays in the ready queue, per requirements to have something there, but just hits 'halt' or goes into one of those sleep modes using a function call or other means. When the timer event ticks, the processor will wake up and vector over, decrement the quantum, check to see if anything moved, and if something _did_ move then insert that into the ready queue and start running it.

But what if moving a thread to the ready queue adds it at the end rather than the front? Then the old null or main routine would likely be re-invoked and perhaps that would restart the sleep mode or halt instruction. Which is not desired.

Lots of ways of solving this. Of course, the operating system could do all that itself and just handle the empty ready queue case that way. Another would be to add priorities and set up the main/null thread as the lowest possible priority so that it doesn't get a chance unless it is the only thing going on. I prefer the latter method, myself. But I've no objections to the former.

The nice thing about adding priorities rather than building into the operating system special detection and then hard coding what it does in that case is that you get some more interesting features in the ready queue -- a priority system

-- and you can still get the round robin behavior if you set everything to the same priority (except null.) Plus, you already need an int member within your process structure to handle the delta queue delta-time count-down value and you might as well use that same member as a priority when the node is in the ready queue. Otherwise it goes to waste. So there is a nice symmetry here, too. It's more like an insertion key -- in the sleep queue case it is handled as a delta time value but determines position; in the ready queue case it is a priority and determines position.

It's not quite as good as I make it sound here, though. You still need to add a separate, fixed priority value for the thread structure. This is because a process may move from the ready queue to the sleep queue and the priority gets destroyed then since it is now a time value. And when moving a thread from the sleep queue back to the ready queue, a similar problem arises. So you do need a separate member value for that, regardless. But the same field serves the same purpose in both queues -- a key value for insertion -- and this makes writing the queue insertion code able to accept a queue without having to know if it is a ready queue or a sleep queue. Even that isn't in practice as good as it seems. But it has a certain symmetry, still.

So all this talk gets me to the point that priorities are likely. Might as well stick them in. It doesn't cost much and there are benefits, should you need them. So let's do it.

Now I should return to that delta queue loop, broken into sectors in the earlier pattern. Since threads can get deleted and reinserted into the ready queue (if they pswitch()), deleted from the ready queue permanently (if they return or are pkill()'d), moved from the ready queue to the sleep queue (if they psleep()), and moved from the sleep queue to the ready queue if the interval timer vector moves them, it is really time to forget the simple 3-sector idea with simple, advancing circular pointers. It just doesn't work anymore for us. We will still use the delta queue idea for insertion into the sleep queue. But we really need a slightly more flexible queue concept than the "loop" thing. In short, a queue with a head and tail and probably both a next and previous pointer. But that's not a big deal. So let's bite that one off, as well.

Hopefully, by now, I've provided a broad overview of the design considerations leading up to talking specifics.

I'd like to stop at this point and give you a chance to comment or question anything I've said here before I start laying out details.

Jon

Vote

J

Jon Kirwan 14 years ago

Just waiting until you get a moment, eeboy. I'd like to make sure we are on the same page, still.

Jon

Vote

E

eeboy 14 years ago

Sorry for the delay, busy Saturday. I read it once late last night and wasn't completely clear. Reading it again now to pin point exactly why it wasn't clear so that I can ask specific questions.

Not to discontinue this conversation, because I still want to see the implementation (and create my own), but my original thought on implementing a remove in the delta queue was to simply clear the PID (set it to 0). Then I could adjust the execution in the run function to something like so...

if(sleeping.pf && sleeping.PID) sleeping.pf();

Or, I could simply clear the function pointer and test for it alone.

I thought that would be simpler to implement than actually iterating the queue and removing the process from the queue. If removing processes is a frequent occurrence then the queue size would likely be much larger than it would be without a remove function. However, this would be a tad bit faster in execution.

Anyway, I just wanted to throw that out there for comment... I'll be back shortly after I re-read (and maybe re-re-read) your last post.

Thanks!

--------------------------------------- Posted through

formatting link

Vote

J

Jon Kirwan 14 years ago

The idea of nulling out the function pointer doesn't complicate things much at all, actually, for a delete operation. You may still wish to scan the ready queue, as well. But up to you on that score. So you could certainly retain the existing design and easily handle deletes that way. I like it fine.

Jon

Vote

E

eeboy 14 years ago

A lot of this if thinking out loud... so you may want to read through all of my comments first before responding. I may answer some of my own questions.

This 'one shot' behavior was/is actually desirable in a few scenarios specific to my application. I want to do something once x milliseconds from now and I don't want to track a handle/process ID so that I have to kill it after it runs that one time.

Of course, the run until killed behavior is desirable as well. For example, a lot of my modules require a 'tick' every few hundred milliseconds by calling ModuleNameTick(). In normal operation, there is no reason to kill this process.

Confusion here... and probably a very silly question. Why the need for the separate stack and how does that work theoretically? Before we were simply calling the provided function... its autos all on the default stack? What do we get from creating a separate stack?

So to get that module tick functionality, instead of inserting into the delta queue I would just call Sleep(500)? How does the scheduler know what to put to sleep? No process ID is provided. Is it based on what the ready pointer is pointing at?

This confuses me too. Again, in my application this behavior would be desirable sometimes but other times I want that one shot behavior. Do task x in 500 ms.

I follow the 'main' discussion. It's not really different at all other than the fact that my sleep call becomes the required perpetual process. Everything in the ready queue is processed in a circular fashion, so, the processor goes to sleep after executing all processes. Easy enough.

Yes, I can visualize the above.

Ok, so I did some 'thinking out loud' with my comments. I am going to attempt to describe this back to you in a general sense so that you may point out any flaws in my understanding.

With the new ideas you present here I think I need to stop thinking of the delta queue (other than it being an efficient method for counting down). A process is now started in a similar manner and it returns some sort of identification so that it can be killed if needed. The process sleeps for a defined interval and then is transferred to the ready queue to execute. This process stays in the ready queue forever unless it is put back to sleep (via a call to Sleep(sometime)). The sleep function is a beautiful thing that would benefit me greatly in a scenario such as this...

Without OS example 1:

SendMessageToDevice(); While(!SomeFeedBack){ ; //Burning CPU }

Without OS example 2:

SendMessageToDevice(); //Create a function to wait for feedback SchedulerInsert(CheckFeedBack,1000);

CheckFeedBack(){

//Keep waiting if(!SomeFeedBack) SchedulerInsert(CheckFeedBack,1000); else DoFeedbackEvent();

}

With OS:

SendMessageToDevice(); While(!SomeFeedback){ Sleep(10); //Relinquish CPU so that it can do other things }

.. and that is exactly the reason we need a separate stack. This way the function can pause execution and all of its autos remain intact. I get more efficient use of the CPU and cleaner code (second example without OS requires 3 functions). This part has me excited.

The ready queue contains a single perpetual thread. In my case, it would be a sleep function. So, main now looks roughly (probably additional parameters for insert) like:

void main(void){

SchedulerInit()

SchedulerInsert(ProcessorSleep,0);

for(;;) ;

}

Finally, to get the one shot behavior, all I need to do is let the executed function return and it will be returned. If I want the ModuleNameTick() behavior my tick functions need to be rewritten to execute forever...

for(;;){

//DoModuleTasks DoStuff();

Sleep(MODULE_FREQUENCY);

}

Is the switch function used to stop execution momentarily without being placed back into the sleep queue?

Do I comprehend?

--------------------------------------- Posted through

formatting link

Vote

Delta Queue Help - Paging Mr. Kirwan

Join the Discussion

Didn't find your answer?