bus allocation strategy

- A
- alb
  
  Contact options for registered users
posted
9 years ago

Thu, Nov 6, 2014 1:24 PM

Hi everyone,

I'm not sure if here's the best place were to talk about this subject and I'd be glad if someone post the message wherever is more appropriate.

I'm architecting a system that has a master unit (MU) and several slave units (SUs) all happily sitting on a shared bus (I'd like to keep the level of description abstract and not go into the details of the bus implementation).

The system has to perform several activities with different 'priorities' [1], within a time cycle T that is cyclic and perpetuous. Each SU has a certain need for data in and out per cycle in order to perform its function. So the MU should provide the necessary input and retrieve the available output at the right pace in order for the function to perform as intended.

Aside the SU I have a memory unit (ME) which is shared amongs all SUs and the MU for configuration and data. The ME contains configuration parameters for all SUs and they can be updated from cycle to cycle, moreover it is the place where results from all SUs are stored.

How do I allocate the bus access to all SUs in order to get them working properly (with a certain margin to handle failures and recovery actions)?

Is there a 'formal' way to define such problem and find a solution?

I realize the problem itself is potentially uncomplete to be solved and it may lack information, but I'm open to questions so that I can include elements in my problem definition.

Any suggestion/hint/pointer is appreciated.

Al

[1] Priority may not be the best term here. The real intent is that each SU work on its own frequency cycle multiple of the 1/T frequency so that within a cycle we have N1 accesses for SU1, N2 accesses for SU2, etc.

--
A: Because it messes up the order in which people normally read text. 
Q: Why is top-posting such a bad thing? 
A: Top-posting. 
Q: What is the most annoying thing on usenet and in e-mail?

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 6, 2014 4:43 PM

Yes, your problem is not adequately specified. In order to even consider that there is any problem to be solved requires a time duration of each memory access and a maximum latency that each unit is able to wait without impacting its performance. In essence, the memory unit is a shared resource and there are multiple devices vying for access to it.

If I understand the problem as you have put it, this is exactly the same as a single processor with multiple interrupts. In your case the processing units are vying for access to the memory. In the case of interrupts the interrupt routines are vying for access to the processor time.

I'm sure you can find a lot of material on interrupt priority assignments.

--

Rick

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 6, 2014 5:23 PM

Google Monotonic Rate Analysis. It may answer your questions.

--
www.wescottdesign.com

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 6, 2014 5:28 PM

If you can do something as simple as establishing a master cycle and making a Gantt chart for all the message passing within that cycle, then you're done. Your solution will probably be somewhat fragile, but you'll be done for the moment.

CAN has the concept of prioritized messages, and you can use that for the purposes of dividing the network traffic into short messages that need low latency, and long messages that can take a while. CAN does it in the context of breaking up the long messages into short ones (CAN has an 8- byte maximum message length).

Basically, your long slow messages either have to come at same scheduled "slow traffic time", or your physical (or maybe link) layer needs to specify a maximum message length, with higher layers having a way to send long messages as a collection of short ones.

--
www.wescottdesign.com

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 6, 2014 9:20 PM

Hi Rick,

rickman wrote: []

[]

While memory access may be considered negligible, the handshake protocol, if any, on the bus may lead to overhead and the transaction can be longer. Let's then talk about 'transaction time' instead of memory access.

Latency was indeed specified with 'priority'. If you want it can be expressed in data rate per unit of time T. The data rate (in and out) expresses the needs from the unit perspective to function properly. If for any reason the data rate is not sufficient there'll be an impact on functionality.

or multiple slaves to poll. Interrupts in this case, with a commonly shared resource may cause priority inversion which is usually undesired.

yes, there's lots of material indeed, but I'd reserve interrupts to 'asynchronous' events which need to be served timely. In the use case presented I have not mentioned the need to react to asynchronous information, rather allocate resources with margins in order to allow to recover from failures should that happen.

I had in mind a protocol similar to a 1553, where the Bus Controller is continuously granting access to the bus to each Remote Terminal within a framed structure (often divided in major and minor frames). Now reading around in an not so well organized way I've found that this type of scheduler is defined as cyclic executive, a form of cooperative multitasking.

Al

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 6, 2014 9:46 PM

Hi Tim,

Tim Wescott wrote: []

That's it! And the good thing is that this scheduling algorithm can be tested to verify if the tasks will meet their deadlines (Liu & Layland -

1973).

I also found an interesting program that allows to perform the analysis and confirm schedulability

formatting link

Taking the right amount of margins to 'react' to failure cases would be sufficient to fully verify those transactions can fit on the bus.

Al

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 6, 2014 10:09 PM

Hi Tim,

Tim Wescott wrote: []

[]

Why fragile? What would be the pitfall with this approach? Surely one evident point is that changing priority to one unit may screw up the whole mechanism since everything is linked to eachother.

On CAN priority is granted at bit level with recessive and dominant bits and the necessity comes because is a multi-master bus. Priority are needed to resolve clashes, if the scheduler is such that there are no clashes than no priority needs to be implemented.

There are two aspects to consider IMO:

long and slow messages
short and frequent messages

In the first event we need to provide a bit more of transaction time in order not to extend the message over a long span of time. OTOH short messages may suffer from protocol overhead and be penalized. There must be an optimum size of the transaction, given the sizes of each transfer and their respective periods. And I'm also convinced there's a study out there that just provides the answer... :-)

Al

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Thu, Nov 6, 2014 10:45 PM

Fragile in the sense that as you add messages the whole job may have to be done over again, rather than having any structured way to do it.

The optimum size differs with the environment. CAN is optimized such that the most important message never has to wait longer than 84 bit times or some such (I can't remember the exact number). It gives up considerably on overhead (and thus throughput) to do so.

Ethernet, IIRC, is the opposite -- it has some relatively small amount of overhead, because it's relative to a HUGE packet size (1024 bytes?). So totally aside from the fact that there's no prioritization in "regular" Ethernet, you'd have to wait many bit times for a message to finish before another one starts.

--
www.wescottdesign.com

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Fri, Nov 7, 2014 4:50 AM

I believe you are splitting hairs that aren't even a part of your problem. Priority inversion can only occur under certain circumstances which I don't believe exists in your context. The issue of "asynchronous" events imply something to be asynchronous to which I don't believe exists in your context.

I was drawing an analogy between the two types of problem which is valid. You can simply not involve the parts of interrupts that do not apply.

Great.

--

Rick

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 7:53 AM

Hi Rick,

rickman wrote: []

Priority inversion can occur whenever there's a shared resource amongst differently prioritized tasks which is reserved by a low priority one preempted by a middle priority one. At this stage the high priority task cannot run because it needs the low priority task to release the locking mechanism for the shared resource, hence the priority is /inverted/.

Here's a nice read about priority inversion:

formatting link

Does this mean I'm splitting hairs, maybe, but this is the architecture phase and if things like this get forgotten they're going to hit back later on...and remembered for a long time!

[]

I do appreciate the analogy, I'm just trying to be provocative and test my reasoning. So thanks for being my 'rubber duck' in this case ;-)

Thanks!

- D
- Dimiter_Popoff
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 9:19 AM

A friend of mine once told me of the so called "oh shit" factor about designs.

You either say it at the beginning or at the end.

Dimiter

:-)

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 9:30 AM

Hi Dimiter,

Dimiter_Popoff wrote: []

Nice one! The main difference is that when you are conceiving a system you can still rather simply dump that shit in a garbage bin without too much of a pain, while when you find it at the end of your designs you can be forced to leave it in.

Al

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 8:14 PM

Exactly. In the context of accessing memory there is no "locking" of the resource... so no priority inversion.

There is nothing to forget.

--

Rick

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 9:45 PM

To be clear, from your footnote, nodes access the bus at multiples of the "cycle FREQUENCY", not *period*. I.e., SU#i accesses the bus Ni times in each interval T (assume N is an integer) and NOT "one in every Ni intervals".

Furthermore, you don't indicate if these accesses need to be "phase-locked" to that "time cycle". I.e., whether or not any skew is allowed in accessing the medium.

For example, if Ni=1 and Nj=3, they could each want to access the medium concurrently -- i.e., the second of SUj's accesses coinciding with the first/only of SUi's.

0-------j-------j-------j-------T 0---------------i---------------T

Whereas if you allow the accesses to be arbitrarily (though, perhaps, predictably!) skewed:

0-------j-------j-------j-------T 0-------------------i-----------T

Similarly, if any pair of SU's have the same "priority" (you haven't indicated whether all Ni will be unique).

0---a---a---a---a---a---a---a---T 0---b---b---b---b---b---b---b---T

But, this behavior is consistent and repetitive. I.e., SUi doesn't suddenly decide it needs to access the medium more often in *this* T cycle than in the previous, etc. Nor do more nodes (SU) come on-line. This sort of arbitration is handled "elsewhere" in the protocol, if needed.

And, you are sure (at design time) that the sum of all accesses (times) will never exceed T. (?)

But, do the accesses from each node need to be TRULY equally spaced in time? (think carefully about your answer -- and, put some numbers on it in your head to verify they CAN be equally spaced in time :> ) I.e., are accesses

*truly* at harmonics of the fundamental? Or, can you tolerate instantaneous variations in the access frequency (for a particular node)? (Much of this will depend on whether or not your comms subsystem is blocking/nonblocking and synchronous/asynchronous -- this determines how tightly coupled it will be to your "application/implementation")

For a concrete example, assume each access takes a fixed time, t. Assume three nodes (SU's). Assume the Ni are 1, 3 and 4 and this saturates the medium. I.e., 1T+3t+4t=T

So, you could view each T as being cut into 8 t's.

|-----|-----|-----|-----|-----|-----|-----|-----|

Assign the Ni=4 pieces (i.e., for the last SU):

|--4--|-----|--4--|-----|--4--|-----|--4--|-----|

Now, assign the Ni=3 pieces... No matter how you do it, the time between consecutive accesses for that SU will *never* be constant (given the conditions I laid out, above):

|--4--|--3--|--4--|--3--|--4--|--3--|--4--|--1--|

|--4--|--3--|--4--|--3--|--4--|--1--|--4--|--3--|

|--4--|--3--|--4--|--1--|--4--|--3--|--4--|--3--|

|--4--|--1--|--4--|--3--|--4--|--3--|--4--|--3--|

(this is obvious once you put in real numbers)

Alternatively, do you merely want to give each SUi enough *bandwidth* to accommodate Ni transfers in each period, T? In other words, is there anything wrong with:

|--4--|--4--|--4--|--4--|--3--|--3--|--3--|--1--|

in which case, there are only 3 bus acquisitions per T cycle ("4" acquires, then releases as "3" acquires, then releases as "1" acquires, then releases as...) instead of the previous *8*.

[Again, this depends on how your system deals with comms. If a node blocks until it can send/receive, then any time a node spends waiting for access to the medium is lost processing time. OTOH, if comms are handled by an asynchronous (wrt the app) process, then the app can continue while waiting on comms]

Does each access (assuming operating at true harmonics) take the same amount of time? Or, do they vary between nodes and/or between transactions on an individual node? E.g., can a node's access "now" require a bigger piece of time than the *previous* access did (perhaps previous access transferred a smaller packet of information than current access)?

For example, can a node's *second* access in this T cycle require more time than it's first did? Can a node's first access in this T cycle require more time than it's first in the *previous* period?

If each access (transaction) takes a fixed amount of time (t) such that N1*t + N2*t + N3*t... Is there a 'formal' way to define such problem and find a solution?

As you appear to suggest your "priorities" are static and the times (frequencies) invariant, there are no real needs for formal RUN-TIME scheduling approaches -- those would just be wasted CPU cycles. You can achieve an optimal strategy "at compile time" without having to resort to evaluating (changing) "priorities" at run time.

The TDM approach is an example of a "reserves" (reservations) scheduling algorithm -- you KNOW the resource will be available because it is intentionally set aside for your use! (e.g., timeslot) It can also be applied to scheduling tasks/jobs in a multitasking system that must

*guarantee* performance -- a task can calmly wait for the processor (or any other resource) because it KNOWS that it will be given the resource for the amount of time that it needs -- regardless of how long it actually waits!!

No need to reply to any of this. Rather, intended as things for you to think about in how (and WHEN!) you "allocate" your "bus resource".

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 10:30 PM

Hi Rick,

rickman wrote: []

Depending on what gets written in the memory it is still possible to create locking. Assume you have a data structure in memory where a bit says that an area is being updated. If the area is shared the high priority task will need to wait for the low priority task to finish in order to access that area, otherwise it will get the wrong value. Here you go with your priority inverted.

You can easily forget about these issues and implementing incorrectly your scheduler. AFAIK there are techniques to avoid priority inversion (like priority inheritance) but you need to be aware first that your system is potentially affected by those problems.

Al

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 10, 2014 10:35 PM

The OP appears to be talking about bus allocation strategies, not communications techniques using memory.

As I said, with a bus allocation there is no locking and so no possibility of priority inversion. But then the OP can give us more detail about what is happening in his situation.

--

Rick

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Tue, Nov 11, 2014 2:08 PM

Hi Don,

Don Y wrote: []

thanks for the correction.

This is a good question and I believe the answer is that some skew is allowed, that is why we asked the customer to specify frequencies with a tolerance value (ex. 25KHz +/-1%).

There'll always be some sort of slack between the two processes since they need to share the bus. Yet the transaction can be indeed non-atomic and the two processes run *in parallel*, but I'd like to avoid this situation.

I think this is more reasonable and equally accepted, provided that time execution for j is not delaying i far beyond the tolerances.

Well, starting from SU throughput needs (frequency and amount of data to be transferred) we can verify what is the minimum 'transaction' time and slice the whole period in slices which are then allocated to SUs accordingly.

The bus specification will need to comply with the transaction time in order to meet system level requirements.

Events are stored with a timestamp, so the acquisition rate at ~25KHz can effectively 'fluctuate' a bit without introducing an error. Especially considering that most of the activity is for housekeeping purposes and therefore not time critical. Yet there are some tasks that are expected to be executed at a certain frequency (multiple of 'S' frequency) with much more accuracy.

If I take my time slices I'd allocate the time critical ones as needed and let the other tasks execute *around* the time critical ones in order to meet their repetition cycle.

Funny, I came up with the same approach and I call t the transaction time, without the need to specify the protocol for the transaction (maybe handshake, or a read/write access, who cares for the time being).

[]

Unfortunately tasks need to run at regular time intervals w.r.t. the 'S' period and I cannot only reason in terms of 'bandwidth'.

At a certain stage we thought about separating control flow and data flow on two separate busses, meaning that the controller will instruct a DMA to take care of sending data to each SU, while the controller itself is free for serving other activities.

The transaction time can be made constant, reserving necessary time slot to serve 'asynchronous' events (like failure handling). The main idea is to be agnostic w.r.t. previous accesses and schedule bus activity on worst case scenarios.

TMD seem working very well indeed and from an SU perspective the transaction can be done such that will never span over two time slices. This will simplify allocation and avoid complex logic on the sequencer side.

I'm not sure how far are TMD and rate monotonic scheduling, since they both look to me as cooperative multitasking algorithms.

That is correct and it was my feeling too.

[]

Too late! :-)

Al

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Wed, Nov 12, 2014 4:00 AM

Note that this doesn't (shouldn't) relate to "frequencies". Rather, the question resolves how you deal with the *relationship* (phase) of these activities relative to each other and your "T" cycle.

My question is meant to address what your *specification* requires of the *implementation*. I.e., I can concoct all sorts of scenarios that will *break* unfortunate implementations. So, you have to determine how to resolve them in a generic manner.

If Tom boards the bus, he must be given a seat. If Bill boards the bus, he must be given a seat.

If you *reserve* a particular seat for Tom, then Tom can always be guaranteed a seat -- THAT seat!

If you reserve the *same* seat for Bill, then the same guarantee applies -- unless Tom is ALSO present on the bus!

So, either ensure Tom & Bill can never both be present on the bus at the same time *or* come up with a different solution (e.g., reserve *different* seats for each of them!)

Exactly. If this never changes (or, changes infrequently and with forewarning), then you can do the analysis at design time and just drive the implementation with your results.

Unsure of your intent, here. Guessing you mean the stuff pushed down the bus can tolerate some skew because you have a notion of "when" it actually occurred -- regardless of when it is "reported".

Note that you can allocate timeslots over MULTIPLE T-periods. E.g.,

|-a-|-b-|-a-|-c-|-a-|-b-|-a-|-q-|-a-|-b-|-a-|-c-|-a-|-b-|-a-|-r-| T...............................T...............................T

a gets 4 slots (Na=4) ALWAYS. b gets 2 (Nb=2), always. c gets 1, always. This leaves a spare slot every T period that sometimes gets used by q and other times gets used by r. (i.e., allocate timeslots over multiple T periods instead of assuming every T period allocation is identical.)

It's not just how the comms layer is implemented but, also, how the application expects data to be *delivered*. E.g., assume your comms layer is asynchronous to the process that uses it. So, a "program" can just say "send this datum" and know that it will get sent. However, if it expects a datum from someone -- or, an acknowledgement of that datum that *it* asked to have sent -- then it will spin its wheels waiting, depending on how the comms system works.

RMA is a means of scheduling periodic tasks that allows you to (relatively) easily determine worst case criteria for "guaranteed" schedulability.

TDM is an example of "reservations" scheduling -- reserving a resource for a particular use -- in this case, the bus.

But, it's far more intuitive than that. "I've got these things that HAVE to get done. There's enough resource available for them. How do I arbitrate who gets that resource and when?"

You can also do this dynamically at run-time -- but, why add complexity if your needs are static?

If, OTOH, your needs varied, then you could look into other algorithms to schedule the resource -- earliest deadline first, shortest job first, round robin, etc.

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 17, 2014 8:14 AM

Hi Don,

Don Y wrote: []

Yep, I realized this and indeed there are 'tasks' (or accesses) which need to be phase-locked to the 'S' signal, while others that can be scheduled 'around' the first ones, as long as they keep a certain /mean/ frequency value.

[]

After a long discussion with the customer we both realized that there's no need to serve asynchronously the 'S' signal since it's deterministic, moreover the tolerances on the tasks frequencies are such to allow flexibility in the scheduling.

exactly. The scheduling will be deterministic in order to *guarantee* both a seat. Being the scheduler deterministic and sequential there's no risk of having both Tom and Bill 'sent' to the bus at the same time []

Extra margin will be taken only to allow extra 'changes' the customer will find out along the project. Margins consideration will limit the acceptability of a change request and give us a quick feedback whether their extra need will require an architecture change.

Exactly. The acquisition is going on regardless of the bus access allocation and being the data timestamped they can be retrieved at any moment.

Good intuition, indeed this is actually what is going to happen. There are certain activities that are done once each 19 cycles, so they can be allocated a slot.

Obviously I need to slice my time frame in order to take into account the extra slots which will be allocated every 19 cycles. In your example q and r contribute with an additional slot, even if their needs are less than once every two slots.

That is correct. The main idea for the time being is that there are two types of accesses to the bus:

master to slave (read or write)
slave to slave (read or write)

The master is always the one that allows data trasfer between units, so even in the second type of access the master is involved and is the one that initiate the transfer.

[]

The main difference I see is that TDM resource allocation is fixed and fits well the need of repeatability, while the order of execution in an RMA is not a priori known.

Agreed. And indeed as long as needs are static I'd assume the simplest way would remain TDM.

Al

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Nov 17, 2014 10:31 PM

No, TDM doesn't *require* fixed assignments. You have two problems that (I see) you need to address. One is who gets the resource. The other is

*when* is the resource "get-able".

Cutting the resource into time-slots reduces the effort for this second aspect -- the resource gets chopped up into discrete units. So, there are fixed points in time (relative to the T event) at which a consumer can acquire that resource. Contrast this with something like a CSMA scheme where anytime is a potentially "right" time -- harder to arbitrate AND PLAN for.

E.g., you could slice up the T interval into N units (possibly even different size units!) and then layer another protocol on top of it that allows you to dynamically assign particular timeslots to particular nodes.

My point was to take all of this variability out of the implementation (*if* it can be removed) and just "hard wire" accesses to the resource so you can do a static analysis "at design time" instead of having some code running *in* the product that is constantly modifying it's timeslot assignments (now, you have to validate that code to be sure it remains "fair" to the various consumers vying for the resource at run time)