UDP timers

D

Don Y 12 years ago

Hi,

I've been discussing this in another forum and we haven't (yet!) come up with any generic approaches that can be bent to the task. So, I figured it was worth exploring, here...

I've got some UDP-based protocols that try to be really lean (hence the avoidance of TCP!). They also must provide certain timeliness guarantees for the mechanisms that rely on them. And, reliable delivery (heh heh heh).

Deployed, the network configuration is either known a priori or network discovery *at* deployment suffices (thereafter, networks are *very* static!). Networks are intended to be private and contain only "well behaved" devices.

I currently use RTT estimates based on information that the device, AS AN INTEGRATED ENTITY, can gleam from the *set* of protocols running thereon. (This lets me avoid adding extra traffic as "overhead" -- low information content per octet). I.e., instead of explicit acknowledgements (which add overhead), the acknowledgements are *inferred* from other observable behaviors in the device as a whole.

While I can get a rather precise metric for the temporal costs of the *fabric*, I'm having a harder time trying to factor in the variance in the processing times on the (remote) node without artificially tightening its deadline constraints.

Specifically, I'm trying to (dynamically) maintain the functional equivalent of the RTO at the minimum level that *just* catches a lost datagram without waiting wastefully too long *or* "correcting/recovering" prematurely. The algorithms used in commodity stacks tend to be very naive in their expectations (of necessity!) so you don't get a finely tuned response. There, the cost of a dropped packet is inconsequential. I want to dynamically alter the deadline that invokes the "recovery handler" for these particular protocol instances -- not too soon (wasted effort as the "dropped" datagram may be arriving *as* the handler starts running!) nor too late (threatens performance).

I guess this boils down to filters to guesstimate runtimes of specific (remote!) tasks?

Thx,

--don

Vote

S

Simon Clubley 12 years ago

That's because there isn't a generic approach which I can see. Different response requirements and network bandwidth/costs (and types of costs) require different solutions. Read on for some of my thoughts.

So you don't want TCP, but you want 80% of what TCP provides ? :-)

You know what they say about those who fail to understand TCP... :-)

You have also left out some details; you have not provided any details on expected transmission rates or required confirmation-of-receipt response times.

What is the nature of the network ? Directly connected Ethernet/wireless, some cost per packet communications network, or a mobile phone data network (or something else) ?

How many nodes in this network ?

Is this a one-way data flow (from remote device to some host) or a bi-directional data flow ?

...until a device fails and starts pouring junk onto the network or until some bright spark decides to plug something new and unexpected into your network...

Why do this ? Are you using a financial cost per packet communications network or some very low data rate network ?

I also don't fully understand fully how this avoids acknowledgements. A specific example is required here I think. :-)

However, for whatever specific subset of problem domains this approach may work for, it's not going to be scalable to the generic approach you are looking for.

It also seems fragile even for the problem domains for which it can be made to appear to work.

So you don't want TCP, but you behave like TCP in that you only send packets when you have something to transmit ?

However you do it, you still need some acknowledgement packet/signal back from the other end because that's the only way you are going to know for sure (at least in your desired generic solution) the packet was received.

If there's no cost involved in actually transmitting packets, then why not simply transmit packets all the time, at a acceptable rate, in both directions ?

The device's packet contents can tell the host if there's any data within the packet and the packet header from the host can contain information about the last packet received from the device. It also provides a keep-alive capability so that if the packets stop flowing you know either a node or the part of the network the node is connected through is down.

Of course the above only works if you are not paying a financial cost for keeping the network up (either connect time or per packet/bandwidth charges) and you are not using a really low bandwidth network.

I don't see anything in your current solution which gives you a keep-alive capability (assuming data is not expected from the device at regular intervals) but a specific example of your current solution in use may help me understand that.

No, because you seem to still be thinking like TCP (in that you want to respond to transmission of specific pieces of data from the device) even though that's what you seem to be saying you don't want.

Of course, what you may _really_ mean is that you want everything TCP gives you but without TCP's latency in the presence of packet loss.

Also, you talk about a generic solution but there's no way your approach can scale to a generic set of task workloads. You may have a specific subset of tasks for which your approach is suitable but that's not the generic solution you are seeking.

If you want a generic solution then go with some form of ACK packet. If latency is a concern when dealing with packet loss, and if your network is suitable, then just transmit on a regular basis, even if you have nothing to send or nothing new to acknowledge.

Simon.

Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP [Note: email address not currently working as the system is physically moving] Microsoft: Bringing you 1980s technology to a 21st century world

Vote

D

Don Y 12 years ago

I was looking more for a "line of thinking" (i.e., how to approach the solution) than a generic solution.

Exactly -- hence the "snicker" :>

Actually, it is my understanding of TCP (its costs, limitations and intended use) that motivates my *avoiding* it!

All of those are essentially immaterial as one can scale the compute resources UNavailable to make even generaous times look tight! C.a.e -- think "resource constrained" (regardless of how *much* resources you have at your disposal, it's typically not "far more than necessary")

In one application (network multimedia tank), (audio) clients receive datagrams at a roughly 500Hz average rate. Acknowledgements would have to occur at a similar rate -- though possibly skewed in time as allowed by the application buffer depth (which, of course, we want to keep very shallow to save recurring costs! :> )

Keeping the buffer shallow means you have very little slack time in the event that a packet is dropped or lost in the ether. E.g., waiting *seconds* for an acknowledgement would necessitate a very

*deep* buffer to carry over while awaiting acknowledgement(s) for the oldest packets.

This gets exponentially worse when dealing with video.

[I.e., the figure of merit is the difference between when you *rely* on an acknowledgement vs. when your buffer of events *preceding* that unacknowledged event will be exhausted.]

Currently wired ethernet. Wireless would be a practical extension (though I may move to a mesh network for that to exploit geographical location)

Hundreds to ~1000. Beyond that, I will need to subdivide.

The individual channels tend to be simplex. But, there are lots of communication channels active concurrently -- for differing roles. Instead of treating *a* connection in isolation as a separate entity (like TCP would), I'm leveraging my knowledge of *everything* happening in the system to carry the data (in this case, the acknowledgements) "for free" without having to bear the higher cost of building that into the protocol FOR A SINGLE COMM LINK (which is what TCP does).

When you walk up to someone's door and "press the doorbell", there is typically no *direct* feedback (i.e., acknowledgement) that the bell has, in fact, rung. Depending on the size of the residence, the location of the annunciator and its "sounding" characteristics (if the residents are deaf, it probably doesn't make a NOISE that you could hear standing outside the house!), you have no idea if there was *any* notification delivered to the occupants!

The doorbell manufacturer could have designed his product to provide that notification to you (at some ridiculous cost!). Or, the annunciator could have been designed to be REALLY LOUD so you could hear it regardless of where it was situated. Etc.

Instead, you rely on a subconscious timer that you start in your mind. If it expires before you hear footsteps approaching the door, you ring the bell again (retransmit message). You repeat this some number of times (some PERIOD of time) before giving up (missing the deadline). "Maybe no one is home?"

OTOH, your ringing the doorbell may have woken the dog who was sunning himself by the back porch. He may start barking. Or, come to the door. I.e., *he* acts as your acknowledgement that the bell did, in fact, ring -- even though the homeowner never appeared! You rely on your knowledge of the *complete* "Residence.application" to acquire the data/acknowledgement that you need!

That "bright spark" could just as easily remove the cover from your computer/workstation and pour liquid mercury into the unused PCI connectors on your motherboard. Or, wrap some components with metal foil (short). What measures have you taken to ensure this does not happen? :>

There are lots of "private"/dedicated networks deployed in industry that inherently rely on their private nature to operate reliably. They are most often protected by *policy* rather than physical or protocol issues.

(relatively) high data rates with large numbers of relatively low capability devices.

Imagine using TCP for every communication channel. How many such sockets can you support on the server? And each client? Those costs are there *solely* for maintaining a connection oriented protocol and its "guarantees". A protocol that was designed for GLOBAL communication, through a dynamically varying route, comprised of divers active devices of variable reliability, and temporal characteristics, etc.!

(You will note that the PCI bus in your workstation *could* have been designed with this level of "robustness" -- but wasn't!

*WHY?* -- cost!)

Imagine comm path #1 is delivering sensor readings to a controller. The controller *could* implement a reliable comm link for this. That assures the "sensor device" that its data *is* being delivered (and not being "lost in transmission").

If the sensor device can see the commands sent to the corresponding

*actuator* (whose actions are governed by its reports), it can *infer* that its data is being delivered by noticing how the actuator is being commanded. [E.g., in a simplistic case, imagine the controller powers down the actuator if it fails to see all the data that it *expects* to see from the sensor. The sensor noticing the actuator being commanded to power down *knows* that some of its data has not been seen. I.e., instead of implementing two *reliable* communication channels, use your knowledge of what "the other" is supposed to carry to deduce how *your* transmissions have been received]

I'm not looking for a generic approach. I'm not trying to implement a globally distributed system, through dynamically varying routes, comprised of divers active devices of variable reliability, and temporal characteristics, etc.! :> If I was, I'd have to live within

*those* constraints and use a protocol designed with that sort of "generic" solution in mind!

Yes. But the acknowledgement need not be the *explicit* acknowledgement that TCP requires -- "per connection".

"The dog is barking. The doorbell works and was just rung!"

There is always a cost for every action. Buffer space, CPU cycles, latency, etc.

Why don't doorbells provide an acknowledgement to the person standing on the doorstep? (because the cost far outweighs the alternatives -- ring the bell again! hopefully with some sense of "appropriateness" dictated by societal norms)

A keep-alive is redundant. If the data in an audio stream suddenly goes missing, something is broken! The client doesn't need the overhead of a periodic keep alive to tell it that the connection is still "up" -- the fact that it is receiving "content" suffices for that. The keep alive packets are just wasteful overhead.

If the client sees the link go quiet, it can attempt to contact the server -- bearing the cost of doing so AS AN EXCEPTION not as part of the general protocol.

Imagine the protocol had the server receiving an acknowledgement from the client (EACH client) for each packet sent. What is it going to *do* if the client doesn't ack a packet? Cry?? If packets are still getting through to the client, the client can continue to operate (in this example application) as its ack's are not germane to the protocol's proper operation. OTOH, if the server suddenly stops *sending* in order to attempt some sort of recovery, then the fault has been compounded to a fatal fault, needlessly. (The system could have kept operating with one-way comms to the client).

If the client attempts to contact the server (because it *stopped* seeing packets -- even though the server was still sending them) and there is no reply (i.e., failure in fabric between client and server), there is no way for the server to *know* that the client is trying to contact it to attempt recovery. And, nothing for the client to do besides "go silent".

[Presumably, *other* periodic mechanisms eventually alert the parties -- including the persons served -- of these sorts of failures. But, far too late to recover a "lost packet"]

TCP builds a virtual connection. A *single* connection. One conductor from node A to node B.

That connection is ignorant of any *other* connections from A to B (or, B to A, to make it more obvious! :> ).

The "application", however, *is* aware of all of these connections!

*It* can rely on its knowledge of them and their content to infer what TCP expects to be EXPLICITLY STATED (ack'ed).

I want neither the latency nor the costs associated with TCP.

I can connect a printer to a PC with a "printer cable". No TCP involved. The guarantees that this gives me (user) are different than the guarantees TCP affords. I.e., I have no idea if there are breaks in the data conductors such that what is being *sent* isn't what is actually being received (no checksums!). The "loop is closed" by the human user examining the printed document to verify that it contains what was intended.

Yet, I suspect "six nines" of the data sent over "printer cables" was actually delivered as intended. At a very low cost (in terms of product -- cable -- and user experience).

I think you misread my initial comment: "come up with any generic approaches that can be BENT TO THE TASK." (emphasis mine). I.e., I am looking for a very *specific* solution (to THIS task) and have been examining the generic solutions (e.g., RTT estimation algorithms) to see what I might carry away from their implementations... their *assumptions*... to apply to my *specific* needs.

I received a suggestion for a different way to implement my estimator that might give me better performance (in the estimate) esp given the static nature of the application. Now, I just need to code it and run some simulations...

Thanks!

--don

Vote

G

Grant Edwards 12 years ago

I suspect what he wants is a reliable datagram service rather than a reliable byte-stream service (which is what TCP provides). I also suspect the datagrams are pretty limited in size. But, with no real stated requirments, I'm just guessing.

He could acheive this by layering a datagram protocol on top of TCP. OTOH, he could layer an ack/timeout/retransmit protocol on top of UDP. If he's willing to limit is datagram size to 1KB, he could probably even get away with skipping the IP fragmentation/reassembly stuff.

Without knowing detailed requirements it's impossible to say much more...

Grant Edwards grant.b.edwards Yow! An INK-LING? Sure -- at TAKE one!! Did you BUY any gmail.com COMMUNIST UNIFORMS??

Vote

UDP timers

Join the Discussion

Didn't find your answer?