Efficient methods

=A0

=A0

=A0

y a =A0

to =A0

l =A0

d =A0

em, =A0

Okay :-)

While reading about the various designs, interestingly i came across an info that the design of TCP servers is mostly such that whenever it accepts a connection, a new process is invoked to handle it . But, it seems that in the case of UDP servers design, there is only a single process that handles all client requests. Why such a difference in design of TCP and UDP servers ? How is TCP server able to handle large number of very rapid near-simultaneous connections ? Any ideas ?

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru
Loading thread data ...

While I understand that some lazy programmers might use TCP/IP for some minor ad hoc applications.

I still do not understand why anybody would use TCP/IP for any critical 24x7 applications.

Reply to
Paul Keinanen

Paul Keinanen wibbled on Saturday 20 February 2010 14:12

Because it's a *reliable* transport protocol? Why waste effort in the application doing what an off the shelf stack can do for you?

Perhaps someone wants to shift more than 64k's worth of data and doesn't want to be bothered with checking for duplicate packets, sequencing or failure to deliver.

Most of the internet runs quite happily on TCP/IP with only a very few critical components over UDP (eg DNS, NTP - and even then either *may* use TCP).

But the choice goes deeper - is the problem better solved with a reliable connection oriented protocol or a datagram based one?

That is a rather sweeping statement.

--
Tim Watts

Managers, politicians and environmentalists: Nature's carbon buffer.
Reply to
Tim Watts

karthikbalaguru wibbled on Saturday 20 February 2010 13:10

Not generally true these days - used to be the method of choice, see below...

First I recommend signing up to O'Reilly's Safari books onlinbe service - or buy some actual books. There are some excellent O'Reilly books specifically on TCP/IP.

In the meantime, speaking generally (without embedded systems specifically in mind):

TCP = reliable stream connection oriented protocol. No worrying about sequences, out of order packet delivery, missed packets - except in as much as your application needs to handle the TCP stack declaring it's given up (exception handling). Some overhead in setting up (3 way handshake) and closedown.

UDP = datagram protocol and your application needs to worry about all the rest above, if it cares. But very light - no setup/closedown.

Regarding TCP service architechture, there are 3 main classes:

1) Forking server; 2) Threaded server; 3) Multiplexing server;

1 - simplest to program, heaviest on system resources. But you can potentially (on a real *nix system) simply write a program that talks to STDIN/STDOUT and shove it behind (x)inetd and have a network server without a single line of network code in your program. Perfectly good method for light load servers where latency is not an issue.

2 - Popular - little harder to program, much more efficient, assuming your OS can handle thread creation more lightly than process creation.

3 - Very efficient. One process maintains a state for all connections, often using event methodology to call service subroutines when something interesting happens (eg new connection, data arrived, output capable of accepting data, connection closed). Sounds horrible, but with an OO approach, very easy to get your head around. Now bearing in mind that anything in OO can be bastardised to a handle and an array of struct which holds the equivalent data that an OO object would, this could be a very suitable method for emebedded systems where C may be the language of choice and there may be no OS or only a very simple one that doesn't map well.

Now, doing 3 wouldn't be so far different to doing it all in UDP *except* you now have to care about packet delivery unreliability - as you can get a variety of stacks for many embedded systems, why not let someone else's hard work help you out?

--
Tim Watts

Managers, politicians and environmentalists: Nature's carbon buffer.
Reply to
Tim Watts

TCP is a "reliable" connection, whereas UDP is "unreliable". If you understand the difference between these two types of connections, it should be clear why this is so, and you would know which connection type best suits your application.

I would go down to the book store, and look for a load of books with "TCP/IP" in the title, then sit and read a couple of chapters to decide which one to buy.

The datagrams carry identification numbers that enable them to be related to the controlling processes, enabling them to be easily managed.

You could implement a large number of "connections" over UDP too, if you so wished to do so.

Mark.

--
Mark Hobley
Linux User: #370818  http://markhobley.yi.org/
Reply to
Mark Hobley

...

Ugh. Please don't crosspost that widely, and especially don't include additional groups in mid-thread without saying so in your posting.

It makes me reluctant to answer your question, because I don't know if the comp.arch.embedded and comp.os.linux.networking crowds want to hear me, or if they'd rather not see the rest of the thread. I don't know what's on topic there, because I don't read those groups. See the problem?

/Jorgen

--
  // Jorgen Grahn    O  o   .
Reply to
Jorgen Grahn

stand

why

Agreed, but the query is about the design of the TCP server and the UDP server. In TCP server whenever a new connection arrives, it accepts the connection and invokes a new process to handle the new connection request. The main point here is that 'a new process is created to handle every new connection that arrives at the server' . In the case of UDP server, it seems that most of the the server design is such that there is only one process to handle various clients. Will the TCP server get overloaded if it creates a new process for every new connection ? How is it being managed ?

The point here is, consider a scenario that there are multiple connection requests are arriving while the TCP server is busy in the process of creation of a new process for the earlier connection request. How does TCP handle those multiple connection requests during that scenario ?

so

Thx in advans, Karthik Balaguru

Reply to
karthikbalaguru

...

Heaviest relatively speaking yes, but less heavy than many people think. I was going to say "measure before you decide", but you'd need to write the server first to get realistic data :-/

...

[Here I'm sticking to Unix, which was your original three-bullet list.]

There are few technical reasons not to use C++ in embedded systems (no performance impact compared to C if you do it right) but maybe cultural reasons in some places.

I spent Friday comparing my own C++ implementation of the listen/accept part of (3) and cleaning up someone else's C implementation of the same thing. Mine was much simpler, but only maybe 30% of that simplicity came from OO things which could be simulated in C. Then 30% were due to other features of C++ such as RAII, standard containers and algorithms. The remaining 40% was sane naming and lack of misleading documentation.

That's *one* feature of TCP, but there are others which, if you forget to implement them correctly, usually spells disaster. Just to mention two: flow control and congestion avoidance.

/Jorgen

--
  // Jorgen Grahn    O  o   .
Reply to
Jorgen Grahn

Tim Watts did an excellent job two posts up-thread describing three different architectures for TCP servers. To summarize the part that relates directly to your question: if you've got a really heavy load, the server can indeed get overloaded. In that case, you need to work harder and do something like a threaded or multiplexing server.

That's what the backlog parameter on the listen() call is for. If the number of pending requests is less than or equal to that number, they get queued. When the number of pending requests exceeds it, requests start getting refused.

--
As we enjoy great advantages from the inventions of others, we should
be glad of an opportunity to serve others by any invention of ours;
 Click to see the full signature
Reply to
Joe Pfeiffer

This is true only as long as there is an existing TCP/IP connection.

Once this is lost, you have to use something else to establish a new TCP/IP connection.

Once you have to create a new TCP/IP connection to replace a broken TCP/IP connection, you need to use a similar amount of logic compared to an UDP raw Ethernet packet system.

Only if there is a 100 % certainty that a TCP/IP connection that I create today, will remain there long after I am retired and long after I am dead. A 99.9 % certainty is _far_ too low.

The main problem with TCP/IP is that you can not take immediate action as soon as the link is lost.

As soon as the link is lost, you need to take a similar amount of action as a TCP/IP logic would require to implement.

Reply to
Paul Keinanen

As long as you have a simple transaction system, one incoming request, one outgoing response, why on earth would any sensible person create a TCP/IP connection for this simple transaction ?

Reply to
Paul Keinanen

Consider a scenario in which multiple high speed TCP connection requests are arriving within a very very short/micro time frame. In that scenario, the TCP server would get overloaded if separate thread is created for every new connection that arrives at the server.

Karthik Balaguru

Reply to
karthikbalaguru

In many cases, the outgoing response will not fit in one packet. TCP takes care of handling out-of-order responses being received by the client.

Regards, Dave Hodgins

--
Change nomail.afraid.org to ody.ca to reply by email.
(nomail.afraid.org has been set up specifically for
 Click to see the full signature
Reply to
David W. Hodgins

or

ly

y

ch

he

ut

Interesting to know a method for having a Light load TCP server by using the existing utilities in Linux/Unix in the form of Forking Server !

r

Threaded Server seems to be good, but it might be overloading the TCP server very quickly incase of fast multiple connection requests within a very short timeframe. Just as you said, i think if the thread creation is of less overhead in the particular OS in which TCP server is running, then it would be great.

I came across preforking tricks too where a server launches a number of child processes when it starts . Those inturn would be serving the new connection requests by having some kind of locking mechanism around the call to accept so that at any point of time, only one child can use it and the others will be blocked until the lock is released. There seem to be some way out of that locking problem. But, i think the idea of creation of one child for every new connection/client seems to be better than the preforking trick, but these tricks in turn overload the TCP server incase of fast successive/near-simultaneous connection requests within a short time frame.

Just as you said, i think if the thread creation is of less overhead in the particular OS in which TCP server is running, then it would be great.

ten

h

ce

Having one process for maintaining states of all connections and implementing the event methodology that calls service subroutines whenever a certain specific instance happens sounds interesting. Appears to be the ultimate method for embedded systems where OS is absent and C is the main language . Anyhow, need to analyze the drawbacks if any.

a

ard

Karthik Balaguru

Reply to
karthikbalaguru

...

To continue in the same confrontational tone:

Because UDP doesn't work in practice except in some very limited scenarios. This is also why almost no widely used application protocols use UDP.

If you're arguing that UDP is generally a better transport protocol than TCP, you're in a small minority.

I also note that it was you who brought up these simple transaction systems. It's not implied in the text you quoted (although it does make pointless comparisons between the internal workings of UDP and TCP servers -- pointless because that's not what makes you decide to use TCP or UDP.)

/Jorgen

--
  // Jorgen Grahn    O  o   .
Reply to
Jorgen Grahn

derstand

ear why

True ! Need to decide on the best design methodology between either threaded or multiplexing server.

ted

Great ! I checked thee backlog parameter, yes it seems to be providing support interms of queing :-) !! The below link seems to have some good info about backlog parameter -

formatting link

3/2333/2333s2.html It seems that it plays a role in determining the maximum rate at which a server can accept new TCP connections on a socket. Interesting to know that the rate at which new connections can be accepted is equal to the listen backlog divided by the round-trip time of the path between client and server.

Thx, Karthik Balaguru

Reply to
karthikbalaguru

karthikbalaguru wibbled on Saturday 20 February 2010 19:49

Yes, it's actually very simple. Get yourself a linux machine, write a trivial program in perl,python, C, whatever that accepts lines on STDIN and replies with some trivial (eg echo the contents of STDIN) back to STDOUT. Run it and see if it does what you expect.

Now configure xinetd (the modern inetd, usually the default on any modern linux) to bind your program to say, TCP port 9000.

On the same machine, telnet localhost 9000 and you should have the same experience as running the program directly. telnet to it 3 times simultaneously from 3 different terminal windows. telnet to it from a different machine on your network.

Yes - if you don't mind thread programming - it does have its own peculiar issues.

Apache does that in one of its modes (and it has several modes). That is an example of a high performance bit of server software. Much more complicated to manage of course and less likely to be suitable for a tiny embedded system - but could be suitable for a decent 32 bit system with some sort of OS.

I actually used this when I coded a bunch of servers in perl [1] to interface to dozens of identical embedded devices. It was actually mentally much easier than worry about locking issues as all the separate connections had to be coordinated onto one data set in RAM, ie they weren't functionally independant.

[1] Yes perl. This was a rapid prototyping excercise to prove a point. My aim was to recode in C if necessary. It wasn't as the overhead of using perl was nearly two orders of magnitude less significant than the load of talking to an RDBMS - so it stayed in perl working quite happily in production.
--
Tim Watts

Managers, politicians and environmentalists: Nature's carbon buffer.
Reply to
Tim Watts

karthikbalaguru wibbled on Saturday 20 February 2010 21:26

I might have missed it - but what system is your code going to run on? Linux, something else fairly "fat" or a teeny embedded system with no resources?

It makes a difference, because there's no point in looking at (say) forking servers if you have no processes!

There is also the element of code simplicity and maintainability. If this were running on a high end system, you might be better to use a well known and debugged framework to manage your connections, so you write as little code as possible and what you do write deals mostly with the actual logic of your program rather than a whole overhead of connection management. No disrepect intended on your programming abilities ;-> but less code is always better :)

I was in several minds how to approach my problem, until I found perl's less well known IO::MultiPlex library - after that it was plain sailing with a multiplexed server. If that hadn't existed, I might well have used a multiprocess model with a lump of shared memory and some semaphores (I had the advantage that there one only one persistent TCP connection incoming from each of a finite number of embedded systems, so connection setup overhead was lost in the wash)

--
Tim Watts

Managers, politicians and environmentalists: Nature's carbon buffer.
Reply to
Tim Watts

Jorgen Grahn wibbled on Saturday 20 February 2010 16:31

That is interesting. I've never considered C++ for AVRs (it's that cultural thing you speak of ;-> , but seeing as GCC targets them, it would be interesting to try.

Good point. I'm a bit confused as to what system the OP is targeting. On a weedy embedded system, flow control is possibly less of an issue unless the network physical layer is very low bandwidth such as a modem speed radio link (quite possible I admit).

--
Tim Watts

Managers, politicians and environmentalists: Nature's carbon buffer.
Reply to
Tim Watts

To avoid having to deal with dropped/corrupted packets yourself.

--
As we enjoy great advantages from the inventions of others, we should
be glad of an opportunity to serve others by any invention of ours;
 Click to see the full signature
Reply to
Joe Pfeiffer

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.