High packet rate ethernet with LwIP

UDP and TCP/IP are independent - so there is no problem for a server to have UDP port 1234 and TCP/IP port 1234 open and active at the same time.

For the client side of the connection, the network stack picks a different source port for each connection. So the client end can have as many connections (UDP, TCP/IP, both) as it wants connected to the same destination port at the server side.

Reply to
David Brown
Loading thread data ...

...

That might be my problem as well. ;-)

Turned on LwIP statistics and did a first test with a single simulated serial port (data not actually coming in, just generated by processor itself).

LwIP runs on an LPC4088 and 'serial port' is opened by Docklight running on a W10 PC.

First, 512 byte packets a 1 second intervals and printing the stats every

5 seconds.

- TCP xmit: 44 recv: 45 fw: 0 drop: 0 chkerr: 0 lenerr: 0 memerr: 0 rterr: 0 proterr: 0 opterr: 0 err: 0 cachehit: 0

No memerr and xmit and recv at a constant 1 difference.

Then a test with one port, 10 ms intervals:

- (Aparently) Immediate hangup, stacks keeps going into msDelay().

Then one port, 20 ms intervals; Runs for a while but ultimately ends up in repeated msDelay(). Last statistics output (run time not always equal):

- TCP xmit: 1088 recv: 827 fw: 0 drop: 0 chkerr: 0 lenerr: 0 memerr: 270 rterr: 0 proterr: 0 opterr: 0 err: 0 cachehit: 0

In the previous outputs you can see the gap between xmit and recv increasing as well as the memerr.

Looks like the PC does just not respond fast enough, forcing LwIP to keep the packet until an ACK is received?

Last 6 lines from the Wireshark packet window when the above stops:

6600 78.330701 192.168.125.130 192.168.125.128 TCP 1078 10001 ? 51826 [PSH, ACK] Seq=975873 Ack=1 Win=5840 Len=1024 6601 78.379841 192.168.125.128 192.168.125.130 TCP 54 51826 ? 10001 [ACK] Seq=1 Ack=976897 Win=63216 Len=0 6602 78.379850 192.168.125.128 192.168.125.130 TCP 54 [TCP Dup ACK 6601#1] 51826 ? 10001 [ACK] Seq=1 Ack=976897 Win=63216 Len=0 6607 78.413921 192.168.125.130 192.168.125.128 TCP 566 10001 ? 51826 [PSH, ACK] Seq=976897 Ack=1 Win=5840 Len=512 6609 78.459611 192.168.125.128 192.168.125.130 TCP 54 51826 ? 10001 [ACK] Seq=1 Ack=977409 Win=64240 Len=0 6610 78.459619 192.168.125.128 192.168.125.130 TCP 54 [TCP Dup ACK 6609#1] 51826 ? 10001 [ACK] Seq=1 Ack=977409 Win=64240 Len=0

What seems weird to me is that all ACK are followed by Dup ACK. Is this normal? See that al lot in other TCP traffic as well, but not on all ACK.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail) 

There is no opinion so absurd that some philosopher will not express it. 
 Click to see the full signature
Reply to
Stef

Make sure that you do not cause excessive loading on PC by causing all kinds of activity on the PC.

If you use Telnet (or other display program) as the TCP client, any screen updates will cause loading the PC. At least minimize the Telnet window to reduce the PC loading. The same applies to Wireshark, just capture to disk. If possible use a separate PC for Telnet and Wireshark, but you need a hub, not a switch, so that Wireshark will see the traffic between the embedded device and Telnet PC.

Look at the WireShark sequence numbering (first column), why is there a jump between 6602 and 6607 ? Is this due to display filtering? Use capture filtering to reduce Wireshark loading.

Look at timestamps (second column), the last digit is in microseconds. There seems to be a much larger gap than 20 ms between to embedded sends. The time between data and Ack is much larger than 20 ms.

The Ack sequence number has fallen seriously behind data messages. Apparently the sender fails to buffer about 1000 data frames (500 KB) before being acknowledged, so clearly some messages are loss.

This suggest that the Telnet screen update slows down the traffic, so minimize Telnet screen.

The difference between Ack and duplicate Ack is only 8-9 us. Strange.

Reply to
upsidedown

I was half joking, but only half. Modern systems are very chatty on the network - with newer OS's being worse than old ones. Even Linux desktop systems produce a lot of traffic, though my experience is that Windows is much worse. (I have not looked at Mac's, and server systems are usually a lot quieter.) Printers and other devices also chatter continuously. Some of this is necessary low-level traffic, like ARP and DHCP packets. Some is from the several dozen different "automatic configuration" and "name service" protocols that everyone has to implement, but almost no one uses. Some is from applications like DropBox or Steam trying to find neighbours on the network.

This all means there is a ridiculous amount of traffic on typical Ethernet networks, with 90% of it being basically useless, and a good deal of it being broadcast to all nodes.

If you are getting any memerr results, you have not given LWIP enough resources. I don't know off-hand what buffer types or other resources can lead to memerr counts, but you are low on something. These statistics show that a quarter of your communications are failing to get the buffers they need - not good at all.

PC's are ridiculously fast. Despite the network chatter, and despite the mess of silliness Windows 10 machines are always running with their absurd "start menu" full of adverts and other junk, they should be handling packets in a fraction of a millisecond. When I did the testing I mentioned before, the test programs on the PC side (Linux rather than Windows) was in Python, and not even particularly efficient Python - it even had manual "sleep" calls. Handling was well under a millisecond per transaction.

I /think/ the dup ack's are the result of your board failing to handle some packets, probably related to limited buffers (giving the memerr counts).

Reply to
David Brown

Nonsense. A modern PC will handle this with barely a blip on its processor usage graph. You /might/ see some limitations if you are using a 2 GB "Intel compute stick" with a tiny Celeron and Windows 10. But assuming the Win 10 has halfway sane specs, it is going to be absolutely fine. Run Task Manager and watch the graphs of processor and memory usage to confirm that.

If you are doing large captures, then capture filtering makes a difference by reduce the quantity of data. For short captures, it will make no measurable difference.

No, it suggests things are failing at the LWIP side, especially when we see the LWIP statistics full of memerr counts. A memerr occurs when LWIP can't find a free buffer of some sort (there are several types used) to handle a packet coming in or going out - it has no choice but to drop that packet. This will lead to resends and delays.

It is, I think, due to lost packets. (Google "wireshark dup ack" for suggestions.)

Reply to
David Brown

On 2019-09-25 snipped-for-privacy@downunder.com wrote in comp.arch.embedded:

PC is not heavily loaded, debugger, terminal window, wireshark, a browser with internet radio, not much more.

Yes, display filtering is used.

Why the sends are more than 20 ms apart, I don't know. The sender loop tries to send a packet every 20 ms. Must be due to the lagging ACKs and limited buffer space at the sender?

OK

I don't think the terminal (just a plain serial terminal, no telnet) is slowing the PC down too much, see below.

Yes.

As discussed in other parts of this thread, UDP might be a better match for this application, so I tried that.

A test with 4 ports sending 512 byte packets at 10 ms intervals was successful (mostly). During this test 4 terminal screens were updating and wireshark was running with display filtering. So the PC seems to have no problem keeping up.

7888 23.208722 192.168.125.132 192.168.125.128 UDP 554 10001 ? 10001 Len=512 7889 23.209536 192.168.125.132 192.168.125.128 UDP 554 10002 ? 10002 Len=512 7890 23.210885 192.168.125.132 192.168.125.128 UDP 554 10003 ? 10003 Len=512 7891 23.211184 192.168.125.132 192.168.125.128 UDP 554 10004 ? 10004 Len=512 7892 23.219123 192.168.125.132 192.168.125.128 UDP 554 10001 ? 10001 Len=512 7893 23.219593 192.168.125.132 192.168.125.128 UDP 554 10002 ? 10002 Len=512 7894 23.220354 192.168.125.132 192.168.125.128 UDP 554 10003 ? 10003 Len=512 7895 23.221256 192.168.125.132 192.168.125.128 UDP 554 10004 ? 10004 Len=512

So it looks like UDP is indeed a better match for this application. Have to implement some stuff in the application that TCP took care of. Like opening a port and handling packet loss. Packet loss was already an issue that needed to be handled by the final application. This is because (as an other poster already mentioned) the data is just serial data with no loss detection to begin with.

There was still an issue that at all tested speeds (10/20/40/100 ms) the embedded side goes in a hard fault handler after a while. Aprox 30 seconds for 10/20/40 ms and 2 minutes for 100 ms. This was probably due to my sending the 4 packets for each interval back to back without calling any of the ethrnet handlers in between. If I call the handlers after each packet, the hangups disapper. So I may have to revistit the TCP version to check if that was a problem there as well. :-(

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail) 

I *____knew* I had some reason for not logging you off... If I could just 
 Click to see the full signature
Reply to
Stef

On 2019-09-25 Stef wrote in comp.arch.embedded: ...

For get that, the failing TCP test was running only one port at 20 ms, so it did not have the back to back problem. The failure was also different, a repeating wait loop, not a hard fault.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail) 

I drink to make other people interesting. 
 Click to see the full signature
Reply to
Stef

No, not good. But 64 PBUF_POOL_SIZE and 32k mem should be okay for the few packets that could be in-flight if the PC responds fast? I see no other obviously releated tunable items in my lwipopts.h

But even from the start of transmission, the packets come out too slow. Sometimes 512 byte packets after a > 20ms interval, mostly 1024 byte packets at > 40 ms (mostly 50 ms). So this would quickly cause the buffers to fill up as the I keep feeding them at 20 ms intervals.

I'll proceed with the UDP approach for now. Looks a lot cleaner.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail) 

"You're very sure of your facts, " he said at last, "I  
 Click to see the full signature
Reply to
Stef

What are your other lwip config numbers? I have:

#define MEM_SIZE (48*1024) #define MEMP_NUM_PBUF 32 #define MEMP_NUM_UDP_PCB 6 #define MEMP_NUM_TCP_PCB 128 #define MEMP_NUM_TCP_PCB_LISTEN 8 #define MEMP_NUM_TCP_SEG 16 #define MEMP_NUM_SYS_TIMEOUT 10 #define PBUF_POOL_SIZE 32 #define PBUF_POOL_BUFSIZE 1518

I can't say I have studied the usage in detail, so I might have much more than I need for some of these. But if any of your values are hugely lower than mine, check them to be sure. /Something/ is giving you memory allocation errors, and you need to find and fix that something.

It is also possible that the rest of program structure means you are getting slow handling of the packets - I can't guess anything about that, because I don't know your code at all.

Reply to
David Brown

On 2019-09-25 David Brown wrote in comp.arch.embedded: ...

In lwipopt.h: #define MEM_SIZE (32 * 1024) #define MEMP_NUM_SYS_TIMEOUT 300 #define PBUF_POOL_SIZE 64

Most were not in my lwipopt.h, so at default from opt.h: #define MEMP_NUM_PBUF 16 #define MEMP_NUM_UDP_PCB 4 #define MEMP_NUM_TCP_PCB 5 #define MEMP_NUM_TCP_PCB_LISTEN 8 #define MEMP_NUM_TCP_SEG 16 #define PBUF_POOL_BUFSIZE LWIP_MEM_ALIGN_SIZE(TCP_MSS+40+PBUF_LINK_HLEN)

The last one expands to: ((1460+40+14)+3) & ~3 = 1516

Tried changing MEMP_NUM_TCP_PCB, MEMP_NUM_PBUF and MEMP_NUM_SYS_TIMEOUT to your values, but no change in behaviour.

It's just a loop calling the lwip handlers (the standard stand alone echo example) and a function that checks a timer for when to send a (dummy) packet. So really lightweight, just to test if I can get packets out at the required rate.

I'm giving up the TCP version for now and go with the UDP, which seems to work fine (with the same settings). I'll give TCP another try some time.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail) 

No hardware designer should be allowed to produce any piece of hardware 
 Click to see the full signature
Reply to
Stef

Even stranger, when I just ping the board:

229 5.081046 192.168.125.128 192.168.125.133 ICMP 74 Echo (ping) request id=0x0001, seq=3492/41997, ttl=128 (no response found!) 230 5.081051 192.168.125.128 192.168.125.133 ICMP 74 Echo (ping) request id=0x0001, seq=3492/41997, ttl=128 (reply in 234) 234 5.081615 192.168.125.133 192.168.125.128 ICMP 74 Echo (ping) reply id=0x0001, seq=3492/41997, ttl=255 (request in 230)

A duplicate request goes out from W10 with only 5 us difference? The same happens on the rest of the pings, one second apart.

Nothing to do with the eval kit with LwIP, if I ping my router:

380 7.010693 192.168.125.128 192.168.125.254 ICMP 74 Echo (ping) request id=0x0001, seq=3539/54029, ttl=128 (no response found!)380 7.010693 192.168.125.128 192.168.125.254 ICMP 74 Echo (ping) request id=0x0001, seq=3539/54029, ttl=128 (no response found!) 381 7.010700 192.168.125.128 192.168.125.254 ICMP 74 Echo (ping) request id=0x0001, seq=3539/54029, ttl=128 (reply in 382) 382 7.011670 192.168.125.254 192.168.125.128 ICMP 74 Echo (ping) reply id=0x0001, seq=3539/54029, ttl=255 (request in 381)

And after looking a little better, it seems to happen on almost all packets transmitted from this PC, but not on TLSv1.2 packets.

Something weird in W10 or Wireshark? Or the PC network interface?

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail) 

Intolerance is the last defense of the insecure.
Reply to
Stef

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.