Windows tcp Rx hanging

- S
- Son of a Sea Cook
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Sep 12, 2009 2:43 AM

What are those? "Oooops packets"? Severe FEC. :-)

- P
- Proteus IIV
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Sep 12, 2009 3:30 PM

d

ly

YOUR MISTAKE IS THINKING ANYONE IN USENET ACTUALLY BELIEVES OR NEEDS YOUR ADVICE YOU FLIPPING COX.NET TROLL

BEGONE !

I AM PROTEUS

- P
- Proteus IIV
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sat, Sep 12, 2009 3:30 PM

te

AND TAKE ALL YOUR TROLLOPING COX.NET BUDDIES WITH YOU

I AM PROTEUS

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Sep 13, 2009 7:22 AM

There's a big difference between "half a minute" and "forever".

If the receiving client application doesn't consume received data (e.g. because it is suspended, or in a blocking wait on user input), the receive buffer will fill up, resulting in the kernel's TCP implementation reporting a window size of zero. It will continue to report a window size of zero so long as the receive buffer is full (i.e. until the application eventually retrieves data from the buffer).

Unless the sending application implements its own timeout, the sender will continue to probe the receive window indefinitely. The kernel won't time out the connection so long as the receiver continues to send ACKs, even if the ACKs report a zero window size.

The RFCs say nothing about the case where you get two conflicting versions of a particular byte.

In many cases, the behaviour which you suggest (allow the newer version to override the older version) is impossible, as the older version will already have been passed up to the application.

In order for that behaviour to even be possible, the older data must still exist in the kernel's receive window, either because the application simply hasn't retrieved it, or because preceding data is missing.

If the sender reports with a window size of zero, the receiver should continue to send window probes containing a single byte of data. Such probes should have an exponential back-off capped at 60 seconds, and should be sent until the socket is closed, either by the application or by the protocol (i.e. either a RST or if the sender stops ACKing the probes for an extended period; 9 minutes is typical).

The correct meaning is "I'm busy, and cannot consume the data right now; please hold".

There are any number of reasons why reception may be suspended temporarily, e.g. the user suspended the application, or the application is blocked waiting for a new tape to be inserted, etc. Note: "temporarily" could easily mean "until Monday morning".

An application doesn't have to do anything special to wait indefinitely for its data to be consumed; rather, it has to explicitly set a time-out if it doesn't want to wait forever.

That may be reasonable (if somewhat aggressive) behaviour for an application, but the TCP stack shouldn't time-out the connection so long as the window probes continue to be ACKed.

- D
- Dimiter Popoff
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Sep 13, 2009 10:40 AM

Well yes, of course, but on a 100 MbpS link it feels like "forever". After having transferred hundreds of megabytes at the speed it will sustain (apr. 8 megabytes/S, the smaller system (mine) busses are the limiting factor), things stop and it is obvious there will be no recovery - the hanging application is not struggling, it reacts immediately to my "ABOR" over the control connection.

s

Page 69 of rfc793 says something about that.

o

Clearly so; but since the receiving side has no control over the segment size except over setting its maximum, it is wise to take the latest data as the sender may choose to resize the retransmitted segments (which if mixed with old, differently sized/overlapping, will be quite a mess to dig through). What I do is to take the new segments, discard the old ones, and ignore the beginning of the first segment of the "new" series which overlaps an old which has been consumed by the application. I already forgot how I decide that the peer has begun to retransmit it all (and not just retransmitting one segment, this can be successfully caught both ways without discarding the rest), but it works fine.

I know, and I know where that comes from. The clear error of the tcp implementation at the windows' size is the fact that it chooses to take part of a segment; the sending side (mine) sees too late that the tcp window is too small and sends a segment which the receiver cannot accept (no matter how low my system latencies are, at 12-13 uS/segment this can still happen). Instead of discarding the segment, the receiver acks *part* of it; this is illegal (rfc793, page 69: "If the RCV.WND is zero, no segments will be acceptable, but special allowance should be made to accept valid ACKs, URGs and RSTs". The table at that page is also quite explicit about that).

Since I do not see that behaviour too often from the windows' side, and never at lower speeds, I believe it is safe to say they have some issue, shared with the application (the latter fails to process the data, the tcp will accept the wrong offset).

The 0 window does mean that indeed, the 1200 ack is nonsense; and I agree that 30 seconds timeout may be somewhat aggressive but at 100 MbpS it seems an eternity (or is it an ethernity :-) . It is my tcp sending action which times out, but the application can set if there is one and how long it must be on a per connection basis, in this case it is 30 seconds (I believe this was the default setting, also settable).

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

Original message:

formatting link

- R
- Rocky
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Sep 13, 2009 11:05 AM

Just for interest, what was the window size on the previous ACK? The receiver should always be able to handle its advertised window size.

- D
- Dimiter Popoff
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Sep 13, 2009 11:44 AM

I believe it was the right one, i.e. it takes exactly as many bytes as last advertised and acks as many (which is sheer nonsense, partial segment ack); then it fails to handle the case of being overflown - sometimes.

Dimiter

- R
- Rocky
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Sep 13, 2009 3:32 PM

What I was getting at is that unless fragmentation took place - which on a direct connection is somewhat unlikely :) - then the window of the receiver should have been greater than or equal to about 1400 if a

1460 segement was sent. If it was >=3D 1400, but only 1200 bytes got acked, then my view would be that the TCP engine of the receiver is possibly broken. If however the reciever window was only 1200 bytes and your trasmitter sent about 1400, then the transmitter is broken.

I have approximated 1400 bytes for the window, because of the TCP overhead.

- D
- Dimiter Popoff
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Sep 13, 2009 4:17 PM

Mine does send at times > the actual window size - this cannot be avoided, 12 uS per segment is negligible compared to overall RTT, internal system latencies (both sides) etc. There are a lot of unacked segments in transit, sometimes the smaller window information is seen too late. The receiving side must be prepared to receive such a segment, discard it and stay alive (and open the window at some point). When I get conservative enough to guarantee I will never ever overflow the foreign window, the overall speed drops about 3-4 times; not a good deal. It is normal to have such an overflow every now and then (a few seconds, perhaps tens of seconds apart), recover and maintain maximum link speed. My side keeps on receiving that all the time and handles it without any hiccups.

Well, not that it matters in this context but the segment size (minus overhead) is 1460 bytes, $5b4.

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

Original message:

formatting link

- D
- David Schwartz
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 12:33 AM

The other side keeps trying to get you to accept its ACK until its calculated RTT gets high. Once that happens, the recovery is going to be slow.

There is nothing wrong with taking part of a segment. And you cannot rely on a zero window size advertised meaning there will still be a zero window size when the packet is received.

;

How do you figure that? Why do you think the window size invalidates the ACK?

And what should the other side infer from the fact that you keep ignoring the ACK other than that it dropped?

DS

- D
- Dimiter Popoff
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 1:24 AM

Please consult the paragraph past line 4264 of rfc793, it explains that.

Thanks for your comments,

Dimiter

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 5:37 AM

I presume that you are referring to:

If a segment's contents straddle the boundary between old and new, only the new parts should be processed.

Which implies that the old version should take precedence, regardless of whether it has been passed up to the application.

Although it shouldn't make any difference (the sender shouldn't be sending conflicting data for a given range of sequence numbers), this has been suggested as a possible attack vector, to "smuggle" malicious data past a security scanner built into a router.

Note that it's talking about the receive window within the receiver's TCP stack, not the last advertised window. It's possible that the window has just opened but this fact hasn't yet been announced. If the receiver responds to a probe with a zero window size, then ACKs some data from the next packet, it isn't necessarily violating the rules.

That's an API issue. The BSD sockets API offers the SO_SNDTIMEO socket option (for all socket families). This specifies how long a send/write/etc call can block for; if the timeout is exceeded, the call will return a short count (or -1 with errno set to EAGAIN), but the socket remains valid for futher operations (i.e. it doesn't terminate the connection).

One thing I'm not entirely clear on is whether the result reflects:

the data actually acknowledged by the receiver,
the data actually sent (and scheduled for retransmission until acknowledgement), or
the amount of data copied into the kernel's transmit buffer.

I assume that it would be either 2 or 3; data sent but not acknowledged may have already been received and passed up to the application, so it cannot be "rescinded". If it's 2, the kernel can just discard any data which hasn't been sent yet.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 5:40 AM

OTOH, the sender should always be able to handle the receiver failing to handle its advertised window size (i.e. window shrinking).

- D
- Dimiter Popoff
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 6:30 AM

Well yes, but it cannot do much except probing and waiting. Now why my 1 byte probing does not set in in that case is something I have to investigate (I have it there for years, since day one, actually), but it is completely irrelevant which segment size (and at which offset) is sent to get a valid ack with a valid window size in reply; in our case the fact is that my peer is stuck at 0 window size, clearly messed up.

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

- D
- Dimiter Popoff
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 6:51 AM

and new, only

You are right, this is the correct interpretation. Come to think of it, this is how I do it (I set the newcomers offset to the first byte high enough to skip over the overlapping data with the previous segment, which does exactly that...). But I am sure I had read somewhere about newer stuff taking precedence, it must have been in rfc791 regarding defragmentation, so I am not completely making that up :-). It's been several years since I wrote that implementation, now I am inside it because it first sees 100 MbpS.

S

c

Very similar behaviour here. Obviously my "send" (send and wait for ack), sendq (queue for sending and return - can be polled for status), and a new out25 (...:-), which allows the application to serve the connection in a loop while getting back key parameters (ack position, queued position etc.) all just return with the proper status upon timeout; it is up to the application to decide whether to close the connection. The only case where it will get closed automatically is if the task is killed.

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

Original message:

formatting link

- D
- David Schwartz
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 9:22 AM

It's stuck at 0 window size because it believes its ACK keeps dropping.

DS

- J
- Jorgen Grahn
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 14, 2009 10:42 AM

FYI, he's probably referring to this text on page 69 of 85:

Segments are processed in sequence. Initial tests on arrival are used to discard old duplicates, but further processing is done in SEG.SEQ order. If a segment's contents straddle the boundary between old and new, only the new parts should be processed.

/Jorgen

--
  // Jorgen Grahn    O  o   .

- D
- Dimiter Popoff
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 21, 2009 7:50 PM

o

Out of curiousity - and after a wasted week because of a running nose and a head full of what should have been some rubber glue - I tried to make the stuck windows host happy, i.e. I began probing it not by repeating the segment it was partially ack-ing, but just with the part past what it had acked in the hope it would eventually recover.

No such luck, though. Absolutely no change, keeps on repeating the same (acks the position it had last acked, window size 0). Clearly dead - not that it was not obvious before that, I know a messed up system when I see it, but I did try and thought I'd post the result as well. Only with filezilla, though - another ftp server, xlight something, does not do it. A difference between the two I notice is the fact that only filezilla opens the data connection using window scaling; this may have to do with their problem (I see much larger window advertised than the buffer size which is currently set, not that changing that buffer size had had an effect during earlier tests). With both servers, the upload speed is about 7.5 Mbytes/S - the window scaling is not really needed, this is a local (via a buffering switch) connection. I tried to install an ftp server under linux to do the same test, but after wasting an hour to make that run (under ubuntu) without success I gave up, no time for that now.

Dimiter

------------------------------------------------------ Dimiter Popoff =A0 =A0 =A0 =A0 =A0 =A0 =A0 Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

- P
- Paul Carpenter
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Sep 21, 2009 10:05 PM

to

Considering, the problems I have seen with their Passive mode support on filezilla client, to connect through an ftp-proxy to external ftp servers, I would blame filezilla.

In my case I replaced the client with a 10 year old ftp client on the SAME system and everything worked.

...

If you can get it going you will probably find it works.

Now back to sorting some bugs in some USB to SPI controller driver, that does not work for all CPOL and CPHA modes as=20 advertised.

--=20 Paul Carpenter | snipped-for-privacy@pcserviceselectronics.co.uk PC Services Timing Diagram Font GNU H8 - compiler & Renesas H8/H8S/H8 Tiny For those web sites you hate