TCP Retransmissions b/c of Checksum Incorrect
495 pts.
0
Q:
TCP Retransmissions b/c of Checksum Incorrect
We have a problem where some users will complain of slow application response time (on the LAN) from our ERP server, but others will be fine. After many hours of troublshooting I've discovered the following:

- Packet Sniff's of affected users shows TCP Retransmissions because the Checksum is Incorrect from the server.
- Replacing the NIC card on the workstation temporarily fixes the errors. Then the problem may start happening on new NIC, so I swap back to old NIC and problem goes away for a while.
- Sometimes the problem clears itself up for certain people after a couple months.
- There is no similarity between NIC cards, drivers, versions, laptops, desktops, wireless.
- Affected users have been tested on different segments of the LAN with no success.
ASKED: Jul 19 2007  3:55 PM GMT
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
0
220 pts.
0
A:
 RATE THIS ANSWER
0
Click to Vote:
  •   0
  •  0
  • AddThis Social Bookmark Button
You said "the Checksum is Incorrect from the server". Did you mean server or client? Just to be absolutely clear!!

If you really meant "server", then you should be looking at the server's configuration & hardware.

If not, can you identify any other common characteristics of the users?
Last Answered: Dec 20 2007  3:39 PM GMT by Itguy1509   220 pts.
Latest Contributors: bobkberg   895 pts.
0
0
Discuss This Answer:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _



_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

tbitner   495 pts.  |   Jul 19 2007  4:41PM GMT

I accidentally hit reply before I was finished composing last reply. The Retransmission requests are coming from the server stating “Checksum: 0×493a (incorrect, should be 0×4939)”.

The server is HP-UX 11.11 but will be migrating to 11.23 in the next couple months. From my testing it seems to be something wrong with the server, but we’re DBA-less currently so I’m reluctant to make any changes!

 

jtt555   0 pts.  |   Jul 19 2007  4:58PM GMT

If OS is Microsoft you may find this useful:
article - 224829

A possible reason for the incorrect checksum is if your network cards are capable of performing TCP Checksum Offload. Broadcom and Intel gigabit cards are among those that can offload TCP checksum calculation. Linux enabled TCP Checksum Offload automatically when it is available.

With TCP Checksum Offload, the packets are captured before the card calculates the checksum — so the checksums may not be correct. The checksum actually transmitted on the wire and received by the destination host will be correct.

On Linux, it is possible to disable TCP checksum offload

Of course it could also be due to any number of conditions, such as hardware failure, corruption of an IP datagram or router or congestion. Make sure your NIC drivers (server and wrkstn) are up to date. You may need to configure an NLB setup to make a fatter pipe for your ERP server.
Good luck!

 

tbitner   495 pts.  |   Jul 20 2007  11:28AM GMT

jtt555,

I don’t think the client OS is generating these checksums since I’m seeing retransmission requests from the server. I’m starting to lean towards the server as the culprit from my tests. Can a bad cable cause incorrect checksums or would it be the server’s nic?

Thanks

 

Snapper70   540 pts.  |   Oct 12 2007  9:17PM GMT

You might want to verify the duplex setting between the HP and the switch it’s connected to. If the HP is set to 100full and the switch is autonegotiate, then you may have a mismatch; and as load increases you WILL get a lot of runts and retransmissions. The OTHER thing is that some older HP’s didn’t seem to run full duplex even if configured that way - although recent models don’t have that issue.

What we HAVE done is to FTP to/from the HP to a high end workstation, and verified the transmission rate. If you’re on a 100 Meg connection, you should get at LEAST 30 Meg from a high end workstation to/from the HP via FTP (use a large file of 50 Meg or more). If your rate is only 5 meg or so, you’ve probably got a duplex mismatch.

If you use Autonegotiate on the switches, you should also use Autonegotiate on the server; if hardcode speed/duplex at one side, make sure you fix it the same on the other. A duplex mismatch will severely impact performance, but may appear normal under low volume traffic.

 
0