Owners manual
6: Troubleshooting and Technical Support
Modbus Protocol User Guide 26
After a while, the IAP Device Server seems to take longer and longer to
answer – after a few hours, it takes 10 minutes or more for systems changes
to propagate up to the master/client.
All these relate to the same issue – a mismatch in queuing behavior and expectation
by the master/client to the new realities of Ethernet. (It is not the IAP Device Server
behaving poorly.) Resetting the IAP Device Server fixes the problem (flushes the
bloated TCP queues full of stale requests).
The core problem is that the master/client is using the old RS485 serial assumption
that no answer means poll was lost. However, in the IAP Device Server case, it could
also mean the IAP Device Server has not had time to answer (is being overworked).
Also remember that TCP is reliable – the IAP Device Server receives all polls sent
without error. The result is that the master/client retries, which makes it harder for the
IAP Device Server to catch up.
Here is the scenario that is causing the problem:
1. Master sends out MB/TCP Poll #A with a timeout of 1000 msec.
2. IAP Device Server receives the poll, but the serial link is busy so it waits -
possibly another MB/TCP master is being serviced or timeouts waiting on off-line
stations are creating a backlog of new requests.
3. After approximately 850 msec, the serial link is now free and the IAP Device
Server forwards the MB/RTU request.
4. The IAP Device Server receives the response, and since the timeout on the IAP
Device Server and master are not inherently synchronized, the IAP Device
Server sends the MB/TCP response into the TCP socket.
5. In the best of times, it may take 5-10 msec for this response to actually go down
the IAP Device Server's TCP stack, across the wire, and up the master's TCP
stack. If a WAN or satellite is involved, it could take 750 msec or longer.
6. Meanwhile, before the master receives the Response #A, it gives up and makes
the Modbus/RTU assumption that the request must have been lost. The master
sends out a new MB/TCP Poll #B.
7. A few msec later, there is a response that looks like a good Response #B, but
really is Response #A. If the master does not use a sequence number (which
many do not) and has forgotten about pending poll #A, it wrongly assumes this is
response #B (possibly with catastrophic results if Poll #B was the same size but
different register range). Here is the source of the problem “IAP Device Server
returns the wrong data for wrong slave.”
8. The master is idle and has no outstanding polls. Yet the IAP Device Server has
received Poll #B by TCP/IP. It sends this out to Modbus/RTU slave and gets an
answer. The IAP Device Server is doing its job!
9. The IAP Device Server returns Response #B to the master (if the socket is still
open) and there it sits in its TCP/IP buffer. The master is not expecting more
responses, so it neither receives nor purges the "extra" response.
10. Master sends Poll #C and magically finds "a response" waiting as soon as it
looks in the receive buffer - yet this is stale Response #B received before poll #C
was even issued. If the master does not implement Modbus/TCP sequence
numbers, then it accepts the response #B as satisfying poll #C. Imagine if the
master is putting out 300 polls per minute (5 polls per second), but the IAP