Owners Manual

ManualsBrandsDell ManualsComputer equipmentOpenManage Network Manager

721

722

723

724

725

726

727

728

729

730

FAQs about Monitoring Mediation Servers | Troubleshooting

OMNM 6.5.2 User Guide 725

FAQs about Monitoring Mediation Servers

After making a UDP-based JGroups discovery request and receiving a response from an application

server in the cluster, each mediation server makes an RMI (TCP) call to an application server every

30 seconds. This RMI call results in a “call on cluster” on the application server cluster, using

JGroups (UDP by default), to call the agentHeartbeat method of the OWMedServerTrackerMBean

on each application server in the cluster. The primary application server updates the timestamp for

the medserver in question, and the others ignore the call. Every five seconds, the primary

application server checks to see if it has not received a call from a mediation server in the last 52

seconds. If it has not, it attempts to verify down status by pinging the suspected mediation server.

Then it issues an RMI call on that mediation server. It considers the meditation server down if the

ping or the final RMI call fails. This avoids false meditation server down notifications when a

network cable is pulled from an application server.

• Does the application server wait 15 seconds after receiving the mediation server's response? Or

does it monitor mediation server every 15 seconds regardless of the mediation server's

response?

The receipt of the mediation server's RMI call is on a different thread than the monitoring

code. The monitoring code should run every 5 seconds, regardless of the frequency of

mediation server calls. However, after investigating the scheduling mechanism used (the JBoss

scheduler -

http://community.jboss.org/wiki/scheduler

), it is possible that other tasks using

this scheduler could impact the schedule because of a change in the JDK timer

implementation after JDK 1.4.

• What kind of functionality (JMS?) does application server use to send and receive

OpenManage Network Manager messages?

The application server does not actively monitor the mediation servers unless it fails to get a

call from one for 52 seconds. If it does try to verify a downed mediation server, it uses an RMI

call.

The RMI calls use TCP sockets. It may use multiple ports: 1103/1123 (UDP - JGroups

Discovery), 4445/4446 (TCP - RMI Object), 1098/1099 (TCP - JNDI), or 3100/3200 (TCP -

HAJNDI), 8093 (UIL2).

• What kind of problem or bug would it make application server to falsely detect a mediation

server down? For example, would failing to allocate memory cause application server to think

a mediation server is down (dead)?

An out of memory error on an application server could result in a false detection of a downed

medserver.

• If such memory depletion occurs as described in the previous answer, would the record

appears in the log? If it doesn't appear in the log, would it possibly appear if the log-level is

changed?

An out of memory error usually appears in the log without modifying logging configuration,

since it is logged at ERROR level.

• The log shows that a mediation server was detached from the cluster configuration, but what

kind of logic is used to decide the detachment from the cluster? For instance, would it

detach application servers if they detect the mediation server down?

JBoss (JGroups) has a somewhat complex mechanism for detecting a slow server in a cluster,

which can result in a server being “shunned.” This logic remains, even though we have never

observed the shunning of a server resulting in a workable cluster. This is the only mechanism

which automates removing servers from the cluster. The configuration for this service is