Now that Oracle Enterprise Manager (OEM) 12c Cloud Control is becoming a main-stay and robust enterprise tool; it is important that all the functionality of OEM be available after a reboot. The functionality that OEM extends goes well past the Oracle Management Server and Repository out to the Management Agents that are used to monitor an environment. The Management Agent, after it is installed, should be configured to auto-restart on the event of a reboot or if the process dies. Oracle recommends that any and all operating system mechanisms should be used to to restart the Management Agent.
Example: on a UNIX/LINUX platform this is done by placing an entry in the /etc/init.d that calls the agent on startup or reboot; on Windows a windows service should be used.
Once the Management Agent is started after installation the agent is watched by a watchdog process (emwd.pl). The watchdog process monitors the Management Agent and attempts to restart it in the event the agent fails. The watchdog behavior is controlled by environment variables that are set before the Management Agent is started. The environment variables that control this watchdog behavior are:
EM_MAX_RETRIES – This is the maximum number of times the watchdog will attempt to restart the Management Agent within the EM_RETRY_WINDOW. The default is to attempt restart of the Management Agent three (3) times.
EM_RETRY_WINDOW – This is the time interval in seconds that is used together with the EM_MAX_RETRIES environmental variable to determine whether the Management Agent is to be restarted. The default is 600 seconds.
Note: The watchdog will not restart the Management Agent if the watchdog detects that the Management Agent has requested a restart more than EM_MAX_RETRIES within the EM_RETRY_WINDOW time period.
Let’s talk about the watchdog process. The watchdog process is started at the time ./emctl start agent is executed. When the agent is started, the watchdog process (emwd.pl) is started with a parent pid of 1. The watchdog process is responsible for the monitoring of the agent java process and will also restart the agent if it becomes hung or unresponsive. The watchdog process can be viewed by using the ps -ef | grep -i emagent. Look for emwd.pl.