Job manager


Overview

A job manager allows you to submit administrative jobs asynchronously for application servers registered to admin agents and for dmgrs. We can submit these jobs to a large number of servers over a geographically dispersed area.

We can make both appserver nodes that are registered to admin agents and dmgrs known to the job manager through a registration process. After you register appserver nodes and dmgrs with the job manager, we can queue admin jobs directed at the appserver nodes or dmgrs through the job manager.

To register appserver nodes and dmgrs with the job manager, use the wsadmin command...

registerWithJobManager

Use the job manager to asynchronously administer job submissions...

Each appserver node or dmgr registered with the job manager is known as a managed node to the job manager.

Node groups allow us to submit jobs for a group of nodes.

Many of the management tasks that we can perform with the job manager are tasks that we can already perform with WAS ND, such as application management, server management, and node management. However, with the job manager, we can aggregate tasks and perform those tasks across multiple appservers or dmgrs.

Examples of situations where a job manager is useful...

Branch office environment

A business has a thousand stores geographically dispersed across the continent. Each store contains either a few application servers, or a small ND cell consisting of two or three machines. Each store is managed locally for daily operations. However, each store is also connected to the data center at headquarters, potentially thousands of miles away. Some connections to HQ are at modem speeds.

HQ uses the job manager to periodically submit admin jobs for the stores.

Environment consisting of hundreds of application servers

An administrator sets up hundreds of low-cost machines running identical clones of an appserver. Each appserver node, which is registered with an admin agent, is registered with the job manager.

The administrator uses the job manager to aggregate commands across all the appservers, for example...

  • Create a new server
  • install or update an application

Environment consisting of dozens of dmgr cells

An administrator sets up hundreds of application servers, which are divided into thirty different groups. Each group is configured within a cell. The cells are geographically distributed over five regions, consisting of three to seven cells per region. Each cell is used to support one to fifteen member institutions, with a total of 230 institutions supported. Each cell contains approximately thirty applications, each running on a cluster of two for failover purposes, resulting in a total of 1800 application servers.

The administrator uses the job manager to aggregate commands across all the cells, for example, to...

  • stop servers
  • install or update an application
  • start servers

 

Example topology

The following example topology shows a dmgr and a managed node on machine A, two application servers, Profile01 and Profile 02, registered with an admin agent on machine C, a job manager on machine D, and a Web server on machine B. Firewalls provide additional security for the machines.

The admin agent and the dmgr are registered to the job manager. The admin agent and dmgr periodically poll the job manager to determine whether the job manager posted jobs that require action.

 

Job manager has high CPU when managing many nodes or dmgrs

Technote (troubleshooting)

The job manager CPU increases as more nodes or dmgrs are registered with the job manager. This occurs even if a job is not submitted to the job manager. The job manager eventually runs at 100% CPU as more nodes are registered. The threshold for the number of nodes that we can register with the job manager without high CPU depends on the hardware capability of the machine that runs it.

Resolving the problem

Tune the polling interval that each dmgr or admin agent uses to poll the job manager for jobs. The default polling interval is 30 seconds. As the polling interval increases, the rate at which the admin agent or dmgrs contact the job manager decreases, which alleviates the high CPU issue. The trade-off is that each job takes longer before it starts running.

To change the polling interval for the dmgr...

System administration > Deployment manager > Job managers > UUID

Change the value for the polling interval setting.

Save the changes and restart the dmgr.

To change the polling interval for a node that is registered with admin agent...

System administration > Administrative agent > Nodes > node_name > Job managers > Job Manager UUID

Change the value for the polling interval.

After changing the setting for all of the nodes that an administrative agent manages, save the changes, and restart the administrative agent.

As the number of nodes or dmgrs that are registered with the job manager increases, the polling interval must increase to avoid high CPU on the job manager. The actual value depends on the hardware that the job manager is running. However, determine the value on a case-by-case basis. With up to 100 nodes or dmgrs registered with the job manager, we might need a polling interval that is as long as 20 minutes.




Related concepts

Administrative agent

 

Related tasks

Administer nodes using the job manager
Administer jobs in a flexible management environment using scripting
Set up the admin architecture
Create a management profile with a job manager

 

Related

ManagedNodeAgent