[V5.1 and later]Troubleshooting the installation

 

Overview

If you did not receive the Successful installation message or the INSTFIN: The installation is complete. line in the installation log, use this information to troubleshoot the installation.

 

Steps for this task

  1. Install the base product first before installing the Network Deployment product when installing both products on the same machine.

    Install the base product before installing the Network Deployment product when installing both products on the same machine. The embedded messaging feature is included in the default installation.

    See also: Installing WAS products in order on the same machine, when installing the embedded messaging component

  2. To manage multiple base product Application Servers, install the Network Deployment product.

    The WAS, V5.x product does not provide centralized management of multiple servers. The WAS Network Deployment, V5.x product provides this function.

    The deployment manager process (dmgr) in the Network Deployment product manages a cell of Application Servers. You can federate a base Application Server into the cell or unfederate it. While federated, the configuration of the base Application Server is managed from the deployment manager.

  3. Use the First Steps tool to run the installation verification test (IVT). Select Verify installation. Check...

    install_root\logs\ivt.log

    ...for a summary of test results. Correct any errors and retry.

    The installation wizard automatically starts the First Steps tool at the end of installation.

  4. Check the installation log files for errors after installing:

    During installation, a single entry in...

    install_root/logs/log.txt

    ...points to the temporary log file, /tmp/log.txt

    The installation program copies the file from the temporary directory to the install_dir/logs/log.txt location at the end of the installation. In cases where the installation fails and the install_dir/logs/log.txt has only this one entry to the temporary directory, open the logs.txt file in the temporary directory for clues to the installation failure. Uninstalling creates...

    install_root/logs/uninstlog.txt

    The following table shows the installation log locations when installing the application clients for V5.1. The location is fixed and might not agree with the installation root location if specified a location during installation.

    Installation log locations when installing the WAS Application Clients

    /opt/WebSphere/AppClient/logs/WAS.Client.install.log


    Installation log locations when installing WAS

    IBM HTTP Server ihs_log.txt
    Embedded messaging installation log mq_install.log
    Embedded messaging configuration log createMQ.node.server1.log
    Default application installDefaultApplication.log
    Samples Gallery installSamples.log
    Administrative console installAdminConsole.log
    Migration tools WASPostUpgrade.log


    Installation log locations when installing Network Deployment

    Embedded messaging prerequisites error log mq_prereq.log
    Network Deployment

    log.txt
    uninstlog.txt

    Embedded messaging installation log mq_install.log
    Administrative console installAdminConsole.log
    File transfer installFiletransfer.log
    Migration tools WASPostUpgrade.log

    [V5.1 and later]If the installation program cannot locate the WAS_update directory from disk 2, the installation program fails immediately. The immediate failure leaves the WAS.WBISF.install.log file only in the /tmp directory.

    Normally, the installation program creates installation logs in the install_root/logs directory. In some instances, such as the one where the WAS_update directory does not exist, the installation program fails before it can copy the log files from the system temporary directory to the /logs directory. If you cannot find the installation log files in the install_root/logs directory, look in the system temporary directory.

  5. See if there are files in the install_root/classes directory. When IBM Support queues work for you and and provides you test or debugging fixes, you are supposed to place the fixes in the install_root/classes directory. By default, the install_root/classes directory is picked up first in the WAS class path to let it override other classes.

    This directory lets you verify or debug a fix. After accepting the test fix or finishing with the debugging of the debug fix, you are supposed to delete the fix from the install_root/classes directory to return the system to a working state. If you do not remove such fixes from the install_root/classes directory, you can experience errors.

  6. Turn on tracing if the installation logs do not contain enough information to determine the cause of the problem.

    • Report the stdout and stderr logs to the console window, by adding the -is:javaconsole parameter to the install command:

      install -is:javaconsole

      ...or capture the stream to a file with the following commands...

      install -is:javaconsole > captureFileName.txt 2>&1

    • Capture additional information to a log of your choice with the -is:log filename option.

    • Turn on additional installation logging by passing...

      -W Setup.product.install.logAllEvents="true"

      ...to the install command...

      install -W Setup.product.install.logAllEvents="true"

      If you install on an AIX 5.1 system, with maintenance level 2, it is possible that the Web-based system manager, a standard component of AIX systems, already uses port 9090. When starting the server you get information that port 9090 is already in use. To resolve the conflicting port use, change the port assignment for the HTTP_TRANSPORT_ADMIN port in the server.xml file. The file path is:

      /usr/websphere/appserver/config/cells/cell/nodes/node/servers/server1/server.xml

  7. Use the First Steps tool or the command line method to start the Application Server.

    To start the server from the command line:

    startServer.sh server1

    If you enable security, specify the -user and the -password parameters of the command.

  8. Verify whether the server starts and loads properly, by looking for a running Java process and the Open for e-business message in the SystemOut.log and SystemErr.log files.

    If no Java process exists or if the message does not appear, examine the same logs for any miscellaneous errors. Correct any errors and retry.You can find the SystemOut.log and SystemErr.log files in...

    install_root/logs/server1

  9. Use the First Steps tool or the command line method to stop the Application Server, if it is running, and to start the deployment manager.

    To stop the server from the command line:

    stopServer.sh server1

    If you enable security, specify the -user and the -password parameters of the command.

    To start the deployment manager from the command line:

    startManager.sh

  10. Verify that the server starts and loads properly by looking for a running Java process and the...

    Server dmgr open for e-business

    ...message in the dmgr_stdout.log and dmgr_stderr.log files. If no Java process exists or if the message does not appear, examine the same logs for any miscellaneous errors. Correct any errors and retry.

  11. Refer to the plug-in configuration documentation, if you have installed plug-ins and the Web server does not come up properly.

  12. Start the Snoop servlet.

    In a Network Deployment environment, the Snoop servlet is available in the cell only if you included the DefaultApplication when adding the Application Server to the cell. The -includeapps option for the addNode command migrates the DefaultApplication to the cell. If the application is not present, skip this step.

    1. Start the Application Server.

    2. Start the IBM HTTP Server or the Web server that you are using. Use a command window to change the directory to the IBM HTTP Server installed image, or to the installed image of your Web server. Issue the appropriate command to start the Web server, such as these commands for IBM HTTP Server:

      To start the IBM HTTP Server from the command line:

      Access the apache and apachectl commands in the IBMHttpServer/bin directory.

      ./apachectl start

    3. Point your browser to http://localhost:9090/snoop to test the internal HTTP transport provided by the Application Server. Point your browser to http://localhost/snoop to test the Web server plug-in.

    4. Verify that Snoop is running.

      Either Web address should display the Snoop Servlet - Request/Client Information page.

  13. Start the WAS administrative console.

    1. Start the Application Server.

    2. Point your browser to http://localhost:9090/admin.

    3. Type any ID and click OK at the administrative console window.

    The server starts. The administrative console starts. You can access the administrative console through the browser. The administrative console accepts your login.

  14. Federate the base Application Server into the cell.

    To add the base Application Server into the cell:

    install_root/AppServer/bin/addNode.sh localhost 8879

    If you enable security, specify the -user and the -password parameters of the command.

  15. Verify that the Application Server was incorporated into the cell.The command window displays a sequence of messages when you issue the addNode command:

    Begin federation of node xxxx with deployment manager at localhost:8879.
    Successfully connected to deployment manager Server: localhost:8879
    Creating node agent configuration for node: xxxx
    Reading configuration for node agent process: nodeagent
    Adding node xxxx configuration to cell: AdvancedDeploymentCell
    Performing configuration synchronization between node and cell.
    Launching node agent process for node: xxxx
    Node agent launched. Waiting for initialization status.
    Node agent initialization completed successfully. Process ID is: 3012
    Node xxxx has been successfully federated.
    The last message is an indicator of success. A second Java process is running, which is the nodeagent process. The stdout.log file and stderr.log file in the node directory contains a Server node_name open for e-business message.

  16. Resolve any IP address caching problems.

    By default, the Java 2 SDK caches the IP address for the domain name service (DNS) naming lookup. After resolving the host name successfully, the IP address stays in the cache. By default, the cache entry remains forever. This default IP caching mechanism can cause problems, as described in the following problem scenarios.

    Problem scenario 1

    Suppose the Application Server at host1.ibm.com has an initial IP address of 1.2.3.4. When a client at host2.ibm.com conducts a DNS lookup of host1.ibm.com, the client stores the 1.2.3.4 address in the cache. Subsequent DNS name lookups return the cached value, 1.2.3.4. The cached value is not a problem until the host1.ibm.com IP address changes, to 5.6.7.8, for example. The client at host2.ibm.com does not retrieve the current IP address, but always retrieves the previous address from the cache. If this scenario occurs, the client cannot reach host1.ibm.com unless you stop and restart the client process.

    Problem scenario 2

    Suppose the Application Server at host1.ibm.com has an initial IP address of 1.2.4.5. Although the IP address of the appserver does not change, a network outage can record an exception code as the IP address in the cache, where it remains until the client is restarted on a working network. For example, if the client at host2.ibm.com disconnects from the network because of an unplugged cable, the disconnected lookup of the Application Server at host1.ibm.com fails. The failure causes the IBM Developer Kit to put the special exception code entry into the IP address cache. Subsequent DNS name lookups return the exception code, which is java.net.UnknownHostException.

    IP address caching and WAS process discovery

    If you change the IP address of a federated WAS node, processes running in other nodes cannot contact the changed node until you stop and restart them.

    If a deployment manager process starts on a disconnected node, it cannot communicate with cell member processes until you stop and restart the deployment manager process. For example, plugging in an unplugged network cable does not restore proper addresses in the IP cache until the deployment manager process is restarted.

    Using the IP address cache setting

    You can always stop and restart a deployment manager process to refresh its IP address cache. However, this process might be expensive or inappropriate.

    The networkaddress.cache.ttl (public, JDK1.4) and sun.net.inetaddr.ttl (private, JDK1.3) parameters control IP caching. The value is an integer that specifies the number of seconds to cache IP addresses. The default value, -1, specifies to cache forever. A value of 0 specifies to never cache.

    Using a zero value is not recommended for normal operation. If you do not anticipate network outages or changes in IP addresses, use the cache forever setting. The never caching setting introduces the potential for DNS spoofing attacks.

    For more information about the Java 2 SDK

    The Java 2 SDK, Standard Edition 1.4 Web site describes the private sun.net.inetaddr.ttl property, which works in both Java 2 SDK, Standard Edition 1.3 (WAS V5.0.0, V5.0.1, and V5.0.2) and Java 2 SDK, Standard Edition 1.4. The Networking section of the Java 2 SDK, Standard Edition 1.4 Web site describes a change in the behavior of the java.net.URLConnection class.

  17. Avoid segmentation faults when installing on Red Hat Enterprise Linux 3.0 U1.

    If you are installing from an operator console attached to the RHEL 3 U1 machine and you receive a message that is similar to the following message, you might be experiencing a known limitation of RHEL 3 U1:

    A suitable JVM could not be found

    Red Hat Enterprise Linux 3.0 causes a segmentation fault when calling the JVM. The problem occurs when you run the installation program, log off of the root user, log back on to root, and run the installation again on the operator console that is attached to the machine (not a telnet session).

    The installation fails with a segmentation fault.

    This is a known limitation of Red Hat Enterprise Linux V3.0 that causes a segmentation fault when calling the JVM.

    Test the JVM to see if it is failing by running the following command:

    /mnt/cdrom/platform/linuxi386/jdk/java/jre/bin/java -version

    If you receive a "segmentation fault" message, reboot your machine or press Ctrl-x to reinstall. Rebooting the machine or pressing Ctrl-x resolves the problem.

  18. Preserve symbolic links when copying product CDs to network file server (NFS) disks.When copying a CD for one operating system platform to a network file system (NFS) disk on another type of operating system using the cp command, you can encounter errors such as those in the following example:

    a file is bad no such file or directory exists libCSup.2 cannot be accessed

    The copy error occur when incorrectly copying symbolic link files.

    An example of such an error occurs when copying an HP-UX CD image onto an AIX platform with the cp -frp command. The default cp command behavior on AIX is to resolve the symbolic links by copying the files to which the symbolic links point. Errors occur when a symbolic link resolves to a platform-specific library or file that is not present on the NFS operating system.

    Use options on the copy command of the NFS system to copy symbolic links instead of resolving them. For example, the -h option on the cp command of the AIX platform copies symbolic links from the HP-UX CD to the NFS disk on the AIX platform.

    Even with the -h option, the cp command on a Solaris platform does not preserve symbolic links when copying an HP-UX disk. On a Solaris platform, use the tar -cvf command to copy data from an HP-UX disk and preserve the symbolic links:

    1. Insert disk 2 for HP-UX platforms into the drive on the Solaris system.

    2. Close the file explorer window if one opens.

    3. Open a command window.

    4. Change directories:

      CD /cdrom/cdrom0

    5. Issue the following command:

      tar cvf * /workarea/filename.tar

    6. Change directories:

      CD /workarea

    7. Issue the following command:

      tar -xvf filename.tar

    Consult the man page for the copy command on the NFS system to understand how the platform supports copying symbolic links.

  19. Pick up secondary user groups for root before installing the embedded messaging feature.

    On many systems, such as SuSe Linux, if you telnet and issue the id command or the groups command, you cannot see the groups mqm or mqbrkrs even though they might exist. Solve this problem in one of two ways:

    • Use the ssh command to log in

    • Issue the su - command

    After using one of the commands, verify the required groups with the id command or the groups command.

    In a normal root login, issue the su command. For a real root login, issue the su - command.

    Display settings for a normal root login are automatic. For a real root login, set your display environment properly to successfully view the GUI installation wizard. Otherwise, you see a message about Preparing Java(tm) Virtual Machine... and seven rows of dots, but no installation GUI and no further messages. Refer to the documentation for your platform to determine proper display settings.

    If you see the following messages in the SystemOut.log file, you have not picked up the required secondary groups for root:

    [date time CDT] 60cf2faf JMSRegistrati 
       A MSGS0601I: WebSphere Embedded Messaging has not been installed 
    [date time CDT] 60cf2faf JMSEmbeddedPr A MSGS0050I: 
       Starting the Queue Manager 
    [date time CDT] 60cf2faf JMSEmbeddedPr E MSGS0058E: 
       Unable to start the JMS Server as WebSphere Embedded Messaging has not been installed 
    [date time CDT] 60cf2faf JMSService    E MSGS0001E: 
       Starting the JMS Server failed with exception: 
       java.lang.Exception: MSGS0058E: 
          Unable to start the JMS Server as 
            WebSphere Embedded Messaging has not been installed

    Also, the following associated messages are added to the mq_install.log file:

    wmsetup: date time Checking if user "root" is in group "mqm"
    wmsetup: date time
    wmsetup: date time ERROR: Group "mqm" exists, id "root" is 
         defined to the group but does not
    wmsetup: date time have the group in its current set of effective groups.
    wmsetup: date time Current group membership is :
    wmsetup: date time uid=0(root) gid=0(system) groups=2(bin)
    wmsetup: date time You may need to login.
    wmsetup: date time
    wmsetup: date time ... RC 4 from Check_root
    wmsetup: date time ERROR: User "root" not in group "mqm"
    wmsetup: date time Check_root mqbrkrs
    wmsetup: date time Checking for group "mqbrkrs" ...
    wmsetup: date time lsgroup returned "mqbrkrs id=203 
         admin=false users=root adms=root registry=files " RC=0
    wmsetup: date time Checking if user "root" is in group "mqbrkrs"
    wmsetup: date time
    wmsetup: date time ERROR: Group "mqbrkrs" exists, id "root" is 
         defined to the group but does not
    wmsetup: date time have the group in its current set of effective groups.
    wmsetup: date time Current group membership is :
    wmsetup: date time uid=0(root) gid=0(system) groups=2(bin)
    wmsetup: date time You may need to login.
    wmsetup: date time
    wmsetup: date time ... RC 4 from Check_root
    wmsetup: date time ERROR: User "root" not in group "mqbrkrs"

  20. Do not use the installation verification test (IVT) from the First Steps program or by running the ivt.sh command if the node name contains double-byte characters. The installation verification test is not supported on node names that contain double-byte characters.You can receive an error that is similar to the following example:

    java.lang.reflect.InvocationTargetException
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:79)
           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:41)
           at java.lang.reflect.Method.invoke(Method.java:386)
           at com.ibm.ws.bootstrap.WSLauncher.main(WSLauncher.java:94)Caused 
                by: java.lang.NumberFormatException: For input string: ""       
           at java.lang.NumberFormatException.forInputString(NumberFormatException.java:62)
           at java.lang.Integer.parseInt(Integer.java:469)       
           at java.lang.Integer.parseInt(Integer.java:498)       
           at com.ibm.websphere.ivt.client.ivtClient.main(Unknown Source)
           ... 
    

  21. Starting the Launchpad program for WAS clients, V5.1 using the Konqueror file in the K Desktop Environment on SuSE Linux Enterprise Server (SLES) causes an error.

    When starting the Launchpad program for WAS clients, Version 5.1 using the Konqueror file manager in the K Desktop Environment (KDE) on Linux systems, a "Couldn't find the program launchpad.sh" error occurs.

    Because the launchpad.sh command uses a relative path to locate the Java program, run the launchpad.sh command from the directory where the launchpad.sh command is located for the client program. When using the Konqueror file manager to issue the launchpad.sh command, the current directory is your home directory. Therefore, the launchpad.sh command cannot work.

 

Results

You can troubleshoot the installation.

 

What to do next

The troubleshooting section of the information center, as described in Troubleshooting or problem determination, contains more detailed debugging and reporting instructions. See Installation component troubleshooting tips for more information about troubleshooting the installation.

See IBM Support for current information on known problems and their resolution

IBM Support has documents that can save you time gathering information needed to resolve this problem. Before opening a PMR, see the IBM Support page .


Related tasks
Troubleshooting or problem determination
Related reference
addNode command
removeNode command
startNode command
stopNode command
startServer command
serverStatus command
stopServer command
startManager command
stopManager command