Pages

Monday, March 23, 2015

Node Manager Errors



Many times while configuring Node Manager we face different types of issues. In this blog I will try to summarize various issues I have faced and their associated solution. 


Problem1:


For server WLS_WSM3, the Node Manager associated with machine sca3-prd is not reachable.

All of the servers selected are currently in a state which is incompatible with this operation or are not associated with a running Node 

Manager or you are not authorized to perform the action requested. No action will be performed.



The environment on which I was facing this problem had node manager running via wlst.sh script not via standard way because of that there was no nohup log or node manager log getting generated.

There were no logs messages were getting generated in nodemanager.log file when I issued WLS server start command from console.

But using enable debug flag for a particular machine for which we are facing problem I was able to see some error in AdminServer.out file.
 


 

Once I enabled debug option, was able to see message in AdminServer.out like below –

DEBUG: ShellClient: Executing shell command: ssh -o PasswordAuthentication=no sca13-prd wlscontrol.sh -d aio_prd_domain   -s \'WLS_BAM11\' STAT
DEBUG: ShellClient: STDERR: Host key verification failed.



<Mar 20, 2015 12:26:09 PM EST> <Error> <NodeManager> <BEA-300033> <Could not execute command "getVersion" on the node manager. Reason: "Host key verification failed.".>
DEBUG: ShellClient: Executing shell command: ssh -o PasswordAuthentication=no sca13-prd wlscontrol.sh -d aio_prd_domain   -s \'WLS_BAM11\' STAT
DEBUG: ShellClient: STDERR: Host key verification failed.

 

When admin server was trying to conect sca13-prd server then its was failing to verify host using ssh keys. 

This blog helps to fix this problem 

Just  access the "sca13-prd" server from source server using this command "
 
ssh -o PasswordAuthentication=no sca13-prd" manually, it will ask to store remote server ssh keys into known_host file located inside /home/oracle/.ssh folder 




After doing this, I tried again but got hit with another error message - 

DEBUG: ShellClient: Executing shell command: ssh -o PasswordAuthentication=no sca3-prd wlscontrol.sh -d aio_prd_domain     VERSION
DEBUG: ShellClient: STDERR: Permission denied (publickey,gssapi-with-mic,password).
 

This URL has detailed steps which help me to fix above error message -
 
Above error was coming because remote server ssh keys was not added into “authorized_keys” file located at /home/oracle/.ssh folder at source server or vise verse. 

We need to generate a new ssh key pair on remote server and source server both server and add their public key to each other in “authorized_keys” file.

-bash-3.2$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/oracle/.ssh/id_dsa.
Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
The key fingerprint is:
21:c0:fe:bc:90:53:3d:1e:d4:aa:68:5e:42:f5:9b:20 oracle@AIO-EXL-H105.corp.asciano.ad
-bash-3.2$


Once ssh key pair get generated it will create two files id_dsa and id_dsa.pub
 
 

And then add remote server “id_dsa.pub” key value into the “authorized_files” located inside /home/oracle/.ssh folder at source server.

Also add  "id_dsa.pub" value from source system  to remote server
“authorized_files” located inside /home/xxxx/.ssh folder at remote server.

And then above permission error will go away.

After making these changes, I again tried restarting WLS_SCA13 mserver but bumped with another different issue - 

DEBUG: ShellClient: Executing shell command: ssh -o PasswordAuthentication=no sca3-prd wlscontrol.sh -d aio_prd_domain     VERSION
DEBUG: ShellClient: STDERR: bash: wlscontrol.sh: command not found
<Mar 23, 2015 12:02:57 PM EST> <Error> <NodeManager> <BEA-300033> <Could not execute command "getVersion" on the node manager. Reason: "bash: wlscontrol.sh: command not found".>
 

 

Above error was happening as WL_HOME path was not set for oracle user profiles and bash shell.

Run this command “ps -ef | grep $$ | grep -v grep” or “echo $SHELL” to check the shell name.

 






Add WL_HOME Path in /home/oracle/.bashrc file 


And re-load the oracle user profile again.


 

By doing above steps, I was able to fix problem highlighted in Problem1 and was able to start WLS_SCA13 server from admin console.

No comments:

Post a Comment