No — no firewalls, no SELINUX. ...
No — no firewalls, no SELINUX.
This just started today. I can use the manage service interface to kick off various aspects of the installation process, and the pop dialog shows that what I choose to do has started, but this dialog that is showing the various progress bars NEVER refreshes. It seems like there is no communication of completed results.
I was trying to move off of using localhost.localdomain and used the host file to declare a FQDN — reinstalled puppet and hmc to get a new pem file but now there is no communication between what I kick off and the web page that is supposed to report problems.
Any suggestions about where to look?
The topic ‘hmc console not reporting service results’ is closed to new replies.
If you did everything correctly, HMC installs MySQL for you.
Any firewalling/SELinux running?
The mysql that I am using was the one that Hortonworks obtained during setup. I did not do any special provisioning or account setup. Hortonworks implied that this was all done automatically behind the scenes.
Did I get the wrong impression?
Specifically, what extra things do I need to do to supply the added permissions? Is there a recommended section in the Mysql docs to read? What permissions must be added manually?
This error message means that MySQL user does not have needed permissions.
Please, check documentation on setting up Mysql Account.
Last posting for me today — did a mouse over the Desciption cell and the full error message pops up:
CRITICAL: Error accessing hive-metaserver status [Exception in thread "main" java.io.IOException: Permission denied]
Maybe this gives support something to hang on to in their investigation? What permission is missing?
I am now past the cluster startup — I think my recent errors were because mysqld was in fact not running. I made sure mysql was installed and mysqld running when I did another cluster install. This worked.
BUT I still see the HIVE-METASTORE status check critical error and no amount of refreshing the Monitoring console makes this error disappear. So apparently the problem wasn’t the MySQL host entry after all.
As root, I was able to enter the grant instructions Sasha mentioned earlier — for good measure I did one for localhost, jjscentos64 (my hosts file host name) and jjscentos64.local (the FQDN). But these don’e affect the error message either.
One odd thing is that value in the Duration column is: 0day 7hr 34min but the the cluster hasn’t been up that long. Could this be stale data that the console is bringing in?
So I am kind of at the same spot as I was in when today began — *sigh*
Once again I started over — I have a high tolerance for installer pain.
I was worried that I may have a bad version of mysql installed so I did a yum erase of it as well.
When I did the setup cluster sequence it got to the same failure point as before (Hive/HCatalog test) with what appears to be the same error:
“\”Fri Jul 27 16:00:16 -0400 2012 /Stage/Hdp-hive::Hive::Service_check/Exec[/tmp/hiveSmoke.sh]/returns (notice): Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided)\””,
But this time it appears that mysql wasn’t even installed at all. I thought that the cluster install would install it if it was not found. mysqld is an unknown service and mysql is not in the root user’s path.
Is this what should have happened?
Still looking for answers.
And just for kicks, I logged into mysql as my admin user set up during the install process and I can get the mysql prompt. I tried:
mysql> grant all privileges on *.* to ‘HCAT_USER’@’jjscentos64.local’ identified by ‘HCAT_PW';
ERROR 1045 (28000): Access denied for user ‘hdwDBadmin’@’localhost’ (using password: YES)
So that does not seem to be an answer for my dilemma either.
Further info: in the /var/log/hive file I see messages like this:
** BEGIN NESTED EXCEPTION **
MESSAGE: Connection refused
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.security.AccessController.doPrivileged(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
** END NESTED EXCEPTION **
Does this help diagnose my misconfiguration?
I did some checking after the failed cluster install with mysqld running and it turns out that this command works:
mysql -h localhost -u hdwDBadmin -p
But this command does NOT:
mysql -h jjscentos64.local -u hdwDBadmin -p
The hostname jjscentos64.local is the FQDN for the experimental setup. I am told that:
ERROR 1130 (00000): Host ‘jjscentos64.local’ is not allowed to connect to this MySQL server
Does this help with a work-around? I do NOT want the critical Hive metadata error to be reported AND I want to install a cluster without using localhost.localdomain as my FQDN.
Maybe this is impossible for the current release?
It looks like if I leave the mysql host name undefined, the setup process uses the name localhost. Then Hive installs BUT then the critical error I reported earlier in this discussion happens. If I try to use the FQDN (or the hostname by itself) for the mysql host name, then the cluster install fails. Could it be that there is a bug in the initialization here — I shouldn’t have to muck with mysql grant statements.
Stuck for now.
I did replace the mysql host with both the FQDN and the assigned hostname. I then reinstalled the cluster and the install FAILS during the Hive test. I have the log file but http://ftp.support.hortonworks.com appears to be unreachable for me right now so I can’t upload the log file from the Hive test failure.
When I ping the FQDN I do NOT get 127.0.0.1 — I get the address I have in my hosts file: 192.168.150.130.
This is getting very weird!
Can you confirm if the ftp site is up?
if it returns 127.0.0.1
then you are resolving localhost, but the metastore has put in a grant for your FQDN
you can simply add a grant in the mysql server to fix this
grant all privileges on *.* to ‘HCAT_USER’@’localhost’ identified by ‘HCAT_PW’
what is IP when you ping your HMC server from the HMC server using the FQDN?
Could this issue be due to the choice of Hive hostname? I left this blank and it appears that localhost is then used as the name. Since every other part of the hadoop config is now using the new FQDN, perhaps I should manually enter that name for the mysql host.
I am trying that now — will report back when the cluster installation completes.
I ftp’d the file just now — named hive-fail.txt. Hopefully support can see it and make some sense of it.
After seeing the error, I tried to stop Hive and this also required stopping Templeton. Templeton stopped OK, but after a bit if waiting, the Hive stop FAILed. Despite the failure, I was able to start Hive and Templeton successfully BUT the same critical error was reported in the hmc console.
I will upload the operations log documenting the Hive stop failure tomorrow to the Hortonworks website.
Not sure I know what you are asking. The HMC host is the ONLY host — one VM to rule them all.
I have hostname: jjscentos64 and I have set the domain name as local so the FQDN is jjscentos64.local.
This seems to be working for the most part.
BUT… (there is always a but)
I am seeing a new critical error related to HIVE-METASTORE that has appeared several times now.
The alert name is: HIVE-METASTORE status check
It’s status is CRIT
The description is: CRITICAL: Error accessing hive-metaserver status [Exception in thread "main"]
This error did not seem to occur when using localhost.localdomain for the VM.
I see no other errors. I have not suspended the VM — this is just running after the installation.
are you certain you are able to resolve all the hosts from the HMC host
you are able to resolve the HMC host from all the hosts?
I was connected, but the dialog reporting progress after each step was not refreshing in Firefox.
But then I noticed that the results of hostname and hostname -f were different. I had added an entry in /etc/hosts but I forgot about the hostname property in /etc/sysconfig/network which was not set to the FQDN. It was set to the hostname only (e.g. jjscentos64). Once I changed to the FQDN, the browser behaved as it did before.
Learning more about networking everyday as I go through this exercise.