Page 1 of 3

Failed to get host-alive check for host

Posted: Thu Jun 04, 2020 10:12 am
by plawer
I am fairly new to NEMS (and Linux in general). Today I updated my PI from 1.5.0 to 1.5.2 after setting up the initial server about a month ago. I have restored my backup.nems to the 1.5.2 installation. And it looks like the previous configuration has imported correctly. I did manage to mess up the old SD card so I don't have the 1.5.0 installation any longer.

When I am trying to add a new host now, I get an error when I go to "Generate Nagios config"
[INFO] Starting generate_config script
[INFO] Generating global config files
[INFO] Generating config for Nagios-collector 'Default Nagios'
[ERROR] Failed to get host-alive check for host 'XXXXXX'. Make sure the host is linked with a host-preset. Aborting.
It should be noted that the host in question is one of the existing hosts migrated from the old configuration.

And the additional info reads
Nagios Core 4.4.3
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-01-15
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Error: Cannot open main configuration file '/var/www/html/nconf/temp/test/Default_collector.cfg' for reading!
I checked through SSH and directory /var/www/html/nconf/temp/ exists, but /var/www/html/nconf/temp/test/ does not.

When I searched for the error online I came across a similar error message mentioned in the release notes for version 1.2.2.
May 26, 2017 - NEMS Linux Migrator updated to fix bug in host presets. Was causing these two errors: “[ERROR] Failed to get host-alive check for host ‘NEMS Linux’. Make sure the host is linked with a host-preset. Aborting.” and “Error: Cannot open main configuration file ‘/var/www/html/nconf/temp/test/Default_collector.cfg’ for reading!” - Thanks to Rick for giving me access to his affected system so I could fix this. Requires NEMS Linux 1.2.1 or higher.
Can anyone assist with what to do to resolve this issue?

Re: Failed to get host-alive check for host

Posted: Sun Jun 07, 2020 5:43 am
by StrackeJ
I got the same error.
First i thought, that this is caused after the update of 1-5-2 to the latest patches, but it didn´t.
I installed a fresh version 1-5-2 and restored the backup from my prvious 1-5-1.
After that, I got that same error.
Restoring the backup from the fresh installation (only with the nems-host) didn´t help as well.
So this should be a general issue after restoring from older versions.
hopefully somebody can help us quickly!
;-)

Re: Failed to get host-alive check for host

Posted: Sun Jun 07, 2020 7:54 am
by JatBee
I don't know if it is related, but I got the same error after my 1.52 upgrade. I found it was related to new host-presets and O/Ses I had defined not moving over.

If you don;t have the old config, that might be hard to determine for sure. Mine all ended up defaulting to "linux" for the O/S and "Linux-Server" for the host-preset. I don;t know if you remember tweaking those, or creating new ones.

I just posed a separate thread about that. Let's see if that is expected behavior. Maybe there is an easy tweak to the import that will set them to rights.

Jim

Re: Failed to get host-alive check for host

Posted: Sun Jun 07, 2020 8:16 am
by JatBee
To follow up one more level. Check and see what host-preset is being used for the host where you are getting the error. If there isn't one, select one. If there is one, then find that in the host-presets menu. Make sure it has a good "host alive check" filled in. I don;t remember that the default one is called - I've customized mine too much. But drop down the menu and reselect one of them, and see if it helps.,

Jim

Re: Failed to get host-alive check for host

Posted: Sun Jun 07, 2020 9:09 am
by Marshman
Hi All,

There is a patch for this error. What happened was that the "host alive" check was actually trying to check email and a quick chnage back and forth fixed it. A patch should have been pushed overnight to resolve this issue but I will check with Robbie on this.

Re: Failed to get host-alive check for host

Posted: Mon Jun 08, 2020 1:30 am
by StrackeJ
JatBee wrote: Sun Jun 07, 2020 8:16 am To follow up one more level. Check and see what host-preset is being used for the host where you are getting the error. If there isn't one, select one. If there is one, then find that in the host-presets menu. Make sure it has a good "host alive check" filled in. I don;t remember that the default one is called - I've customized mine too much. But drop down the menu and reselect one of them, and see if it helps.,

Jim
Hi Jim,
I switched in the host-preset for the "linux-server" the "host allive-check" from "check-host-allive" to "notify-host-by-email".
And obviously works.
Thanks a lot for that hint!!!

Juergen

Re: Failed to get host-alive check for host

Posted: Mon Jun 08, 2020 8:07 am
by Robbie Ferguson
Hi all,
Indeed, as @MarshMan points out, this issue has been patched. However the patch is not retroactive to NEMS Servers which have already had their backup restored.

@StrackeJ nearly has it! The only step they're missing is to restore the setting after changing it.

The key is to change the linux-server host-preset host-alive check to any and save, then change it back to check_host_alive and save again. Then, generate your config.

I have recorded a video to show you how to quickly fix this issue:

https://youtu.be/opMX029x9Qo

Following this easy fix, you can once again generate config.

This issue occurs because your NEMS Migrator Backup is from an earlier database build of NEMS Migrator, and the identifier has changed for your NEMS Server. So during import, the host-alive check fails to configure correctly. Again, this has been fixed for future releases, and the patch is issued automatically to any connected NEMS Server. If unsure, feel free to run sudo nems-quickfix (and wait, it could take a while) before doing a NEMS Migrator restore. But this video demonstrates how quick and easy it is to fix the issue should it occur.

Note: Video taken in NEMS Linux 1.6 Developer Build. The steps are the same for all versions of NEMS Linux, even though the interface looks a little different.

Cheers!
Robbie // The Bald Nerd

Re: Failed to get host-alive check for host

Posted: Mon Jun 08, 2020 11:39 am
by StrackeJ
Hi,

I tried it out, but if I do it on that way, it kills nearly 50% of my services, what I want to check.
First, I had 91 services
directly after retsore
directly after retsore
directly-after-restore.PNG (7.18 KiB) Viewed 12222 times
and after cycling the host-preset i have only 47 (see attachments)
after changing and changing back host-presets
after changing and changing back host-presets
after-changing-host-presets.PNG (7.03 KiB) Viewed 12222 times
.

Re: Failed to get host-alive check for host

Posted: Mon Jun 08, 2020 11:54 am
by Robbie Ferguson
I would guess the issue is in fact inflating the number of services before fixing. Can you confirm whether there are actually services missing after fixing, or if it's just the count goes down?

Re: Failed to get host-alive check for host

Posted: Mon Jun 08, 2020 12:03 pm
by StrackeJ
I can indeed confirm, that there are services missing.
On some Server, where I check e. g. http, I also check icmp. And icmp is missing after fixing.