Page 1 of 1

Signal 11 (SEGV) caught by ps (3.3.12).

Posted: Fri Mar 09, 2018 8:30 am
by JatBee
Not sure if this is a bug, or if something else is going on.

NEMS instance has been cooking along without issue. No changes made. Been super useful as we've been moving things around on the network.

One day nagios starts throwing an warning (amber) on NEMS (it is monitoring itself). The "total Processes" monitor goes amber and the status information says "System call sent warnings to stderr: Signal 11 (SEGV) caught by ps (3.3.12)."

Huh.

I login to the nems host at the cli, and the login screen that usually has the colorful NEMS status looks like this.

login as: admin
admin@nems's password:
Linux NEMS 4.9.59-v7+ #1047 SMP Sun Oct 29 12:19:23 GMT 2017 armv7l
Last login: Fri Mar  9 06:50:05 2018 from 192.168.1.113
0
Signal 11 (SEGV) caught by ps (3.3.12).
ps:ps/display.c:66: please report this bug
    while executing
"exec -- ps -A h | wc -l"
    invoked from within
"expr {[lindex [exec -- ps -A h | wc -l] 0]-000}"
    invoked from within
"set psa [expr {[lindex [exec -- ps -A h | wc -l] 0]-000}]"
    (file "/etc/motd.tcl" line 51)
admin@NEMS:~ $

Huh.

Sure enough, if i try to manually run "ps". More or less the same thing.

admin@NEMS:~ $ ps
Signal 11 (SEGV) caught by ps (3.3.12).
ps:ps/display.c:66: please report this bug
Segmentation fault
admin@NEMS:~ $

This seems to be persistent across hard/soft reboots.

Now it is possible something just got corrupted. I have not re-imaged. My googling of the various messages highlights a number of past and present bugs that are similar.

I guess my question(s) are:
Has anyone else run into this?
Has there been any update to anything slipped in about 6/7 days ago?

Nems seems to be running fine otherwise, so the urgency is low, but something is not quite right.

I have no issue with re-imaging but wanted to make an inquiry before I did so.

Thanks for this, and a great product.

Jim

RE: Signal 11 (SEGV) caught by ps (3.3.12).

Posted: Tue Mar 13, 2018 8:52 am
by Robbie Ferguson
Thanks Jim.
Hmm, no I have not seen this on any of my NEMS appliances.

Had you run a dist-upgrade task recently? Perhaps through the webmin interface?

Is your NEMS server on a workgroup, or an AD domain?

What version of Kernel is your NEMS server currently using (uname -a).

Cheers,
Robbie

RE: Signal 11 (SEGV) caught by ps (3.3.12).

Posted: Fri Mar 16, 2018 5:06 am
by JatBee
I'd not done any recent updates (shame on me). After poking at it for a few days I just re-imaged and problem solved. It may be a blunt technique, but it is sure effective. I'll just chalk this one up to gamma rays or something.

RE: Signal 11 (SEGV) caught by ps (3.3.12).

Posted: Fri Mar 16, 2018 9:01 am
by Robbie Ferguson
Well, at least with nems-migrator, re-flashing is quite a viable fix. I'm curious why this happened though.

Re. Updates - no, not shame on you. NEMS Linux updates itself automatically as needed. Doing it manually could break things (which is why I'd asked).

Enjoy!

Robbie