Collectd
is used to tracking hard drive space and cpu usage. The data is collectd on the servers and sent to the Moose indexers via HEC.
systemctl restart collectd
05/08/2020
Basic
| mstats count WHERE index=collectd metric_name=* by host, metric_name
CPU idle with timechart
| mstats avg(_value) as "Avg" WHERE index=collectd metric_name=cpu.idle.value host=bastion* span=1m
| timechart max("Avg") span=5m
CPU Usage with timechart
| mstats avg(_value) as "Avg" WHERE index=collectd host=bastion* metric_name=cpu.system.value OR metric_name=cpu.user.value by metric_name span=1m
| timechart first("Avg") by metric_name span=1h
Currently a bug in collectd where it writes the response from HEC into
the system log /var/log/messages
. There's a github issue, Collectd Issue 3105. Duane has a PR in to fix it, in theory - PR 3263
Duane's PR has been merged.
This search is useful for making sure everything's checking in:
| mstats max(_value) as "val" WHERE index=collectd metric_name=* host=* earliest=-24h latest=@m span=10m by host metric_name
| eventstats dc(metric_name) as metrics by host
| stats latest(_time) as last, latest(metrics) as metrics by host
| convert ctime(last)
| sort last
| eventstats count(host)
| sort last