Today I learned a cool new trick, made possible by the Item Preprocessing that is available since Zabbix 3.4. This is not a Zabbix blog, but I consider this so useful and so not-intuitive that I just have to write it down.
For the occasional visitor: Zabbix is an open source enterprise class monitoring solution. I use it not only at work, but also at home — not only to monitor the meat on my BBQ.
In my opinion, Zabbix is really great when it comes to anything that can be expressed in numerics. It can also handle string data, but having had minor problems with that I try to avoid strings whenever it is possible. States can be expressed numerically and stored in integer values, using value mapping it is possible to show the states in a meaningful way.
Today I came across a problem that I wasn’t able to implement at first. A web service told me the state of an application, which could be either RUNNING, STOPPING or STOPPED. It’s easy to create a string-item to handle this, and to implement a trigger that notifies if the service is STOPPED. However, in this case I had to create a trigger that tells if the service is in the state STOPPING for more than a certain time, indicating that it might have a problem shutting down. I couldn’t find a solution to this, since the trigger function
str("STOPPING",15m) would trigger if at least one of the values during the last 15 minutes was “STOPPING”.
Item Preprocessing to the rescue!Zabbix 3.4 brought a feature called Item Preprocessing. A value can be fetched, and before it gets stored it can be dissected or converted in several ways. One of these is, to apply regular expressions to the fetched value, in a “find and replace” kind of way.
It turned out that I needed a fairly complex expression, but in the end I was able to convert the states from the webservice to simple integers. Searching the web, i found something about “conditional replacement”, and using this great online regex tester I was able to come up with this beauty:
Using this, I can convert the string that is extracted from the web service’s output using a JSON path in two further steps:
- First, attach a “dictionary” to the value: replace the full value
(.*)with with itself, followed by the replacement values:
- Then, replace the regex
(STOPPED|STOPPING|RUNNING)(?=.*:\1=(\d))by the value of the second capturing group
In this way the item can be configured as an unsigned integer, since there are only the numbers 0, 1 or 2 that have to be stored. And I can apply the usual trigger-function-magic to notify if the value is 1 for a certain length of time. Added benefit: I can use the graph view of this item to see if there were any deviations from the wanted “RUNNING” state, when they occured and how long they lasted.
I’m pretty enthusiastic about this way of processing the values. But I’m also very interested in opinions: is there a better way to deal with this problem? Something obvious that I haven’t seen?