We’re using Consul for our cloud deployment. Consul makes it very easy and cheap to run health checks on every node, validating its own services. It does all the hard work of monitoring things that you used to rely on e.g. Nagios to do. The obvious drawback is that it runs locally, so you’re not validating that your service actually works across the network.
I spent a bit of time wondering how I could get node X to monitor node Y in a way that
- fits neatly with Consul’s data model (service checks running on node A can’t easily report failures that are really on node B), and
- doesn’t require a lot of orchestration when you have tens or hundreds of thousands of nodes that all need to keep an eye on themselves and (some subset of) each other.
I realised that I didn’t actually need another node to perform the monitoring. I just needed to have the network be a factor in the check. After learning that the MIRROR target in iptables went the way of the dodo over a decade ago, I hacked together a little script called reflector.
You simply install it on a host on your network, run e.g. “reflector –return-port 22 10022″ and any connection to port 10022 will be reflected back to the connecting node.
In other words, any node connecting to 10022 on the reflector node will actually connect to itself on port 22, except it will have traversed the network, thus ensuring that the service functions remotely.