Using Chef to configure Datadog

The Datadog documentation for Chef is, to be generous, minimal. It's helpful to tell you how to make a recipe to install the Chef handler. This will begin uploading metrics about Chef runs to Datadog. I've yet to find these particular stats to be all that helpful, though.

What I want is to have Chef manage the actual configuration of Datadog so when we add new systems, or change what we're watching it all happens automatically, like Chef is meant to do.

To begin this process, I created a simple recipe that sets up Datadog. This does what the original docs do, but with an extra bit: if there are attributes about some of the other Datadog monitors, this will see them and add the appropriate additional recipes. If these attributes aren't set, the additional monitor recipes are skipped. Datadog's recipes don't gracefully handle the situation where the recipe is included and the attributes aren't there to watch anything (a very simple default attribute in that recipe that was empty would go a long way here).

Here's the bulk of the interesting stuff from our default.rb Chef recipe:

This just checks for the various attributes and if they exist, then add the appropriate include_recipe.

To get the right attributes, we've added the following to our attribute file:

This is a little more complicated, but can be broken down to just a few parts. There are two helper functions, one to check if a service exists, and one to return a list of IIS websites if there are any.

Then, at least on Windows machines, we make a list of each of the IIS sites. If any exist it will add it to the default['datadog']['iis']['instances'] attribute. Going back to the recipe, above, this will also cause the IIS DataDog recipe to be included as well.

In addition, we can check for some services if we're wanting to check those as well. We have a few services and can easily check if they exist (through that helper function) and add the service and process to the appropriate list, which then can watch those as well. The windows_service and the process DataDog elements watch different things, and we want to be able to check both. Obviously if you only wanted to see the service status, or watch a process that wasn't a service, this wouldn't need a ton of modification.