Correlating Puppet changes to events in your infrastructure using graphite

Sometimes it is pretty obvious when Puppet changes something in your infrastructure and bad things happen in a big dramatic way. Other times it’s not so obvious. It can be invaluable to be able to correlate changes made by Puppet to other events happening in your infrastructure.

For example, in this diagram we have plotted the load average from a group of servers. Blue vertical lines mark points in time when puppet modified a resource on a host in the group. We can see that immediately following a puppet change the load spiked on one of the servers.

Code available on github:

Leave the first comment

List of statsd server implementations

Statsd is a simple client/server mechanism from the folks at Etsy that allows operations and development teams to easily feed a variety of metrics into a Graphite system. For more info on statsd read the seminal blog article on Statsd “Measure Anything, Measure Everything”.

As would be expected there are statsd clients in many languages. But, there are also many implementations of the statsd server. This is nice because each organization can pick the one that best fits them. For example, a python shop might prefer to deploy a python based statsd instead of Etsy’s original node.js implementation. Also, there are some statsd implementations that diverge from the original design and provide additional features.

I could not find a single resource that listed all of the different implementations, so I figured I would try to start one here.

  • Flickr’s StatsD: Perl. This is the real original statsd which inspired the Etsy StatsD. It’s here for historical purposes and is not recommended for production use.
  • Etsy’s statsd: node.js. The (new) Original.
  • petef-statsd: ruby. Supports AMQP.
  • statsd_rb: ruby.
  • quasor/statsd: ruby. can send data to graphite or mongoDB
  • py-statsd: python (including python client code).
  • zbx-statsd: python, based on py-statsd. Sends data to Zabbix instead of graphite.
  • statsd.scala: scala. Sends data to Ganglia instead of Graphite. Different messaging protocol, uses JSON.
  • txStatsD: python + twisted, from the folks @ Canonical
  • statsd-librato: node.js. Fork of etsy’s statsd for sending data to Librato instead of graphite from the folks @ Engine Yard.
  • estatsd: erlang. From the folks @ Opscode
  • metricsd: scala. Should be drop-in compatible with etsy’s statsd, but with support for additional metric types (eg: meter, gauge, histogram)
  • statsd-c: C. compatible with original etsy statsd
  • statsd (librato): node.js.  Librato’s officially maintained fork of statsd based on the changes from Engine Yard. Supports multiple graphing services including Librato Metrics
  • bucky: python. A unique spin on statsd that supports collecting data from statsd clients, collectd, and metricsd, with output to graphite. The ability to translate collectd plugin names to be more graphite-friendly is very compelling.
  • clj-statsd-svr: Clojure.
  • statsite: C. Statsite is designed to be both highly performant, and very flexible, using libev to be extremely fast.
  • statsdaemon: Go. Statsdaemon written in Go, from bitly.

Other interesting statsd-like projects that are not protocol compatible with the original Etsy statsd but may offer compelling features not found in other implementations:

  • estatsd (opscode): Erlang. Inspired by etsy’s statsd but not protocol compatible.
  • estatsd (fauxsoup): Erlang. Fork of opscode’s estatsd with a focus on multi-datacenters and high-scalability.

Deprecated:

  • statsite: python. Replaced by a new implementation in C, see above.

Please leave a comment if you have an implementation that should be listed here. Feedback on any of the above implementations would be helpful too.

5 comments so far, add yours

Network Link Conditioner in Xcode 4.1, Lion

Previously, I wrote a post about using the ‘dummynet’ functionality in Mac OSX’s ipfw(8) firewall to simulate a variety of networking conditions, such as:  bandwidth, packet loss, latency (delay). This is a great feature for testing software under a variety of network conditions but it can be a little tough to use unless you’re comfortable at the command line, or even better, have unix scripting skills since there are multiple commands required to create even simple scenarios.

Then, today I noticed that Apple now includes a new prefPane in Xcode 4.1 and Lion called “Network Link Conditioner” that simplifies all of this, and even includes a few profiles to get you started (eg: “Wifi, Average case”, “3G, Lossy Network”.) Pretty cool feature. Especially useful for iOS developers. Screenshot below.

  • Install: find and run /Developer/Applications/Utilities/Network Link Conditioner/Network Link Conditioner.prefPane

 

Leave the first comment

groovy-statsdclient

In an attempt to learn some Groovy and Gradle I wrote an implementation of a Statsd client in groovy.  It’s similar to other statsd clients in other languages and supports the typical increment(), decrement(), and timing() methods.

It’s not available on Maven Central or via Grape at this time, which will be a future learning exercise. In the meantime you can download a pre-built .jar or the source code from github:

For more background on Statsd, check out this blog article from Etsy:  Measure Anything, Measure Everything

 

Leave the first comment

#monitoringsucks – but it doesn’t have to

On May 26, 2011, John Vincent, aka @lusis, started a conversation with a simple tweet that led to an hour-plus chat on IRC freenode between some very smart folks.  The IRC channel was supposed to go away after the chat was over but instead it has survived and is attracting more and more attention.

The genesis of it all is the general staleness of the open source monitoring landscape.  There have been numerous and rapid improvements in operations tools in the last 10 years but when it comes to monitoring we’re still pretty much stuck on the same tools we’ve been using since the late-90s.  It would be understandable if these tools were “as good as it gets”, but they’re not, and there’s definitely room for improvement and sharing of ideas and experiences.  If nothing else, at least there’s a central place for people to discuss monitoring and share ideas.  There’s hardly a day that goes by where someone doesn’t stop into ##monitoringsucks on IRC and announce a new tool or concept they’re hacking away on.

Since the initial chat, a set of github.com repos has popped up to collect links to various tools and blog posts on the topic of monitoring:

  • irc-logs – IRC log of the initial chat that started it all on 5/26/2011 is available here
  • tools-repo – a great resource for many monitoring tools
  • blog-posts – a growing list of interesting monitoring related blog posts

If you’re interested in helping make the art of monitoring better, consider stopping by IRC Freenode or contributing to the github repos.  I’m hoping to learn quite a bit out of this little spark that John created.

Leave the first comment

Alternative skins for Skype 5 and 6 on Mac OSX

The consensus at this point is that Skype 5 on Mac OSX is ugly and wastes a ton of screen space for no good reason (all of this is true, btw.) But, you don’t have to live with the ugliness.  It’s not widely known, but Skype 5 supports skins or “chatstyles”, which are constructed entirely in HTML/CSS/Javascript, so they’re easy to write and hack.

Skype’s official page for information on ChatStyles:  http://macthemes.skype.com/start

Here are a few of the alternate ChatStyles available on github:

  • PanamericanaMini – this is probably the first alternate ChatStyle.  It’s very similar to the default Panamericana theme, but with some whitespace removed.
  • Brief – nice theme.  Check out the screenshots.
  • My fork of Brief on github – I like the Brief chatstyle, but I made some small modifications to further reduce extra whitespace and I modified colors slightly.
  • Simples – heavily influenced by the Renkoo style for Adium.   Screenshot.
  • StyleShift – reduced whitespace, every image URL is expanded in a linked thumbnail, custom emoticons.

More ChatStyles are available in Skype’s ChatStyle Gallery.  They’re also running a competition, so you can go vote for your favorite.

Leave the first comment

Charlie Sheen plugin for the Hudson/Jenkins CI Server

I built a Charlie Sheen “Persona” plugin for our Hudson continuous integration server at work, which is now available on my github page.  If you’re familiar with the Chuck Norris plugin, then you know what this is all about.

Add a little bit of Sheen-ius to your Hudson/Jenkins server!!  Tiger blood is included.  #winning

Installation instructions are in the README.

Leave the first comment

Kanbanops – new mailing list

I am a fan of using Kanban boards to organize and visualize the work coming into an infrastructure team, but most of the Kanban resources on the internet are geared towards software development teams.

Recently, a new mailing list was started for the sole purpose of sharing Kanban experiences in operations teams. The list is appropriately named kanbanops and you can sign up, browse the archives, etc, over at yahoo groups:

Leave the first comment

Collectd-Graphite plugin. Bringing together two great tools

Collectd is a powerful tool for gathering metrics using its wide range of plugins, such as cpu, disk, load, memory, etc.  But there is a lack of good frontend tools for visualizing the data collectd produces.

Graphite is an amazingly powerful tool from Orbitz for visualizing metrics, but there is a lack of tools for gathering host-level stats and sending into graphite.

It would be great if we could leverage the strengths of both tools.

So, I wrote a plugin for collectd that will send data into a graphite instance – collectd-graphite. This plugin runs inside the collectd process using the collectd-perl interface. This makes it different than Jordan Sissel’s collectd-to-graphite tool which runs in a separate process using node.js. The plugin-based approach reduces the number of moving parts we need to worry about (ie: less stuff to monitor, fail, restart, etc.) Jordan’s tool is good, btw, but I wanted something different.

Grab the plugin on github:  https://github.com/joemiller/collectd-graphite

Continue Reading

HOWTO: install graylog2 on CentOS 5 with RVM + Passenger

Graylog2 is an open-source self-hosted centralized log management tool. Think of it as a do-it-yourself version of loggly.com, or perhaps a simpler alternative to Splunk. Logs are stored in a MongoDB database.

I won’t go into too much detail, so if you want more info check out graylog2.org

Graylog2 consists of two components: graylog2-server, a Java process which receives logs and writes them to a MongoDB database. graylog2-web-interface, a Ruby on Rails app that seems happiest with modern versions of Ruby and Rails. CentOS 5, however, ships with an old version of Ruby (1.8.x) which did not play nicely with all of the gems that graylog2 wanted.

If the server is going to be dedicated to only running graylog2, we could probably overwrite the installed Ruby with a newer version. However, the server we wanted to install it on was already running a complex Rails app under mod_passenger which we did not want to risk breaking.

I decided to see if RVM – Ruby Version Manager – would allow me to setup an isolated Ruby environment just for graylog2 and not disturb the other Ruby apps on the machine. I also wanted to setup an isolated instance of Passenger-standalone for graylog2 then configure apache to listen on port 80 and forwarding requests with mod_proxy.

Everything worked even better than I expected. In the Ruby world, with so much rapid change, it can be challenging to put together a compatible set of gems for multiple apps running on the same machine. RVM worked so well I will consider using it to provide isolated ruby environments for other apps in the future. I love RVM!

Here is how I put it all together:

Continue Reading