Finding Out-of-Sync Packages Across Servers with Fabric <a href=_http_/ Casero</a>
Photo by Bernat Casero

Finding Out-Of-Sync Packages Across Servers With Fabric

by Corey Oordt •  Published 1 Dec 2010

We have several servers that are supposed to have the same packages installed, but often get out of sync due over time. To make it easier to find these out-of-sync packages and servers, I wrote a quick fabric script to do the checking for me.

The server environment

A few things are assumed in this post:&nbsp;

None of this will work without pip, as it relies on pip’s freeze command, and it will work really badly if you have multiple projects on the server and no virtualenvs.

Starting the

You may already be using Fabric for some automation, so you could skip to the next section. We are going to set up our “fabfile,” or list of commands that Fabric will use.

Create an empty file named and insert this at the top:

from __future__ import with_statement
from fabric.api import env, run, settings, hide
from fabric.decorators import hosts, runs_once

venv = "/home/websites/.virtualenvs/coolsite/"

env.user = 'webdev'
env.hosts = [

This imports the pieces we need, sets the path to find the virtual environment and the connection information.

Getting the list of packages

pip has a nice command, freeze, that will list all the currently installed packages and versions. It has a flag, -l or —local, that will only list packages installed within the current virtual environment. The python function to do this via Fabric is:

def _get_package_list():
    Get the list of currently installed packages and versions via pip freeze
    with settings(
        hide('warnings', 'running', 'stdout', 'stderr'),
        return run("%sbin/pip freeze -l" % venv)

The function is prefixed with an underscore so that Fabric won’t list it as an executable command when someone types fab -l. This function, as is, will not return anything to the screen if executed from the command line.

The with statement hides the output and simply runs pip freeze -l, but uses the pip installed within the virtual environment. Fabric’s run command returns the output as a string, with some other attributes added on.

pip and packages

Before we get knee deep into the processing of the package lists, lets cover the two ways that pip installs packages. pip allows a normal package install, like from PyPI, by simply specifying the name, and optionally the specific version. In this case, pip freeze will always output package_name==version, even if originally no version was specified.

The second method is with a source code checkout, or “editable” packages. This form of the command looks like:

pip install -e svn+http://myrepo/svn/MyApp#egg=MyApp

The pip freeze output for this type of install is very different from the other packages:

-e svn+http://myrepo/svn/[email protected]#egg=myapp-0.2-py2.6-dev

In this format, the package name is after the #egg=, and we are going to treat everything before #egg= as the version number.

Storing the info for processing

This project seemed a breeze until I had to store and aggregate the package, version, server data for later processing. There are two things that we really want to know:

  • Which packages have multiple versions installed across all the servers, and which servers have which version.
  • Which packages are installed on only some of the servers, and which servers are missing which version.

I came up with a 2-dimensional dictionary with a list that makes it easy to discover what we want to know. Here is the format:

packages = {
    'pkg1': {
        '1.0': ['server1',],
        '1.0.1': ['server2',]
    'pkg2': {
        '0.4': ['server2',],
    'pkg3': {
        '2.0.1': ['server1', 'server2']

The example shows that pkg1 has a different version installed on each server, pkg2 is only installed on one server, and pkg3 is uniformly installed across all the servers.

def check_package_versions():
    Check the versions of all the packages on all the servers and print out
    the out of sync packages
    packages = {}
    for host in env.hosts:
        with settings(host_string=host):
            print "Getting packages on %s" % host
            result = _get_package_list()
            pkg_list = result.splitlines()
            for package in pkg_list:
                if package.startswith("-e"):
                    version, pkg = package.split("#egg=")
                    pkg, version = package.split("==")
                if pkg not in packages:
                    packages[pkg] = {}
                if version not in packages[pkg]:
                    packages[pkg][version] = []

The @runs_once decorator is to make sure that Fabric only runs this once. Typically fabric runs each command for each item in env.hosts.

The with statement tells Fabric to alter the current settings and set the current host to which we will connect.

The result of _get_package_list() is a multi-line string, which we convert to a list (one line is one list item) and loop through it.

Lastly, we check the format of the line (“-e” or not) and populate the data structure. All the processing and output is done in _process_packages().

Processing the information

def _process_packages(packages):
    Convert the packages datastructure into the multiple versions and missing
    servers lists and output the result
    multi_versions = {}
    missing_servers = []
    for package, versions in packages.items():
        if len(versions.keys()) > 1:
            # There is more than one version installed on the servers
            multi_versions[package] = versions
        elif len(versions[versions.keys()[0]]) != len(env.hosts):
            # The package is not installed on all the servers
            missing_hosts = set(env.hosts) - set(versions[versions.keys()[0]])
                "%s: %s" % (package, ", ".join(missing_hosts))
    if missing_servers or multi_versions:
        print ""
        print "Packages out-of-sync:"
    if multi_versions:
        print ""
        print "Multiple versions found of these packages:"
        for package, versions in multi_versions.items():
            print package
            for ver, servers in versions.items():
                print "  %s: %s" % (ver, ", ".join(servers))
    if missing_servers:
        print ""
        print "These packages are missing on these servers:"
        for item in missing_servers:
            print item

Pulling the information from the data structure is pretty easy. As we loop through the first dictionary, we make sure there is only one version installed (only one key in the versions dictionary) and that the length of the list of that one key is equal to the number of servers.

Putting it to use

After saving your fabfile, you can run the command from a shell, just make sure that you are in the same directory as your

fab check_package_versions

When all is output, you’ll see something like:

Getting packages on
Getting packages on
Getting packages on
Getting packages on

Packages out-of-sync:

Multiple versions found of these packages:
  -e git+[email protected]f4be:,
  -e git+[email protected]4867:,

These packages are missing on these servers:
Got the gist?

The example code is available as a github gist.

blog comments powered by Disqus