Our infrastructure automation is driven by Puppet, so this post is mainly going to talk about Puppet – however the key problem we have (and issues we’re solving) is equally relevant for most other current configuration management tools (such as Chef). One of the key challenges for configuration management systems is determinism – i.e. being able to rebuild the same system in the same way.
In a ‘traditional’ world view, the ‘system’ means an individual machine – however, in the real world, there are very few cases where a new production system can be brought on-line with only one component. For resiliency (if not scalability) purposes you probably want to have more than one machine able to fulfil a particular role so that a single hardware failure won’t cause a system wide outage.
Therefore your system consists of more than one machine – whilst the current crop of configuration management tools can be deterministic for a single machine, they’re much less deterministic when you have inter-machine dependencies.
Beyond individual per-application clusters of servers, you want the monitoring of your entire infrastructure to be coupled with the systems in place inside that infrastructure. I.e. you shouldn’t have to replicate any effort; when you add a new web server to the pool serving an application, you expect the monitoring of that server to adjust so that the new service becomes monitored automatically.
In the puppet ecosystem, the traditional solution to this is exported resources. In this model, each host running puppet ‘exports’ a set of data about the system under management (for instance nagios checks), and then other hosts can ‘collect’ these resources when they run puppet.
Traditionally this was not very scalable, although this has largely been addressed with the introduction of PuppetDB. It was also difficult to arrange things such that you could get the set of resources you wanted onto the host you wanted – with the newer versions of PuppetDB this issue is ameliorated with the introduction of a more flexible query interface.
All of these advancements have been great progress, and kudos for puppetlabs to doing much needed work in this area. However, pulling back from the actual problems, myself (and my team) have come to consider exported resources as the wrong solution for the problems it’s commonly used to solve.
Exported resources introduce puppet run-order dependencies, i.e. in order to reach the correct state, puppet must run on some machines before it runs on others. The implication is that this “management method” is a Convergent system as the system could end up in its final state by more than one route. Any system which relies on convergence is complicated, as it’s very hard to know if you’ve converged to the end state (or if you will ever converge to the end state).
The key issue is, of course, determinism: If host A exports resources to host B – then the order in which you build host A and host B matter, making them co-dependent and non-deterministic. If you’re rolling out an entirely new environment then this likely means that you have to run puppet again and again across the machines until things appear to stop changing – and this is just _apparent_ convergence, rather than proven convergence.
We can go some way to solve this issue, by forcing the order that machines are provisioned (or that puppet is run on those machines). We wrote puppet roll which executes puppet on hosts in order according to a dependency graph. But this is the wrong problem to be solving. Eliminate provisioning order dependencies and we eliminate a difficult and brittle problem.
In recent work, we have rejected the traditional “exported resources anti-pattern” and instead have created a model of ‘our system’ entirely outside puppet. This means that we can build a model of the entire system, which contains no mutable state. We wire this model up to puppet to generate ENC (external node classifier) data for each machine. All data needed for this machine’s state is supplied by the ENC, meaning that machines can be built in any order, and all in exactly one pass.
This entirely removes all the key problems with determinism, convergence, multiple puppet runs etc. In our experience, it also in many cases radically simplifies things. Whereas previously we would have bent things to fit into the model offered by exported resources, we can now instead write our business specific logic in our own model layer – meaning that we can represent things as they should naturally be modelled.
The thing we like most about puppet for individual systems and services is its declarative, model driven nature – thus we’ve tried to replicate something with a similar ‘feel’ at the whole system level.
Given this (somewhat simplified) description of a service:
stack 'example' do virtual_appserver 'exampleapp', :instances => 2 loadbalancer, :instances => 2 end env 'ci', :location => 'dc1' do instantiate_stack 'example' end env 'production', :location => 'dc2' do instantiate_stack 'example' end
The (again somewhat simplified) ENC generated for the 2 application servers, looks like this:
--- role: http_app_server environment: ci application: 'exampleapp' vip_fqdn: ci-exampleapp-vip.dc1.net.local
The ENC for the 2 load balancers look like this:
--- role::loadbalancer virtual_servers: ci-exampleapp-vip.dc1.net.local: type: http realservers: - ci-exampleapp-001.dc1.net.local - ci-exampleapp-002.dc1.net.local
This sort of configuration eliminates the need for a defined puppet run-order (as each host has all the details it needs to configure itself from the model – without any data being needed from other hosts), and goes a long way towards achieving the goal of complete determinism.
The example shows a traditional load balancers to web servers dependency, however the same technique can be (and is) applied in our code wherever we have clusters of servers of the same type that need to inter-communicate. E.g. RabbitMQ, Ehcache and Elasticsearch clusters.
If you haven’t guessed yet, as well as being theoretically correct, this approach is vastly powerful:
- We’re able to smoke test our puppet code in a real integration environment.
- We can provision entire clusters of servers for QA or development purposes with 1 line of code and a < 10 minute wait.
- We’ve used this system to build our newest production applications.
- We can rebuild an entire app environment during scheduled maintenance.
- We can add servers or resources to an application cluster with a 1 line code change and 1 command.
We’ve got a lot of work still to do on this system (and on our internal applications and puppet code before it’ll all fit into this system), however it’s already quite obviously a completely different (and superior) model to traditional convergence for how to think of (and use) our configuration management tools across many servers.
- Why Order Matters: Turing Equivalence
in Automated Systems Administration (USENIX 2002)