Introducing Orc and its agents

This post is the third part of series documented here (part 1) and here (part 2).
Orc at a high level

In the previous post we discussed the Application Infrastructure Contracts. These contracts mean that new applications can be deployed to production with minimal effort because the infrastructure tools can make assumptions about their behaviour. In this post we will discuss the tool-set that leverages this.

Orc is split into three main components (or types of): The central orchestration tool, its agents and its model (the cmdb). The diagram above shows this clearly.

Each box must install an agent that is capable of “auditing” itself. This means that it should be able to report if an application is running/not-running, what version it is on, whether it is in or out of the load balancer pool, whether it is stoppable etc. For different types of components this audit information would be different, ie for a database component there would be no concept of stoppable.

Orc is a model driven tool. It continually audits the environment by sending messages to all nodes, each node will respond with its current state. Orc then compares the information retrieved from the audit with its model (the cmdb) and for each node will decide on the action to take if any. Orc will review each action for conformance with its policies and remove any illegal actions (such as removing all instances from the load balancer). It then continues by sending messages to the agents to perform the intended actions. Currently we have one simple policy: “Never leave zero instances in the load balancer”.

The final component is the model of the desired state of the world (the cmdb). It contains which applications should be on which versions and whether they should be participating in the load balanced service or not. This is currently a simple yaml file in a git repository. Having it in git gives us a desirable side-effect in that we can audit changes to the cmdb.

So now we have a way to audit the actual state of the world (via our agents) and we know what the world is supposed to look like (our cmdb) so all that is left is to execute the upgrade steps in the correct order without breaking our policies (well the one policy at the moment).

This diagram explains the transitions that Orc makes given any particular state of any one instance:

So if the world started out looking like this:

Then HostA and HostB would instructed to upgrade (install) to version 6 as they are on the wrong version and are not participating. Actions for HostC and HostD would be to disable participation, but this is currently blocked as it violates the policy (must be at least one item in the load balancer).

Then participation is enabled for HostA and HostB, again HostC and HostD are currently blocked awaiting more instances in the load balancer.

Finally, HostA and HostB have both become inline with the CMDB so no further action has been taken, this leaves the final action of disabling participation on HostC and HostD.

This concludes our introduction. Future posts will look at future work with Orc particularly: respecting component dependencies and performing database upgrades.