This Monday and Tuesday a few of us went to DevOpsDays London 2013 Fall.
We asked for highlights from every attendant and this is what they had to say about the conference:
Francesco Gigli: Security, DevOps & OWASP
There was an interesting talk about security and DevOps and a follow up during one of the open sessions.
We discussed capturing security related work in user stories, or rather “Evil User Stories” and the use of anti-personas as a way to keep malicious users in mind.
OWASP, which I did not know before DevOpsDays, was also nominated: it is an organization “focused on improving the security of software”. One of the resources that they make available is the OWASP Top 10 of the most critical web application security flaws. Very good for awareness.
Tom Denley: Failure Friday
I was fascinated to hear about “Failure Fridays” from Doug Barth at PagerDuty. They take an hour out each week to deliberately failover components that they believe to be resilient. The aim is not to take down production, but to expose unexpected failure modes in a system that is designed to be highly available, and to verify the operation of the monitoring/alerting tools. If production does go down, better that it happens during office hours, when staff are available to make fixes, and in the knowledge of exactly what event triggered the downtime.
Jeffrey Fredrick: Failure Friday
I am very interested in the Failure Fridays. We already do a Failure Analysis for our application where we identify what we believe would happen with different components failing. My plan is that we will use one of these sessions to record our expectations and then try manually failing those components in production to see if our expectations are correct!
Mehul Shah: Failure Fridays & The Network – The Next Frontier for Devops
I very much enjoyed the DevOpsDays. Apart from the fact that I won a HP Slate 7 in the HP free raffle, I drew comfort from the fact that ‘everyone’ is experiencing the same/similar problems to us and it was good to talk and share that stuff. It felt good to understand that we are not far from what most people are doing – emphasizing on strong DevOps communication and collaboration. I really enjoyed most of the morning talks in particular the Failure Fridays and the The Network – The Next Frontier for Devops – which was all about creating a logically centralized program to control the behaviour of an entire network. This will make networks easier to configure, manage and debug. We are doing some cool stuff here at TIM Group (at least from my stand point), but I am keen to see if we can toward this as a goal.
Waseem Taj: Alerting & What science tells us about information infrastructure
At the open space session on alerting, there was a good discussion on adding context to the alert. One of the attendee mentioned that each of the alert they get has a link to a page that describes the likely business impact of the alert (why we think it is worth getting someone out of the bed at 3am), a run book with typical steps to take and the escalation path. We have already started on the path of documenting how to respond to nagios alerts, I believe expanding it to include the perceived ‘business impact of the alert’ and integration with nagios will be most helpful in moments of crisis in the middle of night when the brain just does not want to cooperate.
The talk by Mark Burgress on ‘What science tells us about information infrastructure’ indeed had the intended impact on me, i.e. I will certainly be reading his new book on the subject.