Every time I manage a change to a customer network I have a chance to taste the many shades of possible IT Operations maturity levels. I collected some best practices over the years about how to reduce risk and speed-up the change and testing process. I’ll share some in this post. Improvements and suggestions are welcome in the comments of the post or on my Twitter account.
This story starts with a phone call at night. If you worked in IT long enough you know what it means. Customer’s HQ network is down and since the day before I’ve replaced a pair of data center switch in a remote site I’m somehow involved based on the well-known principle “last one who made changes is responsible”. I state that all the facts took place with my telephone support, without any remote access to the machines.