How to define ‘Emergency change’

Posted: June 10, 2010 in IT Service Management
Tags: ,

These days, most companies rely on IT and IT infrastructure to support part or all of their business. Therefore the need to fix major incidents is becoming more and more pressing, since companies tend to lose money when services are down.

Most companies also have change and release control in place, which balances stability vs flexibility in their organisation.

Emrgency changes and the emergency change process (ECP) are then what reconciles the urgent need for an immediate fix (read change) versus the control and management required for normal change.

In an outsourced environment there are some more parameters that will complicate the matter. Who will authorise the emergency change? how is it charged?

I’ve seen several companies that have defined the ECP as ‘fix now, do all change steps later’. This (almost) always is a certain path to disaster. Furthermore there is no framework or best practice that tells you to do this… And you are opening the door for a specialist to change the IT environment whenever he feels like it, (ab)using the Emergency change as the means to get things done.

So, how to cope with this?
The easiest way to deal with this is formalize the Emergency change Process for each critical service or system. This ECP will list the criteria to invoke it, who will be on the crisis team (including a contact list with backup/escalation), and what the steps to be taken are (eg. minimal testing requirements). Defining these is usually done with the release of a new service or system.
The biggest advantage is that there is little room for discussion…The rules are set (and agreed with the business).

After the change has been implemented, a formal handover to the CAB of the change needs to be done, in order to ensure that:
– Everyone is aware of what exactly was changed
– What the impact is of the implemented change on other systems and/or services
– what the impact is of the implemented change on other ongoing changes/releases
– we know what could have been done to avoid this (lessons learned)
– make sure we know what the current version/state of the production environment is
– asses the need for the emergency change

I also advice to launch a problem (after the implementation) to make sure the RCA of the incident has really been handled.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s