web analytics

Disaster Recovery: Famous Last Words

A quick scan of search engine results and tweets regarding the development of business continuity and disaster recovery plans will yield a significant number of results for you to go look at and research. In fact, there seems to be an overwhelming bias towards the planning portion of these two areas. Indeed, this bias can be seen in almost any area — it is not unique to just these two. I think that one of the reasons for this bias is the fact that few organizations actually make the investment to create credible plans. Still, that’s not what this post is about. This post is about what happens to those organizations that invest resources in creating their plans and then stop.

Why is this a problem? It’s just like the old joke/saying:

In theory, theory and practice are the same. In practice, they’re actually different.

Anonymous

Whether or not we actually think that’s funny, it’s actually pretty accurate. Because when something goes wrong during the execution of the plan, someone will undoubtedly say something along the lines of “well, that wasn’t supposed to happen like that”. Why something actually didn’t go “according to plan” is irrelevant, when you’re in the heat of a response effort. You just know that it didn’t work the way it was supposed to. That’s why I’m calling this post “Famous Last Words”.

Let’s consider this in some more detail.

One basic tenet in the development of business continuity and disaster recovery is this — all planning is based upon assumptions. In order for you to build a credible plan, making some assumptions about that plan during its development are required. There’s no way around it. No organization has the amount of resources required to mitigate and respond to all of the potential risks that an organization faces. Because resources are limited, choices have to be made as to where those resources will be invested. Typically this will be where the impact on the organization is most likely and has the highest impact.

In systems modeling and development, we use models quite often, because they provide a useful platform for exploring potential designs and testing those against a customers explicit requirements. In many cases, we can actually “execute the model” and see how well it holds up logically. Because of this, we get to ask a number of interesting questions about the system in question:

  • Does it perform the way that we expected it to?
  • Were the interactions what we had planned on?
  • Did we have all the information/resources we required?

The nice thing about an approach like this, is that it allows us to check our thinking and explore our options prior to committing to development of the system. In essence, it’s a bit of an insurance policy against getting requirements wrong. Sometimes we find that the model does not hold up and the execution will fail. This may be because a system element failed, an expectation wasn’t met or there was logical error in the construction of the model. For the most part, we are not unhappy to run into these boundary conditions. In fact, that’s why we do the simulations! We want to uncover what the boundary conditions are and then take steps to address those.

Unfortunately, most of the plans that we get the chance to see don’t get the same level of attention. In many organizations, the first time that a plan is actually executed is in response to an event. Finding out that your plan is insufficient to meet its intended purpose in the middle of your response can be deadly to the health, well-being and future of your organization. This is why we recommend that all such plans be regularly reviewed and evaluated.

There are many potential methods for accomplishing this evaluation. They can be as simple as informal, tabletop exercises or as complicated as a full-scale dress rehearsal involving actual execution of the plans. For the most part, when people think about exercising their plans, they tend to think of the full-scale activities first and minimize ( or disregard) the value that can be obtained from conducting a well-designed tabletop exercise. doing this well require specialized knowledge and experience. It can be obtained by hiring a practitioner, developing a resource internally or retaining a services firm to assist in the development and execution of your exercise. The latter two methods can be very cost effective and provide significant benefit when properly planned.

As you’re going through the various exercises, it would be worthwhile to consider questions, such as:

  • What metrics are you tracking that can help you gauge how well your response effort is going?
  • What did not go according to plan?
  • What did you assume that you’d have access to that you didn’t?
  • How were the conditions different from what was specified during the planning process?
  • Were all of the team members familiar enough with their roles and what was expected to function effectively?

Indeed the number of questions that one could potentially ask are about as numerous as the number of potential risks one could focus on mitigating or responding to. This is another reason why it’s extremely worthwhile to have an outside resource review your plan. In doing so, you are more likely to uncover costly assumptions and gain a more robust evaluation of your plans. Outside eyes are much more likely to ask critical questions, identify plan weaknesses/limitations and recommend strategies for improvement than an internal resource. Why is that? The main reason is that an outside resource will not walk in with “already knowing” how “things really work around here”. They will evaluate and make recommendations about your plan and response capability without internal bias.

Needless to say, the answers to the above questions, a review of and actually exercising your plan can help you get the feedback you need to:

  • Uncover hidden risks, assumptions and failures in logic;
  • Identify what you need to do to prepare to execute the plan;
  • Help ensure that your actual response efforts are as effective as possible; and
  • Serve as a source of valuable ideas for the continual evolution and improvement of your plan.

If you would like to find out more about the topic or find out how I might be able to assist you with your business continuity or disaster recovery needs, please feel free to reply to this post and start a discussion or contact me.


Posted In:Data Center Management
Tags Used:Business Continuity, Disaster Recovery, Preparedness
Original Post Context/Metadata