Does the term ‘warm site’ leave you cold? Are you baffled by ‘redundancy’? And when your IT consultant mentions your RPO do you just grunt and nod your head knowingly, when really you don’t have a clue? You’re not on your own.
When it comes to disaster recovery terminology many small business owners and employees are left in the dark. But you need to rectify this. Your company relies on its IT system to be able to operate; for that reason, creating a disaster recovery plan is a necessity – and you can’t do this properly until you know what you are talking about. So, to help, this article will explain some of the key terms and acronyms used in disaster recovery planning.
You can’t begin to have a discussion about disaster recovery until you know what a disaster is. In IT parlance, a disaster is when your system goes down. This can include your website going offline, the loss of email, your manufacturing software not working, or any other critical application ceasing to operate.
The causes are as numerous as the problems they create: hacking, faulty wiring, fire, flood, a corrupted hard drive, software conflict or human error.
Frequently referred to using the acronym DR, disaster recovery is the process of getting your system fully working again. This involves restoring your data, IT equipment, software applications and any other technical resources on which your operations depend.
Disaster Recovery Plan
Unsurprisingly referred to in the trade as a DRP, a disaster recovery plan is a highly detailed plan of action to get your system back online if it goes down. It should give full details of who does what, when, why and how as well as containing all necessary documentation, lists of equipment needed and where to find it. For a full understanding of what’s required, read our post, ’10 Tips for an Effective Disaster Recovery Plan.’
Other disaster recovery terminology – in alphabetical order
Application recovery is the process of restoring your business system software and data. This is done after restoring your hardware and operating system.
Your disaster recovery plan should be an element of an overarching business continuity plan. Business continuity looks at how you continue to keep your business operating in any disruptive situation, whether that’s a system failure, industrial action, severe weather or anything else.
Business Impact Analysis
In order to make informed decisions when drafting your disaster recovery plan, you need to have a thorough understanding of how a disaster will impact upon your business. Undertaking a business impact analysis will help you prioritise the different applications you need to bring back online and give you an idea of how long you can afford to be offline for. It will also help you develop contingency plans for business continuity whilst disaster recovery is taking place.
Disaster Recovery Site
A disaster recovery site, also referred to as a secondary site, is essentially a replica of your data centre, containing backup hardware, applications and data that can be brought in action if your data centre cannot function. See item 10: Hot Sites, Cold Sites and Warm Sites.
Disaster Recovery Team
This is the team of individuals who are responsible for bringing your system back online. Every member should be listed in your DRP, together with their contact details and each one should have a clearly defined role. The team may include both internal and external members, so besides employees, you may have software developers, web host technical support and other consultants.
In simple terms, high availability refers to a system that is capable of staying online for most of the time. By most of the time, we mean that for 99.99% of the time and above, the system will continue to process and function. High availability can be achieved by using redundant components which can be brought into service if there is a failure. One of the best ways to achieve high availability is to utilise cloud hosting.
Hot Sites, Warm Sites and Cold Sites
Disaster recovery sites come in three different forms, hot sites, warm sites and cold sites. Here’s an explanation of what they are so you can see the differences between them.
A hot site is essentially a fully operational replica of your data centre that, in the event of a disaster, you can switch your operations to in order to prevent your mission critical services going offline. It is the best method for ensuring business continuity but as you have to keep the centre up and running it is also the most expensive. A more cost effective solution can be achieved by creating a cloud-based hot site which can utilise pay as you go, scalable resources in the event of a disaster.
A warm site is one where you have hardware installed and preconfigured in case of a disaster, but to keep costs down you do not install software or data. In the event of a disaster, software and data backups need to be installed before recovery can be completed. Again, a cloud-based warm site can be a more cost-efficient solution.
A cold site is basically just a data centre enabled space, providing power, network connectivity, air conditioning, telephone lines, etc. In the case of a disaster, you would need to install the hardware, software and data before recovery could take place. It’s the cheapest solution, but the one that would take the longest to bring back online.
One of the most bandied about terms in IT phraseology, ‘mission critical’ refers to an application that, without which, your business would not able to function. Think of TomTom losing GPS, Heathrow losing air traffic control or Amazon losing its payment facilities.
Recovery Time Objective (RTO)
One of the key things you should find out from your business impact analysis is how long your business can realistically accept its applications being offline. From this, you can then set your Recovery Time Objective (RTO) which is the maximum time you will allow for disaster recovery to take place. Obviously, the quicker you can recover, the sooner you can be back in business, but this may have an impact on your Recovery Point Objective (see below).
Recovery Point Objective (RPO)
If you are familiar with a PC or a laptop, the best way to understand Recovery Point Objective is to think about it in the same way as a system restore point. RPO is about restoring your system and, in particular, its data, exactly as it was at a certain point in time.
From your business impact analysis, you should have a good understanding of the point in time you need to restore too. For example, if you are an e-commerce business you may need to restore right up to the moment that your website went offline in order not to lose track of any orders that were taken.
Knowing what your needs are and setting your RPO will affect two important things, firstly, it will have an impact on your backup process and secondly on your RTO. This is because, if you need to restore all the way up to the moment of disaster, you will have to have constant backups taking place and this is likely to increase the amount of time you need to fully restore your system.
Redundancy, in IT terminology, means having unused resources at your disposal that you can put into operation when the need requires, such as servers or disk space. For disaster recovery, it means having these resources in place so that you can either prevent failure happening altogether or have the means to recover very quickly.
Unfortunately, redundancy can also be what happens to employees of companies that don’t have a disaster recovery plan. Don’t let this happen to you!
Hopefully, from reading this article, you will now have a better understanding of the disaster recovery jargon and acronyms used in the IT industry.
If you are looking for a cost-efficient way to meet your disaster recovery objectives, take a look at our VMware and HyperV cloud hosting packages.