Modern Disaster Preparedness & Recovery in the Age of the Cloud

11.30.20 Best Practices

The Internet Has Become Complacent Again, & That’s Got to Stop

November 13th - 25th, 2020, was a rough two weeks on the internet, seeing multiple outages and actions by malevolent hackers. And even though not all systems are back online, we’re thankful the worst of it is over. The period of darkness began when GoDaddy — the world’s largest purveyor of domains — was targeted by phishing scammers who succeeded in gaining access to prominent cryptocurrency website and email traffic. Then Windows hosting player Managed.com was infected by a ransomware attack that same weekend.

And finally, on the 25th, the world’s largest public cloud infrastructure — Amazon AWS — suffered a major outage across twenty-eight of its core services that wreaked havoc across huge swaths of the internet from Adobe and Roku to Pokémon Go and League of Legends.

We have nothing but empathy for the vendors and all the services affected. These were the most prominent breaches and outages, but they were not the only ones by far. There were dozens of breach outages across nearly every industry segment and in every industrialized country in the world generating billions of dollars in economic impact. All are just the latest examples that underscore that — even in the age of the hyperscale cloud and technological automation — the internet can be a dangerous place and there is no substitute for Disaster Recovery Best Practices.

Here are our recommended minimum best practices for technology disaster preparedness and recovery.

Note: This article does not address standard security and server hardening measures. We cover those topics in other articles and white papers. Visit our Insights page or contact us for information on those Best Practices.

The Diversification Dilemma

If you manage online resources, the major players have a strong incentive to capture as much of your services business as possible. And bundling services in incentivized discount packages is remarkably convenient and can provide the appearance of cost-effectiveness. But there is another side to this coin.

It’s a classic risk scenario — the kind used by insurance companies around the world. If all of your primary resources are delivered through a single supplier, then the risk of any single supplier failure impacting your technology assets goes down. But the impact of an outage or breach at your supplier is magnified greatly — if they are down, then everything is down for you as well. If your assets are diversified to multiple datacenters and suppliers, then the odds of any incident occurring go up, but the impact of any single event goes down.

The Best Practice is to have enough diversification so that you can maintain some key operational functions and communications no matter what happens without over-diversifying to expand your risk footprint.

We recommend that the following services be distributed out to different service provider locations outside of your primary hosting environment for mission-critical and enterprise applications:

Domain Registration
DNS
Email
Disaster Preparedness Landing Page/Site
Cloud Backup

Domain Registration

It can seem convenient to use your hosting provider to purchase your domain. But in the vast majority of cases, the hosting provider is, in fact, purchasing or leasing the domain through a primary domain registrar. Allowing an intermediary — whether it is your hosting provider or website designer — to have full domain ownership admin rights to your domain adds a layer of risk that is just necessary.

It is a Best Practice recommendation to purchase your domain directly through your own account at a major recognized domain registrar and to allow edit-only access to trusted contractors.

DNS

The Domain name System (DNS) is critical. It’s how the world finds your website, how your internet-connected services find your data, and how your employees access their email and collaboration tools. It is the addressing system of the internet that makes everything work. But it is also the one thing that is too often disregarded in most disaster preparedness plans.

If your DNS is provided by your hosting provider, then your DNS is down when your host is down for whatever reason. If your DNS has been compromised, then the bad guys have control over your web traffic. And in the event of an issue, it can take as much as 72 hours to transfer your DNS from one provider to another.

The solution is to have a redundant, geographically diversified network supporting your DNS. There are a number of professional DNS providers out there who do just that. CloudFlare started as a Content Delivery Network (CDN) designed to create speedier connections and protect sites from DDoS and other attacks. And their network has withstood the largest attacks to date. A few years ago, they put DNS services up on that same network.

We discussed this with Matthew Prince, Founder and CEO of CloudFlare and he said, “DNS provides the foundation for Internet access, so its security is critical if you want to be safe online. That’s why we built an extremely fast and secure DNS infrastructure and made it free for all Cloudflare customers.”

We agree.

It is a Best Practices recommendation to use a well-distributed third-party DNS provider for all enterprise and mission-critical applications.

Email

In an emergency, one of the biggest mistakes that companies make is attempting to communicate with their customers via social media. In a crisis, social media quickly becomes overwhelmed and incoherent while calling more and more attention to the outage or breach.

In addition, email is the key to the kingdom — meaning that email is the central point of contact and identification for all your employees' online services. Password resets, email verifications, many customer service applications, alerts, and more are delivered via email. It’s bad enough if your website is down. If your website AND your email are down, you have no effective way to communicate with your customers and your team.

It is a Best Practices recommendation to ensure that your email is hosted separately from your website on a different network and — if feasible — in a different physical location.

Disaster Preparedness Landing Page/Site

If your company website is unavailable or has been breached but you have maintained control of your DNS, then you can temporarily redirect traffic quickly to a failover page while your technical team works to solve the issue. This will allow you to retain control of your custom and web traffic and communicate by posting updates and additional information as it becomes available. This is a standard part of all professional Disaster Preparedness plans.

It is a Best Practices recommendation to ensure that this failover page or website is hosted separately from your website on a different network and in a different physical location.

Cloud Backup

Running and maintaining local backups can protect you from individual server (hardware) failures. But local backups cannot protect you from network failures or malicious attacks like malware and ransomware. We will be talking more about backups later in this article, but the first order of business is to ensure that your backups are secured and protected from the hazards that might impact your primary network.

It is a Best Practices recommendation to ensure that your backups are hosted separately from your website on a different network and in a different physical location.

You Often Get What You Pay For

We have been called in to consult on business technology projects from small businesses to large enterprise organizations. And we are still surprised when multi-million-dollar companies are using bargain, consumer-level hosting plans to support their technology.

For most organizations, data has become the most valuable asset they have. It should be treated that way. There is simply no way that a host can provide the level of maintenance, security, and support across fully-patched and updated networks cheaply. Note that we are not saying that a hosting company has to be big — although that can help. We are saying that professional technology management and support does not come cheaply.

It is a Best Practices recommendation to ensure that your technology assets and resources are hosted in a facility with professionalism and at a cost commensurate with the size, importance, and value of your business.

Backups

It used to be that the primary purpose of backups was to protect data from hardware failures and human error. Both of those are still important considerations. But as technology has advanced, we have developed more and more reliable hardware, solid-state drives, automated redundancies, and fault-tolerant systems that mitigate those risks. Now, however, we have entirely new threats to our data in the form of major network/cloud outages and malicious attacks.

Just this morning as we are completing this article we have learned of another ransomware attack — this time on the Baltimore County Public Schools system. If they have not prepared with proper backups, they will be forced to choose between paying the ransom and losing their data. In the absence of conforming backups, their best-case scenario will be a long and expensive forensic recovery process that can take days or even weeks.

It is our Best Practice recommendation that all businesses have a three-tier backup strategy:

Daily automated incremental backups to cloud storage.
Weekly full backup to cloud storage.
Monthly fully backup to storage device(s) without direct internet access.

Backups will not prevent a disaster or malicious attack. But they will allow you to recover from them quickly.

Conclusion

There are always unknowns. But following the Best Practice recommendations in this article are industry-standard techniques to mitigate the impact of unforeseen events to your business and ensure that they do not result is lost data.

If you have any questions about Disaster Preparedness or need help with your technology project, just let us know, We are always happy to help.

Links & Additional Resources for Disaster Preparedness & Recovery

Cyware Reports dozens of breaches and outages in November 2020:
https://cyware.com/category/breaches-and-incidents-news

CloudFlare DNS Information:
https://www.cloudflare.com/dns/

The ransomware attack on Baltimore County Schools:
https://www.nytimes.com/2020/11/29/us/baltimore-schools-cyberattack.html

See All Posts

Previous Next