Around 4 years ago one of dedicated servers developed an irretrievable disk error. All data written to the RAID and tape archive during the previous week was corrupt! Not good. As we couldn’t restore our customer’s data from our most recent tape archive we had to install new hardware and import data from a previous week’s backup cycle – effectively loosing a weeks worth of data. To add insult to injury it took over 2 days to restore the data and get our customer’s back online. A very dark few days. Fortunately we did have back-up mail servers on the secondary MX records, so there was no interruption to customer’s email service.
Once the ordeal was over we immediately changed our disaster recovery plan. We created a system that automatically synchronised files and databases to remote servers – in addition to the automated tape archive and RAID.
Our objective was to provide a fast, robust website hosting solution with data backup and disaster recovery plan for an extremely affordable price.
In theory the plan would expose our customer’s to an average downtime of 1 hour. Obviously the perfect solution is 0 downtime. Whilst (in theory) this is possible using clustered servers with load-balancing software, we could not implement this solution due to the prohibitive cost that would have to be applied to the majority of our customers.
However, we proudly provide a dedicated server solution including the disaster recovery plan (tested!) for a modest monthly fee.
After the recent hardware failure earlier this week we brought the majority of our customers back online within 1 hour. Whilst I apologise for any inconvenience caused within that period of downtime I sincerely believe we have learned from our past experience and have now provided a recovery plan that wasn’t too painful for the majority of our customers.
We would love to hear your comments or any feedback that would help us to improve our service to you.