Around 4 years ago one of dedicated servers developed an irretrievable disk error. All data written to the RAID and tape archive during the previous week was corrupt! Not good. As we couldn’t restore our customer’s data from our most recent tape archive we had to install new hardware and import data from a previous week’s backup cycle – effectively loosing a weeks worth of data. To add insult to injury it took over 2 days to restore the data and get our customer’s back online. A very dark few days. Fortunately we did have back-up mail servers on the secondary MX records, so there was no interruption to customer’s email service.

Once the ordeal was over we immediately changed our disaster recovery plan. We created a system that automatically synchronised files and databases to remote servers – in addition to the automated tape archive and RAID.

Our objective was to provide a fast, robust website hosting solution with data backup and disaster recovery plan for an extremely affordable price.

In theory the plan would expose our customer’s to an average downtime of 1 hour. Obviously the perfect solution is 0 downtime. Whilst (in theory) this is possible using clustered servers with load-balancing software, we could not implement this solution due to the prohibitive cost that would have to be applied to the majority of our customers.

However, we proudly provide a dedicated server solution including the disaster recovery plan (tested!) for a modest monthly fee.

After the recent hardware failure earlier this week we brought the majority of our customers back online within 1 hour. Whilst I apologise for any inconvenience caused within that period of downtime I sincerely believe we have learned from our past experience and have now provided a recovery plan that wasn’t too painful for the majority of our customers.

We would love to hear your comments or any feedback that would help us to improve our service to you.
 

2 Comments

  1. Havens are one of Silkstream’s customers and would like to thank Leigh, Steve and Team for averting potential disaster. I too have written an article on our personal experience of website hosting server failure and how the disaster recovery plan swings into place. There are lessons for all to learn. Do not be complacent and think it will never happen, if there is no recovery plan in place you could be stuffed!

  2. Silkstream have been hosting one of our websites and, more importantly, over 200,00 wedding and travel images in our online galleries, for over 6 years. In that time they have never once lost my data, and only on 3 occasions have our galleries been down at all, and the main site and Blog NEVER! Considering we have about 10GB of images online, that is a miraculous feat! They are exceptional! Thanks Silkstream!