Our technical administrators have identified a service issue and are
working to resolve it as quickly as possible. Since your system may be
among those affected, you may experience some disruption in service
for the next 24 hours while we resolve this issue.
On December 10th at 2:00am EST, our technicians discovered a malfunctionOkay, so there would be additional delay, but all should be well at last. Not so fast, grasshopper. Yeah, the restore finished, but no files were brought back. Only the folder (directory) names were restored. So we notify the support desk, which in turn notifies the admins, which in turn notify the support desk, who reply with the following email on Dec. 17th:
in the server that hosts your site. Currently, your website is still offline.
However, we have isolated the malfunction and are working to restore your
service as quickly as possible. The restoration of your website will be
complete tonight around 10:00pm EST. You will not experience any loss of data.
We know that you trust us with one of your most important resources, your
website, and we are truly sorry for the negative impact that this outage has
had on your business or personal web presence.
Please be assured that we have done everything within our capacity in handling
this matter. Our technicians understand the priority of this, and have followed
all procedures in place to ensure limited downtime for your site, which is our
primary concern.
We are investigating any improvements we can make to our processes for the
future to ensure that an incident like this does not develop to a major outage.
***************************************
Technical Information:
***************************************
The web server that your site is hosted on has been offline due to a hardware
failure in the RAID setup.
RAID stands for "Redundant Array of Independent Disks" and is a technology
that employs the simultaneous use of two or more hard disk drives to achieve
greater levels of reliability and performance.
Your website is stored across the RAID system twice over different hard drives,
if one of the hard drives fails your web site will continue to run. The failed
hard drive is replaced and the data that was on the drive copied again from the
other drives within the RAID, this is known as rebuilding the RAID, and normally
happens seamlessly without any effect to the web hosting server or your website.
This is a daily task performed in our data centers and is standard for large
data storage systems such as used in the web hosting environment.
In this instance, we replaced the failed drive with a new drive and the RAID
started to rebuild. While this was happening the rebuild process failed,
corrupting all the data within the RAID set. This should not happen and we
have open tickets with the RAID manufacturer to understand what went wrong
in this case and to ensure that they can prevent this for the future.
Our system administrators do not rely on the RAID system as our only source
of backup. We run a rolling backup of the live system to external backup servers
to ensure that in a case like this we have a restore solution.
After the RAID corruption occurred, our engineers analyzed the situation and
found that the only solution left to us was to recover the data from our backup
systems. At this point the RAID was reinitialized ready to receive data, this
process itself takes several hours to perform.
This is an update with the back up you have requested.
Unfortunately, our administrators was not able to restored
files in your web space.
We sincerely apologized for the inconvenienced this has caused you.