View Single Post
  #1  
Old 07-04-2011, 03:30 AM
rasiel rasiel is offline
Administrator
 
Join Date: Sep 2005
Posts: 276
Default Tantalus rebuilt

It's still early in the restoration process but I wanted to get a
status update on what's happened so far.

Sometime late on June 27th Tantalus went offline suddenly. The firs
thought was that this was due to a programming error in one of the
scripts that overwrote the main index. However, when we managed to log
in to the server all the site files were gone. As far as we can tell
this is the analysis:

-After having a number of minor outages we asked to switch servers to
improve performance.
-The migration however was not fully completed. The software that
manages the site was updated to reflect the new location but the data
was still on the old hardware
-Since there was no data yet at this location each night the software
was creating an empty backup
-Meanwhile, as far as we can tell, the staff brought the old server
offline and reformatted the hard drive; not realizing, of course, that
there was a live site on it which was not being backed up.

Adding to this list of errors is my own fault for relying solely on
the hoster's backups and not creating a set of my own. The reason is
that it had grown to over 60 gigabytes in size and a file of this size
takes days to download.

Fortunately, most of the database was recoverable. The database
contains all the data about the users and the records and keeps it
separate from the images. At this point it looks like we're missing
all of June and the last week of May. The images at this point seem
only partly recoverable. An old copy had stored up to December 2009
and a few more may have been saved from testing. It's too early to
tell how we can patch everything together but we should have a better
idea by the end of the weekend.

Now this situation has caused me a lot of distress. I know many users
and guests have come to rely on Tantalus over the years and data loss
is a huge blow for everyone involved. No matter what the succession of
events as site owner the responsibility nevertheless falls on me. Once
the dust settles I will see how I can compensate everyone for their
inconvenience. In the meantime please accept my sincere apology.

Ras