Data Integrity
Warning: Geeky Content Ahead
I learned a lot when ficlets got shut down. It was just the latest in a series of web products that I loved that died untimely deaths. Thankfully, in ficlets’ case, we had some warning and we were able to save the stories. Even though Jason and I built ficlets, we didn’t have time to add a real data retrieval API into it or any way for users to back up their own stories – we had to crawl the site and hope we could get decent data out of it. So, when we started talking about building Ficly, I wanted to make sure that no matter what happened to the site or the hardware it runs on, the stories would be safe.
Here’s what Ficly has so far:
- The database that holds everything is backed up and uploaded to Amazon’s Simple Storage Service every 12 hours. Your avatars are also uploaded there.
- Every feed has built-in pagination support (At least, I think they all do – I know the important ones do). If you want to back up their stories, all you need to do is grab and save your feed changing http://ficly.com/authors/kevin-lawver/stories.atom?page=1 to the next number until no more stories show up (change kevin-lawver to your URL name).
- All the code behind Ficly is stored in version control using Beanstalk, down to the configuration files for all the major processes we run. Jason and I also have local copies, of course.
Here’s what we plan on doing as we find time:
- Generate a nightly backup of all the published stories on the site and make it available to whoever wants it as an XML or JSON file. This will eventually include the data to accurately rebuild chains of sequels and prequels (something you can’t yet do with the story feeds).
- Generate a nightly backup of every author’s stories so you could download them periodically for a personal backup.
I can’t think of anything else right now, but I don’t like repeating mistakes, and I want to make sure that Ficly’s data, better yet, your stories are safe, no matter what happens. Ficly’s young, and we’re running very lean (on a single server that hosts everything). If something happens to that server, I want to make sure that everything is recoverable so we can bring everything back up where it belongs.
What else can we do? I’m open to ideas. I can’t say we’ll implement all of them or when, but we’ll consider everything.
Comments