Sunday, 15 December 2013

Personal Backup strategies

Its been on my mind of late that I don't have a very good backup strategy in place for my own things at home. I've got many gigs of photos, code, documents, videos that are locally backed up but all over the place and not very consistent, and then there is gmail and the 4 gig of emails in there. So I'm doing something about it.

The solution is:

Dropbox

I use dropbox for cloud sync and storage. This is not back up. I use it to get access to files easily from anywhere, but if I accidentally delete or change something, then the change is propagated straight to dropbox, so (unless you have packrat) its quite hard to undo the change or get to an older version.

I keep a local copy of all the dropbox files on my home server.

gmail

A weekly download of all gmail to local machines using gmvault. The official guide to set up
is here, but Scott Hansleman did a great write up of how to do this here.

This boils down to two commands. The first for the initial sync, the second for incremental backups on top of the same folder structure.
gmvault sync youremail@gmail.com -d D:\foldertosaveto
gmvault sync -t quick youremail@gmail.com -d D:\foldertosaveto

Output of my initial run. Yes took a while to run...
================================================================
Sync operation performed in 2h 36m 35s.
Number of reconnections: 70.
Number of emails quarantined: 0.
Number of emails that could not be fetched: 0.
Number of emails that were returned empty by gmail: 0
================================================================

Scheduled job

I have set up a scheduled job (in windows task scheduler) which runs a script every Friday that backs up the week's email to my hard disk. This script is just a simple .bat file where the contents are thus:
gmvault sync -t quick youremail@gmail.com -d D:\foldertosaveto
You will need to make sure that gmvault is on your path if you do it this way. setting up scheduled jobs is easy too. There are loads of on-line tutorials, here is one for windows 8.

Amazon glacier

  1. Sign up for Amazon glacier, you will need your credit card for this (first you need to sign up for an Amazon AWS account)
  2. Once logged in, create a key pair (Access Keys (Access Key ID and Secret Access Key)) save them to your machine.
  3. Go to the glacier console and create a vault for each type of backup you are planning on doing. I've created two for now, one for my photos and one for my mail backups. I might create another for music later.
  4. Ensure you have chosen the data centre closest to you for the vaults. Both of mine are in EU Ireland.

Cloudberry online backup

I use cloudberry online backup to do the heavy lifting of actually sending all my files up to amazon
http://www.cloudberrylab.com/amazon-glacier-storage-backup.aspx#amazonglacier. Its great you just set up some backup plans and a schedule and cloudberry does the rest. Its not free but really quite cheap given what it does and how well it does it.
  1. Install the cloudberry online backup desktop version (download from: http://www.cloudberrylab.com/amazon-s3-cloud-desktop-backup.aspx )
  2. Add a glacier cloud storage account (File->amazon glacier)
  3. Follow the wizard - it's really easy
  4. Go to the backup plans tab and create a new plan or use a predefined plan
  5. For my gmail backup I created a new plan
  6. Click the backup wizard (backup files). Again, a real easy wizard to follow. Select the glacier account/vault, the files to back up and the schedule. So easy.

Costs

I'm storing 150 gig in amazon glacier, that costs me £1.50 per month and I can store as much as I like, practically unlimited storage. Be careful though because it costs a lot more to get it out. But that's ok right? This is emergency backup. You might be able to get your files back from dropbox, local backup etc. Glacier is the long term emergency backup we all need.

Summary

The whole point was to get all the files that I care about into cheap storage with multiple redundant backup locations, so if/when I lose some data I can get it back. Dropbox provides an easy way to get back files but its not a total solution, amazon provides the cheap offsite secure backup that I want for my 150 gig+ of data.

Other options:




Comments from https://news.ycombinator.com/item?id=6927659

* by drdaeman

Isn't Glacier overpriced, compared to other personal backup solutions?

Say, I have a mere 2TiB of historical data (various junk I made or collected over last ten years or so). Storing on them with Amazon is $20/mo, and if I want to look on that photos from 2008 I have to wait for several hours just to find that I misremembered where they were stored and pulled out wrong files. And unless it happened that I uploaded a good amount of data on that exact day, I'll have to pay for downloads.

Other offers for unlimited storage are Cyphertite at $10/mo, Crashplan at $6/mo, Carbonite at $100/yr, AltDrive at $4.5/mo and so on. While they're probably not-so-unlimited (they don't say that, but I guess one won't have much luck storing a petabyte), less respectable than Amazon, and most services lack an API and require to use not-so-trusty proprietary software that has to be sandboxed properly, Glacier doesn't look like a good deal to me unless we're talking about backing up some either quite big data (like tens of terabytes) or relatively small amounts of data (less than 500GiB).

Disclaimer: I have no affiliation to any of companies mentioned above. Just happens that I'm currently fleeing from Bitcasa (they suck hard) and looking at various options to not maintain a self-hosted NAS.

* by tfe

The difference is that I trust Amazon far more than those other companies you mentioned. If they go out if business or even change their "unlimited" policy, you're exposed until you can get your 2TB re-uploaded to another provider. It's a pain and a risk I'm unwilling to take. I know Amazon isn't going to suddenly try to dump me as a customer.

* by damianstanger

Yes all good points. I have a relatively small data set < 200GiB and so my costs with glacier are less than $2 per month :-)


* by hengheng

I am using Glacier to store a backup of most of my personal data. This includes my home directory, the most relevant photos I have taken as jpeg, my gmvault and that's about it. I do not copy over any movies, music, raw photos or software, as this is my last line of defense, so it only needs to cover the essentials. I am under 1€ per month this way, and the backup gets refreshed only every other month or so.

I do have a local server that stores a windows backup image of my whole laptop, a second Harddisk in that Server to store a copy of the server, and an external hard disk with a windows backup at my parents that gets a refresh every time I am over there. All backups are truecrypt images for good measure, and I have tested recovery. Amazon stores a split truecrypt archive. Recovery cost about 20€ and took a day.

So yes, glacier is great as a personal backup, if you make it part of a larger strategy. To me, this is disaster recovery, and a small price to pay for this kind of insurance of important files and memories.

4 comments:

  1. You can undelete things in dropbox from their web UI.

    ReplyDelete
  2. Dropbox does that for you =) you can "check previous VERSIONS" and restore.

    ReplyDelete
  3. Cloudberry offers several backup products for almost any situation, from a single laptop to a Microsoft Small Business Server to your standard Windows server… they do it all. From what I have seen they have built in VSS support if you want to use it, but for my use case I left it off as I just wanted to push my documents and pictures to the cloud for which I got this suggestion from Datapro technology now I can get backup without any effort.

    ReplyDelete
  4. It's wise to have as many options ready as possible in your storage endeavors. Cloud storage is really a very viable option, since it relies on the larger memory latitudes beyond the confines of your existing physical hardware, and into the vastness of the worldwide net. You should keep on this course, and disperse your data into as many venues as you can.

    Manda @ Scality

    ReplyDelete