AWS – Backing up data for compliance

How to install docker on windows 10

We work with a lot of companies (especially in the finance industry) that have strict rules regarding the minimum time data needs to be kept for. This can become quite onerous and expensive if you need to keep customer records for a minimum of seven years, for example.

Using S3, Glacier, and life cycle rules, we can create a flexible long-term backup solution while also automating the archiving and purging of backups and reducing costs.

We are also going to utilize versioning in order to mitigate the damage caused by a file being accidentally deleted or overwritten in our backup bucket.

How to do it…

Execute the following steps to create an S3 bucket with life cycle policies to migrate data to Glacier:

  1. First, we need to define a few parameters:
    • ExpirationInDays: This is the maximum amount of time we want to have our files kept in backup for. We’ve set a default for this value of 2,555 days (seven years).
    • TransitionToInfrequentAccessInDays: After a backup has been copied to S3, we want to move it to the infrequently accessed class to reduce our costs. This doesn’t affect the durability of the backup, but it does have a small impact on its availability. We’ll set this to 30 days.
    • TransitionToGlacierInDays: After the backup has been kept in the infrequently accessed class for a while, we want to move it to Glacier. Again, this helps us reduce our costs at the expense of retrieval times. If we need to fetch a backup from Glacier, the wait time will be approximately 3 to 5 hours. We’ll set the default for this to 60 days.
    • PreviousVersionsExpirationInDays: Given that we will have versioning enabled on our bucket, we want to make sure old versions of files aren’t kept forever – we’re only using this feature to mitigate accidents. We’ll set this value to 60 days, which gives us more than enough time to identify and recover from accidental deletion or overwrite.
    • PreviousVersionsToInfrequentAccessInDays: Just like our other backup files, we want to move our old versions to the infrequently accessed class after a period of time in order to minimize costs. We’ll set this to 30 days:
AWSTemplateFormatVersion: '2010-09-09'
   Description: The maximum amount of time to keep files
   Type: Number
   Default: 2555
   Description: How many days until files are moved to
     the Infrequent Access class
   Type: Number
   Default: 30
   Description: How many days until files are moved
     to Glacier
   Type: Number
   Default: 60
   Description: The maximum amount of time to keep previous
     versions of files for
   Type: Number
   Default: 60
   Description: How many days until previous versions
     of files are moved to the Infrequent Access class
   Type: Number
   Default: 30
  1. Next, we’ll need to create the S3 bucket in which to store our backups. Note that we’re omitting the name property for this bucket in order to avoid bucket name conflicts and maximize region portability. We’re also enabling versioning and adding our life cycle rules from our previous Parameters:
          Type: AWS::S3::Bucket 
              Status: Enabled 
                - Status: Enabled 
                    Ref: ExpirationInDays 
                    - StorageClass: STANDARD_IA 
                        Ref: TransitionToInfrequentAccessInDays 
                    - StorageClass: GLACIER 
                        Ref: TransitionToGlacierInDays 
                      Ref: PreviousVersionsExpirationInDays 
                    - StorageClass: STANDARD_IA 
                        Ref: PreviousVersionsToInfrequentAccessInDays

  1. Finally, let’s add some output so that we know which bucket to store our backups in:
          Description: Bucket where backups are stored 
            Ref: BackupBucket

How it works…

Go ahead and launch this CloudFormation stack. If you’re happy with the default values for the parameters, you don’t need to provide them with the CLI command:

aws cloudformation create-stack \
  --stack-name backup-s3-glacier-1 \
  --template-body file://03-backing-up-data-for-compliance.yaml

Once the stack has been created, you’ll be all set to start copying backups to the S3 bucket and start worrying less about your backups’ life cycle and management. If you decide that the expiry or transition times need to change after you’ve created the bucket, you can do this by simply updating the parameters for the stack.

There’s more…

Glacier is a companion service to S3, but it is the cold storage option. Cold storage is a service where you are unable to directly access your data; you must lodge a request for data to be restored (to S3), and you will be notified when it is ready. A physical example of cold storage might be backup tapes that are stored in a secure location. Similar to S3, files are referred to as objects. Files are grouped together and stored in archives. Archives can be created and deleted but never modified. Archives are grouped together into vaults, which allow you to control access.

The shortest restoration time is 1 to 5 minutes (with limitations). Standard restoration times take 3 to 5 hours, with some other options available.

The following are some recommended use cases for Glacier:

  • Long-term (that is, cold) backups
  • Compliance backups

Comments are closed.