Navigating AWS Services – Static Storage

When you are trying to deploy an application to the Cloud, there are so many choices! Honestly, I was overwhelmed by them the first time I came across them because of the sheer number of options I had. In this series of blog posts we will discuss factors that can help you decide which type of static storage service (S3 bucket type) may be ideal for you.

This one will explore the different Static Storage options offered by AWS’ S3 service.

What is S3?

Amazon S3 (Static Storage Service) is an object storage feature where you can store an unlimited amount of data, organized by buckets. It is generally an ideal place to store static data like sites and assets. Each file (called an “Object”) can be maximum 5TB in size and can be configured to be accessible globally by a REST-supported URL. All objects have a key, a version ID (can be useful if we are expecting to replace these objects), value, metadata, and subresources. Buckets must have a globally unique name and are defined at the Region [regions are the geographical areas where AWS has data centers (e.g. regions are Canada-central, US-east, etc. and each region has multiple Availability Zones (AZs) like US-east-1, US-east-2, etc. where each AZ has 1 or more data centres)] level.


Benefits of using S3

  1. Durability
    • Ensures data is not lost like in scenarios where AWS has a hardware failure.
    • S3 provides 11 9’s (99.999999999%) of durability.
      2. Scalability
    • Virtually unlimited capacity per bucket.
    • Each file can be up to 5 TB in size.
      3. Availability
    • Access data whenever you need it.
    • Standard tier of S3 is designed for 4 9’s (99.99%) of availability.
      4. Security
    • Fine-grained access control on per-bucket and per-object levels.
      5. Performance
    • Exceptional performance, not affected by any design patterns’ use (can also be used for static hosting!).

Securing Data on S3

  • New buckets are private and protected by default – you can change this when setting up your bucket.
  • If the principle of least privilege is followed (the principle that you provide the minimum/least amount of access that is absolutely necessary), your data is also always safe and AWS’s IAM services offer granular control to enable you to implement this.
  • There are many more features that can help us secure our buckets and objects like Blocking Public Access, Setting Bucket Policies, Access Control Lists, S3 Access Points, Pre-signed URLs and AWS Trusted Advisor.

Encryption on S3

  • At server side, S3 encrypts data by default using AES-256 keys. However, this encryption can be configured on a bucket-level basis to be either managed using AWS KMS (Key Management Service) or by using our own keys.
  • On the client side, it is up to the developer/customer to encrypt data before/after transmitting it to/from AWS.
  • Data in transit can be encrypted using SSL/TLS and HTTPS.

What types of Storage solutions are provided by S3?

S3 provides 8 different types of storage solutions which you can select according to your requirements:

1. S3 Standard

The General-purpose storage for active storage that is frequently accessed. Objects can be retrieved in milliseconds. It is designed for low latency, high availability and high throughput. This makes it ideal for content distribution.

2. S3 Standard – Infrequent Access (S3-IA)

A lower tier storage service for data that is less frequently accessed (say, monthly) whilst also providing millisecond retrieval times. While this has a lower per-GB storage cost, it comes with a per-GB retrieval cost. This can be useful for long-term storage solutions like backups and disaster recovery.

3. S3 One Zone – Infrequent Access

A storage solution when you are not concerned a lot with data redundancy and when objects are not accessed very often – however, when needed, we can access it quickly. This tier can provide up to 20% cost benefits compared to S3-IA. It can be ideal for secondary backups.

The Glacier tier is a lower cost storage solution that can be used for archiving purposes:

4. S3 Glacier – Instant Retrieval

Lowest cost storage for long-term archival data, with millisecond retrieval speeds. We can save up to 68% on cost compared to S3-IA. This could be ideal for data that may accessed as rarely as only once in a quarter.

5. S3 Glacier – Flexible Retrieval

It is a durable and low-cost archival solution with 3 retrieval tiers – expedited (1-5 minutes), standard (3-5 hours) and bulk (12 hours). Since it is secure, durable and low-cost, this makes it ideal for solutions that require learning from or referencing to data at rest.

6. S3 Glacier – Deep Archive

This is the lowest cost static storage solution offered by AWS for data archiving with cost going as low as $1 per TB per month. There are 2 data retrieval tiers – standard (12 hours) and bulk (48 hours). This can be ideal for less frequently accessed storage solutions (say, when the data may have to be accessed once a year).

Almost there! There are just a couple more solutions that might interest you:

7. S3 Intelligent Tiering

Automatically moves objects between cost-optimized access tiers and provides automatic cost savings. This can be especially useful when you want to optimize storage costs based on access

patterns. Alternatively, we can also manually setup rules to automatically move objects between different storage tiers to optimize our storage costs.

8. S3 on Outposts

This is a unique service by AWS that provides on-premises service that may be needed for local data processing or regulatory requirements.

The image below (Credit: AWS Docs @ Object Storage Classes – Amazon S3)summarizes and offers some insights:

 

How to get data into S3?

Now that we are clear about the different storage solutions provided by AWS, we can explore diverse ways to get our data into AWS:

We can get data into AWS by using the Console (Web), Command-line (CLI) and SDK (multiple languages like Java, Python, Go, JavaScript, etc. are supported). Features like Multipart upload (breaking up files into multiple parts during upload) and S3 Transfer Acceleration (using edge nodes to accelerate transfer speed) can be used to gain better results.

When you may want to transfer data at enterprise scale, we have solutions like AWS Snowball (multiple Terrabytes), AWS Snowmobile (up to 100 Petabytes).

And it’s a wrap – this concludes my blog on S3 – I hope you are excited to try it out! Remember to explore the resources at https://docs.aws.amazon.com/s3/ if you get stuck and feel free to reach out to me on sdave@dal.ca to share your opinion or suggestions. Keep an eye out for my next blog!

Follow us on social media