Skip to content

Storage

S3

Limitations

  • 0 bytes - 5TB per file.
  • 100 buckets per account - soft limit.

Properties

  • Object storage service
  • No fixed structure
  • Flat address space, unique URL.
  • Regional service, must specify the Region while upload
  • Automated replication across multiple AZs for durability.
  • Need to create buckets - name must be unique across all of S3.
  • Unique Object key identifies a files in a bucket. Unique Key: Bucket + folder + file.
  • Also, get a URL which is fully qualified
  • Can create virtual folders for easy management

Spec

  • Durability - 11 9s
  • Availability - 99.5% - 99.99% (depends on storage class)

Versioning

  • Keeps all versions of the objects in the same bucket.

Server Access Logging

  • Logs for access to bucket.

Object-level logging

  • Object-level API activity.

Encryption

Bucket-level Encryption

Folder level Encryption

Storage Classes

S3 Standard S3 INT S3 S-IA S3 Z-IA
High throughput ✅ ✅ ✅
Low Latency ✅ ✅ ✅
Frequent access to data Unknown (Frequent (30 days) + Infrequent) Infrequent Infrequent
Durability Eleven 9s ✅ ✅ Eleven 9s across single AZ
Availability 99.99% 99.9% 99.9% 9 9.5%
SSL/TLS ✅ ✅ ✅
Lifecycle rules ✅ ✅ ✅

S3 Glacier

  • 11 9s durability
  • Lower cost of storage.
  • Retrieval is not instant
  • Vaults and Archives. Vaults are container for Archives. Vaults are region specific.
  • No GUI. GUI is for management only.
  • Moving data two step process - (1) Create Vault (2) Use API/SDK to move data. or via Data lifecycle from S3.
  • Access is via API/SDK or AWS CLI. Either way an archival retrieval job needs to be create first.
  • S3 Access of data comes at a cost based on retrieval options:
  • Set 1
    • Expedited - <250MB, 5 mins, most expensive
    • Standard - Any size, 3-5 hours, 2nd most expensive
    • Bulk - Any size, 5-12 hours, cheapest
  • Set 2
    • Instant Retrieval - milliseconds
    • Flexible Retrieval - mins to hours

S3 Glacier Deep Archive

  • Minimal access retrieval is 12 hours.

Fees

  • Minimum storage of 30 days of storage requirement on all tiers, except standard. Glacier has 90 days. Deep Archive 180 days.
  • Per object monitoring fess in Intelligent Tier.
  • Retrieval fees per GB for Infrequent Access and Glacier.
  • S3 Glacier is at a fraction of the cost of S3.

### Lifecycle Rules * Moving data automatically between different tiers.

EBS

  • Persistent Block-level storage
  • Attached to EC2 volumes and can meet IOPS requirements.
  • Generally, EBS volume can be attached to a single storage (exception Multi-Attach).
  • EBS backup - snapshots. Manually / cloudwatch scheduled event. Stored on S3 and incremental changes are stored. Restored to another EBS volume.
  • Copy snapshot to anther for high durability.
  • Replicated multiple times within the same AZ. Lives only in the single AZ.
  • When creating an EBS Volume directly, need to specify the AZ can only be attached to EC2 instances in the same AZ.

EBS Types

SSD

  • Better for work with smaller blocks
  • Boot Volume
  • SSD-GP2
  • Balanced
  • Boot Volume
  • Provisioned IOPS SSD - io1
  • Low latency/high throughput
  • 16000 IOPS of 250 MiB/s of throughput per volume

HDD

  • Throughput optimized HDD (st1)
  • Frequently accesses, large throughput, data streaming, log processing
  • cannot be used as boot volumes
  • Cold HDD (sc1)
  • lowest cost
  • large in size and accessed infrequently
  • cannot be used as boot volumes

Security

  • Enable encryption at the time of creation.
  • Uses AWS KMS to automatically encrypt data
  • snapshots of encrypted volumes are also encrypted.
  • Encryption is only available on selected instance types.
  • Regional setting possible to configure all EBS volumes in the region to use encryption.

EFS

Properties

  • File Storage Access like a File Manager
  • Uses hierarchy structure
  • Low Latency and multi-access.
  • EFS is accessed via mountpoints that live in particular AZs and can be used by EC2s in that AZ.
  • Uses standard OS API - NFS4.1 and 4.0.
  • Replicated across AZs in a single region, making it highly available and reliable.
  • Regional service. Not available in all regions.

Storage Classes and Performance

Standard

  • Default
  • Charged on the storage used at per month.

Infrequent Access

  • Rarely accessed, lower cost.
  • Higher latency of first byte read.
  • Cost for storage space used + Cost for read and write.

Lifecycle

  • Moves data between the storage classes.
  • Fixed duration of timers 14, 30, 60, 90 days.
  • Metadata and files smaller than 128K in size.
  • Can be switched on / off.

Performance Modes

General Purpose Max I/O
Throughput Standard Unlimited
IOPS <= 7K >= 7k
Latency Low Latency High Latency
Metric Cloudwatch % of IPOS

Throughput Modes

Bursting Provisioned
Throughput 100MiB/s per TB Additional charges beyond the bursting option

Duration of Burst

  • 50 MiB/s per TB is baseline,
  • lower accumulates burst credit
  • burst credit is viewable on CloudMetrics.

SNOW Family

  • Storage and compute capabilities.
  • Few TB to 100PB of physical data transfer in and out of AWS.
Snowcone Snowball Compute Optimized Snowball Compute Optimized with GPU Snowball Storage Optimized Snow Mobile
vcpu 2 52 52 24 n/a
memory 4GB 208GB 208GB 32GB n/a
storage 8TB 39.5TB 39.5TB 80TB 100PB
SSD n/a 7.68 7.68 n/a n/a
GPU n/a n/a Nvdia Tesla n/a n/a
cluster na 5-10 5-10 5-10 n/a
use cases portable, battery, upto 8TB, DataSync, 10Gbit S3 API, 100Gbit network compute power, cluster 80TB, SSD, Rack mounting, HIPPA compliant 10 petabyte

FAQ