A Tenth Revolution Group Company

AWS storage guide for beginners

By Nicola Wright

Cloud storage is everywhere these days.

If you have a smartphone, you probably already have a cloud storage account, like iCloud. Maybe you have a Google Drive folder on your desktop for your vacation photos, or perhaps you’ve used OneDrive or Dropbox to send files to friends or colleagues.

Why wouldn’t you take advantage of being able to store your stuff on someone else’s server? It’s easy, the things you need are accessible from anywhere, and you’re not eating into space on your own devices.

Cloud storage, however, isn’t just available on an individual scale. There are massive, commercial online data warehousing services out there for businesses too, and they offer all of the benefits of personal cloud storage, and then some.

Storage is one of the most popular and useful facilities offered by cloud service vendors today. Businesses are using cloud storage for a whole heap of reasons, from simple file storage and backup, to creating data lakes to feed into big data and machine learning platforms.

Data is central to how we live today. Businesses need it to make informed decisions and deliver terrific experiences for customers and clients.

How a company handles, exploits, and safeguards its data can make or break its chances of survival, and with the amount of data we generate every day growing to colossal and unprecedented levels, it’s never been more vital for organizations to have scalable, secure, smart storage services at their disposal.

Given its position as the world’s leading cloud service, it should come as no surprise that AWS offers a multitude of cloud storage options for a wide range of use cases. Let’s take a look at just a few of the ways that businesses across the globe are using AWS to store their valuable data.

Save money, protect your data, innovate faster.
With extensive insights, advice, and best practices from cloud leaders, our brand new white paper is the ultimate guide to optimizing your business with AWS.

Get your copy

Why use AWS cloud storage?

Archiving

In this fast-paced, data-fueled world, there are plenty of instances where fast, immediate access to your data is a must, so that you can wring valuable insight from it as quickly as possible.

But not all of your business data needs to be at your fingertips. You’ll also have plenty of older data, legacy information, and outdated files that’re ready to be “boxed away.”

Of course, it’s important that any data you’re archiving remains safe, secure, and accessible, no matter how ancient and seemingly irrelevant it might be. That’s why the cloud is a top option for archiving.

AWS has a number of services that can be used to archive data, depending on an organization’s specific needs. Amazon Glacier, for example, is an affordable option for storing data you’re not going to need to get your hands on in a hurry (hence the name).

Amazon Simple Storage Service (S3) lets you dip into your achieve a little faster, or you can use AWS Storage Gateway to create a tailor-made storage solution that suits your business.

All of AWS’s cloud storage solutions adhere to rigorous security and compliance standards, and come with in-built encryption too, so you can be sure that your data is safe and meets all necessary data handling regulations.

Backup and restore

More data means greater potential, but it also means more backups.

Whether you need object, file, or block storage, the cloud offers backup options that are scalable, so you can easily increase and decrease the amount of storage you have based on your needs, without having to deal with hardware or working out where to put everything.

This scalability means you can be more agile and address changes in your business needs immediately.

Cloud backups on AWS are also super durable and secure. Copies of backups made on Amazon S3 and Amazon S3 Glacier are stored on at least three highly secure devices in a single AWS Region, creating a level of stability that can’t be matched on-premise.

Backing up data to the cloud can be more cost-effective too. Like most cloud storage vendors, AWS uses a pay-as-you-go pricing model, so you only pay for the storage you’re using right now, and offers cost management tools to help optimize spending.

With no infrastructure to maintain or invest in when you need more space, using the cloud for your backup needs can drastically reduce your IT overheads.

Big data analytics

Putting all your data into one centralized, well-managed data lake is the first step towards harvesting valuable insights through analytics and business intelligence. AWS data lake services provide all the tools a business needs to integrate its data into one scalable, accessible, and agile data warehouse to help them get the best results from their analytical strategy.

More organizations have data lakes and analytics on AWS than anywhere else, with companies like NASDAQ, Zillow, and Yelp hosting the foundation for their business intelligence tools on the cloud platform.

More AWS talent than anyone else

Take a look at our database of pre-screened AWS professionals and take the first step toward landing the best administrators, developers, and consultants in the market.

Take a look

AWS cloud storage options

Amazon S3

One of AWS’s most popular services, Amazon Simple Storage Service—commonly known as Amazon S3—is an object storage service.

What makes Amazon S3 so appealing (aside from its durability, speed, and cost-effectiveness) is its simplicity. A simple web storage service with a user-friendly interface, S3 makes it really straightforward to store and retrieve data from wherever you are, whenever you need it.

It’s designed to maximize the benefits of scale, offering huge amounts of storage space at inexpensive, pay-as-you-go rates. Its simplicity is also what makes it so useful: it can be leveraged in many use cases, and by organizations of every type and size.

Businesses can take advantage of Amazon S3 to securely store data for all kinds of use; from websites and simple mobile apps to complex enterprise applications, from backups and archival data to IoT devices and big data analytics.

The clear-cut interface includes management features that enable users to organize their data and configure their data lake to meet business and compliance requirements, ensuring that their information is secure.

“Fleximize has been using AWS cloud services for a number of years. For a development team, one of the key benefits of cloud storage is the speed with which you can scale to address the changing requirements of your business. On a semi-regular basis, we find ourselves needing to clone or rebuild servers and databases, or to move around large numbers of assets stored in the cloud. While these sorts of projects had traditionally taken days or weeks to plan and execute on local storage (often at massive expensive, with little option to roll-back), cloud-based storage makes things much simpler, with hosting providers often providing tools to allow developers to quickly and securely perform these sorts of tasks in a matter of minutes.
“For basic file storage, products like Amazon S3 buckets also provide almost unlimited scalability for enterprise-level operations, while remaining an equally cost-effective option for small business customers. And with free tier and trial options now available on many cloud storage platforms, the barrier to entry is lower than it has ever been.” Cormac Scanlan, CTO at Fleximize 

AWS aims to operate Amazon S3 with 99.999999999% (that’s eleven nines…) durability and between 99.95% to 99.99% availability, so users should never have to worry about data corruption or being unable to get their hands on data when they need it.

Netflix, reddit, Dropbox, Tumblr, and Pinterest are among the big-name companies that use Amazon S3 to stow their data. London-based FinTech company iwoca is another organization that’s seen the benefits of S3 storage.

“We use S3 for serving all static files, including the react SPA frontend, as well as some data in parquet format for analytical purposes,” said Will Hayes, Head of Engineering at iwoca. “It’s also used for call recordings and document uploads.

“Quite simply, we use S3 to upload data and download it when it’s needed! The main benefits of using S3 are its high-availability, speed, and simplicity of use.

“In terms of comparing S3 to using on-premise servers, iwoca doesn’t have any data centers. By doing everything in AWS we can take advantage of high availability, multi-availability zone replication, auto-scaling, and easy server provisioning. Doing this ourselves would be prohibitively expensive.”

Amazon S3 is an object storage service, meaning data is rolled together alongside its metadata, creating an “object.” Each object is given a unique, user-assigned key that can be used to identify it. These objects are then grouped together in “buckets,” similar to file folders.

Object storage vs file storage vs block storage

Block storage is a storage method that groups data into fixed-size chunks that are rigidly arranged within sectors and tracks. Each block of data gets its own location—much like how a library might organize books using the Dewey Decimal system—that can be used to find and access it. Block storage is great for enterprise databases.

File storage is where bits of often unrelated data are stored as files within a folder hierarchy, the way you would likely arrange your “documents” folder on your personal computer. This method is best for active documents that are accessed often.

The object storage method stores data as “objects” in scalable buckets that can be easily accessed, making it an ideal choice for storing large amounts of unstructured data, for example, for archiving or analytical purposes. Objects store data and that data’s descriptive metadata together in one place.

There are a handful of storage “classes” within Amazon S3, designed to help users choose the best storage option for their data and, in turn, optimize spending.

Cloud storage classes are a little like savings accounts; the less frequently and immediately you need to access the contents, the better rate you get.

Classes are assigned at object level, so everything in a single block must be stored using the same class, but you can have objects with different storage classes within the same bucket.

Amazon S3 classes

Class Details
Amazon S3 Standard The default class, offering general-purpose storage best suited for frequently accessed data.
Amazon S3 Standard Infrequent Access (IA) Ideal for less frequently accessed data like backup and disaster recovery solutions.
Amazon S3 One Zone-Infrequent Access Best for infrequently accessed data that needs to be surfaced rapidly when required. Stored in one location only; if that data center is destroyed, all data is lost.
Amazon S3 Intelligent-Tiering Designed to optimize costs by automatically moving data to the most cost-effective class depending on how often data is accessed.
Amazon S3 Glacier Secure, durable, and low-cost storage class for data being archived long-term.
Amazon S3 Glacier Deep Archive The lowest cost option. Best for data to be retained for 7-10 years, that only needs to be accessed a few times a year, and doesn’t need to be pulled immediately.
Use Amazon S3 when you need: Scalable storage that lets you access data from any Internet location. Ideal for storing user-generated content, archived data, Big Data for analytical purposes, or backup and recovery data.

Amazon Elastic Block Store

Amazon Elastic Block Store (Amazon EBS) is a persistent block storage service, most commonly used in conjunction with Amazon EC2.

What is EC2?

Amazon EC2—or Amazon Elastic Compute Cloud to give it its full title—is a core AWS offering. It provides computing power without the need for hardware, allowing users to “rent” virtual machines to run their own applications on.

These virtual computing environments are known as “instances.” Using Amazon EC2, businesses can run as many or as few virtual servers as they need to develop and deploy their applications, and can manage security, networking, and storage just like they would with a real computer.

Unlike Amazon S3, which is an object storage service, Amazon EBS stores data in raw, unformatted “blocks.” These blocks, hosted on Amazon EBS, are called Amazon EBS volumes, and they allow you to store the persistent data you need for the applications you’re running on Amazon EC2.

This persistent data is different from the temporary data that’s generated by the app but not needed long-term: this type of data is stored by EC2 automatically in “instance store volumes,” and is deleted when you shut down your instance. You don’t need additional storage for this kind of data like you do for persistent data.

To use Amazon EBS, simply create a volume, select a size, and attach it to any EC2 instance (provided both the volume and the instance exist in the same Availability Zone).

Once the instance and the volume are linked, data can be pulled from the EBS volume and fed into whatever apps or services are built on the instance. You can only attach an EBS volume to one instance at a time, though an instance can have multiple volumes attached to it.

To use a microcosm example, this works the same way as loading a program like Adobe Photoshop on your computer: you open the program, which runs on your operating system (the instance), then you open a file, and that data is pulled from your hard drive (the volume).

Any changes you make, or any new data generated by the program, are then saved back to the hard drive. The only difference here is that it’s happening on a much larger scale, and both your “computer” and your “hard drive” exist on AWS’s servers.

So, in essence, EBS is a cloud-based drive for a virtual computer.

Every Amazon EBS volume is copied with its Availability Zone, giving you a backup if anything goes wrong, and making the data highly durable and consistently available. Data storage in EBS volumes is encrypted at rest, and encryption keys can be created and managed through AWS Key Management Service.

There are four types of storage volume available through Amazon EBS:

  • General Purpose SSD (gp2): EBS’s default option that toes the line between affordability and performance. Gp2 is a good choice for many different use cases, such as virtual desktops, applications, and sandbox and testing environments.
  • Provisioned IOPS SSD (io1): This powerful option is best suited to critical workloads, large databases and apps that require faster read/write performance.
  • Throughput Optimized HDD (st1): Another premium performance volume option, st1 is a cheaper alternative for large, frequently-accessed workloads like big data apps and data warehousing.
  • Cold HDD (sc1): This is the lowest cost volume option, designed for large quantities of data that needs to be accessed infrequently.
Use Amazon EBS when you need: Persistent storage for your Amazon EC2 application data.

Amazon Elastic File System

Amazon Elastic File System (Amazon EFS) is a network file system storage solution.

What is a Network File System?

A network file system (NFS) is a type of storage system that enables the storage and retrieval of data from multiple locations across a shared network, as if that drive was on the user’s own device.

Amazon EBS scales automatically, expanding and shrinking to accommodate the needs of the workload in real-time as you add and remove files, so there’s no need to manually manage capacity to ensure applications run as they should.

It’s based on the AWS Cloud, but it can be used as a file system for both AWS cloud services and applications, and on-premise resources.

Like Amazon EBS, EFS is designed to be used with Amazon EC2 instances. One of the key differences between the two storage services, however, is that data stored on Amazon EFS can be accessed by multiple instances at the same time.

With EBS, you can add more volumes to an instance if necessary, but EFS does this automatically, which is ideal if you want to run an application with high workloads that needs a fast output.

As well as acting as a storage component for EC2, EFS can also be used as a network file system for on-premise servers by connecting to the AWS Cloud through a service called AWS Direct Connect.

Of the three services we’ve covered at this point—S2, EBS, and EFS—Amazon EFS is the most expensive. This is because of how “active” the storage is. S3 is the cheapest as it’s used more like traditional storage; written once, accessed multiple times.

EBS and EFS are made to be accessed more often, with the data actively used by apps hosted on EC2. EBS volumes can be attached to just one instance at a time, while EFS can be attached to many. AWS pricing is informed by how much resource is used, which is why EFS is almost ten times more expensive than EBS.

Amazon EFS is a good choice for use cases such as lift-and-shift enterprise applications, big data analytics, content management, app development and testing, media workflows, database backups, and container storage.

There are two storage classes to choose from with Amazon EFS:

  • Standard: Used to store files you need daily access to.
  • Infrequent Access (EFS IA): A lower-cost alternative for storing data that you don’t need to access often, like audit data, historical analytic data, and backup and recovery files.
Use Amazon EFS when you need: A scalable file system for use with AWS Cloud services like Amazon EC2 and on-premises resources.

On the hunt for top AWS talent? We make it easy.

Tell us what you’re looking for and we’ll put together a job spec that’ll attract professionals with the skills and experience you need.

Upload a job

Amazon FSx for Lustre

Amazon FSx for Lustre is a fully managed file storage system, designed for use with workloads that require a lot of compute power – for example, high-performance computing, machine learning, or media data processing. Amazon FSx for Lustre enables users to launch and run a Lustre file system able to process massive data sets quickly and capably.

What is Lustre?

Lustre is an open-source, parallel file system commonly used for large-scale cluster computing.

The platform integrates with Amazon S3, so users can link S3 data sets with your high-performance Lustre file systems to run compute-intensive workloads. FSx for Lustre automatically copies data from S3, and writes results back to S3 once workloads have been run.

Like Amazon EFS, FSx for Lustre can also be used on-premise. Users can connect their FSx for Lustre file system to their in-house servers using Amazon Direct Connect.

Using FSx for Lustre is a cost-effective way for users to utilize their S3-hosted data for compute-intensive workloads.

Use Amazon FSx for Lustre when you need: A managed file system optimized for compute-intensive workloads.

Amazon FSx for Windows File Server

Amazon FSx for Windows File Server is a fully managed file system created to help move storage for Windows-based apps to AWS.

FSx for Windows File Server is built on Windows Server, and provides shared file storage that’s compatible with Windows-based apps. It fully supports the SMB protocol, Windows NTFS, Active Directory integration, and Distributed File System.

Amazon FSx uses SSD storage, so Windows app workloads like those generated by CRM, ERP, and .NET applications will enjoy high performance and durability and performance.

Through Amazon FSx, Windows file systems can be accessed from multiple compute instances at once.

Use Amazon FSx for Windows File Server when you need: A Windows-native file system that lets you move any Windows-based apps that require file storage to AWS.

AWS Storage Gateway

The AWS Storage Gateway is a hybrid storage service that enables users to connect an on-premise app to AWS Cloud storage.

Through the AWS Storage Gateway, data can be stored in the cloud for backup and archiving, disaster recovery, data processing, or migration purposes. On-premise apps connect to AWS storage services—such as S3, Glacier, or EBS—through the AWS Storage Gateway via a virtual machine or hardware gateway appliance.

Files, volumes, and virtual tapes can all be stored on AWS through the Storage Gateway, eliminating storage limits and allowing on-premise environments to benefit from the security and durability of AWS cloud storage.

Use Amazon FSx for Windows File Server when you need: A hybrid cloud option that links an on-premise environment with AWS Cloud storage solutions.

AWS Backup

Launched in early 2019, AWS Backup is a fully managed backup service that helps streamline the automate data backups across a range of AWS storage services.

Storage volumes, databases, and file systems hosted on DynamoDB, EBS, EFS, Amazon RDS, and AWS Storage Gateway can be backed up, restored, configured, and managed from a centralized AWS Backup console.

The console also allows users to automate backup schedules and retention management, and apply backup policies across all your AWS storage resources.

All backups made through AWS Backup are then encrypted in transit and at rest, making it simple to ensure compliance and audit backup data when required.

For when you need: A storage service that allows you to centralize and automate the backing up of data across multiple AWS storage services, both in the cloud and on-premise.

AWS storage pricing

AWS storage pricing is a complex animal; how much you pay per GB of storage used varies massively depending on:

  • Which storage product you’re using
  • What class or tier of storage you’ve chosen
  • How much data you’re storing
  • What AWS region you’re storing your data in

You can get a rough idea of how much each storage option will cost by using AWS’s pricing calculator.

The good news is that AWS offers more than 60 products at no cost on its Free Tier, including six types of storage. Here’s what kind of storage you can get on AWS without paying a penny.

Product What do you get? How much storage? How long is it free for?
Amazon S3 Secure, durable, and scalable object storage infrastructure. 5 GB of Standard Storage

20,000 Get Requests

2,000 Put Requests

12 months
Amazon CloudFront Web service to distribute content to end users with low latency and high data transfer speeds. 50 GB of Data Transfer Out

2,000,000 HTTP or HTTPS Requests

12 months
Amazon EFS Simple, scalable, shared file storage service for Amazon EC2 instances. 5 GB 12 months
Amazon Elastic Block Storage Persistent, durable, low-latency block-level storage volumes for EC2 instances. 30 GiB of Amazon EBS: any combination of General Purpose (SSD) or Magnetic

2,000,000 I/Os (with EBS Magnetic)

1 GB of snapshot storage

12 months
Amazon Glacier The free tier allowance can be used at any time during the month and applies to Standard retrievals 10 GB of Amazon Glacier data retrievals per month for free 12 months
AWS Storage Gateway Hybrid cloud storage with seamless local integration and optimized data transfer. No transfer charges into AWS. 100 GB

free per account

Forever

 Want key hiring insights from the AWS community?

Find out the latest salary averages, key industry insights, and invaluable hiring advice for organizations building AWS teams across the world with the Jefferson Frank AWS Careers and Hiring Guide.

Download the guide

AWS insights now

Get the latest AWS news and views delivered straight to your inbox

We'd love to send you Jefferson Frank’s AWS insights and tips by email, phone or other electronic means.