By Nicola Wright
So you’ve decided to migrate your business to the cloud—good call!
Now, there’s just the small matter of transferring your data. Here’s what you need to know about transferring your data to AWS, what tools are available, and how much it’ll cost.
Data transfer is the process of moving data, applications, or other business components from an organization’s on-premises infrastructure to the cloud, or moving them from one cloud service to another.
Data Transfer services work both Online and Offline and the usage depends on several factors like the amount of data, time required, frequency, available bandwidth, and cost.
Before you decide which is the right data transfer method for you, you need to think about some key questions. According to Ed Laczynski, CEO at video CMS creator Zype, there are four factors to pay particular attention to.
“First consideration is the size of the data that is being transferred,” says Ed. “How much data? How many files? Second is the format. How is it stored: Lots of small files in lots of folders? Lots of large files? Is it compressed?
“Third is the source location, and the network environment at that location. Is the location geographically near an AWS Region? Is the network bandwidth on the source end adequate and consistent? Speed of light matters!
“Fourth are the privacy and security needs for the data. Can anyone move this data? Are your employees authorized to move the data? Do special arrangements need to be made to encrypt the data in preparation for the move? How will it be secured in the cloud?”
In the majority of cases, a major data transfer to AWS will be undertaken as part of a migration to the cloud. That’s the scenario we’re going to be thinking about here.
There may well be other, more efficient data transfer tools for other use cases, but we’re going to look at the best options when migrating a large amount of data in a single instance.
Take a look at our database of pre-screened AWS professionals and take the first step toward landing the best administrators, developers, and consultants in the market.
Given that the AWS Cloud is an online platform that promises to free its customers from the shackles of hardware and in-house servers, you might be surprised to hear that AWS offers some decidedly analog methods of transferring your data.
Depending on how much data you have to shift and how quickly you need it transferred, you can opt to migrate your data over an internet connection, or by transferring your data to a physical storage device which AWS then takes away and imports for you.
With an online data transfer method, you can set up a network link to the AWS Virtual Private Cloud (VPC) and transfer your data to AWS via an internet connection.
Network data transfers via AWS’ VPC are useful for lifting and shifting large datasets once, and help you integrate existing process flows like backup and recovery.
AWS offers several methods that allow you to create a network link to your VPC, with options to suit your needs depending on the quantity of your data, where it’s located, and how fast you want to shift it.
If you have too much data to spend weeks, months, or years watching it stream into AWS over a network connection, if your connection speed isn’t great, or if you’d rather someone just turned up and dealt with the whole data transfer thing for you, you can opt to migrate the “old-fashioned” way.
AWS offers a set of services that allow you to drop your data onto a secure device and ship it back to AWS for transfer onto its could platform.
How much data you need to ship will dictate which service you need: AWS Snowball, a rugged, end-to-end encrypted device with an 80TB capacity, or AWS Snowmobile, a 100 petabyte storage facility housed in a 45ft semi-truck.
Both of these options offer secure, shippable devices that take the hassle out of moving archives, data lakes, or other huge volumes of data that can’t be transferred over a network. Which one is right for the job at hand depends largely, according to Ed, on how much data you have to move and how quickly you need access to it once it reaches its destination.
“Offline services like Snowball are good for very large datasets,” explains Ed, “without a requirement for immediate availability. For most use cases, online transfer is suitable and much more convenient, plus you can test the outcome of the move much faster.
“For example, you can move a small subset of data, test the move, make any necessary changes, and then move a larger set of data. Offline mechanisms make that feedback loop much longer.”
If an organization wants to take advantage of cloud storage but has applications running on-premises that require low-latency access to their data, AWS hybrid cloud storage architectures may be the way to go. These connect on-premises applications and systems to the cloud, helping to minimize data costs and management requirements.
Let’s take a closer look at some of your online data transfer options.
AWS Virtual Private Network (AWS VPN) allows users to establish a secure, private connection from their network or device to the AWS Cloud. There are two options when it comes to using AWS VPN:
AWS VPN is encrypted, quick to set up, and pretty cost-effective for small data transfers. It is a shared connection, however, so it’s not always as reliable as other options.
Migrating a whole database to AWS? The AWS Database Migration Service is built just for that purpose. Migrating databases to AWS with this tool is fast and securer, and best of all, the database you’re transferring remains fully operational and usable throughout the shift.
This is because AWS DMS continuously replicates your data with high availability. It can also consolidate disparate databases into a multi-petabyte data warehouse by streaming data to Amazon Redshift and Amazon S3.
If you’re transferring your data to the popular AWS storage platform S3 over long distances, AWS S3 Transfer Acceleration helps you do it faster: 50-500% faster for the long-distance transfer of larger objects, according to AWS.
Amazon S3 Transfer Acceleration is a service that allows you to upload data to an S3 bucket quickly and securely over the public Internet. If you’re uploading to a centralized bucket from different locations across the globe, S3 Transfer Acceleration can save a lot of transfer time.
Data is routed to S3 on optimized network paths via Amazon CloudFront edge locations, which are spread across the globe. This helps maximize available bandwidth no matter how far the data is traveling or how much the latency varies.
There are no special clients or proprietary network protocols involved either. Just turn on Transfer Acceleration for an S3 bucket using the Amazon S3 console, set your S3 endpoint to one of two TA options, and the acceleration is applied automatically.
The service is available for both reading and writing data to Amazon S3, making it useful for recurring jobs like media uploads, backups, and local data processing.
AWS DataSync is a service that allows users to automate the shifting of data between on-premises storage and Amazon S3 or Amazon EFS.
There’s lots to consider when migrating data: it’s not always as simple as drag and drop. The administrative tasks that come with data transfers can eat up resources and slow down cloud migration, but DataSync aims to take care of some of that burden for you.
Running instances, encryption, managing scripts, network optimization, and validating data can all be handled automatically through DataSync, all while transferring data up to 10 times faster than many open source migration services.
The tool uses an on-premises software agent to connect to your in-house storage systems using Network File System (NFS) and Server Message Block (SMB) protocols, eliminating the need for you to write scripts or modify your applications to make them compatible with AWS APIs.
DataSync can be used to copy data over AWS Direct Connect or other internet links to AWS, and is suitable for both one-time data migrations, recurring workflows, and for automated backup and recovery purposes.
The service works by deploying a DataSync agent: a virtual machine that reads or writes data from your on-premises storage system. The agent connects to your file system, you select either Amazon EFS or Amazon S3 as the destination and configure your transfer options, and away your data goes.
Once it’s in transit, you can monitor its progress either through the DataSync console or using Amazon CloudWatch.
DataSync is perfect for businesses that are looking to migrate active data sets, want timely transfers for continuously updating data streams, or need to replicate data for business continuity.
Tell us what you’re looking for and we’ll put together a job spec that’ll attract professionals with the skills and experience you need.
As the most portable and compact storage device, AWS Snowcone is the baby of the AWS Snow Family.
It weighs just 4.5lbs and is available with SSD or HDD options, enabling users to deploy ultra-portable data transfer and edge computing devices anywhere, from on-premises Windows, Linux, and MacOS systems to file-based applications via the NFS interface.
AWS provide this handy visual on how exactly Snowcone works.
AWS Snowball is a suitcase-sized, super-tough device that can be loaded with up to 80TB of data.
Although copying your data onto a giant hard drive and posting it off to be uploaded might seem like a pretty archaic way to get your data onto the cloud, it’s actually becoming one of the most popular methods of AWS data transfer, thanks to its security, cost-effectiveness, and fast turnaround time.
Transferring terabytes of data over an internet connection, even a super-fast one, can take months, and can rack up high costs in the process. Snowball makes it simple and fast to migrate colossal amounts of data while keeping costs down.
The humble Snowball device boasts 256-bit encryption, and its heavyweight casing is tamper-resistant thanks to its industry-standard Trusted Platform Module (TPM). Encryption keys are managed through AWS Key Management Service (KMS) and are never sent to or stored on the device itself, for extra security.
Snowball Edge can also run Lambda and EC2-based applications locally, even without a network connection. This makes it ideal for use cases that need local processing or analysis before the data makes it way onto the AWS Cloud.
Snowball Edge supports local workloads in remote or offline locations, facilitating one-time data transfers where the data is being migrated from an on-premises environment that’s in a remote location or has limited network bandwidth.
Got too much data to fit on a handful of Snowballs? Time to call in the big guns.
AWS Snowmobile is an exabyte-scale data migration solution that packs the equivalent of 1,250 AWS Snowball devices into 45ft long shipping container.
AWS Snowmobile is able to transport up to 100PB of data in a single trip, at around a fifth of the cost of transferring data over a high-speed internet connection.
After assessing your migration needs, AWS will transport a Snowmobile to you, where AWS personnel will configure it for you, setting up a removable, high-speed network switch from Snowmobile to your local network.
Once this is in place, you can start loading up the Snowmobile with data from an unlimited number of sources within your data center. The Snowmobile is driven back to AWS to be loaded into Amazon S3.
Snowmobile is not only the fastest and cheapest way to transfer huge amounts of data to the cloud, it’s also highly secure.
Dedicated security personnel, GPS tracking, alarm monitoring, and 24/7 video surveillance work together to keep your data safe throughout its journey. Data is encrypted with 256-bit encryption keys too.
AWS Direct Connect is a dedicated connection from your datacenter to the AWS Cloud.
Like AWS VPN, Direct Connect provides an encrypted connection between your infrastructure and the AWS Cloud. Where Direct Connect differs is that it circumvents the public internet and establishes a private connection in which either a 1 GB or 10 GB fiber-optic Ethernet cable is used to connect your router to an AWS Direct Connect router.
Because of this dedicated connection, Direct Connect is significantly more costly than its public internet-using VPN counterpart.
If you’re going to need a fast, reliable, and secure channel to continuously stream large amounts of data back and forth from the AWS Cloud, it might be worth setting up a Direct Connect line for your data center.
Otherwise, given the cost and the time it takes to set up, it’s not an ideal data transfer option if you only need it to execute a one-off migration.
Storage Gateway is a handy tool that enables users to connect and extend on-premises application to AWS storage. Storage Gateway provides cloud-backed file shares and creates a low-latency cache to access data in AWS for on-premises applications.
This service is provided through three different gateways:
AWS data transfer costs can occur when a user is migrating data either between AWS and the internet or between different AWS services.
AWS wants you to store data on its platform, so as a rule, it doesn’t charge for importing data into its cloud platform, though there are often costs associated with transferring back out.
Generally, what you’re paying for with AWS data transfer services is the resource and infrastructure required to facilitate the transfer. How much you’ll pay to transfer your data will depend on the method you choose, your region(S), how much resource it uses, and how fast the connection is.
In April 2022, AWS announced that inter-Availability Zone (AZ) data transfers within the same AWS Region for AWS PrivateLink, AWS Transit Gateway, and AWS Client VPN would be completely free of charge.
AWS VPN costs are calculated by hourly usage, so you’ll be charged for every hour that the connection is active:
If you’re using AWS Database Migration Service to transfer your existing databases to Amazon Aurora, Redshift, or DynamoDB, you can use it free for six months.
Once that period is up, you’ll only pay for the compute resources, or instances, used to port databases to AWS, and for any additional log storage needed.
Each DMS database migration instance includes enough storage for swap space, replication logs, and data caching for the majority of cases.
These on-demand instances are priced by hourly usage, with a sliding scale of costs depending on how powerful the instance is, and whether you opt for single or multiple availability zone instances (multi-AZs mean more durability for your migration, as you’ve still got a zone to run the migration on if one is down).
Instance pricing begins at $0.018 per hour, topping out at $21.65 per hour for multi-AZ instances with the highest level of processor performance and lowest network latency.
Pricing for the AWS S3 Transfer Acceleration service is calculated by the volume of data you’re migrating to S3, rather than by how long you’re using the connection, as with some other tools.
Transfer Acceleration constantly monitors its speeds, and if, for whatever reason, the service doesn’t move your data faster than a standard transfer over public internet would, you won’t be charged for using it.
Using AWS DataSync, you’ll be charged based the amount of data that you transfer through the service, costing $0.0125 per gigabyte (GB) of data copied.
Amazon Kinesis Data Firehose costs are charged by the volume of data you put into the service, with no set up fees or upfront commitments required.
The exact pricing will depend on which of the four on-demand usages you’ve utilized: ingestion, format conversion, VPC delivery, or Dynamic Partitioning.
Data transfer using AWS Snowball is charged one of two ways: either on-demand or upfront by project.
Snowball’s on-demand pricing is charged per “job”: each job includes use of a single Snowball device for ten days, and the import of data into Amazon S3.
How much a job will cost you depends on the device needed:
You get ten days to upload your data to the Snowball/s before you start getting charged for additional days, with each extra day the device is onsite charged a ‘per-day fee’ of $30-50.
The job fee doesn’t include shipping, though: you’ll be charged the standard shipping rate for whichever shipping service you opt to use.
If you’ve got more data than will fit on a single Snowball, you can get several shipped to you at once, though you’ll still be charged as though they were individual jobs.
There’s a significant discount available to users ready to commit to longer-term usage. In fact, users can save up to 62% on 1-year and 3-year usage commitments, and avoid paying additional service fees and per-day fees all the while.
Snowmobile jobs are priced a little differently, with costs calculated by both storage used and how long the job takes.
With Snowmobile, you’ll pay $0.005 per GB per month. The clock starts ticking when a Snowmobile leaves an AWS data center and starts its journey to you, and stops when all your data has been uploaded to the AWS Cloud.
Like other AWS data transfer services, with AWS Direct Connect you only pay for what you use, with no minimum spend required.
Direct Connect is priced by the hour, with two cost options depending on the capacity of your Dedicated Connection:
If you ever want to transfer data out using Direct Connect, there are additional charges to pay.
Charges for AWS Storage Gateway accumulate based on usage—it all depends on the type and amount of storage you use, as well as what requests you make and the amount of data being transferred out.
Data Transfer out from the AWS Storage Gateway service to your on-premises gateway appliance is charged between $0.05-$0.09 per GB, while Data Transfer out to your gateway appliance in Amazon EC2 costs $0.02 per GB.
Knowing how to reduce the cost of your AWS data transfer can be invaluable to your business’s cost optimization and cloud strategy. Of course, some AWS data transfers will inevitably cost more than others —it all depends on the size and scale of the project—but here are some top tips on how to reduce AWS data transfer costs.
The Jefferson Frank Salary Survey provides a unique insight into the Amazon Web Services community.
Einblicke in den AWS-Markt