Introduction to Snowball

Although a Storage Gateway can be a great solution for delivering data from our on-premises data center into AWS, it is still dependent on the bandwidth of the internet connection. Sometimes, the datasets can be very large and for those datasets it can take a very long time, be very expensive, and be very inefficient to transfer across the internet.

For example, let's take a look how long it takes to transfer a certain large dataset across the internet. For example, let's consider a very common dataset of 50 TB of data. If we employ a dedicated 1 Gbps connection to transfer that dataset, it will take us about 125 hours to transfer all the data across that link to the target destination. This calculation is taken in ideal conditions and 100% link utilization. So, we can easily say that on a 1 Gbps uplink, which is a common corporate internet uplink these days, we are able to transfer that dataset in roughly 6 days. How about when that dataset is even larger, say 100 TB? That would take about 12 days. What about 1 PB? That would bring us closer to 4 months! 

AWS Snowball was designed to enable us to quickly and efficiently transfer large datasets in exactly these kinds of scenarios. The Snowball device is a self-contained shipping unit that is able to deliver either 42 or 72 TB of data from on-premises locations into AWS. With much larger datasets, we can use multiple Snowball devices in parallel to quickly and efficiently transfer data at Petabyte scale to AWS. The device itself hosts an S3 compatible endpoint that you can plug into your network via a 10 GbE copper, SPF, or SPF+ connector located on the device. The service can then be addressed on the local network and the full transfer speed of the link can be utilized to copy data onto the device. With the 10 Gbe connection, we can fill up the 42 TB snowball with data in about 10 hours, while the 72 TB model will take about 18 hours to fill up with data completely. This means that a 1 PB dataset can be transferred to 22 smaller devices in about 10 hours and to 15 larger ones in about 18 hours when transferring the whole capacity in parallel.

Once completed, we simply power off the unit and call our local carrier. The unit has an E-ink label with the shipping instructions so we don't need to worry about anything else. Once the snowball arrives at the designated data center, the data is transferred to an S3 bucket in our account. We then have the option to perform any kind of action available on S3, such as life cycling and presenting the data via HTTP/HTTPS.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.107.152