Choosing a Cloud Provider—Storage: Part 2—Azure & GCP

Cloud Provider-Storage: Azure & GCP

Choosing a Cloud Provider—Storage: Part 2—Azure & GCP

In the first part of this series, I covered the basics of cloud storage and AWS’ most popular cloud storage services. While Amazon was the first to introduce object, block, and file storage in the cloud, Microsoft Azure and Google Cloud Platform (GCP) both have some tricks up their sleeves, despite being late to the cloud game. This includes features not currently supported by AWS, which I’ll discuss in this blog post.

Cloud Storage on Azure

Azure Cloud Storage offers a range of services suitable for any kind of enterprise workloads, from hosting development and testing applications to running redundant databases with extensive I/O requirements. Depending on your needs, you can choose from Azure Blob Storage and Azure Archive Storage, Managed Disks, or Azure Files.

Azure Blob Storage

Azure Blob Storage is Microsoft’s object cloud storage service, where data is organized in containers and blobs. This means you can have an unlimited number of containers in which you can store unlimited numbers of blobs. There are three different types of blobs:

      • Block blobs: These store text and binary data and have a maximum size of 4.75 TB. However, Microsoft is letting select customers try out block blobs of 190.7 TB, and will likely make them generally available soon.
      • Append blobs: These are similar to block blobs, but used in scenarios where you frequently append data, such as streaming log files from your virtual machines (VMs) or databases.
      • Page blobs: These are random access files, with a maximum size of 8 TB. Page blobs can also store VHD files for Azure VMs and serve as actual disk volumes in running VMs. This is different from AWS, where Amazon S3 can be mounted inside a VM, but cannot power it as boot volume. Also, performance of this cloud storage type is not great. In Azure terminology, page blobs are referred to as Unmanaged Disks.

Azure Blob Storage, in general, can also be used as a storage option for data lakes and high-performance computing. It has a similar pricing model to Amazon S3—you pay for the amount of data stored, storage tiers (Premium, Hot, or Cold), transfer costs, and cost per operation (read, write, delete, etc.). 

The advantage of Azure over AWS is that there is an option to pre-reserve a specific amount of data (starting with 100 TB, in 100 TB increments) on a one- or three-year basis. You can also get a discounted price per GB stored, but this requires calculating your storage needs in advance.

Are you a tech blogger?

We're currently seeking new cloud experts to join our network of influencers. Devops? Serverless? Machine learning?

Azure Archive Storage

Microsoft defines Azure Archive Storage as “highly available secure cloud storage for rarely accessed data with flexible latency requirements,” which makes this service a direct competitor of Amazon S3 Glacier. In essence, however, Archive Storage is another tier of Blob Storage, in which the price for storing data is significantly cheaper per 1 GB stored ($0.00099 per GB, compared to $0.0184 per GB in the Hot storage tier above). However, data retrieval is significantly more expensive ($5 per 10,000 read operations, compared to $0.004 in the Hot storage tier). The use case for this cloud storage type is quite clear: You can store large amounts of data you don’t plan on retrieving often—if ever—and simply delete the data when it’s no longer needed.

Azure Managed Disks

For non-production VMs, like those running in development and testing environments, you can use Page Blobs to store a VM operating system and its accompanying data. In fact, this was  the only way to provision a VM on Azure until recently, when Microsoft introduced Managed Disks. Microsoft now offers four types of managed disks: Standard (SSD and HDD variants), Premium, and Ultra.

      • Standard: These disks come in SSD and HDD variants and are designed for general purpose VMs. 
            • SSD: These serve in production workloads that don’t require high IOPS and/or throughput. The smallest option comes with 4 GB and can be increased to up to 32 TB (or, as Microsoft calls these disks, from E1 to E80, where capacity doubles with each level). The maximum IOPS and throughput with this disk is 6,000 and 750 MB/s. 
            • HDD: The size of these disks varies from 32 GB to 32 TB (S4 to S80, in Microsoft’s language), with the maximum IOPS and throughput capped at 2,000 and 500 MB/s, respectively.
      • Premium: This tier offers only SSD disks, which have the same capacity as the Standard tier, but an increased IOPS (up to 20,000) and throughput (capped at 900 MB/s). This, of course, impacts price. However, Microsoft offers reservations for this disk type—the only downside is that reservations apply to P30–P80 types (1 TB drives and higher).
      • Ultra: This tier offers maximum configurability to end users, so when picking a specific disk size (from 4 GB to 32 TB), there’s a range of available IOPS and throughput values. This way, users can tailor their disk to the workloads running on top of these volumes. 

Azure Files

Azure Files is Microsoft’s implementation of cloud file storage, with support for SMBv3 and NFSv4. There are four different tiers: Premium, Transaction Optimized, Hot, and Cold. 

      • Premium: Naturally, this tier is the most expensive and offers the best performance, since the underlying shared network storage is based on SSD disks tiered together. 
      • Transaction Optimized: The latency offered in this tier is significantly slower than in the Premium tier. However, this tier is suitable for workloads with a lot of transactions, and can be used with a plethora of database engines, even for production needs.
      • Hot and Cold: These tiers are mostly used for instances that need shared file storage, but are not under heavy load, like user home directories or traditional enterprise user file shares. When choosing between the two, keep the following in mind: 
            • Hot has higher storage prices than Cold, but when it comes to the operations you perform on the files (read, write, delete), Cold comes with a higher price tag.
            • You should therefore use Hot for shared network file systems with lots of operations. Use Cold for bulk-storing data that’s not accessed or changed frequently, but still needs to be available from multiple servers (like in back-up scenarios).

Cloud Storage on GCP

Last but not least, I will cover storage options on GCP. The most important is probably Google Cloud Storage (a direct competitor to AWS’ most popular service, Amazon S3), but we shouldn’t neglect Google Persistent Disks or Google Filestore.

Google Cloud Storage

Google Cloud Storage (GCP) is following in the footsteps of Amazon S3, as it’s an object cloud storage where data is organized into buckets. The difference is that Google offers dual-region and multi-regional buckets (where data is stored in several regions, like the entire United States, the EU, and Asia), whereas with AWS, buckets can only exist in one region. 

GCP offers four storage classes:

      • Standard: For frequently used data (e.g., to be used as a file share for your enterprise)
      • Nearline: For backups and data accessed a couple of times a month
      • Codline: For disaster recovery needs (once a quarter, as Google suggests)
      • Archival: Equivalent to Amazon S3 Glacier and Azure Archive Storage 

Also, GCP supports hosting static websites inside buckets, as well as versioning, and can be accessed via a web browser, Google’s gcloud CLI, or Google Cloud SDK, which is available in all popular development languages.

Google Persistent Disk

Google Compute Engine or Kubernetes Engine instances use Google Persistent Disk to run underlying operating systems or store application and user data. This block storage service can provision volumes on SSD or HDD drives up to 64 TB in size, and IOPS and throughput are directly related to the size you choose. This means that if you want improved performance, you need to provision large volumes. By default, all Persistent Disks are encrypted, and users can choose if they want Google to manage the encryption keys or prefer to import their own. 

One significant advantage over the competition is that Google offers regional Persistent Disks, which means that disks can be available in all of the selected region’s availability zones. This comes in particularly handy if you want to use containers with persistent storage in regional GKE clusters, a feature not currently available on AWS or Azure. 

Google Filestore

Google’s file cloud storage, Filestore, offers two service tiers: Basic and the newly introduced High Scale, which is still in beta mode. 

      • Basic: This tier allows you to provision instances with up to 64 TB of storage, backed up by HDD or SSD disks, depending on your needs. The minimum capacity is 2.5 TB, which will cost around $200 a month if you select HDD disks in U.S. regions. If you opt for SSDs, prepare for a monthly bill of more than $750. Obviously, this cloud storage type is for enterprises that require a large shared storage in the cloud.
      • High Scale: This tier offers from 60 TB to a stunning 320 TB of storage. With the minimum configuration of 60 TB, you get 90,000 read and 30,000 write IOPS, with 3,000 read and 660 write MB/s of throughput. This amazing performance comes with a very high price tag—over $18,000 in U.S. regions and more in the rest of the world. So, be careful when provisioning—accidentally misconfiguring your setup can really hit your cloud bill hard.

Summary

Here is a table summarizing the services I covered in this article, based on their type.

Cloud provider

AWS

Azure

GCP

Object storage

Amazon S3

Amazon S3 Glacier

Azure Blob Storage

Azure Archival Storage

Google Cloud Storage

Block storage

Amazon EBS

Azure Managed Disks

Azure Unmanaged Disks/page blobs

Google Persistent Disk

File storage

Amazon EFS

Amazon FSx

Azure Files

Google Filestore

Table 1: Storage services comparison

One final note: In this blog series, I tried to summarize all of the main cloud storage services on AWS, Azure, and GCP. However, there are other services that integrate with each provider’s object, block, and file storage services, such as expansions for hybrid connectivity, options to perform backups of your on-premises data to the cloud, and even tools to complete a cold migration (where you get a storage device and ship it to your cloud provider to take care of the rest). 

While you might not pick your cloud based on the storage services offered, storage is an integral part of any cloud migration and shouldn’t be neglected. Before deciding on a specific provider, consider creating a proof-of-concept project and trying out as many services as you can. And be sure to track costs for each step of the project and perform a post-mortem analysis every month, based on your findings.

With thorough research and preparation, the sky’s the limit. In the future, you might even decide to use a multi-cloud solution or use services from different providers in parallel. 

Be sure to check out the next post in this series, Choosing the Right Cloud Provider—Databases.

Tech content for tech experts by tech experts.

Learn more about IOD's content research & creation services.

Related posts