Everything You Need to Know About Azure Stack (a First Course)

By Shiji Sujai, IOD Expert

My first tryst with cloud can be traced back to 2007, when the term public cloud was synonymous with AWS. At the time, our team had an AWS subscription we played around with. We were mostly fresh recruits out of college and had little idea that “the cloud” that piqued our curiosity would one day own more than half of the planet’s IT landscape.

Fast forward to today. I earn my living as a cloud consultant. But gotcha, instead of AWS, I work on Microsoft Azure. How times have changed and how technology has transformed!  Public cloud no longer == AWS. It is Microsoft Azure, Google Cloud, IBM and many others.

Microsoft Azure was a late bloomer, but it did carve out a niche for itself with plethora of offerings across the IaaS, PaaS, and SaaS landscape. However, what makes Azure stand out from the competition is its very vocally publicized hybrid cloud strategy. The latest tool in the arsenal of Microsoft reinforcing the hybrid cloud strategy is its private cloud offering, Azure Stack. The unique value proposition of Azure Stack is the close-knit integration with Azure public cloud in terms of management, security, operations, and identity. Azure Stack is being marketed by Microsoft as “an extension of Azure” because of this consistent experience offered by the solution.  

Did Someone Say Private Cloud?

One might wonder why Microsoft is pitching a private cloud solution at this point when it’s neck and neck with AWS in the public cloud market. Why not focus more on the public cloud offering to win the race? In fact, this is a brilliant move to bring more customers to Microsoft Azure.

Some organizations still have reservations about adopting public cloud, mostly due to security or regulatory concerns. Hybrid cloud architecture offers a solution, where critical systems or data stay on-premise, but other applications are migrated to cloud. Azure Stack brings the same code base as its public cloud to your data center, thereby catering to one of the following use cases:

Edge and disconnected deployments:  In disconnected environments with limited or no connectivity to internet, Azure Stack can be used to store and process data locally in a private cloud environment.

Meets regulatory and compliance standards: Organizations can deploy the same version of the application on-premise in Azure Stack, as well as in Azure public cloud without any code changes. The platform and the technology used in the backend is the same. However, deploying in Azure Stack helps to meet the regulatory requirements associated with industry verticals.

Deploy on-premise applications using a Cloud application model: Azure Stack enables consistent usage of development and devops toolsets for application modernizations and green field applications targeting cloud. Applications deployed in Azure Stack can be scaled to cloud whenever required with minimal or no changes using the same toolsets.

Demystifying the Integrated Systems Approach

It came as a surprise to many when Microsoft decided to go for an integrated systems delivery for Azure Stack instead of the more popular DIY private cloud approach where you can simply download the software and install it in a hardware of a choice. Microsoft has announced partnership with the following hardware vendors to deliver Azure Stack as an integrated system – Cisco, Dell EMC, HP, Huawei, Lenovo, and Terra. There is an SDK version of Azure Stack also available for testing purposes.

However, for a fully supported production deployment, you need to purchase one of the integrated systems solutions. Though it might look counterproductive in terms of flexibility, the integrated systems approach was adopted by Microsoft for specific reasons:

  • Offers a turnkey solution including all hardware and software components designed to work from Day One.
  • All components are pre-validated and pre-integrated with the right configuration required for Azure Stack, which saves the customers time and money spend for reiterations to find the right mix.
  • Simplified update and patching of the private cloud platform directly from the management interface without affecting any of the workloads.
  • Helps to maintain a minimum standard in the hardware being used, thereby meeting the expected performance benchmarks.
  • Not much time and effort required at the infra layer as everything is prescriptive and standardized.

Azure Stack Under Wraps

The basic building block of Azure Stack is a scale unit which consists of four servers by default: two Top of Rack (TOR) switches and one Baseboard Management Controller (BMC) switch. The BMC switch connects to the physical server’s baseboard management controller to provide DRAC like capabilities. The system has redundancy built-in at all layers. The servers are all part of a Windows server failover cluster domain. They are redundantly connected to both TOR switches along with the BMC switch. There are two connections from the BMC switch to the TOR switches. The TOR switches should be connected to the switches in the datacenter in a redundant manner. At general availability, an Azure Stack scale unit can have 4-12 nodes. Microsoft has announced that work is in progress to increase the number of nodes to 16 in near future. It is important to note that all nodes in the scale unit will have identical CPU, memory, and storage configuration.

Image courtesy: Microsoft

A separate server called Hardware Lifecycle Host (HLH) is shipped along with the Azure Stack components. HLM is used to manage and monitor the hardware components and is also used by OEM engineers for the initial deployment of Azure Stack.

Azure Stack uses hyper-converged architecture and leverages software defined compute, storage, and networking components of Windows Server 2016 in the backend. It uses Hyper-V for compute, Windows Server 2016 SDN for networking, and storage spaces direct for storage aggregation. The high-level logical architecture of Azure Stack is as follows

Image courtesy: Microsoft
  • The Azure Resource Manager (ARM) is the same as what is used in Azure public cloud. It provides the various user interfaces for customers and administrators like Azure portal, PowerShell, CLI, and SDK.
  • ARM layer interacts with the underlying resource provider (RP) layer. Resource Provider layer forwards the requests from ARM layer to targeted service controllers. These resource providers are Fabric Resource Provider (FRP), Storage Resource Provider (SRP), Compute Resource Provider (CRP), Network Resource provider (NRP), Hardware Resource Provider (HRP) and Update Resource Provider (URP).
  • The different controllers carry out the tasks related to the requests transferred to them from the RP layer. For example, when a request is forwarded by CRP to create a VM, the compute controller interacts with the Hyper-V layer to create the VMs.
  • There are several infrastructure roles that gets deployed inside VMs in Hyper-V during initial setup of Azure Stack. These VMs carry out specific functions defined in the architecture. For example, the Network controller VM is the SDN Network controller that executes requests forwarded by the network resource provider.

From a customer perspective, administration of Azure Stack does not involve any direct interaction with any of the backend components. In fact, Microsoft advises against any action on these infrastructure roles unless supervised by a Microsoft support personnel.

Azure Stack Management and Operations

 Azure Stack provides a management portal for administrators and a user portal for users, backed by separate instances of Azure Resource Manager (ARM). The management boundaries are linked to the concepts of cloud, regions, and scale units

A scale unit will always be associated with a single location or region. It defines the unit of capacity of Azure Stack. There could be more than one scale units in a region. Scale units have fault domain and update domain capability built in same as that in Azure public cloud.

Region refers to multiple scale units in a given physical location. There should be high-bandwidth, low latency layer -3 connectivity between scale units in a region.

 

Cloud is a single instance of Azure Resource Manager (ARM). Cloud can have one or more regions under the same ARM, multiple scale units in a region, and 4 or more nodes in a scale unit.

At general availability of Azure Stack, only a subset of features is supported. For instance, up to 12 nodes in a scale unit,1 scale unit per region, and single region in a cloud.

Services, Plans, Offers, and Subscriptions

The long-term roadmap of Azure Stack is to support all services that are available in the public cloud. Currently, limited services are available which includes virtual machines, virtual machine scale sets, Azure Storage (blobs, tables, queues), virtual networks, load balancers, VPN gateways, Azure KeyVault, Azure App service, API apps, Azure functions, Azure service fabric, and SQL server.

  • An administrator can decide which service should be made available to a customer by grouping them into plans that can be offered to tenants. Each plan will have quotas associated with it that restrict the VM, CPU, and RAM limits per user subscriptions.
  • One or more plans can be grouped to create an offer that tenants can purchase.
  • A subscription comprises of tenant + offer. In other words, a subscription has one-to-one mapping with an offer. Tenants can subscribe to more than one offer.

Identity Providers

Azure Stack can be configured to use either Azure AD or ADFS for authentication. Azure AD provides cloud-based identity-as-a-service and is recommended for hybrid cloud deployments. ADFS is preferred in disconnected deployment scenarios with limited or no internet connectivity. Azure Stack creates AD and ADFS VMs as part of infrastructure role deployment. This can be integrated with on-premise AD by establishing an ADFS federation trust. The Azure AD graph API components is also included in the ADFS VM which helps to enable RBAC feature in the Azure Stack.

Security

Azure Stack uses a strict layering enforced by a sealed host with well-defined components and interactions.

  • Fine grained access control to resources are enforced using RBAC.
  • Only whitelisted applications signed either by Microsoft or OEM can run on the Azure Stack infrastructure.
  • Only those infrastructure components that need to talk to each other are allowed to communicate.
  • Data at rest is encrypted across all components using BitLocker.  
  • All network traffic is encrypted using TLS 1.2.
  • Patch and update tool (PNU) should be used to update the core infrastructure. It is not possible for an administrator to upgrade the environment other than using the PNU tool.
  • Windows Server 2016 security features like credential guard, device guard, and Windows Defender are leveraged extensively to enforce security.

Azure Stack does not need any effort from the customer side to harden the core infrastructure as best in class security is prebuilt into the system. However, do keep in mind that tenant workload security is still the customer’s responsibility. For example, if a VM is deployed in Azure stack it is up to the customer to do OS hardening and implement any other required security measures.

Backup and DR

Protecting your environment from any unexpected failures or disasters is an important part of the Azure Stack operations strategy. An Infrastructure Backup Controller is created during Azure Stack deployment, which interacts with the Backup Resource provider to take backup of Azure stack internals. The backup will include identity configuration, root certificates used by the internal components, ARM configuration data related to users, subscription, and plans, and tenant information. If the Azure Stack infrastructure is down and restored from backup, users will have their subscriptions, plans, quotas, and metadata of the deployed services. However, it should be noted that Storage RP does not take backup of any deployed services like VMs and Apps. It is up to the user to implement service level backup and DR strategy.

Purchase and Licensing

Azure Stack integrated systems hardware purchase and hardware support agreement is directly executed with the OEM vendor. Azure service and support is  purchased from Microsoft. Customers can use either the pay-as-you-go model or a capacity model for purchasing Azure services for Azure Stack. Pay-as-you-go uses the same principle used in public cloud, i.e. consumption-based pricing. The capacity model is more suitable in a disconnected scenario. It licenses all physical cores in the deployment in a fixed fee annual subscription model. A customer can either opt for an IaaS package for base VM images+ Storage or an app service package which includes App Services along with the services in the IaaS package.

To Stack or Not to Stack?

Microsoft dabbled previously with private cloud with Azure Pack, but then shifted efforts to Azure Stack as it supports the new ARM model. The focus on Azure Stack is steady as more and more hardware partners are associating with it and new features are being worked on.

Since major rivals such as AWS and Google are focusing on a public cloud strategy and therefore Microsoft has the floor for hybrid cloud opportunities. Azure Stack truly brings in a slice of Azure public cloud to your data center.

If you didn’t get a taste of the Stack yet, I suggest you start testing the waters with the Azure Stack SDK version before going full swing with an integrated system from one of the OEM vendors.

Happy ‘Stack’ing ???? !!

Related posts