10 Things I Learned Shipping an Ancient Data Center to AWS (Part 2)

By John Fahl, IOD Expert
Moving a data center is hard. It takes a ton of work to move years of cruft, drift, tech debt, and forgotten relics to a completely different place.
Last month, I shared with you the first five lessons I learned from shipping an ancient data center. This month, I’m going to share with you five more, with the focus on knowing why you should or shouldn’t migrate and the impacts of doing so.

6. On-Prem != Cloud, So Don’t Design it Like it Does

When you start constructing the designs of your new networks, subnets, and permissions (hopefully using IAC), take note that your cloud network environment doesn’t follow the same rules of an on-premises data center. I saw one example of an unnecessarily complicated firewall sandwich for external zones. Another design had three DMZ layers and nearly a dozen other segmentation zones, split at the subnet boundary for each Availability Zone (as if someone was designing VLANs). Don’t do this.
When designing your new cloud environment, start simple, but segment at the largest level, then go smaller. Concepts like VPC Peering make it easy to have multiple VPCs in your new environment. You can route only what you need between VPCs, then start breaking up subnets inside each VPC into easy-to-use segments. If you don’t need to strictly control one-way communication (think DMZ), then NACLs may not be needed, so very large subnets may be just fine. Controlling access of ports and protocols through security groups work perfectly fine and allow for simpler nesting.
Do yourself a favor: if you’re migrating workloads to AWS, put them in a separate VPC than the cloud native workloads. Since the migrated workloads won’t likely be running in auto scaling groups and using IAM profiles to call resources, why bother coupling the legacy workloads with the cloud-native services?

7. Lift and Shift: Moving Pets to New Pens

All major cloud vendors will gladly take your legacy dinosaurs. Why wouldn’t they? They will likely be up 24/7, which means you’ll need to purchase Reserved Instances for 1-3 years. If you don’t, you’ll be paying 30-70% more at the on-demand price. Using most tools, lifting and shifting ecosystems is difficult. You lose a lot of comfort you’ve experienced on-premises, such as clustered storage and daily backups, just to move the workload to a data center your admins don’t fully control. They will lose the recovery comfort for servers by means of hypervisor console access or crash carting.
Moving pets to the cloud is not modernization. It’s feigning innovation. If you’re telling yourself you’ll move the servers, and then automate them on the next project, it is unlikely. In fact, I’ve never seen it done after the fact.
After migrating servers to the cloud, every workload I moved was online all day, every day:

  • None of them were stateless.
  • None of them could auto scale.
  • All of them needed to be backed up.
  • Most of them lost storage redundancy and high availability

Many ecosystems:

  • Abandoned a clustering model (like Oracle RAC),
  • Left a thin provisioned virtual machine and went to a thick provisioned instance,
  • Required work on the other end of migration (very few machines were smooth), and
  • Had to be moved by ecosystem, sometimes with other ecosystems.

Make sure you want this migration more than you want the benefits of your data center. It is expensive to migrate and you will lose comfort for almost no cloud-native benefits.

8. If You Haven’t Shipped Before, Get Help Now

Most companies don’t have the budget to fully train their personnel or the luxury of spending months properly planning and executing their migrations. Even if your company paid for full training and planned for a year, if you have no one on the migration team with experience, you will get it wrong, guaranteed.
Whether you’re building a hybrid cloud or migrating some or all workloads, get help from professionals who know how to build for cloud. They can analyze your workloads and provide recommendations to save you from making expensive or catastrophic mistakes.
Furthermore, when you get advice from individuals who have done this, do not ignore it. You would be amazed how often companies pay expensive consultants for help moving to the cloud and then ignore the consultant’s recommendations. One company I worked with got their migration to cloud so wrong, it took them $150M to learn their mistake. Guess what? In the end, the help “fired” the company who hired them, after two years of trying to help them move to the cloud the right way. Six months later, the company scrapped the whole cloud move. True story (and a very juicy one). Don’t be that company.

9. Examine Your Moving Ecosystems

What am I calling an “ecosystem?” An ecosystem is a full system and the ancillary systems that interact with it. Need a brief example?

Customer Portal (Top to bottom)

  • F5 VIP
  • 3 web servers running IIS
  • 10 app servers
  • 3 different file servers behind a DFS Root
    • Replicates to a second data center using AD (DFS-R)
  • SQL database cluster
    • Talks to a data warehouse
    • Customer Shopping App uses a LinkedDB (for a legacy reason)
    • Replicates to a second data center
  • Notes
    • Every hour application is offline, SLA cost $50K
    • SQL cluster is on bare metal
      • 50GB change rate daily
    • DFS root file servers house 50 million files, over two million added/update daily

This sample ecosystem isn’t actually such a challenging one to move. I’ve moved far far worse. Considering this sample ecosystem, we now need look under the hood to determine data points to help us make a good move decision:

  • What can be moved “as-is?” If servers can be replicated with a tool, or migrated as OVAs, great. But some things will have to be updated, such as drivers, IP addresses, DNS settings, firewalls, any nasty manual pointers.
  • What must change? Will that F5 now be an ALB? Will the web/app servers work in an ASG? Will the SQL cluster need to be changed from traditional cluster to “Always On?” Will firewall rules need to be opened to come back over the WAN (or DirectConnect) to the data warehouse? Also, don’t forget to check those scheduled tasks. Those are often filled with static pointers that would need updating.
  • What must be synced (just as important, what can be left behind)? If you’re not using a vendor appliance to sync data, and resort to Robocopy or Active Directory, how long does it take to ship a full copy? How long does it take to ship deltas when the change ceases (in other words, when you stop replication to make the final sync and cutover)? Is your data change rate overrunning your ability to sync it? Ran into that before, time to ship data with an appliance or use some Snowball magic.
  • In what order must the move happen? This is typically bottom up: data, files, services, app, web, Endpoint. The order is important to understand for a “fast as possible” outage to cutover.
  • Are any applications connecting to this ecosystem that are sensitive to new latency? In other words, if my business is in Kentucky and I’m moving to AWS in Virginia, will 40ms+ cause an issue until I can move the other applications?

Keep in mind, this is mostly lifting and shifting an ecosystem. An ecosystem that can be deployed to AWS is, of course, much easier. Not many ancient data centers have CI/CD pipelines to redeploy 10 year old applications.

10. Invest In DevOps

There is no perfect time to make the paradigm shift, but there are certainly bad times to do it (like during a big lift and shift project). If you’re finding yourself migrating a data center absent of pipelines and automation, build the framework for these tools in your cloud directly or use hosted solutions. Designate a team to manage getting these tools off the ground, then have them train your developers and administrators on how to use them. This is a painful evolution so give your personnel time to adapt. Some will refuse to abandon the old ways, that’s okay. They can manage the crap you migrated.
You’ve moved your old stuff to the cloud. You have new tools and a new cloud with a green field ready for planting. Time to build all new products or replace migrated workloads with cloud-native applications he right way. You’ve crossed the bridge, congrats.
[iod-callout]

Summary

My hope is at least someone looking to make the leap can avoid a few mistakes with this advice. This work is not for the faint of heart and certainly can make or break careers. Make sure you train, plan, get help, and know what you are committing to. The last thing you want is a snarky engineer telling you, “You’re doing it wrong.”
[iod-subscribe]

Related posts