Allow me to introduce a good friend, Peter Sankauskas, who I met through the AWS cloud community. Our level of cloud experience evolved with the expansion of the cloud, Amazon in particular, and we have both become prominent members of the AWS community. Having learned more about open source and development, I’d like to focus on the idea of using Netflix open source tools to generate great values to cloud users.
Peter is the Founder and CEO of CloudNative, a company known for its continuous deployment tools. He started CloudNative after consulting at Answers for AWS and building the same system multiple times. Peter has been using AWS for over 7 years, across many companies and has designed, implemented, and managed systems that remain performant, highly available, and scale during exponential growth.
He led the engineering team at Motally to become one of the finalists in the AWS Start up Challenge in 2009. He wrote the EC2 inventory plugin for Ansible, giving Ansible it’s first taste of AWS. In 2013, Peter won a NetflixOSS Cloud Prize for Best Usability Enhancement for his work on making the NetflixOSS Stack easier to get up and running.
So, without further ado, let’s see what Peter has to say…
Ofir Nachmani: So Peter, what triggered your interest in the AWS cloud?
Peter Sankauskas: I started using AWS cloud at around the end of 2008. The thing with me back then was, in order to get hardware, you needed to talk to a vendor, fill out some forms, figure out what you needed in advance, and it would take a couple of months to materialize. When EC2 came out, I would make one API call, and then five or ten minutes later, I had the hardware I needed ready to go. It was clear to me that this was the future.
In 2009 as Director of Engineering at Motally, a company that did analytics for mobile websites and mobile applications, we relied heavily on AWS. We entered Motally into the AWS Start-Up Challenge and were one of the six finalists. That was my first introduction into the AWS community and my first realization as to how much involvement AWS has.
ON: I also started in 2008, trying to turn traditional enterprise solutions into SaaS solutions. However, taking gigantic MSSQL servers and complex enterprise apps to Amazon back then was a huge headache.
ON: Let’s fast forward 4 or 5 years to 2013. I think at this point you were receiving an award from Amazon for something that you built, based on Netflix’s open source solution, am I right?
PS: Yes, that’s right. Netflix held a competition to promote their open source software.
I had been using Netflix tools for a while and had firsthand experience of just how difficult they were to set up in your own Amazon account. As part of the contest, I made it easier to use some of the tools: Asgard, Eureka, Aminator, the Simian Army (Chaos Monkey), and more.
I began making Ansible playbooks, CloudFormation templates, and AMI’s to make it easy to test out their tools. This turned out to be a good entry into the contest, and at AWS re:Invent in 2013, I won the Best Usability Enhancement prize from Netflix. It was a fantastic honor and great publicity for my consulting company, Answers for AWS. On top of that, it was simply a lot of fun to do.
ON: How did Answers for AWS come about?
PS: I left the previous start-up and started the consulting business because at the time, if there was a glitch with AWS, half the internet would go offline. I’d been using AWS for long enough to know that there were plenty of techniques that existed to keep glitches from becoming as catastrophic as they were. That’s when I started writing blog posts, recording screencasts and contributing more to open source tools.
ON: Can you give me an example of the work you did? This was 2013, still, am I right?
PS: Yes, I started consulting in 2013. What really helped me learn about Ruby on Rails was Ryan Bates’ RailsCasts. He had recorded around 400 videos on how to do different things in Ruby on Rails. Nothing like that existed for AWS, so I started recording videos, and by the time I got up to the fifth video, I was already too busy with consultancy work to continue.
I had visions of recording many videos, and I’ve still got a long list of topics. Most of them are still relevant today, even a year and a half, two years later. The consultancy work just took over from that point.
ON: What type of consultancy did you focus on? Do you have a good story from back then?
PS: I did two different types of consultancy. One was more short-term, where I consulted clients that had already been using AWS for a while, but had questions about where they could improve. We would then go through their cloud infrastructure and architecture, identify any single points of failure and security concerns, then look at what they were spending and find ways to reduce costs. In the end, I’d give them a list of recommendations.
The other type of consultancy was for companies that had just come out of one of the accelerator programs, built a great system, but didn’t really know how to take it to the next level. I would help them understand how to create a production environment. “I’ve heard about this thing called VPC, but I don’t know how it works, will you come and help us?” some would ask. For those customers it would be a much longer term engagement, where I would go in, take a look at the various components of their environment, help with architecture decisions, create and automate the process to get everything up and running. All clients wanted a production environment that was highly available and scalable.
That’s what led me to create CloudNative. All of the clients I was helping were building very different things, from baby monitors to boat rentals to restaurant delivery services. They were all very different products, but the foundation that I built for each one was the same. As a consultant, how do you scale that? Do you hire more people and build a bigger consultancy team? Well, still, that only scales as much as people do. Could I not take that same knowledge and put it into software that does scale? That’s when I started working on the Bakery and Delta.
ON: So what are the principles/methods that allow you to build a generic product? I assume that Netflix leads the way here…
PS: This is where I get into best practices. A lot of the time, you look at what more established companies have been doing for a while. Netflix is the most vocal and open example here. They have been building and configuring AMIs that have everything ready to go. This is how you launch EC2 instances that are instantly ready to receive traffic. Then they used a tool called Asgard to deploy their stacks. They have one stack of hardware running the existing version and another stack running the new version. Then the load balancer switches traffic from the old to the new. This is called blue/green deployment.
That process, like those two components are exactly the same no matter what your company does, whether it’s a web service or a background service. That’s one of the best practices that exists in terms of managing virtual machines. It’s very repeatable with very little deployment risk. That’s how CloudNative started. It was a case of, “Let’s take this and make it really easy for another company to get up and start using these tools within minutes.”
Going further, take a look at what the others are doing – Airbnb, Pinterest, and Hailo. They all started with a monolithic application, and then split that off following a microservices architecture. This requires some common tooling, because you are deploying not just one service, but many. It is advisable to have something that is repeatable with some shared knowledge between the various components.
ON: Very interesting. Before wrapping up, what are the top 3 open source tools that AWS cloud users need to get their hands on?
PS: For open source tools, I’d say something like Packer, to build AMIs, which is a tool from HashiCorp. You can use it to build an image of your software. It supports AMIs for Amazon, images for Google Cloud, and docker containers. Actually, underneath the covers, CloudNative’s Bakery is using a slightly modified version of Packer.
It’s a great way to get to the point of having immutable infrastructure. When a machine comes online, all it has to do is boot and it’s ready for action. When you need to change something, you don’t need to modify a live machine. Simply create a new image, launch a new code, and test it out on a new machine.
For the blue/green deployment where you can quickly roll back to a previous version if something goes wrong, in the open source world, there’s only one package that really helps, and that’s Asgard from Netflix. Asgard will let you create and define an application, which then allows you to have multiple versions of that application over time. Asgard has a lot of other useful tools, too. If you think back to when the Asgard project was started, about five years ago, Amazon was very different. The deployment model that is built into Asgard is fantastic, but the other parts weren’t quite as necessary given the changes.
Within CloudNative, we have a tool called Delta, that uses the same deployment model that you get with Asgard, but has been updated to match the current AWS feature set. It has the security you would expect from a SaaS and integrates with the tool that you use to build AMIs, the Bakery.
The other open source tool that I really like is Troposphere, a python library that you can use to write CloudFormation templates. If combined with boto, you can look up various IDs and fields that you need in your AWS account, then plug them into the Troposphere CloudFormation template. You can then use boto again to launch the CloudFormation template. That way, you have an infrastructure that is very, very repeatable with a single script.
ON: Let’s summarize with, what the latest, most interesting change in the Amazon cloud is in your opinion, and give us one thing that Amazon needs to improve.
PS: Lambda. It’s all about speed. In the past, it’s taken five to ten minutes to get infrastructure up to run an EC2 instance. Then Docker came along and reduced that to a few seconds. With Lambda, you get charged for the 100 milliseconds that your function is running. The speed at which you can perform actions and get feedback is getting faster and faster. Lambda is a reflection of this.
The great thing about Amazon Web Services is that there are so many options. A great example is the EC2 container service. It’s a great start, but it doesn’t integrate with as many other services as we would like. No code deployment, no monitoring or logging, and no auto scaling. There is plenty of room for improvement. All of those things need to be thought into the equation. It’s a preview service, so it will get better over time, and I have no doubt that it will support all of the things I mentioned at some point. In order to improve, new features and services need to incorporate the other parts of the ecosystem.
You can find Peter at @pas256 on Twitter, pas256 on Github, and occasionally on his own blog.