Cloud-native readiness in seconds, not years.
How to get to cloud native readiness in two minutes flat!
I used to have a boss who often said…
“I want to have an idea in the morning and put it in production that same evening”.
It’s a simple idea. A simple Ideal. But somehow one that’s incredibly difficult to achieve, even with years of work by smart engineers.
I should know. I’ve worked with smart engineers for many years now. I used to work for large software companies in Chief Software Architect roles designing cloud infrastructure which was supposed to facilitate this kind of same-day agility. I noticed that developers spent quite a bit of time learning Kubernetes, Istio, Prometheus, Grafana, Vault, AWS concepts, along with the dozens of other technologies modern cloud architecture requires. Keeping on top of all the latest techniques, tools, versions and best practices is hard but not impossible and learning a new technology can be rewarding, but there was a big problem.
A Big Dam Problem
The problem was that no matter how many times I or the developers around me reinvented the Ops part of DevOps to get better availability, scalability, security, or lower latency, we always ended up having to turn around and do it again and never did get to the point where we could put an idea into production the same day. My old boss never did get what he wanted.
The ceremonial overhead of configuring logging, tracing, metrics, secrets management, load balancing, service discovery, scaling, TLS, MTLS, DNS, ingress controllers, networking security configuration and such — eat a very significant portion of every project I’ve ever seen.
Every single project, and yet we’re doing the same kinds of things each time. It’s as though we feel the need to re-wire the house every time we plug in the TV. The strange thing is that we do need to re-wire the house every time. There’s not a way to just plug your app in and go. Some days it feels like we also need to build a big dam to generate the electricity.
This is a big dam problem.
As a developer your objective is to have reliable, always-available API endpoints that serve an elastic user count, no matter where users are. You want high availability and low-latency with uncompromising security, observability and serviceability, so you can fix it if anything breaks. Your objective is a backend API that just works and never fails. Developers should be solving, in the most efficient, secure and reliable manner, unique business problems and providing value to the organization’s customers. Not reinventing for the millionth time what every organization on the planet is re-inventing — the Ops part in DevOps.
But for this to happen, the best practices in Ops would have to be codified into something that operates like a wall socket – giving you a platform for your app without having to learn fifty domains. An added benefit would be the ability to take a vacation without worrying about Kubernetes falling over, or the Ops wires getting crossed and causing a fire.
A Dan Clone?
I’ve spent a lot of time thinking about this problem. Ironically, though, a key to solving the problem came during a part of my career when I didn’t have to think about it quite as much.
I was working at SAP alongside this genius Ops guy named Dan Wilson. I knew Kubernetes, but Dan was like a Kubernetes magician. And his skills didn’t stop there. He knew Istio, Vault, Consul, Terraform and the rest of the Ops stack and he knew how to make them work together. It was a huge relief and it made me and the people around me far more productive. So much so that I started thinking to myself “the world needs Dan-Wilson-as-a-Service – DWaaS for short – so that developers can get back to doing what they do best”. Every dev team should have DWaaS, but of course most don’t.
At first, this was just a passing thought. I moved on from SAP to VMWare to serve as Chief Software Architect for their Cloud Services Platform. But the idea stayed with me and kept growing until it was all I could think about. I decided to make a move.
Leaving Big Tech for a Big Idea
I quit my chief architect job at VMWare in late 2019. I wanted to bring the equivalent of the “wall socket” to backend developers, so that they can focus on the Dev part, and have the equivalent of 200 of the best Site Reliability Engineers in Big Tech companies (or maybe just one guy like Dan Wilson) behind them to handle the “Ops” part.
I was going to need help, and you can guess who I asked first.
But Dan had a big job and a promising career path at a big company. Any reasonable person would have listened to me politely while rubbing his golden handcuffs and then found an excuse to get off the phone.
Fortunately for me, Dan is not completely reasonable. I told him that I was building a company to bring his “special set of skills” to millions of backend developers who do not have time to spend a decade becoming Ops ninjas. I told him I was looking for a CTO. To my delight, Dan loved the mission and he said YES! I was in business.
I had other doors to knock on. I had been lucky enough to work with some truly incredible engineers at VMware, SAP and my old startup BiTKOO. I was doubly lucky to be able to persuade them to leave huge established companies and join Control Plane. I was triply lucky to be able to raise capital in short order from a group of fantastic investors.
But now came the hard part. We knew what we wanted, but we had no idea whether it was possible.
Dreaming the Build and Building the Dream
We flew the team from around the world to California and closed ourselves in an office with a whiteboard. My directive to the team was to not worry about the laws of physics (what is possible) but rather design what we, as backend engineers, want as our ultimate “microservices substrate”. We agreed to disregard for the moment whether something seemed doable or not.
After about three days, we were in science fiction territory. In our dream world, we envisioned AWS, GCP and Azure merging into some sort of virtual uber-cloud with developers mixing and matching any of their services without dealing with credentials. We imagined a single pane of glass to configure access, without the pain and complexity of IAM and the different types of IAM interfaces across different clouds. We could see a world in our minds in which workloads were portable and didn’t care where they were running (on premises, or on Azure, GCP and/or AWS) and in which we could extend a workload’s network to any private network anywhere on the planet (to a VPC, some data center, or even a developer’s laptop).
If you’re not a developer, the alternate reality I just described may not sound very exciting, but we had each spent the last several decades struggling in the real world with those same limitations. We each had memories of the long nights and weekends away from our families trying to make the Ops-related goo running beneath our microservices function more like a cloud to help us get off the ground and less like a swamp that we could never quite get unstuck from. Just allowing ourselves to dream of this better world was cathartic.
We came up with a list of requirements that any sane engineer would say are impossible – at least not in a couple of years. But my team is even less sane than they are reasonable. It wasn’t easy and we had moments where we thought we should have had more respect for the laws of physics, but in the end we built our dream “microservices substrate”: a platform that combines the power of AWS, GCP and Azure — providing developers an “Uber Cloud”. We named the company Control Plane.
We’ve been onboarding customers since Q3 2021 and the feedback so far has been nothing short of miraculous. It turns out that many other developers have dreamed about the same things we did. Many other developers (and many of their bosses) have dreamed about having an idea in the morning and putting it in production that same afternoon.
Coding Past and Future
I always liked freedom (I guess everyone does!). I always disliked being confined. I’ve spoken to many software architects who felt they were forced to make a bet on a particular cloud only to regret their original decision a year later because the “other” cloud has now leapfrogged their original choice in one or several categories (graph DB, machine learning, etc.).
I wanted a platform that would free developers to focus on writing code in their language of choice (Python, Node.js, Ruby, C#, Java, PHP, yes, even Cobol) and have the platform take care of the Ops without sacrificing flexibility. It’d also be nice if it remained pleasant and easy to operate. I thought if I could make the norm easy then that might make what is hard possible.
If we look back at why we became software developers in the first place and we’re honest with ourselves — we’d realize that we’re lazy. We’d rather spend time automating a task than doing something repetitive. We like to work smart, not hard. We want to create something really new rather than constantly rehash something old.
For this to happen, for developers to create the future, some of the complexities of the past have to be covered over with a new layer of abstraction – have to be standardized and simplified. It has happened before, and it will happen again. I’m excited to report that it’s also happening now, and even more excited to be a part of it. I hope the work we’re doing now will enable developers years from now to close themselves in a room and dream bigger than we ever could.