The Unfulfilled Potential of Serverless

Corey Quinn, Cloud Economist (and perpetual thorn in AWS's side), recently published a post titled The Unfulfilled Promise of Serverless. Twitter reacted as we would expect, with plenty of folks feeling vindicated, others professing their staunch disagreement, and perhaps even a few now questioning their life (and technology) choices. My take is that he's not wrong, but he's also not entirely right.

Let me start by saying I'm a big fan of Corey's (seriously, this video about AppConfig might be the funniest thing he's ever created.) However, much of Corey's brand is predicated on creating controversy with his unique brand of snark and clever shitposting (the Twitterverse wants what the Twitterverse wants.) So while his hottake on serverless might resonate with many of us, it's important to understand both its motivation as well as its missing context.

In order for a promise to be "unfulfilled," we must understand both what was promised, as well as when delivery was promised. While this might sound like a semantic argument, I think it's important to point out a very important piece of context: we've barely scratched the surface of what's possible with serverless technology. I agree with Corey that Werner's "just write business logic" claim conveniently ignored the mountain of configuration that came with it, but it also put a stake in the ground and set the vision. To the best of my knowledge, no one ever gave a timeline to achieve everything that was promised, but what I've seen over the last seven years are massive investments by public cloud providers, established players, and startups laying the groundwork that brings us closer to that vision. No, we're not there yet, but that just means we have more work to do.

Below I've tried to add some context to the main points Corey made in his post. As I said, many of his points resonate with me, but hopefully this brings some comfort to those of you that were disheartened by his message. For those of you that have yet to "buy in" to serverless, I simply ask that you keep an open mind. The vision for serverless and its current reality are very different, and like with most things, we're likely all trying to get to the same place.

Serverless, but not potable

Yes, potable, you read that right. The (lack of) portability with serverless is less about lock-in, and more about the lack of consensus around implementation standards. Sure, if you're AWS, keeping customers in your ecosystem by providing native services with first-class integrations (even if they're as terrible as Cognito) makes it harder for them to look elsewhere and even harder for them to migrate away later. But if I take my cynical-colored glasses off for a moment, there's also a greater than zero percent chance that some of the amazing engineers and product managers that work behind the curtain at AWS are actually trying to "innovate."

For me, there is a major disconnect between AWS and most of the other major cloud providers when it comes to their approach to serverless. AWS took the same security and isolation properties of hardware virtualization technology and built them into Firecracker. Why? So that they could optimize the execution of microVMs by moving them as close to the metal as possible. What did just about everyone else do? They built their serverless compute solutions in containers running on top of Kubernetes, just about as far away from the metal as you can get.

Kelsey Hightower recently tweeted that "Serverless based on containers is the future." Really? I guess that's true if we're prepared to give up on the optimizations already possible by letting the platform be the orchestrator rather than requiring 30 layers of abstraction just to execute a few lines of code. Even if we go down the container route, the "portability" problem is only partially solved. As Forrest Brazeal has pointed out, choosing "cloud-agnostic" technology often means going with the "lowest common denominator." In other words, regardless of where you "port" your serverless compute, your (wise) reliance on CSP specific services will make migrating to other providers the bigger hurdle. This is true for all of "cloud," by the way, and isn't just a serverless phenomenon.

So back to my original point. I see "lock-in" as merely an excuse for people's unwillingness to buy into the uncertainty. Sure, there is no competitor to AWS Step Functions, but that's because this service is ahead of its time—not because state machines haven't been a thing since the dawn of computing, but because other cloud providers have a short-sighted vision for serverless. Customers that are willing to go "all in" on AWS's vision might lose portability, but what they gain in capabilities (at least in my opinion) far outweighs any potential drawbacks. It's not about portability; it's the fact that too many people want to reach for the familiar and are unwilling to drink the Kool Aid.

The imperceptible value fallacy

On a recent episode of his podcast, Corey so eloquently stated that "serverless sucks," to which Ant Stanley, co-founder of A Cloud Guru/ServerlessConf/ServerlessDays and arguably one of the earliest of early adopters of serverless, agreed. It's not the serverless technology that sucks, but rather the complexity we've created around it. As Ant explains, "I think folks have focused far too much on the technical aspects of serverless, and what is serverless and not serverless, or how you deploy something, or how you monitor something, observability, instead of going back to basics and first principles of what is this thing? Why should you do it? How do you do it? And how do we make that easy? There's no real focus on user experience and adoption for inexperienced folks."

Corey's experience with serverless isn't an outlier. AWS has come a long way with CloudWatch and its related tools, but the fact that an entirely new industry was required to help you understand and observe your serverless applications (Lumigo, Epsagon, IOpipe, Thundra, Dashbird, etc.) should have been the canary in the coal mine for widespread adoption. Most of these tools have either been acquired or have pivoted to focus on more than just serverless because, as Corey said, they faced the unfortunate reality that a pure serverless market just wasn't valuable enough.

Serverless adoption (including the use of "serverless services" that Corey casually dismisses) continues to grow year over year. But even though more that 50% of the "thousands of companies in Datadog's customer base" are using some form of FaaS, it's only based on whether "they ran at least five distinct functions in a given month." While we don't have the underlying data, this suggests that FaaS makes up just a portion of their cloud workloads. So maybe we don't have full buy-in at this point, but plenty of cloud companies are embracing the serverless paradigm and extracting some value from it.

However, there is a major problem with partial buy-in: TCO. The total cost of ownership (TCO) of serverless (or at least the promise of it) is significantly lower than setting up and maintaining traditional compute (yes, even if the compute charges are higher). But the incremental cost of adding workloads to an existing Kubernetes cluster or cloud environment supported by an experienced Ops team is minimal. For a startup (or even a small team within a larger org), going "serverless first" should be a no-brainer. Unfortunately, the value of those customers probably won't move the needle, hence the fact that it "doesn't feel like it's solving an expensive problem."

Who isn't going to own the upgrades?

I had Ant Stanley on the Serverless Chats podcast not long after his interview with Corey, and when I asked about the complexity issue, he also pointed out that the number of choices with serverless creates more cognitive load on developers when making decisions.

"I would love to see a simplification of serverless going, 'Hey, if you want a relational database, this is it. If you want a NoSQL database system, this is it. if you want to run synchronous workloads, this is where you do it. If you want asynchronous workloads, this is where you do it, and give one option. And here's your development workflow, and one set of tooling to make that work,' but I think we're way off that."

While Ant's vision of serverless simplicity would certainly make things easier, others have posited ideas that get us much closer to the ultimate serverless vision. Shawn @SWYX Wang recently articulated his vision of The Self Provisioning Runtime. He says, "If the Platonic ideal of Developer Experience is a world where you 'Just Write Business Logic', the logical endgame is a language+infrastructure combination that figures out everything else."

At the time Shawn published his article in late August, he was unaware of Serverless Cloud, the project that my team has been working on at Serverless, Inc. for nearly a year. The goal was to achieve almost exactly what Shawn was describing, a platform that would interpret your code and deploy the necessary infrastructure to best support it. It's what we call Infrastructure from Code at Serverless Cloud. There are others that are doing similar work like Dark Lang, Lambdragon, and others mentioned in Shawn's post.

But we're now at the point where the technological groundwork painstakingly laid over the last seven years is ready for the next major step towards the ultimate vision. The complexity added by configuration was an intermediate step that told the cloud what was needed to run our code. The near future is a world that no longer requires that step. So, yes, there are plenty of things still required for serverless computing to fulfill its full potential, but we're a lot closer to "just write business logic" than you might think. And when anyone can fully express their application with just their code, then the true potential of serverless will finally be met.

Tags: #aws, #lambda, #cloud