On Tuesday, November 20, 2018, AWS announced the release of the new Aurora Serverless Data API. This has been a long awaited feature and has been at the top of many a person’s #awswishlist. As you can imagine, there was quite a bit of fanfare over this on Twitter.
Obviously, I too was excited. The prospect of not needing to use VPCs with Lambda functions to access an RDS database is pretty compelling. Think about all those cold start savings. Plus, connection management with serverless and RDBMS has been quite tricky. I even wrote an NPM package to help deal with the
max_connections issue and the inevitable zombies 🧟♂️ roaming around your RDS cluster. So AWS’s RDS via HTTP seems like the perfect solution, right? Well, not so fast. 😞
“What? You can’t use MySQL with serverless functions, you’ll just exhaust all the connections as soon as it starts to scale! And what about zombie connections? Lambda doesn’t clean those up for you, meaning you’ll potentially have hundreds of sleeping threads blocking new connections and throwing errors. It can’t be done!” ~ Naysayer
I really like DynamoDB and BigTable (even Cosmos DB is pretty cool), and for most of my serverless applications, they would be my first choice as a datastore. But I still have a love for relational databases, especially MySQL. It had always been my goto choice, perfect for building normalized data structures, enforcing declarative constants, providing referential integrity, and enabling ACID-compliant transactions. Plus the elegance of SQL (structured query language) makes organizing, retrieving and updating your data drop dead simple.
But now we have SERVERLESS. And Serverless functions (like AWS Lambda, Google Cloud Functions, and Azure Functions) scale almost infinitely by creating separate instances for each concurrent user. This is a MAJOR PROBLEM for RDBS solutions like MySQL, because available connections can be quickly maxed out by concurrent functions competing for access. Reusing database connections doesn’t help, and even the release of Aurora Serverless doesn’t solve the
max_connections problem. Sure there are some tricks we can use to mitigate the problem, but ultimately, using MySQL with serverless is a massive headache.
Well, maybe not anymore. 😀 I’ve been dealing with MySQL scaling issues and serverless functions for years now, and I’ve finally incorporated all of my learning into a simple, easy to use NPM module that (I hope) will solve your Serverless MySQL problems.
I’m a huge fan of building microservices with serverless systems. Serverless gives us the power to focus on just the code and our data without worrying about the maintenance and configuration of the underlying compute resources. Cloud providers (like AWS), also give us a huge number of managed services that we can stitch together to create incredibly powerful, and massively scalable serverless microservices.
I’ve read a lot of posts that mention serverless microservices, but they often don’t go into much detail. I feel like that can leave people confused and make it harder for them to implement their own solutions. Since I work with serverless microservices all the time, I figured I’d compile a list of design patterns and how to implement them in AWS. I came up with 19 of them, though I’m sure there are plenty more.
In this post we’ll look at all 19 in detail so that you can use them as templates to start designing your own serverless microservices.
Amazon announced the General Availability of Aurora Serverless on August 9, 2018. I have been playing around with the preview of Aurora Serverless for a few months, and I must say that overall, I’m very impressed. There are A LOT of limitations with this first release, but I believe that Amazon will do what Amazon does best, and keep iterating until this thing is rock solid.
The announcement gives a great overview and the official User Guide is chock full of interesting and useful information, so I definitely suggest giving those a read. In this post, I want to dive a little bit deeper and discuss the pros and cons of Aurora Serverless. I also want to dig into some of the technical details, pricing comparisons, and look more closely at the limitations.
Someone asked a great question on my How To: Reuse Database Connections in AWS Lambda post about how to end the unused connections left over by expired Lambda functions:
I’m playing around with AWS lambda and connections to an RDS database and am finding that for the containers that are not reused the connection remains. I found before that sometimes the connections would just die eventually. I was wondering, is there some way to manage and/or end the connections without needing to wait for them to end on their own? The main issue I’m worried about is that these unused connections would remain for an excessive amount of time and prevent new connections that will actually be used from being made due to the limit on the number of connections.
🧟♂️ Zombie RDS connections leftover on container expiration can become a problem when you start to reach a high number of concurrent Lambda executions. My guess is that this is why AWS is launching Aurora Serverless, to deal with relational databases at scale.
At the time of this writing it is still in preview mode.
Update September 2, 2018: I wrote an NPM module that manages MySQL connections for you in serverless environments. Check it out here.
Update August 9, 2018: Aurora Serverless is now Generally Available!
Overall, I’ve found that Lambda is pretty good about closing database connections when the container expires, but even if it does it reliably, it still doesn’t solve the MAX CONNECTIONS problem. Here are several strategies that I’ve used to deal with this issue.