Managing MySQL at Serverless Scale

“What? You can’t use MySQL with serverless functions, you’ll just exhaust all the connections as soon as it starts to scale! And what about zombie connections? Lambda doesn’t clean those up for you, meaning you’ll potentially have hundreds of sleeping threads blocking new connections and throwing errors. It can’t be done!”  ~ Naysayer

I really like DynamoDB and BigTable (even Cosmos DB is pretty cool), and for most of my serverless applications, they would be my first choice as a datastore. But I still have a love for relational databases, especially MySQL. It had always been my goto choice, perfect for building normalized data structures, enforcing declarative constants, providing referential integrity, and enabling ACID-compliant transactions. Plus the elegance of SQL (structured query language) makes organizing, retrieving and updating your data drop dead simple.

But now we have SERVERLESS. And Serverless functions (like AWS Lambda, Google Cloud Functions, and Azure Functions) scale almost infinitely by creating separate instances for each concurrent user. This is a MAJOR PROBLEM for RDBS solutions like MySQL, because available connections can be quickly maxed out by concurrent functions competing for access. Reusing database connections doesn’t help, and even the release of Aurora Serverless doesn’t solve the max_connections problem. Sure there are some tricks we can use to mitigate the problem, but ultimately, using MySQL with serverless is a massive headache.

Well, maybe not anymore. 😀 I’ve been dealing with MySQL scaling issues and serverless functions for years now, and I’ve finally incorporated all of my learning into a simple, easy to use NPM module that (I hope) will solve your Serverless MySQL problems.

Introducing Serverless MySQL

Serverless MySQL is a wrapper for Doug Wilson’s amazing mysql Node.js module. It adds a connection management component to the mysql module that is designed specifically for use with serverless applications. The module can monitor the number of connections being utilized, and then based on your settings, manage those connections to allow thousands of concurrent executions to share them. It will reuse connections when possible, clean up zombie threads, enforce connection limits per user, and retry connections using trusted backoff algorithms.

In addition, Serverless MySQL also adds modern async/await support to the mysql module, eliminating callback hell or the need to wrap calls in promises. It also dramatically simplifies transactions, giving you a simple and consistent pattern to handle common workflows.

NOTE: This module has been tested with AWS’s RDS MySQL, Aurora MySQL, and Aurora Serverless, but it should work with any standards-based MySQL server.

A Simple Example

Installation and Requirements

Serverless MySQL is an NPM module that can be installed and included in your Node.js functions.

This module uses modern JavaScript methods and therefore requires Node 8.10+. It also requires the use of a MySQL-compatable server/cluster.

There really isn’t anything special that needs to be done in order for your MySQL server (including RDS, Aurora, and Aurora Serverless) to use serverless-mysql. You should just be aware of the following two scenarios.

If you set max user_connections, the module will only manage connections for that user. This is useful if you have multiple clients connecting to the same MySQL server (or cluster) and you want to make sure your serverless app doesn’t use all of the available connections.

If you’re not setting max user_connections, the user MUST BE granted the PROCESS privilege in order to count other connections. Otherwise it will assume that its connections are the only ones being used. Granting PROCESS is fairly safe as it is a read only permission and doesn’t expose any sensitive data.

How to use Serverless MySQL 🚀

Serverless MySQL wraps the mysql module, so it supports pretty much everything that the mysql module does. It uses all the same connection options, provides a query() method that accepts the same arguments when performing queries (except the callback of course), and passes back the query results exactly as the mysql module returns them. There are a few things that don’t make sense in serverless environments, like streaming rows, so there is no support for that yet.

To use Serverless MySQL, require it OUTSIDE your main function handler. This will allow for connection reuse between executions. The module must be initialized before its methods are available.

There are a number of supported configuration options (including event callbacks) detailed in the documentation. These must be passed in during initialization.

MySQL connection options can be passed in at initialization (using the config property) or later using the config() method.

You can explicitly establish a connection using the connect() method if you want to, though it isn’t necessary. This method returns a promise, so you’ll need to await the response or wrap it in a promise chain.

Running queries is super simple using the query() method. It supports all query options supported by the mysql module, but returns a promise instead of using the standard callbacks. You either need to await them or wrap them in a promise chain.

Once you’ve run all your queries and your serverless function is ready to return data, call the end() method to perform connection management tasks. This will do things like check the current number of connections, clean up zombies, or even disconnect if there are too many connections being used. Be sure to await its result before continuing.

Note that end() will NOT necessarily terminate the connection. Only if it has to in order to free up connections. If you’d like to explicitly terminate connections, use the quit() method instead.

If you need access to the connection object, you can use the getClient() method. This will allow you to use any supported feature of the mysql module directly.

Transaction Support 🙌

Transaction support in Serverless MySQL has been dramatically simplified. Start a new transaction using the transaction() method, and then chain queries using the query() method. The query() method supports all standard query options.

You can specify an optional rollback() method in the chain. This will receive the error object if a transaction fails, allowing you to add additional logging or perform some other action. Call the commit() method when you are ready to execute the queries.

Automatic Connection Retries and Backoff 🤓

Serverless MySQL will automatically kill idle connections or disconnect the current connection if the connUtilization limit is reached. Even with this aggressive strategy, it is possible that multiple functions will be competing for the same available connections. The backoff setting uses the strategy outlined here to use Jitter instead of Exponential Backoff when attempting connection retries.

The two supported methods are full and decorrelated Jitter. Both are effective in reducing server strain and minimize retries. The module defaults to full.

Full Jitter: LESS work, MORE time

Decorrelated Jitter: MORE work, LESS time

You can also provide your own backoff algorithm.

Does this thing actually work? 🤔

Fair question. I’ve run a lot of tests using a number of different configurations. Ramp ups appear to work best, but once there are several warm containers, the response times are much better. Below is an example test I ran using AWS Lambda and Aurora Serverless. Aurora Serverless was configured with 2 ACUs (and it did not autoscale), so there were only 90 connections available to the MySQL cluster. The Lambda function was configured with 1,024 MB of memory. This test simulated 500 users per second for one minute. Each user ran a sample query retrieving a few rows from a table.

From the graph below, you can see that the average response time was 41 ms (min 20 ms, max 3743 ms) with ZERO errors.

Other tests that use larger configurations were extremely successful too, but I’d appreciate other independent tests to verify my assumptions. 📊

Where do we go from here?

There is plenty more to do with this module. I’ve done a number of simulated tests, but there are several others that need to be run. The module also needs some automated tests written along with code coverage reporting. I’ll be adding that soon so that we don’t introduce regressions as new features are added.

I’ve actually been using this module in production for quite some time and it has worked perfectly for several of my applications. Even if you are not experiencing massive scale, this module is a handy tool for simply managing your connections and providing a more intuitive interface into the mysql module.

I’d love to know your thoughts and see your test results. There is always room for improvement, so your feedback, bug reports, suggestions, and contributions are greatly appreciated.

Good luck! 🤘🏻

Tags: , , , , , , , ,


Did you like this post? 👍  Do you want more? 🙌  Follow me on Twitter or check out some of the projects I’m working on. You can sign up for my WEEKLY newsletter too. You'll get links to my new posts (like this one), industry happenings, project updates and much more! 📪

Sign Up for my WEEKLY email newsletter


I respect your privacy and I will NEVER sell, rent or share your email address.

6 thoughts on “Managing MySQL at Serverless Scale”

    1. Hi Chandru,

      I wish I had the time. I don’t use MongoDB in serverless production environments, so I’m not as familar with the ins and outs of the technology. I do think most of the same techniques I use would work for MongoDB, so perhaps someone wants to fork the repo and add that functionality?

      Thanks,
      Jeremy

  1. Thanks for sharing! Do you plan to integrate this with existing ORMs? Like sequelize? Or would you prefer to just keep using raw SQL queries?

    1. Hi Tom,

      I personally hate ORMs. I think they mostly provide a layer of abstraction that has little to no benefit. They often require several queries to be run when a simple join could have solved the problem. Plus, the likelihood of me changing my underlying database technology is slim to none, and even if I did, most ORMs would require code changes anyway. I do have a separate NPM module I’ve been working on that integrates a caching layer, but it’s not quite ready to share.

      Thanks for feedback,
      Jeremy

  2. Thanks for the article and the lib, we have the same Lambda/Aurora SLS combo so we’ll definitely give it a try.

    Regarding your comment on ORMs and the likelihood to change DB tech… I don’t believe that “convenient switching of underlying DB tech” is something that is highlighted as an ORM benefit. At the beginning of a project, you can pick whatever you like / need, but at later stages, it immediately disappears because switch is not “just plug-in a different DB driver”, there so much more to that.

    1. Hi Adrian,

      I did point out that changing your DB tech often requires a number of query rewrites. You’ve obviously had experience with ORMs, so you are well aware of this fact. However, Doctrine ORM (as an example) says, “This provides developers with a powerful alternative to SQL that maintains flexibility without requiring unnecessary code duplication.” IMO, someone new to ORMs might interpret “maintains flexibility” as a reference to the underlying DB tech. In some cases this is true, until you add complexity.

      My disdain for most ORMs is that they provide too much abstraction, and in doing so, require unnecessary steps (read multiple queries) to accomplish simple things. If you are just using it to list, sort and paginate data, they may work well. I am a fan of query builders, since they can shortcut quite a bit of code, but I don’t like handing query planning over to an ORM.

      Thanks for the comment and I hope the lib works for you.

      – Jeremy

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.