How To: Manage RDS Connections from AWS Lambda Serverless Functions

Someone asked a great question on my How To: Reuse Database Connections in AWS Lambda post about how to end the unused connections left over by expired Lambda functions:

I’m playing around with AWS lambda and connections to an RDS database and am finding that for the containers that are not reused the connection remains. I found before that sometimes the connections would just die eventually. I was wondering, is there some way to manage and/or end the connections without needing to wait for them to end on their own? The main issue I’m worried about is that these unused connections would remain for an excessive amount of time and prevent new connections that will actually be used from being made due to the limit on the number of connections.

🧟‍♂️ Zombie RDS connections leftover on container expiration can become a problem when you start to reach a high number of concurrent Lambda executions. My guess is that this is why AWS is launching Aurora Serverless, to deal with relational databases at scale. At the time of this writing it is still in preview mode.

Update September 2, 2018: I wrote an NPM module that manages MySQL connections for you in serverless environments. Check it out here.

Update August 9, 2018: Aurora Serverless is now Generally Available!

Overall, I’ve found that Lambda is pretty good about closing database connections when the container expires, but even if it does it reliably, it still doesn’t solve the MAX CONNECTIONS problem. Here are several strategies that I’ve used to deal with this issue.

1. Implement a good caching strategy

As your application scales, use something like REDIS to cache common data queries. REDIS’s maxclient default is 65,000, far greater than the connection limits for RDS. You should also design your application to ONLY create a connection to RDS when you need to. It’s possible that your thousands of concurrent requests might not actually need a database connection if they get data from a cache or are performing another action. Your connect() call should always be INSIDE your handler and only called when your flow requires a connection.

2. Lower your connection timeouts

RDS doesn’t give your root account the SUPER privilege, but you can create “RDS DB Parameter Groups” that allow you to set the wait_timeout and interactive_timeout to values more appropriate for serverless applications (the default is 8 hours). I’ve set these to 10 minutes in certain implementations and it seems to work well. My code is always designed to reconnect if the connection drops, so even if “warm” containers lose their connection, this isn’t an issue.

3. Limit concurrent executions

If your database only supports 1,000 concurrent connections, limit your Lambda function’s concurrent connections to something less that that. This isn’t necessarily ideal, since it will impact your users during high traffic situations, but it will prevent the database from maxing out connections.

4. Limit connections per username

Another strategy is to set a resource limit on the number of connections a given db user can have. MAX USER_CONNECTIONS can be limited to a number less than the max connections allowed by the RDS instance. This will throw an error in your Lambda function, but it is possible for you to gracefully handle that error with a Too many connections response or something else. This is handy, because you can then…

5. Clear connections with a proxy function

Even without SUPER privilege, RDS does allow SHOW PROCESSLIST which lets you view all open connections. You could trigger a proxy Lambda function on a Too many connections error (or on a regular schedule) and have it inspect all the open connections. Then it can issue a CALL mysql.rds_kill(thread-ID) on sleeping connections with high time values.

6. Increase max_connections setting

As I mentioned earlier, RDS lets you create “RDS DB Parameter Groups” to adjust your database’s default settings. This can be used to set the max_connections value up to 16,000 connections!  This obviously has its tradeoffs, especially around memory and CPU consumption. If you are performing long-running or CPU intensive queries, make sure you size your instances to appropriately handle the load. There is a reason why the defaults are the defaults. Also, you can’t set this value with Aurora Serverless, so something to think about there.

7. Use the serverless-mysql NPM module

I found myself continually using the same strategies to manage RDS connections, so I finally decided to package them into an NPM module. This module not only handles connection management, but also throws in a bunch of extras, like transaction support, to make working with MySQL in your apps much easier. You can install it by running npm i serverless-mysql. The documentation and code is available on GitHub.

The bottomline

Serverless isn’t perfect…yet! 😉 However, using appropriate strategies like the ones mentioned above help to work around some of the current limitations. And don’t forget to check out my new NPM module: serverless-mysql.

Do you have other strategies for dealing with RDS connections in your serverless functions? Let me know in the comments.

Tags: , , , , ,


Did you like this post? 👍  Do you want more? 🙌  Follow me on Twitter or check out some of the projects I’m working on. You can sign up for my WEEKLY newsletter too. You'll get links to my new posts (like this one), industry happenings, project updates and much more! 📪

Sign Up for my WEEKLY email newsletter


I respect your privacy and I will NEVER sell, rent or share your email address.

12 thoughts on “How To: Manage RDS Connections from AWS Lambda Serverless Functions”

  1. Hi Jeremy, Thanks a lot for sharing this amazing article, it ensured my observations and made me confident about my moves.

    I would like to discuss this point with you: “3. Limit concurrent executions”, what happens if the limit reached? would AWS queue the requests to run the lambda later or we just lost it?

    Best,

  2. How can your #1 suggestion be to use Redis when we are talking about serverless applications? Redis would require you to setup and maintain a server.

    1. Serverless is not about having no servers, it’s about handing off server management to the cloud provider. If you are using a relational database in a serverless application, you’ll need to either maintain that yourself, or use a managed service like RDS from AWS. When I implement caching, I typically use AWS Elasticache, a managed service that maintains my Redis cluster.

  3. Hi jeremy,

    Thank you for your post, your site is a great place to increase my knowledge about serverless.

    Do you have any post related as Kinesis ? did you use this service?

    Thank you!

    1. Hi Hugo,

      I’m glad you find the site useful. I don’t have any posts about Kinesis yet, but I do use the service for a number of applications. If I think of a good topic, I’ll be sure to write about it.

      Thanks,
      Jeremy

  4. Good work on that NPM module! A question about transactions though:
    How can I get the result (i.e. the id of the inserted row) of the first query, to use it in the following query?
    I used to do this with custom functions that returned promises:

    beginTransaction()
    .then(() => executeQuery(‘INSERT INTO table (foo) VALUES(?)’, [1])
    .then((r) => {
    console.log(r.insertId);
    return executeQuery(‘INSERT INTO table2 (bar), [r.insertId])
    });

    but with your .transaction() it’s not really clear how to do this… pls help!

    1. Hi Dominggus,

      Thank you for pointing this out! I actually didn’t account for this. The queries are all queued, so right now your example wouldn’t work with Serverless MySQL’s transaction method. I will add an issue to GitHub and add this capability in. You can still access the underlying connection by calling the getClient() method, which would allow you to work around it for now.

      Github issue here: https://github.com/jeremydaly/serverless-mysql/issues/1

      Thanks again,
      Jeremy

  5. Hi Jeremy,
    Your post has been very helpful for my lambda implementation. I have a question: Can I use your mysql NPM module for Postgres? My Postgres database is running in an EC2 instance under a VPC and my lambda makes query on that. The project is expected to have huge number of concurrent connections from postgres via server less lambda invocation, through an AWS API gateway.
    Any help/pointer would be very much appreciated.
    Thanks,
    Sanjoy

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.