How To: Reuse Database Connections in AWS Lambda

Update 9/2/2018: I wrote an NPM module that manages MySQL connections for you in serverless environments. Check it out here.

I work with AWS Lambda quite a bit. The ability to use this Functions-as-a-Service (FaaS) has dramatically reduced the complexity and hardware needs of the apps I work on. This is what’s known as a “Serverless” architecture since we do not need to provision any servers in order to run these functions. FaaS is great for a number of use cases (like processing images) because it will scale immediately and near infinitely when there are spikes in traffic. There’s no longer the need to run several underutilized processing servers just waiting for someone to request a large job.

AWS Lambda is event-driven, so it’s also possible to have it respond to API requests through AWS’s API Gateway. However, since Lambda is stateless, you’ll most likely need to query a persistent datastore in order for it to do anything exciting. Setting up a new database connection is relatively expensive. In my experience it typically takes more than 200ms. If we have to reconnect to the database every time we run our Lambda functions (especially if we’re responding to an API request) then we are already adding over 200ms to the total response time. Add that to your queries and whatever additional processing you need to perform and it becomes unusable under normal circumstance. Luckily, Lambda lets us “freeze” and then “thaw” these types of connections.

Update 4/5/2018: After running some new tests, it appears that “warm” functions now average anywhere between 4 and 20ms to connect to RDS instances in the same VPC. Cold starts still average greater than 100ms. Lambda does handle setting up DB connections really well under heavy load, but I still favor connection reuse as it cuts several milliseconds off your execution time.

The Lambda documentation tells you to keep your variable declarations inside your handler function. For example:

Any variable outside the handler function will be frozen in between Lambda invocations and possibly reused. The documentation states to “not assume that AWS Lambda always reuses the container because AWS Lambda may choose not to reuse the container.” See here for AWS’s introduction to Lambda. I’ve found that depending on the volume of executions, the container is almost always reused.

This “freezing” process allows us to maintain state between executions. For example, we could create a simple counter variable to see how many times the Lambda container was reused:

This would increment the counter every time we called the Lambda function until AWS decided to expire the container. Lambda is able to freeze any type of variable, including the connection to a database like MySQL. We simply create our connection outside of our handler function like so:

This is all fine and good, but the problem is that this will never actually return unless you close the connection to the database. This is because Lambda waits for Node’s Event Loop to finish before returning anything via the callback. However, Lambda has a “context” object that can be tweaked to make this work. All we need to do is update context.callbackWaitsForEmptyEventLoop to false and Lambda will return as soon as we execute the callback() function.

Another important thing to remember is that module references are also “frozen”, so you can add all of your database connection and state management functionality into a separate module. I typically use a closure to store the state of my database connections and then use promises to manage my async calls. This is a great way to connect to the database ONLY when you need to instead of making sure it is enabled for every invocation.

How do I manage these connections, especially after the Lambda container expires? Great question! I’ve written a post to answer this question: How To: Manage RDS Connections from AWS Lambda Serverless Functions

Looking to build an serverless API with AWS Lambda? Read my post How To: Build a Serverless API with Serverless, AWS Lambda and lambda-api

Tags: , , , , , ,


Did you like this post? 👍  Do you want more? 🙌  Follow me on Twitter or check out some of the projects I’m working on. You can sign up for my WEEKLY newsletter too. You'll get links to my new posts (like this one), industry happenings, project updates and much more! 📪

Sign Up for my WEEKLY email newsletter


I respect your privacy and I will NEVER sell, rent or share your email address.

17 thoughts on “How To: Reuse Database Connections in AWS Lambda”

  1. Hey Jeremy,

    Your post was very helpful. Especially, the trick with context.callbackWaitsForEmptyEventLoop
    However, I stumble upon an issue. Once, I refactored the code to NOT close the connection explicitly all my test started to behave in the same way as Lambda without context.callbackWaitsForEmptyEventLoop = false! Basically, once all tests have completed, the event loop is waiting for the connection to be closed and therefore, sls invoke test never finishes.

    Please see my question on SO:
    https://stackoverflow.com/questions/48933248/how-to-prevent-constructor-call-in-lambda-function-test

    Do you have any suggestions or patterns on how to rework the test? I have one solution (that actually works) but it involves changing NOT the test but the actual code to prevent constructor call. Even though, it works I wonder if there is a better and cleaner way – my favourite one would be to simply stub or mock constructor from the test so the actual application code can be left intact.

    Thanks!

    1. Hey Arek,

      Glad you found the post useful. To answer your Stack Overflow question, try this post I wrote about stubbing AWS services: https://www.jeremydaly.com/stub-aws-services-lambda-functions-using-serverless-sinon-js-promises/

      I’m not sure what you’re using for a database connection, but you should instantiate your Class outside of your handler, and then create a connect() method that gets triggered inside your handler. This will allow you to freeze the connection while still having control over when it connects.

      The best solution I’ve found to overcome the context.callbackWaitsForEmptyEventLoop = false issue when running tests locally is to conditionally execute code if I’m in the test environment. I put this at the end of my main handler:

      // For local testing
      if (process.env.IS_LOCAL) {
      REDIS.quit();
      MYSQL.quit();
      } // end if local

      The serverless-mocha-plugin doesn’t provide the IS_LOCAL environment variable by default, but you can assign it to one of your stages and then run sls invoke test -f myFunction -s local.

      Hope that helps!

  2. Hello Jeremy,

    i’m new to use the AWS Lambda, this looks cool and i found this post and your other post a great help too. I’ve a question, since you said:

    This would increment the counter every time we called the Lambda function until AWS decided to expire the container

    So what if i create a connection as a module, and use it with context.callbackWaitsForEmptyEventLoop = false; after each function execution AWS would freeze that.

    At the time AWS decides to expire this container, will it create connection automatically on new function execution? (i’m using pg-promise ) What would be the behaviour?

    Thanks!

    1. AWS Lambda will “freeze” any variable or connection that is outside of your main handler function’s scope. Therefore, requiring a module (outside of the main handler) that uses pg-promise to create a connection should behave as you noted above: reuse the connection until Lambda expires, and then create a new connection on the next cold start.

      Also note, the connection doesn’t actually need to be created outside of the main handler, just a global variable that will store the reference. So for instance, you could create a module that handled all of your Postgres db interactions and require that outside of the main handler function.

      Something like const myPostgres = require('./myPostgresModule.js'). The module could return a object that stores the connection and a method to create a connection. You could then call myPostgres.connect() and that would actually make the database connection for you. This would allow you to only connect to the database when needed and not for every call.

      Hope that helps.

  3. Is this an “officially” supported Lambda behavior or could it potentially disappear in a future release?

    1. Yes, this is officially supported by AWS Lambda. See https://docs.aws.amazon.com/lambda/latest/dg/running-lambda-code.html

      Any declarations in your Lambda function code (outside the handler code) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We suggest adding logic in your code to check if a connection exists before creating one.

  4. Hi Jeremy,

    This was a very insightful post, thanks for sharing!

    I’m playing around with AWS lambda and connections to a RDS database and am finding that for the containers that are not reused the connection remains. I found before that sometimes the connections would just die eventually. I was wondering, is there some way to manage and/or end the connections without needing to wait for them to end on their own? The main issue I’m worried about is that these unused connections would remain for an excessive amount of time and prevent new connections that will actually be used from being made due the the limit on the number of connections.

    Thanks!

  5. What if you get a big spike of traffic? AWS spins up 10,000 containers to serve the requests, but if your database only allows, say, 100 connections then most of the requests (and other systems) won’t be able to connect to the DB. If the spike continues the Lambda containers will hold the connection open, never sharing with other containers or systems.

  6. Did you mean to use let in blocks that check if a variable exists yet, such as let client = mysql.createConnection({ ...? Because that makes the client variable local to the if block, so you can’t use client later like you can with var.

    1. Hi Ryan,

      You’re right, let in this case would scope the variable inappropriately. I’ve updated the post with var so that it would work. I typically use closures with my persistent connections, so my returned methods have access to the client. But thanks for pointing in out, I missed it in the contrived example.

      Thanks,
      Jeremy

  7. Jeremy.. What if it takes a couple seconds for the database connection to connect and then handler is called by Lambda before the connection is open?

    1. Hi Mit,

      The mysql module will wait for a connection when you try to use query, so you wouldn’t need to worry about that. The callback() wouldn’t fire until after the query completed. This example is a bit out of date now that Lambda supports async/await. I will update the example to make it a bit clearer.

      – Jeremy

  8. Hi Jeremy
    Good idea! I think I can test and use it in my lambda that is written in python.
    I’ll use the sqlalchemy as the ORM.

    Thanks

  9. Thank you. This post was really helpful. I was stuck on the same issue of live connection resulting in timeout of Lambda.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.