Update 9/2/2018: I wrote an NPM module that manages MySQL connections for you in serverless environments. Check it out here.
I work with AWS Lambda quite a bit. The ability to use this Functions-as-a-Service (FaaS) has dramatically reduced the complexity and hardware needs of the apps I work on. This is what’s known as a “Serverless” architecture since we do not need to provision any servers in order to run these functions. FaaS is great for a number of use cases (like processing images) because it will scale immediately and near infinitely when there are spikes in traffic. There’s no longer the need to run several underutilized processing servers just waiting for someone to request a large job.
AWS Lambda is event-driven, so it’s also possible to have it respond to API requests through AWS’s API Gateway. However, since Lambda is stateless, you’ll most likely need to query a persistent datastore in order for it to do anything exciting. Setting up a new database connection is relatively expensive. In my experience it typically takes more than 200ms. If we have to reconnect to the database every time we run our Lambda functions (especially if we’re responding to an API request) then we are already adding over 200ms to the total response time. Add that to your queries and whatever additional processing you need to perform and it becomes unusable under normal circumstance. Luckily, Lambda lets us “freeze” and then “thaw” these types of connections.
Update 4/5/2018: After running some new tests, it appears that “warm” functions now average anywhere between 4 and 20ms to connect to RDS instances in the same VPC. Cold starts still average greater than 100ms. Lambda does handle setting up DB connections really well under heavy load, but I still favor connection reuse as it cuts several milliseconds off your execution time.
The Lambda documentation tells you to keep your variable declarations inside your handler
function. For example:
1 2 3 4 5 6 |
'use strict'; module.exports.handler = (event, context, callback) => { const someVar = "foo" callback(null, { result: someVar }) } |
Any variable outside the handler
function will be frozen in between Lambda invocations and possibly reused. The documentation states to “not assume that AWS Lambda always reuses the container because AWS Lambda may choose not to reuse the container.” See here for AWS’s introduction to Lambda. I’ve found that depending on the volume of executions, the container is almost always reused.
This “freezing” process allows us to maintain state between executions. For example, we could create a simple counter variable to see how many times the Lambda container was reused:
1 2 3 4 5 6 7 8 9 |
'use strict'; let counter = 0 module.exports.handler = (event, context, callback) => { counter++ console.log(counter) callback(null, { count: counter }) } |
This would increment the counter every time we called the Lambda function until AWS decided to expire the container. Lambda is able to freeze any type of variable, including the connection to a database like MySQL. We simply create our connection outside of our handler
function like so:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
'use strict'; const mysql = require('mysql'); // require mysql // If 'client' variable doesn't exist if (typeof client === 'undefined') { // Connect to the MySQL database var client = mysql.createConnection({ // your connection info }); client.connect() } module.exports.handler = (event, context, callback) => { client.query('SELECT * FROM `books`', function (error, results) { callback(null, results) }); } |
This is all fine and good, but the problem is that this will never actually return unless you close the connection to the database. This is because Lambda waits for Node’s Event Loop to finish before returning anything via the callback. However, Lambda has a “context” object that can be tweaked to make this work. All we need to do is update context.callbackWaitsForEmptyEventLoop
to false
and Lambda will return as soon as we execute the callback()
function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
'use strict'; const mysql = require('mysql'); // require mysql // If 'client' variable doesn't exist if (typeof client === 'undefined') { // Connect to the MySQL database var client = mysql.createConnection({ // your connection info }); client.connect() } module.exports.handler = (event, context, callback) => { // This will allow us to freeze open connections to a database context.callbackWaitsForEmptyEventLoop = false; client.query('SELECT * FROM `books`', function (error, results) { callback(null, results) }); } |
Another important thing to remember is that module references are also “frozen”, so you can add all of your database connection and state management functionality into a separate module. I typically use a closure to store the state of my database connections and then use promises to manage my async calls. This is a great way to connect to the database ONLY when you need to instead of making sure it is enabled for every invocation.
How do I manage these connections, especially after the Lambda container expires? Great question! I’ve written a post to answer this question: How To: Manage RDS Connections from AWS Lambda Serverless Functions
Looking to build an serverless API with AWS Lambda? Read my post How To: Build a Serverless API with Serverless, AWS Lambda and lambda-api
Tags: amazon web services, aws lambda, faas, javascript, mysql, nodejs, serverless
Did you like this post? 👍 Do you want more? 🙌 Follow me on Twitter or check out some of the projects I’m working on.
Hey Jeremy,
Your post was very helpful. Especially, the trick with
context.callbackWaitsForEmptyEventLoop
However, I stumble upon an issue. Once, I refactored the code to NOT close the connection explicitly all my test started to behave in the same way as Lambda without
context.callbackWaitsForEmptyEventLoop = false
! Basically, once all tests have completed, the event loop is waiting for the connection to be closed and therefore,sls invoke test
never finishes.Please see my question on SO:
https://stackoverflow.com/questions/48933248/how-to-prevent-constructor-call-in-lambda-function-test
Do you have any suggestions or patterns on how to rework the test? I have one solution (that actually works) but it involves changing NOT the test but the actual code to prevent constructor call. Even though, it works I wonder if there is a better and cleaner way – my favourite one would be to simply stub or mock constructor from the test so the actual application code can be left intact.
Thanks!
Hey Arek,
Glad you found the post useful. To answer your Stack Overflow question, try this post I wrote about stubbing AWS services: https://www.jeremydaly.com/stub-aws-services-lambda-functions-using-serverless-sinon-js-promises/
I’m not sure what you’re using for a database connection, but you should instantiate your
Class
outside of your handler, and then create aconnect()
method that gets triggered inside your handler. This will allow you to freeze the connection while still having control over when it connects.The best solution I’ve found to overcome the
context.callbackWaitsForEmptyEventLoop = false
issue when running tests locally is to conditionally execute code if I’m in the test environment. I put this at the end of my main handler:// For local testing
if (process.env.IS_LOCAL) {
REDIS.quit();
MYSQL.quit();
} // end if local
The
serverless-mocha-plugin
doesn’t provide the IS_LOCAL environment variable by default, but you can assign it to one of yourstages
and then runsls invoke test -f myFunction -s local
.Hope that helps!
Hello Jeremy,
i’m new to use the AWS Lambda, this looks cool and i found this post and your other post a great help too. I’ve a question, since you said:
This would increment the counter every time we called the Lambda function until AWS decided to expire the container
So what if i create a connection as a module, and use it with
context.callbackWaitsForEmptyEventLoop = false;
after each function execution AWS would freeze that.At the time AWS decides to expire this container, will it create connection automatically on new function execution? (i’m using
pg-promise
) What would be the behaviour?Thanks!
AWS Lambda will “freeze” any variable or connection that is outside of your main handler function’s scope. Therefore, requiring a module (outside of the main handler) that uses
pg-promise
to create a connection should behave as you noted above: reuse the connection until Lambda expires, and then create a new connection on the next cold start.Also note, the connection doesn’t actually need to be created outside of the main handler, just a global variable that will store the reference. So for instance, you could create a module that handled all of your Postgres db interactions and
require
that outside of the main handler function.Something like
const myPostgres = require('./myPostgresModule.js')
. The module could return a object that stores the connection and a method to create a connection. You could then callmyPostgres.connect()
and that would actually make the database connection for you. This would allow you to only connect to the database when needed and not for every call.Hope that helps.
Is this an “officially” supported Lambda behavior or could it potentially disappear in a future release?
Yes, this is officially supported by AWS Lambda. See https://docs.aws.amazon.com/lambda/latest/dg/running-lambda-code.html
Hi Jeremy,
This was a very insightful post, thanks for sharing!
I’m playing around with AWS lambda and connections to a RDS database and am finding that for the containers that are not reused the connection remains. I found before that sometimes the connections would just die eventually. I was wondering, is there some way to manage and/or end the connections without needing to wait for them to end on their own? The main issue I’m worried about is that these unused connections would remain for an excessive amount of time and prevent new connections that will actually be used from being made due the the limit on the number of connections.
Thanks!
Hi Gabriel,
This is a great question, so I wrote a post to answer it for you: How To: Manage RDS Connections from AWS Lambda Serverless Functions.
Hope this helps,
Jeremy
What if you get a big spike of traffic? AWS spins up 10,000 containers to serve the requests, but if your database only allows, say, 100 connections then most of the requests (and other systems) won’t be able to connect to the DB. If the spike continues the Lambda containers will hold the connection open, never sharing with other containers or systems.
Hi Nick,
This is certainly a potential problem given the bottleneck that relational databases could create. I have another post (https://www.jeremydaly.com/manage-rds-connections-aws-lambda/) that gives some strategies to mitigate these problems.
Hope that helps,
Jeremy
Did you mean to use
let
in blocks that check if a variable exists yet, such aslet client = mysql.createConnection({ ...
? Because that makes the client variable local to the if block, so you can’t use client later like you can withvar
.Hi Ryan,
You’re right,
let
in this case would scope the variable inappropriately. I’ve updated the post withvar
so that it would work. I typically use closures with my persistent connections, so my returned methods have access to theclient
. But thanks for pointing in out, I missed it in the contrived example.Thanks,
Jeremy
Jeremy.. What if it takes a couple seconds for the database connection to connect and then handler is called by Lambda before the connection is open?
Hi Mit,
The
mysql
module will wait for a connection when you try to usequery
, so you wouldn’t need to worry about that. Thecallback()
wouldn’t fire until after the query completed. This example is a bit out of date now that Lambda supportsasync/await
. I will update the example to make it a bit clearer.– Jeremy
Hi Jeremy
Good idea! I think I can test and use it in my lambda that is written in python.
I’ll use the sqlalchemy as the ORM.
Thanks
Thank you. This post was really helpful. I was stuck on the same issue of live connection resulting in timeout of Lambda.
Thanks, this post was extremely helpful. I was also stuck on the lambda timeout issue.
Curious, why are you checking if the variable exists:
// If ‘client’ variable doesn’t exist
if (typeof client === ‘undefined’)
Doesn’t the code outside the handler only run once, so it will always be undefined.
Hi Alex,
The variable only exists if the container is being reused. There is no guarantee that Lambda will reuse the same function that already has our frozen connection, so we must check for its existence first. By wrapping the connection information with the “client” check conditional, we are ensuring that we reuse the connection if it exists, or create a new one if it doesn’t.
Hope that helps,
Jeremy
Hi Jeremy, any update for using async/await? Thanks
Hi Hongbo,
Using NodeJS 8.10+ will allow you to use async/await as an alternative to promises.
– Jeremy
Hi Jeremy,
Do you still need to use
context.callbackWaitsForEmptyEventLoop = false
if you’re retuning a promise?For example?
const handler = (event, context = {}) => {
context.callbackWaitsForEmptyEventLoop = false
return promiseFunction(event)
.catch(error => {
// handler error
})
}
Hi Che,
Yes you do, because you’re still returning a resolved promise. If you don’t set
context.callbackWaitsForEmptyEventLoop = false
, then the Lambda function will hang without returning the result.– Jeremy
We use mysql2 NodeJS library to connect to RDS and cache connections outside of the Lambda’s handler method. We handle dead connections and re-establish them.
We don’t set callbackWaitsForEmptyEventLoop to false, however connections are reused and Lambda returns right away.
That’s interesting about
callbackWaitsForEmptyEventLoop
. Unless something was changed, any open connections should be blocking. You’re not disconnecting after each invocation, are you?I have similar experience. I didn’t set callbackWaitsForEmptyEventLoop to false but connections are reused and Lambda returns right the way (but I am not using mysql2). Turns out it is because another modules which I used set callbackWaitsForEmptyEventLoop to false. In my case, that module is apollo-server-lambda.
Interesting. Unless something changed,
callbackWaitsForEmptyEventLoop
has to be set for Node, or the process hangs.Today I found that callbackWaitsForEmptyEventLoop is only applied to callback style non-async Lambda function, and has no effect when async style Lambda function is used.
From documentation:
> callbackWaitsForEmptyEventLoop – Set to *false* to send the response right away when the *callback* executes.
This apply only when *callback* function is executed.
I tested with the following Lambda function and indeed, response is returned immediately.
js
exports.handler = async (event, context) => {
console.log(context.callbackWaitsForEmptyEventLoop)
let d = Date.now()
let resp = 'this is resposne ' + d
console.log(resp)
setTimeout(function () {
console.log(Date.now() + ' Timeout complete. ' + d)
}, 2000)
return resp
};
Related documentation:
https://docs.aws.amazon.com/lambda/latest/dg/nodejs-prog-model-context.html
https://docs.aws.amazon.com/lambda/latest/dg/nodejs-prog-model-handler.html
That makes total sense. I’ve always had it as boilerplate in my code, so I didn’t even think twice about it when I migrated to v8.10 and started using async/await.
Thanks for the clarification!
Hey Jeremy,
Great article and very informative. I actually found this article while searching for a solution to something, got halfway through the article before realised what/who’s site I was on… I was actually in the audience for your presentation at #BFSServerless just a couple of days ago!
I’m trying to figure out if Lambda execution context is the “right” solution for my issue.
I’m currently making a call to an external service for sending SMS messages, from a lambda function. The SMS API requires 2 requests… one to their /token endpoint to get an auth token valid for 20 mins, and another to actually send the SMS (passing in the auth token).
Currently I am making both calls one after the other on every lambda invocation which feels wasteful and inefficient.
I’m wondering if it would be efficient/correct/secure dumping the auth token into the execution context, ready for the next invocation which could be 2 seconds later, or could be 24 hours later to try and grab the token (and check the expiry to see if it’s still valid).
Am I way off here? Is there a better way to do this? Or even another AWS service better suited to this?
Thanks for the advice and thanks for a great presentation at #BFSServerless… I thoroughly enjoyed it!
Hi Mark,
I’m sorry for the late reply. Great question. Hopefully you’ve already figured this out, but there would be multiple ways to achieve this. If the
/token
endpoints allows you to create multiple tokens (creating a new one doesn’t expire the old one), then having each function store a token in theGLOBAL
scope for reuse on warm invocations would probably work quite well. You’d just need to check the timeout on each call and then potentially refresh the token if it’s about to expire. The other option is to store the token in DynamoDB, or ElastiCache, or even Parameter Store and coordinate your Lambdas to pull updates from that source. Each concurrent Lambdas would still need to track expiration times so that it could either refresh the token from your store, or potentially fetch a new token and update the store.Hope that helps. Feel free to DM on Twitter if you have more questions.
Hi Jeremy,
Thanks for this infomation.
This solved my problem of running into my database connection limits.
Here is the solution:
http://github.com/dsanandiya/lambda-nodejs-mysql-redis
Hi Jeremy. I have another question about this if you don’t mind.
Most of the time you would have your DB setup stuff in a different module – you wouldn’t do this all in the same
index.js
file where you define your handler.You can require your “DB module” in the
index.js
but you may not use it there.Do you know of any best practices around this, or is simply requiring it and setting to an unused variable enough (due to how Node caches modules)?
Also, if you have any additional pointers for doing this with Express too, that would be good to know.
Thanks in advance – much appreciated.
Jeremy Dalysays:
March 7, 2018 at 12:38 pm
AWS Lambda will “freeze” any variable or connection that is outside of your main handler function’s scope. Therefore, requiring a module (outside of the main handler) that uses pg-promise to create a connection should behave as you noted above: reuse the connection until Lambda expires, and then create a new connection on the next cold start.
Also note, the connection doesn’t actually need to be created outside of the main handler, just a global variable that will store the reference. So for instance, you could create a module that handled all of your Postgres db interactions and require that outside of the main handler function.
Something like const myPostgres = require(‘./myPostgresModule.js’). The module could return a object that stores the connection and a method to create a connection. You could then call myPostgres.connect() and that would actually make the database connection for you. This would allow you to only connect to the database when needed and not for every call.
Hope that helps.
Jeremy: thanks for https://www.npmjs.com/package/serverless-mysql and your blog posts surrounding as well. This is good stuff man. I have a minor question that I hope doesn’t cross the line into asking for free consulting 😉
Specifically: I have a group of lambda based apis deployed as one serverless project, (each uses the underlying serverless Aurora MySql db for its own query) and was using https://www.npmjs.com/package/mysql, but was “randomly” getting unexplained db connection fails even though the aws metrics did not show the number of outstanding connections was ever over 4 for a db that supposedly had 80 total. So switching to your library, but my question is: what should my value for “connUtilization” be? My guess is that while each lambda is going to have its own cache of the max and total connections, that since they are all sharing the same db user and your library’s code is hitting the same common db resource (e.g. @@max_user_connections), they will share just fine? Or do I need to scale it down for each to something like 0.3?
Hi Tom,
The MySQL npm package doesn’t handle connection management very well, so you likely have connections getting dropped and the library doesn’t know how to reconnect. The Serverless MySQL package will take care of that for you. Probably no need to change the connection utilization since the caching is very low (or maybe off) by default. The only time you might want to change the connUtilization is if you need to trigger autoscales for Aurora Serverless. If set too low, the library will prevent scaling because it will aggressively manage connections.
– Jeremy
Hi Jeremy
Will there be a situation that there are too many connections open as there was a hipe and too many instances were created for the micro service and the database connections get exceeded … whenever I think of using lambda for a database related functionality , this doubt disturbs me and as I have no prior experience with lambda for a millions of request coming up in a second …
My experiance is with java J2ee springboot microservices deployed in kunernetis – I have no much idea about how a similar highly scalable requirement will work on AwS lambda
Hi Arun,
Are you facing any resource leak for millions of request if you are doing in this way? Same requirement I’m also trying to do.