Aurora Serverless: The Good, the Bad and the Scalable

Amazon announced the General Availability of Aurora Serverless on August 9, 2018. I have been playing around with the preview of Aurora Serverless for a few months, and I must say that overall, I’m very impressed. There are A LOT of limitations with this first release, but I believe that Amazon will do what Amazon does best, and keep iterating until this thing is rock solid.

The announcement gives a great overview and the official User Guide is chock full of interesting and useful information, so I definitely suggest giving those a read. In this post, I want to dive a little bit deeper and discuss the pros and cons of Aurora Serverless. I also want to dig into some of the technical details, pricing comparisons, and look more closely at the limitations.

Audio Version

Update May 2, 2019: Amazon Aurora Serverless Supports Capacity of 1 Unit and a New Scaling Option

Update November 21, 2018: AWS released the Aurora Serverless Data API BETA that lets you connect to Aurora Serverless using HTTP as opposed to a standard MySQL TCP connection. It isn’t ready for primetime, but is a good first step. You can read my post about it here: Aurora Serverless Data API: A First Look.

What is Aurora Serverless?

Let’s start with Aurora. Aurora is Amazon’s “MySQL and PostgreSQL compatible relational database built for the cloud, that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases.” If you are using MySQL or PostgreSQL, you can basically migrate to Aurora without any changes to your application code. It’s fully-managed, ridiculously fast (up to 5x faster than MySQL), and can scale using shared-storage read replicas and multi availability zone deployments. This extra power (and convenience) comes with an added cost (~23% more per hour), but I’ve found it to be well worth it.

Now let’s talk about the “serverless” part. Yes, serverless has servers (I’m getting sick of typing this 🤦🏻‍♂️), but the general idea is that for something to be “serverless”, it should:

  • Charge you only for what you use
  • Require zero server maintenance
  • Provide continuous scaling
  • Support built-in high availability and fault tolerance

Aurora Serverless provides an on demand, auto-scaling, high-availability relational database that only charges you when it’s in use. While it may not be perfect, I do think that this fits the definition of serverless. All you need to do is configure a cluster, and then all the maintenance, patching, backups, replication, and scaling are handled automatically for you. It almost sounds too good to be true, so let’s jump in and investigate.

Setting up your Aurora Serverless cluster

Setting up an Aurora Serverless cluster is fairly simple. Note that you can only create a cluster in a VPC.

1. Click “Create Database” from the main RDS Console screen or from within “Instances” or “Clusters”

2. Select “Amazon Aurora” and the “MySQL 5.6-compatible” edition (Aurora Serverless only supports this edition) and click “Next”

3. Select “Serverless” as the DB engine, enter the cluster settings including the identifier, master username and password, then click “Next”

4. Select your “Capacity settings” and your “Additional scaling configuration” to control auto-pause settings

5. Configure “Network & Security” by selecting (or creating) a VPC and subnet group as well as choosing (or creating) VPC security groups

6. Expand “Additional configuration” and confirm your parameter group, backup retention setting and your encryption key and then click “Create database”

Note that data at rest in Serverless Aurora clusters appears to be automatically encrypted using KMS. There is no option to disable it.

It took just over 2 minutes to create the cluster for me. Once it is available, you can view the cluster information by clicking on “Clusters” in the left menu and then clicking on your cluster name in the list.

This dashboard is quite robust, with lots of CloudWatch metrics (including the ability to compare clusters) and detailed information about your cluster configuration. From here you can also grab your “Database endpoint”.

Aurora Serverless Cluster Details

Connecting to your Aurora Serverless Cluster

From a connection standpoint, your Aurora Serverless cluster is very similar to a regular Aurora cluster. Any VPC resource that has access to your cluster (based on your chosen security groups) can connect to your cluster on port 3306. If you have an existing EC2 instance (with proper security group access and the mysql CLI tools installed), you can connect from your terminal using the standard:

mysql -u root -h test-serverless-db.cluster-XXXXXXXXXX.us-east-1.rds.amazonaws.com -p

If you want to connect to your cluster from your local machine, you can either connect through a VPN, or set up an SSH tunnel. You can configure an SSH tunnel by adding the following to your ~/.ssh/config file:

Then simply run ssh tunnel -Nv in your terminal. From another terminal window run:

mysql -u root -h 127.0.0.1 -p

And that should connect you. I wrote a detailed post about how to do this with Elasticsearch in a VPC. That should give you more detailed information if you have an issue with the above setup.

Your applications would connect just like they would to any other MySQL database. I was able to easily add my new cluster to phpMyAdmin and connect from Lambda using the mysql node package.

Limitations of Aurora Serverless

As I mentioned earlier, there are A LOT of limitations to Aurora Serverless. However, depending on your use case, most of these probably won’t matter much and will have little to no impact on your application.

A few top level limitations:

  • Aurora Serverless is only compatible with MySQL 5.6
  • The port number for connections must be 3306
  • Aurora Serverless DB clusters can only be accessed from within a VPC
  • AWS VPN connections and inter-region VPC peering connections are not able to connect to Aurora Serverless clusters

You are also extremely limited with your ability to modify cluster-level parameters in custom Cluster Parameter groups. As of now, you can only modify the following cluster-level parameters:

  • character_set_server
  • collation_server
  • lc_time_names
  • lower_case_table_names
  • time_zone

Modifying other cluster-level parameters would have no effect since Aurora Serverless uses the default values. Instance-level parameters are not supported either. Noticeably absent here is the max_connections parameter. See Max Connections below for more information.

There are also several other features that Aurora Serverless doesn’t currently support. Some of these features are quite useful when using provisioned Aurora, so these might be a dealbreaker if you are relying on any of the following:

  • Loading data from an Amazon S3 bucket
    This is a great feature for bulk loading data into your MySQL database, especially if you’re using Aurora for reporting. Aurora Serverless doesn’t have support yet.
  • Invoking an AWS Lambda function with an Aurora MySQL native function
    This is one of Aurora’s greatest features, essentially giving you the ability to create triggers based on actions in your MySQL database. Comparable to DynamoDB streams.
  • Advanced auditing
    Useful for some applications, but something I can live without.
  • Aurora Replicas
    This is another great feature of Aurora that scales reads by creating new instances that access the same shared cluster volume. I’m sure Amazon thought this through, but I wonder how auto-scaling of Aurora Serverless compares.
  • Backtrack
    Another unique feature of Aurora is the ability to rewind and fast-forward transactions in your cluster. This isn’t available in Aurora Serverless.
  • Database cloning
    If you are using database cloning (especially for things like chaos engineering), you’re out of luck with Aurora Serverless.
  • IAM database authentication
    This is a nice and secure way to protect security credentials, but since Aurora Serverless doesn’t have public endpoints, it probably isn’t a huge deal.
  • Cross-region read replicas
    Helpful for disaster recover, but I think it is unnecessary given the built-in high availability and distributed nature of Aurora Serverless.
  • Restoring a snapshot from a MySQL DB instance
    Not a huge deal if you are already using Aurora. If you are migrating from MySQL, you could always migrate to an Aurora provisioned cluster and then to Serverless.
  • Migrating backup files from Amazon S3
    Same as above. Useful, but there are workarounds.
  • Connecting to a DB cluster with Secure Socket Layer (SSL)
    Again, useful for security, but not necessary if using username/password to connect.

If none of these limitations affect you, then Aurora Serverless might be right for your use case. Let’s look at the cost comparison between using provisioned versus serverless cluster.

Cost comparisons

Both provisioned and serverless versions of Aurora charge you for storage, I/O, and data transfer. Storage and I/O are flat rates per corresponding unit of measure:

Storage Rate $0.10 per GB-month
I/O Rate $0.20 per 1 million requests

Data transfer rates from Amazon RDS to the Internet shouldn’t apply to Aurora Serverless since it can’t be accessed directly from the Internet. However, depending on where data is being transferred to within AWS, data transfer fees may apply. Below is a small sample of the pricing for data transfer.

Data Transfer OUT From Amazon RDS To
CloudFront $0.00 per GB
US East (N. Virginia) $0.01 per GB
Asia Pacific (Mumbai) $0.02 per GB
Asia Pacific (Sydney) $0.02 per GB
EU (London) $0.02 per GB
Complete pricing https://aws.amazon.com/rds/aurora/pricing/

NOTE: Data transferred between Amazon RDS and Amazon EC2 Instances in the same Availability Zone is free.

The real difference in pricing is based on the way Aurora Serverless scales and how it charges for usage. Provisioned Aurora instances utilize per hour billing based on instance size, like EC2 and other virtual instances. Aurora Serverless, on the other hand, uses ACUs (or Aurora Capacity Units) to measure database capacity. Each ACU has approximately 2 GB of memory with corresponding CPU and networking resources that are similar to provisioned instances of Aurora.

ACUs are billed by the second at a flat rate of $0.06 per hour. Even though you are only billed per second of usage, there is a minimum of 5 minutes billed each time the database is active. There is also a minimum provisioning of 2 ACUs (with 4 GB of memory).  Updated May 2, 2019: You can now set your minimum capacity to 1 ACU (with 2 GB of memory) if you are using the MySQL version. PostgreSQL still has a minimum of 2 ACUs. You can scale all the way up to 256 ACUs with approximately 488 GB of memory. If you were to keep an Aurora Serverless database running 24 hours per day at 2 ACUs, it would cost you $2.88 ($0.06 * 2 * 24) per day (or roughly $86 per month). If you scale down to 1 ACU (on MySQL), the base cost (without and scale ups) would be about $43 per month.

In order to compare apples to apples, I’ve created a chart below that makes some assumptions based on corresponding memory to see the difference in cost between provisioned versus serverless. The instance prices below reflect on-demand pricing, which are obviously much higher than reserved instances.

ACUs Memory (GB) Serverless/hr Instance Type Cost/hr Diff/hour Diff/month
1 2 $0.06 db.t2.small $0.041 $0.019 $13.68
2 4 $0.12 db.t2.medium $0.082 $0.038 $27.36
8 16 $0.48 db.r4.large $0.29 $0.190 $136.80
16 32 $0.96 db.r4.xlarge $0.58 $0.38 $273.60
32 64 $1.92 db.r4.2xlarge $1.16 $0.76 $547.20
64 122 $3.84 db.r4.4xlarge $2.32 $1.52 $1,094.40
128 244 $7.68 db.r4.8xlarge $4.64 $3.04 $2,188.80
256 488 $15.36 db.r4.16xlarge $9.28 $6.08 $4,377.60

As you can see, for sustained loads, the pricing for Aurora Serverless quickly becomes extremely expensive as compared to a single provisioned Aurora instance. However, according to the announcement, Aurora Serverless “creates an Aurora storage volume replicated across multiple AZs.” It also seems to indicate that it relies on multiple nodes to handle requests, which suggests that the service automatically provides high-availability and failover via multiple AZs. If you follow AWS’s recommendation to place “at least one Replica in a different Availability Zone from the Primary instance” to maximize availability, then the cost in the above chart would double for on-demand instances. This is a more fair comparison given the inherent high-availability nature of Aurora Serverless. The chart below assumes that each instance has a replica in a different availability zone.

ACUs Memory (GB) Serverless/hr Instance Type Cost/hr Diff/hour Diff/month
1 2 $0.06 db.t2.small $0.082 ($0.022) ($15.84)
2 4 $0.12 db.t2.medium $0.164 ($0.04) ($31.68)
8 16 $0.48 db.r4.large $0.58 ($0.10) ($72.00)
16 32 $0.96 db.r4.xlarge $1.16 ($0.20) ($144.00)
32 64 $1.92 db.r4.2xlarge $2.32 ($0.40) ($288.00)
64 122 $3.84 db.r4.4xlarge $4.64 ($0.80) ($576.00)
128 244 $7.68 db.r4.8xlarge $9.28 ($1.60) ($1,152.00)
256 488 $15.36 db.r4.16xlarge $18.56 ($3.20) ($2,304.00)

From this we can see significant cost savings, even if you maintained comparable capacity for 24 hours each day. Of course, the idea behind Aurora Serverless is to assume unpredictable workloads. If your application doesn’t have extremely long periods of sustained load, only paying for occasional spikes in traffic would actually be significantly cheaper.

With regards to reserved instances, obviously there is a significant price difference. However, I always find it hard to plan database capacity (especially a year out), so buying reserved instances is always a crap shoot (for me anyway). However, the ability to buy “Reserved ACUs” would be a really interesting concept. That way I could prepay for HOURS of capacity at a discounted rate. Something to think about, Amazon. 😉

Autoscaling Aurora Serverless

A cornerstone of serverless architectures are their ability to provide continuous scaling. Aurora Serverless is designed to scale up based on the current load generated by your application. Your cluster will automatically scale ACUs up if either of the following conditions are met:

  • CPU utilization is above 70% OR
  • More than 90% of connections are being used

You cluster will automatically scale down if both of the following conditions are met:

  • CPU utilization drops below 30% AND
  • Less than 40% of connections are being used

Update May 2, 2019: According to the documentation, “There is no cooldown period for scaling up. Aurora Serverless can scale up whenever necessary, including immediately after scaling up or scaling down.”

There is also a 3 minute cooldown period after a scale up operation occurs. I haven’t fully tested this, but it seems to restrict the system from autoscaling more than once every 3 minutes. Similarly, However, there is a cooldown period of 15 minutes for scale down operations, meaning that your scaled capacity will be maintained for at least 15 minutes after a scale-up operation. This makes sense to avoid scaling down too quickly. After a scale down, there is a 310 second cooldown period before the cluster will scale down again.

IMPORTANT NOTE: As we saw in the setup, it’s also possible to have your cluster automatically pause itself after a period of inactivity. If you are using your Aurora Serverless cluster in a production environment, I strongly suggest disabling this feature. It takes about 30 seconds to “unpause” a database, which is much too long for client-facing applications.

Aurora Serverless also introduces the concept of “scaling points”, which refer to a point in time at which the database can safely initiate a scaling operation. The documentation specifies long-running queries, transactions, temporary tables, and table locks as reasons why it might not be able to scale. Aurora Serverless will try to scale the cluster five times before cancelling the operation.

Update May 2, 2019: “You can now choose to apply capacity changes even when a scaling point is not found. If you opt to forcibly apply capacity changes, active connections to your database may get dropped. This configuration could be used to more readily scale capacity of your Aurora Serverless DB clusters if your application is resilient to connection drops.”

Manually Setting the Capacity

An extremely handy feature of Aurora Serverless is the ability to manually provision capacity for your cluster. This can be done via the console, AWS CLI, or the RDS API. If you are anticipating an increase in transactions, whether by detecting some type of leading indicator or preparing for a large batch operation, manually setting the ACUs allows you to quickly scale your cluster to handle the extra workload. Note that setting the capacity manually might drop existing connections if they prevent scaling operations. Once capacity is manually scaled, the same cooldown periods apply for autoscaling the cluster up and down.

Manually scaling capacity in the Console

Details for modifying the capacity manually can be found in the API documentation here. The AWS Node.js SDK already provides support for manual scaling as well.

Max Connections

A major limitation of relational databases in serverless architectures is the maximum number of concurrent connections allowed by the database engine. While FaaS services like Lambda may scale infinitely (in theory anyway), massive spikes in volume can quickly saturate the number of available connections to the underlying database. There are ways to manage connections in serverless environments (also see Managing MySQL at Serverless Scale), but even with Aurora Serverless, this still appears to be a possible limiting factor.

AWS uses the following formula for generating the max_connections value for Aurora instances:

log( ( <Instance Memory> * 1073741824) / 8187281408 ) * 1000 = <Default Max Connections>

A db.r4.xlarge instance with 30.5 GB of memory for example would have a default max_connections value of 2,000.

log( (30.5 * 1073741824) / 8187281408 ) * 1000 = 2000

You can see the a list of default max_connections settings for provisioned Aurora instances here.

Aurora Serverless uses a similar formula based on memory allocation. I manually scaled a test cluster and ran select @@max_connections; after each operation to retrieve the actual value. The following chart outlines my results.

ACUs Memory (in GB) Max Connections
1 2 90
2 4 180 (up from 90)
4 8 270 (up from 135)
8 16 1,000
16 32 2,000
32 64 3,000
64 122 4,000
128 244 5,000
256 488 6,000

Update May 2, 2019: max_connections for 2 ACUs and 4 ACUs have now doubled.

If you compare this to the chart of provisioned instance values, you’ll notice that they are essentially identical to their once they are at or above 8 ACUs, they are essentially identical to their corresponding instance types. However, with provisioned instances, you can manually change the max_connections value and squeeze a few more connections out of it. Aurora Serverless does not allow you to modify this value.

Provisioned clusters also allow for read replicas that can scale the number of available connections as well. Aurora Serverless does not allow for these.

Time to Scale

I ran a few tests to see how quickly the cluster would actually scale. Rather than simulating load (which I assume would have similar results), I manually scaled the cluster and measured the time it took for the max_connections value to change. I’m assuming that at that point, the new capacity was available for use. I also measured the time it took for the cluster status to change from “scaling-capacity” to “available”.

ACUs Time to Capacity Time to Completion
Up to 1 0:56 2:15
Up to 4 0:48 2:15
Up to 8 1:30 3:00
Up to 16 0:45 1:37
Up to 32 0:50 1:45
Up to 64 1:00 1:40
Up to 128 1:25 2:05
Up to 256 2:30 3:45
Down to 2 0:35 2:21
Down to 1 1:10 2:56

My tests showed most of the scaling operations taking less than a minute. Assuming similar performance for autoscaling operations, this would seem to be adequate for handling steadily increasing or potentially even sudden traffic bursts.

Update May 2, 2019: I reran the scaling tests and the numbers were all very similar to my original results. You can expect approximately 1 minute to scale capacity, and another minute for the operation to fully complete.

Monitoring

CloudWatch provides all the same metrics for Aurora Serverless that it does for provisioned clusters. In addition, CloudWatch allows you to monitor the capacity allocated to your cluster using the ServerlessDatabaseCapacity metric. This would make for a great alarm.

ServerlessDatabaseCapacity CloudWatch Metric

Final Thoughts

So far I’m very impressed by Aurora Serverless and the new capabilities it introduces. I also think that most of the limitations, such as IAM authentication, S3 integration, and Lambda triggers will eventually make their way into the platform. But even the lack of those features (which didn’t even exist until AWS introduced them with Aurora), isn’t enough to outweigh the tremendous value that Aurora Serverless provides. If you are a PostgreSQL shop (this is supposedly coming soon), or you really need MySQL 5.7, then you might need to wait, but otherwise, this could be the new way to do relational databases at scale.

I need to run some additional tests to see how well it plays with Lambda, especially with regards to the connection limitation, but after playing with this for a few months, I’m inclined to start using it in production. I already have several clusters that are over provisioned to handle periodic traffic spikes. If this scales as expected, it very well could be the silver bullet I’ve been looking for. 🤔

Update May 2, 2019: I have been running Aurora Serverless clusters in production for several months now and haven’t had any problems with them. As far as the max_connections issue is concerned, I have been using the serverless-mysql package without issue.

Are you planning on using Aurora Serverless? Let me know in the comments and describe the problem you plan on solving with it.

Tags: , , , , ,


Did you like this post? 👍  Do you want more? 🙌  Follow me on Twitter or check out some of the projects I’m working on.

97 thoughts on “Aurora Serverless: The Good, the Bad and the Scalable”

  1. Excellent article, to add a note on the approach to connect to the database you could sshuttle as a wrapper of the SSH Tunnel.

    Terminal 1:
    $ sshuttle –dns -r username@jumphost 192.168.1.0/24
    Connected…

    Terminal 2:
    $ mysql -h aurora-serverless -u dbuser -p -P 3306
    Enter password:
    mysql>

  2. Excellent article! A question I have is: how does it help lambda in light of your previous articles? Do we still need to manage a connection pool for lambda functions? or Aurora Serverless connects so fast and renders the connection pool unnecessary? – Thanks.

    1. Hi Ken,

      Unfortunately, Aurora Serverless doesn’t solve the max_connections issue yet. It’s still possible that zombie connections or high concurrency could use all the connections. Aurora Serverless should definitely keep scaling to mitigate it, but there’s no guarantee it will be fast enough. I’m working on a little something to better handle this myself that I hope to share soon.

      Interestingly, there is a session called “SRV301 – Best Practices for Using AWS Lambda with RDS/RDBMS Solutions” at this year’s re:Invent. I plan on attending to get some more insight myself.

      – Jeremy

  3. Jeremy,
    Thanks for your answer. That is what I thought also. The Lambda with RDS is indeed interesting. Along the same lines, do you use ORM in your Lambada function? I am wondering what are your thoughts about Lambda with ORM?
    – Thanks.

    1. I hate ORMs with a passion. I have a project that I inherited that is using Doctrine ORM with Symfony and the amount of unnecessary queries it runs is ridiculous.

      A key component to keeping serverless functions fast and efficient is to minimize dependencies as much as possible. I doubt your Lambda function needs the full power of an ORM to run a few single purpose RDS queries. I just don’t see the benefit to abstracting simple T-SQL queries within serverless environments.

      – Jeremy

  4. I totally agree with you. I feels that ORM is not a good fit for microservices either, even not combined with serverless. I know you work with microservices also. What are your thoughts on microservices with ORM?

    1. Same feeling. I really dislike how much control you give to ORMs. Oftentimes a simple join can accomplish what takes an ORM 5+ queries to do. I avoid them at all costs. I’m unlikely to change my underlying RDBS, so the flexibility it provides in that regard is essentially useless to me.

  5. Glad on the same page as you in this regard. Changing database is always a big effort in reality. I doubt ORM can help a whole lot, unless it is a simple simple CRUD application.

  6. I’d like to say thank you because your articles are so clear and very useful!!
    After I’ve read this I really ken to test Aurora serverless but I’m just wondering…
    Lambda service uses an elastic network interface (ENI) to access resources inside of a private VPC.
    A cold start has to create and attach the ENI to the lambda container.
    This seems to be a very slow process.
    I’m quite scared about that because could have a huge bad impact, especially in development environment context.
    Can I ask you what do you think about that?
    Thank you so much and sorry for my English 🙂

    1. Hi Diego,

      I’m glad you found the article helpful. You’re right that Lambdas inside a VPC need to use ENIs and it does have an impact on cold start times. However, VPC functions stay warm for at least 15 minutes (and much longer if in use), so once you get passed the cold start (which is usually less that a few seconds), subsequent requests will be very fast. Also, Lambda functions smartly reuse ENIs, so a new ENI doesn’t necessarily need to be created on each cold start. I use Aurora heavily in production and I find cold starts to have a very minimal effect on overall performance.

      Hope that helps,
      Jeremy

  7. Hi, I’m try to connet to Database aurora serverless from spring cloud but requiere the attribute “db-instance-identifier” but with serverless no exists instances only cluster. I hope your answer. Thanks

    1. Hi Fernando,

      I’m not overly familiar with Spring, but the Aurora cluster endpoint should be all you need. Aurora takes care of the routing behind the scenes, so unless Spring is asking you for something beyond the connection endpoint, username and password, then you should be all set.

      – Jeremy

  8. Great article!

    Had a look at using it for our test workloads. But the limitations are a bit of a hinderance.
    1.) 8k column size. There’s a workaround for Aurora, but not for the serverless version.
    2.) ACU is a bit overkill in size.

    For production workloads that do have no activity, lets say out of office hours, a +- 10 second cold start will never be acceptable on a web application.

    Will keep my eye out for the next couple of iterations. Really a game changer in the world of RDBMS :-D.

    Thanks for the summary!
    JJ

  9. Hello excellent article.

    I managed to connect to Aurora from an EC2 and it worked very well, however when I try to connect through a Lambda function to Aurora Serverless I get a timeout. I configured the Lambda VPC with the same as Aurora but still does not connect me. Could you tell me what I’m possibly doing wrong?

    Thank you

    1. Hi Brahyam,

      It sounds like you might be missing a security rule for your Aurora Serverless cluster. Make sure that you have an outbound rule in your Lambda security group and that you are allowing port 3306 from your VPC.

      Hope that helps,
      Jeremy

  10. Excellent article. I am new to aurora & amazon RDS. Currently I am using mysql server hosted on regular server. This article helps me in getting started.

    Thank You

  11. Very Good article.

    We have a requirement to copy MYSQL DB from Production from different VPC to Integration and also to Sandboxes (20-30) in different VPC with Serverless Aurora willit be viable option.

    Ram

  12. Hello, great article!
    I had a couple of questions around the cold start. We were thinking of using it with a synchronous web application where endpoints are handled with lambdas but I read that aurora serverless has a 25 second cold start (https://aws.amazon.com/blogs/aws/aurora-serverless-ga/). Is that still correct? You haven’t mentioned it in your post.
    The data api seems promising but in your other blog it looks like its not quite mature enough yet for a synchronous web application?

    1. Hi Dan,

      Aurora Serverless has the ability to scale to ZERO ACUs, which basically pauses the database so that you are not paying for it. In order to “wake it up”, it does take ~30 seconds. However, in production, you would NEVER do this. As I suggest in the post, simply disable this feature, and your cluster will always be running at a minimum of 2 ACUs, making it always available. It will cost about $80 USD per month, but you get all the multi-region, high-availability failover benefits I mention above.

      Hope that helps,
      Jeremy

  13. How do you compare the performance metrics between Aurora serverless and Dynamo DB when the traffic load is pretty high? Our system usually gets pretty high spikes in the traffic and recently we observed 100 thousand requests to dynamo in 1 sec. We scale up the dynamo pretty high before the estimated traffic spike but since, Auroral serverles has limitations on the number of connections, will it be able to handle 100k requests/sec constantly for about 30 mins?

    1. Rahul,
      That’s a good question. The max_connections limitation will most likely be an issue here if you’re managing that yourself. Even scaled to the largest ACU setting, you only have 6,000 connections available. If you can use connection pooling, it might handle that amount of load. If you are using Lambda, then your options are more limited. There is the new Data API for Aurora Serverless that handles the connection pooling for you, but it is still in beta.
      – Jeremy

  14. I’ve been searching, but I don’t have a clear answer yet. What logging of Aurora Serverless is available, if any? Can I track slow queries?

    If my application isn’t tuned properly, should I avoid using Aurora Serverless because it has little or no logging facilities?

    1. I’ve reached out to the team, but haven’t gotten a clear answer either. Right now I think it would be worth tuning your queries on a separate “dev” cluster, and then using Aurora Serverless for production.

  15. Thanks Jerremy, great article. Given the current limitations with Aurora Serverless, what can we use to export tables from Amazon Serverless into S3? Do we have to use some other AWS service to achieve this?

    To elaborate further, the below syntax works perfectly on an Aurora Provisioned Cluster but what do we need to do the same on Aurora Serverless?
    mysql> SELECT * FROM table_name INTO OUTFILE S3 ‘s3-region://bucketname/sample.manifest ‘ FIELDS TERMINATED BY ‘,’ lines terminated by ‘\n’ overwrite on;

    Thanks!

    1. Hi Naren,

      The last time I checked Aurora Serverless doesn’t have support for S3 integration yet. However, the benefits of Aurora Serverless, IMO, outweigh the need for some workarounds. You can use DMS to replicate data to a non-serverless instance and export the data from there, as one possible solution.

      I know the Aurora team is working to add all sorts of great features, so it is likely just a matter of time before these things are built-in.

      – Jeremy

  16. Hi,
    I want to connect to Aurora Serverless using SSH tunnel from local via EC2. I’m using windows 7 with Git bash.
    Does I need install mysql cli on EC2 in this case?
    Thank you.

  17. Jeremy, thanks for taking the time to write this – I am in the process of launching a web-based application and intend to use Aurora Serverless at least to get me started so I can right-size an instance later. I do have some questions around encryption and number of databases. I know I can encrypt my data at rest with a key, no brainer. Can I use multiple keys to encrypt different tables in the same DB? So each table can hold their own siloed data, and encyrpted with their designated key. I don’t think this is possible, but thought I’d ask.

    Assuming I cannot, is the ACU charge of Serverless per DB, or can I run say 1,000 DBs and ACUs scale as needed under one “serverless” account. This meets my encryption requirement, but I need to be sure it’s cost effective. If instead I am charged 1,000 x 2 ACUs that isn’t financially scalable.

    1. Hi Chris,

      Data in Aurora Serverless is encrypted at rest by default, but there is no way to use different KMS keys for different databases, at least not as far as I know.

      The ACU charge is per “cluster”, so it is possible to run hundreds of databases on a single cluster and share ACUs. However, the encryption keys are tied to the cluster, so you’d only be able to use the one key.

      Hope that helps,
      Jeremy

  18. Jeremy, Excellent article, thanks for writing such a detailed article which is focused on the practical aspects(which is hard to figure out for anybody).

    I have a quick question, when auto-scaling up or down does it check the resource usage for some time window? Eg: CPU usage >70% for 5 mins then go to next level

    1. Hi Lohith,

      There is no “cool down” period for scaling up, so your cluster will continue to scale until it satisfies the load. There is a 15 minute cool down period for scaling down, however. This is also based on whether or not it can find a scaling point that will not interrupt transactions and long running queries.

      https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.how-it-works.html#aurora-serverless.how-it-works.auto-scaling

      Hope that helps,
      Jeremy

  19. Hi Jeremy,

    I am using aurora serverless db with lambda and below is the exception i am getting in first invocation when db is stopped. However it is running fine in second invocation because db is resumed this time. I am trying to connecting using npm mysql and creating a pool connection.

    pool.getConnection(function(error, connection) {
    connection.query(
    })

    Exception is

    TypeError: Cannot read property ‘query’ of undefined
    at /var/task/main.js:18:16
    at Handshake.onConnect [as _callback] (/opt/nodejs/node_modules/mysql/lib/Pool.js:58:9)
    at Handshake.Sequence.end (/opt/nodejs/node_modules/mysql/lib/protocol/sequences/Sequence.js:88:24)
    at /opt/nodejs/node_modules/mysql/lib/protocol/Protocol.js:398:18
    at Array.forEach ()
    at /opt/nodejs/node_modules/mysql/lib/protocol/Protocol.js:397:13
    at _combinedTickCallback (internal/process/next_tick.js:131:7)
    at process._tickDomainCallback (internal/process/next_tick.js:218:9)

    Please help me like how can i solve this. I understand like this is issue with cooling period.

    Thanks,
    Ramesh

    1. Hi Ramesh,

      Sorry for the late reply. If you are letting the database go to sleep, then you will have ~30 second start up time. To avoid this in production, you can disable this feature. Also, it is not necessary to create a connection pool, as Lambda needs to create a new connection for every concurrent execution. Take a look at Managing MySQL at Serverless Scale to see how to deal with this issue.

      Thanks,
      Jeremy

  20. Hey Jeremy, Thank you for the great article, it’s an informative write-up! I’m considering migrating from an Aurora Provisioned cluster to Aurora Serverless. However, I’m not entirely sure about the claim that Aurora Serverless may rely on multiple nodes to handle requests (-and thereby save costs).
    ” It also seems to indicate that it relies on multiple nodes to handle requests, which suggests that the service automatically provides high-availability and failover via multiple AZs.”

    The user guide provided by AWS specifies “An Aurora Provisioned cluster that is configured for fast failover recovers in approximately 60 seconds. Although Aurora Serverless does not support fast failover, it supports automatic multi-AZ failover. Failover for Aurora Serverless takes longer than for an Aurora Provisioned cluster. The Aurora Serverless failover time is currently undefined because it depends on demand and capacity availability in other AZs within the given AWS Region.” –https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.how-it-works.html

    Since the failover takes longer for Aurora Serverless, and the fact that it is dependent on availability in other AZ’s, it may indicate that there aren’t other instance running to perform a fast failover. Which makes Aurora Serverless significant more expensive than running an Aurora Provisioned cluster, as pointed out in your article.

    1. Hi Mark,

      In order for fast failover to work with Aurora Provisioned, you need to have replicas running in another region, which you are paying for. Aurora Serverless doesn’t require you to run multiple clusters, so you are only paying for the ACUs being used in any given AZ at any given moment. I understand that the documentation states that the AZ failover time is “undefined” due to available space in other AZs, but my assumption is that it would be a matter of minutes before additional capacity was added to handle an AZ failure. If your use case requires a maximum of 60 seconds to failover to another AZ, then provisioned would likely be the way to go, but it would be much more expensive since you are running multiple clusters.

      Also, as Aurora Serverless becomes more popular and there are more customer using it, the available capacity in different AZs will likely be able to handle a single AZ outage. AWS is very good at planning for these types of scenarios.

      – Jeremy

    1. This minimum of one ACU is only available for MySQL-compatible databases.

      Also, Aurora Serverless for PostgreSQL still isn’t even available in some regions.

      Thanks for the valuable post!

  21. Hi,
    Great article!
    What do you think is the best biz scenario to use Aurora Serverless.
    It is expensive in comparison to RI for any sustained workload. We are using it for dev/testing but we still can’t match any of biz cases except some low traffic web sites. Maybe some web sites that require database transactions sometimes but would cache a lot for reads using Redis or similar (wordpress,…)

    1. Hi Goran,

      In environments with predictable sustained load, reserved instances might be cheaper (even when running a failover cluster). If you have peak load times, however, I really like the flexibility of Aurora Serverless. Scaling down to 2 ACUs during non-peak, and then back up to 32 or 64 to handle high-traffic scenarios, seems like a cost saving opportunity.

      Good luck,
      Jeremy

  22. Am I correct in assuming if this runs in the same VPC as your Lambdas that connect to it you do not have the ENI creation wait time?

    1. Hi Ian,

      Lambdas don’t actually run “in” your VPC. They need to create an ENI in order to connect, making connections to Aurora Serverless the same as connecting to any other resource in your VPC from Lambda.

      – Jeremy

  23. I am no longer positive the place you’re getting your info, but great topic.
    I must spend a while finding out more or figuring out more.

    Thank you for excellent info I used to be looking for this info for my mission.

  24. Just an update:
    We are using Aurora serverless with mysql in production, backing up some php websites in N Virginia. They dont manage scaling “local” storages probably. and we get application error “SQLSTATE[HY000]: General error: 1030 Got error 28 from storage engine” often.

    We have enterprise support plan. Someone from their support team manually goes and empties the local space on the instance in case of an incident . Their development team cannot provide an ETA for the fix.

    More info on local storage:
    – Aurora Local Storage
    This storage is used to store temporary tables, temporary files and logs (general logs, slow query logs, audit logs, error logs). Each ACU has a local SSD storage attached to it, and this local storage does not auto grow like the Aurora cluster volume. Instead, the amount of local storage is limited. The limit is based on the DB instance class/ACU in your DB cluster. In a serverless environment, the local storage increases as the ACU increases i.e. the Aurora cluster scales up.

  25. Hi,
    Thank you for a very informative article, AWS docs were never really good, not to mention for new services, so thank you.

    I have a question regarding that max_connections. We are building an ecommerce site and although we will use cache a lot we feel like 6,000 connections is not much. Amazon market this as enterprise DB, but isn’t 6000 connections is a serious limit for big websites? I mean if you have a website with lets say 500,000 monthly users, in peak hours you can easily reach 6,000 simultaneous users for 20-30minutes ?

    I also have a technical question please, the only way to connect to the db outside of the VPC, is through ec2? what does the “internet gateway” in VPC used for?
    And does that mean that I cant connect to Aurora serverless from my app who runs on google compute engine for example?

    Thanks!
    Ziv

    1. Hi Ziv,

      6,000 connections would definitely be enough for large websites if you were using connection pooling or simply maintaining persistent connections. It would not be one connection per user. The problem with accessing it from something like Lambda, however, is that Lambda functions create a new connection for every “concurrent” execution, so you would need to use something like serverless-mysql or Data API to handle that situation.

      As far as VPCs are concerned, Aurora Serverless must be in a VPC, so you would need to use a VPN or some other proxy to route traffic from the Internet. “Internet Gateways” within VPCs, allow your VPC to access the Internet. By default, your VPC cannot access anything outside of your VPC.

      I hope that helps,
      Jeremy

    2. Hi Jeremy ,

      It certainly helps!

      Thanks again, really appreciate your time and efforts , writing these articles and answering questions!
      All the best.

  26. Do you know if it’s STILL required to connect to the serverless DB’s from only a VPS? Wanted to start transitioning, but would be coming from an externally (non-AWS) VPS box, guessing that’s not going to work.

  27. How does Aurora Server-less charges for storage?
    Also, what if i scale down the storage from 100GB to lets say 50GB , how will the cost be calculated?
    Thanks

  28. Hello Jeremey,
    First of all thanks a lot for the article. It is the best place to get information on aurora serverless.
    I have tried to use it as it fits well with occasional usage scenario for us.
    One issue that I am facing is I am trying to load a table ( truncate & Load ) from another app (spark app ) using spark sql jdbc option. But the data load is taking much much longer compared to our current RDS ( microsoft sql server) . I have the config of minimum 2 CUs and max 4 CUs for aurora serverless and i have a db.m4.xlarge for our current RDS. in current RDS the same data takes 10+ mins to load into table (same schema ) but with aurora serverless its takes 3+ hours ! . When I looks at the monitoring tab , I see around 20% cpu usage / 3-4 active connections and it doesn’t scale to 4 CUs but stays with 2 CUs only.
    Have you done any benchmarking on bulk data writes to aurora serverless ?

  29. Hi Jeremy,
    Thanks for the great article.
    Have you done any benchmarking for bulk data load into aurora serverless ?
    In my case I am trying to load data to it from spark cluster thru jdbc and seeing significantly high ( 10X more ) data load times ( truncate and load ) compared to a sql server RDS .

    Thanks,
    Saroj

    1. Hi Saroj,

      I have not experimented with bulk loading data, but from your other comment, it seems like an issue. I will run some experiments and see what I find.

      Thanks,
      Jeremy

    2. Saroj, for the BULK INSERT, for better performance you’ll need multiple separate imports running… the “5x” speed of MySQL is when it’s doing stuff in parallel.

  30. Hi Jeremy,

    Thanks a lot for this very well written article. Quick question about the scaling: Assuming I don’t have any long running connections and Aurora will be able to find a scaling point (up or down), is there any time when the database is unavailable? Does scaling happen in the background and the database is ready to accept conenctions at any time? Thanks!

    Best regards,
    Theodor

    1. Hi Theodor,

      The database should never be unavailable. Scaling points are only restricted by long running queries and operations. Generally the system will scale immediately when thresholds are met.

      – Jeremy

  31. Hi Jeremy,

    Thanks for the great article!

    One thing I don’t understand. How does the scaling happen? Why does Aurora need to find a scaling point? Is there any downtime / unavailability of the database while the actual scaling is happening?

    Thanks,
    Thomas

    1. Hi Thomas,

      The scaling point is only necessary if there are long running queries or operations that would be disrupted. Under most circumstances, the database will scale as soon as your CPU and/or connections reach the threshold. There is zero downtime even when there is a scaling operation.

      Hope that helps,
      Jeremy

  32. Thanks for the article, man! Impressed how you touched on all aspects of Aurora Serverless. I came here for pricing details but, found lots of useful information. I didn’t know you could connect to Aurora Serverless instance with SSH tunnel so, a big thanks for that. Because, if you search on StackOverflow, most answers would declare connecting to Aurora Serverless instance outside VPC impossible.

    While reading your replies in the comment section, your comment regarding ORMs caught my attention. Currently, I am using Sequelize, a node package built over mysql2 node package, in my Lambda resolvers which might be worth looking into. I totally agree that ORMs have limitations but, for apps that are just “rolling out” with no or very small user base, you can develop much faster with ORMs. If the app gets a considerable user base, you can always rewrite the queries that are fine tuned to your app considering schema and queries will have become more clear by that time. In my opinion, ORMs or other tools aren’t that pointless.

    It did help,
    Talha

  33. Hi Jeremy,

    Very helpful article, thanks for taking the time to write it!
    Did you observer significant costs saving in your production environment, comparing to using reserved instances? If so, could you share with us, in percentage, how much you could save?

    Thank you!

    1. Hi Emmanuel,

      I’ve been using Aurora Serverless in a few production environments (mostly small workloads with some spiky traffic), but the cost has been much lower. I typically don’t buy reserved instances for database workloads, because every time I do, I end up needing to change the size of the instance. If I had consistent workloads, then it makes sense, but that is one of the benefits of Aurora Serverless: I don’t have to capacity plan (as much). The cost savings over on-demand has been significant in the long run, but you’d have to work the numbers to see if the RI pricing makes more sense for your use case.

      – Jeremy

  34. Hi Jeremy,
    Thanks for writing a very informative blog on AWS Aurora serverless.
    I Have one issue to restore my Database file (backup.sql) to my Aurora serverless from the MySQL CLI.
    I have one RDS (MySQL 5.7) stand-alone Database. i took the backup of that database (mydbbckup.sql).
    I setup AWS Aurora serverless. I now am trying to import that backup file to the Aurora with MySQL CLI.
    I am getting this error. can you please help me with this.

    getting below error

    ERROR 1709 (HY000) at line 182: Index column size too large. The maximum column size is 767 bytes.
    then I check on line 182
    and here below
    CREATE TABLE pod_cast (
    id int(10) unsigned NOT NULL AUTO_INCREMENT,
    client_uuid varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    creator_client_uuid varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    user_id varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    name varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    description varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    image varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    objective text COLLATE utf8mb4_unicode_ci NOT NULL,
    video_url varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    uuid varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    due_date date DEFAULT NULL,
    type varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    category varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT ‘CXT’,
    language_code varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
    published tinyint(3) unsigned NOT NULL DEFAULT ‘0’,
    created_at timestamp NULL DEFAULT NULL,
    updated_at timestamp NULL DEFAULT NULL,
    deleted_at timestamp NULL DEFAULT NULL,
    PRIMARY KEY (id),
    UNIQUE KEY surveys_uuid_unique (uuid)
    ) ENGINE=InnoDB AUTO_INCREMENT=131 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

  35. Thank you Jeremy, this is an amazing article!

    For your production workloads, do you notice increased latency from your app servers during a scaling event? I notice for about 5 minutes our APM reports significant (15x) latency during the scaling event. Basically is this something that I should be trying to fix, or everyone experiences this?

    Thanks
    Ryan

    1. Hi Ryan,

      I think this is just the scaling behavior. 5 minutes seems a bit high, but I’m not sure what knobs you can tweak to fix this. I’ll investigate a bit more.

      – Jeremy

  36. Hi Jeremy,

    Thanks for the very helpful article.

    Do you know what happens to your indexes on aurora serverless?

    Lets say you start with 2GB ram and you add indexes that gobbles up 4GB. Does it drop some of the indexes? Or does it scale up?

    And what happens on the opposite? (removing indexes which in turn frees up 4GB presumably)

    Ty

    1. Hi Matt,

      Great question. In MySQL, your indexes are persisted to disk, and only loaded into memory when needed (and only the blocks that are needed). So I would think that if your queries begin to need more memory/CPU, then Aurora Serverless will scale up. I’m not sure how garbage collection works with Aurora, but it seems to me that eventually that memory would be freed, and the system would scale down again. I’ll send a note to the Aurora Serverless team to get their thoughts as well.

      Thanks,
      Jeremy

  37. Hi Jeremy,

    Thanks for the very helpful article.
    For our production workloads, we noticed increased latency during a scaling event.
    Has anyone experienced this behavior?
    Is this correct behavior during a scaling-up event?
    Thanks
    Salvatore

  38. Thanks for the great post. Question on the limits: “AWS VPN connections and inter-region VPC peering connections are not able to connect to Aurora Serverless clusters”. Is this still the case? I could not find this in the current documentation.

  39. Would definitely be interested in seeing the pricing examples with updated instance sizes (e.g. t3, r5 instead of t2, r4). Great article!

  40. Hi @jeremy great article
    We were building solution on real time processing data project via using aurora serverless, lambda, redshift…
    Not able to decide on Acu when we use 32 GB it’s works fine but when 16GB it slows down a bit.
    However the Cpu utilization in both scenario does not reach to maximum. Is there any way to perform some test
    Need guidance ….

  41. This is a great article Jeremy. Really detailed and it is amazing that you modified it after AWS updated its new tool. I was looking for a way to compare “apple to apples” and your table is great for that purpose
    The only thing that has kept me thinking is that it would be fair to mention the sleep mechanism when we compare prices as this feature can reduce the costs considerably. What do you think about it?

    Thanks for this amazing article

    1. If you are using it for production, you should NOT use the sleep feature. It takes in upwards of 30 seconds to wake up and would not be good for production workloads. For development workloads, I would definitely recommend using the sleep feature because, yes, it would increase savings even more.

  42. Hi, great article Jeremy
    We use serverless for production now and set max ACU is 32. But I think the scale ACUs up not above 70%, but 50%.
    Because I check my server the max only 56% (in avg 40% for more than 20 minutes) and db connection only 200-300 (for 16 ACUs) , then suddenly the ACUs scale up to 32
    It happen for several times, so I think the max CPU utilization above 50%
    what do you think?

    1. I’m not entirely sure, but they are probably constantly tweaking the thresholds. That is what I really like about v2 being so incremental, you don’t have to worry about scaling events that double capacity.

  43. Would have been awesome if you actually explained how you setup the DB instead of just saying that you did. That frigging VPC security nonsense holds a good part of the population from using this stuff and fills the stackoverflow boards.

    My take-away after a month of serverless AWS? If you want to reinvent the wheel and spend your time configuring amazon instead of writing code, sign up now.

    1. Hi Rian. My articles tend to get very long, so I try to keep the scope limited. There are lots of articles that explain how to set up a VPC and a database in AWS. I agree that it is more difficult than it needs to be.

  44. Hey Jeremy,

    Thanks for the article, got a lot of insights.

    I have a question, we can’t export our snapshots in S3 with Aurora Serverless, is there any alternative for this where we can achieve this export? Or be able to download it somewhere?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.