Aurora Serverless: The Good, the Bad and the Scalable

Amazon announced the General Availability of Aurora Serverless on August 9, 2018. I have been playing around with the preview of Aurora Serverless for a few months, and I must say that overall, I’m very impressed. There are A LOT of limitations with this first release, but I believe that Amazon will do what Amazon does best, and keep iterating until this thing is rock solid.

The announcement gives a great overview and the official User Guide is chock full of interesting and useful information, so I definitely suggest giving those a read. In this post, I want to dive a little bit deeper and discuss the pros and cons of Aurora Serverless. I also want to dig into some of the technical details, pricing comparisons, and look more closely at the limitations.

Audio Version

What is Aurora Serverless?

Let’s start with Aurora. Aurora is Amazon’s “MySQL and PostgreSQL compatible relational database built for the cloud, that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases.” If you are using MySQL or PostgreSQL, you can basically migrate to Aurora without any changes to your application code. It’s fully-managed, ridiculously fast (up to 5x faster than MySQL), and can scale using shared-storage read replicas and multi availability zone deployments. This extra power (and convenience) comes with an added cost (~23% more per hour), but I’ve found it to be well worth it.

Now let’s talk about the “serverless” part. Yes, serverless has servers (I’m getting sick of typing this 🤦🏻‍♂️), but the general idea is that for something to be “serverless”, it should:

  • Charge you only for what you use
  • Require zero server maintenance
  • Provide continuous scaling
  • Support built-in high availability and fault tolerance

Aurora Serverless provides an on demand, auto-scaling, high-availability relational database that only charges you when it’s in use. While it may not be perfect, I do think that this fits the definition of serverless. All you need to do is configure a cluster, and then all the maintenance, patching, backups, replication, and scaling are handled automatically for you. It almost sounds too good to be true, so let’s jump in and investigate.

Setting up your Aurora Serverless cluster

Setting up an Aurora Serverless cluster is fairly simple. Note that you can only create a cluster in a VPC.

1. Click “Create Database” from the main RDS Console screen or from within “Instances” or “Clusters”

2. Select “Amazon Aurora” and the “MySQL 5.6-compatible” edition (Aurora Serverless only supports this edition) and click “Next”

3. Select “Serverless” as the DB engine, enter the cluster settings including the identifier, master username and password, then click “Next”

4. Select your “Capacity settings” and your “Additional scaling configuration” to control auto-pause settings

5. Configure “Network & Security” by selecting (or creating) a VPC and subnet group as well as choosing (or creating) VPC security groups

6. Expand “Additional configuration” and confirm your parameter group, backup retention setting and your encryption key and then click “Create database”

Note that data at rest in Serverless Aurora clusters appears to be automatically encrypted using KMS. There is no option to disable it.

It took just over 2 minutes to create the cluster for me. Once it is available, you can view the cluster information by clicking on “Clusters” in the left menu and then clicking on your cluster name in the list.

This dashboard is quite robust, with lots of CloudWatch metrics (including the ability to compare clusters) and detailed information about your cluster configuration. From here you can also grab your “Database endpoint”.

Aurora Serverless Cluster Details

Connecting to your Aurora Serverless Cluster

From a connection standpoint, your Aurora Serverless cluster is very similar to a regular Aurora cluster. Any VPC resource that has access to your cluster (based on your chosen security groups) can connect to your cluster on port 3306. If you have an existing EC2 instance (with proper security group access and the mysql CLI tools installed), you can connect from your terminal using the standard:

mysql -u root -h test-serverless-db.cluster-XXXXXXXXXX.us-east-1.rds.amazonaws.com -p

If you want to connect to your cluster from your local machine, you can either connect through a VPN, or set up an SSH tunnel. You can configure an SSH tunnel by adding the following to your ~/.ssh/config file:

Then simply run ssh tunnel -Nv in your terminal. From another terminal window run:

mysql -u root -h 127.0.0.1 -p

And that should connect you. I wrote a detailed post about how to do this with Elasticsearch in a VPC. That should give you more detailed information if you have an issue with the above setup.

Your applications would connect just like they would to any other MySQL database. I was able to easily add my new cluster to phpMyAdmin and connect from Lambda using the mysql node package.

Limitations of Aurora Serverless

As I mentioned earlier, there are A LOT of limitations to Aurora Serverless. However, depending on your use case, most of these probably won’t matter much and will have little to no impact on your application.

A few top level limitations:

  • Aurora Serverless is only compatible with MySQL 5.6
  • The port number for connections must be 3306
  • Aurora Serverless DB clusters can only be accessed from within a VPC
  • AWS VPN connections and inter-region VPC peering connections are not able to connect to Aurora Serverless clusters

You are also extremely limited with your ability to modify cluster-level parameters in custom Cluster Parameter groups. As of now, you can only modify the following cluster-level parameters:

  • character_set_server
  • collation_server
  • lc_time_names
  • lower_case_table_names
  • time_zone

Modifying other cluster-level parameters would have no effect since Aurora Serverless uses the default values. Instance-level parameters are not supported either. Noticeably absent here is the max_connections parameter. See Max Connections below for more information.

There are also several other features that Aurora Serverless doesn’t currently support. Some of these features are quite useful when using provisioned Aurora, so these might be a dealbreaker if you are relying on any of the following:

  • Loading data from an Amazon S3 bucket
    This is a great feature for bulk loading data into your MySQL database, especially if you’re using Aurora for reporting. Aurora Serverless doesn’t have support yet.
  • Invoking an AWS Lambda function with an Aurora MySQL native function
    This is one of Aurora’s greatest features, essentially giving you the ability to create triggers based on actions in your MySQL database. Comparable to DynamoDB streams.
  • Advanced auditing
    Useful for some applications, but something I can live without.
  • Aurora Replicas
    This is another great feature of Aurora that scales reads by creating new instances that access the same shared cluster volume. I’m sure Amazon thought this through, but I wonder how auto-scaling of Aurora Serverless compares.
  • Backtrack
    Another unique feature of Aurora is the ability to rewind and fast-forward transactions in your cluster. This isn’t available in Aurora Serverless.
  • Database cloning
    If you are using database cloning (especially for things like chaos engineering), you’re out of luck with Aurora Serverless.
  • IAM database authentication
    This is a nice and secure way to protect security credentials, but since Aurora Serverless doesn’t have public endpoints, it probably isn’t a huge deal.
  • Cross-region read replicas
    Helpful for disaster recover, but I think it is unnecessary given the built-in high availability and distributed nature of Aurora Serverless.
  • Restoring a snapshot from a MySQL DB instance
    Not a huge deal if you are already using Aurora. If you are migrating from MySQL, you could always migrate to an Aurora provisioned cluster and then to Serverless.
  • Migrating backup files from Amazon S3
    Same as above. Useful, but there are workarounds.
  • Connecting to a DB cluster with Secure Socket Layer (SSL)
    Again, useful for security, but not necessary if using username/password to connect.

If none of these limitations affect you, then Aurora Serverless might be right for your use case. Let’s look at the cost comparison between using provisioned versus serverless cluster.

Cost comparisons

Both provisioned and serverless versions of Aurora charge you for storage, I/O, and data transfer. Storage and I/O are flat rates per corresponding unit of measure:

Storage Rate $0.10 per GB-month
I/O Rate $0.20 per 1 million requests

Data transfer rates from Amazon RDS to the Internet shouldn’t apply to Aurora Serverless since it can’t be accessed directly from the Internet. However, depending on where data is being transferred to within AWS, data transfer fees may apply. Below is a small sample of the pricing for data transfer.

Data Transfer OUT From Amazon RDS To
CloudFront $0.00 per GB
US East (N. Virginia) $0.01 per GB
Asia Pacific (Mumbai) $0.02 per GB
Asia Pacific (Sydney) $0.02 per GB
EU (London) $0.02 per GB
Complete pricing https://aws.amazon.com/rds/aurora/pricing/

NOTE: Data transferred between Amazon RDS and Amazon EC2 Instances in the same Availability Zone is free.

The real difference in pricing is based on the way Aurora Serverless scales and how it charges for usage. Provisioned Aurora instances utilize per hour billing based on instance size, like EC2 and other virtual instances. Aurora Serverless, on the other hand, uses ACUs (or Aurora Capacity Units) to measure database capacity. Each ACU has approximately 2 GB of memory with corresponding CPU and networking resources that are similar to provisioned instances of Aurora.

ACUs are billed by the second at a flat rate of $0.06 per hour. Even though you are only billed per second of usage, there is a minimum of 5 minutes billed each time the database is active. There is also a minimum provisioning of 2 ACUs (with 4 GB of memory). You can scale all the way up to 256 ACUs with approximately 488 GB of memory. If you were to keep an Aurora Serverless database running 24 hours per day at 2 ACUs, it would cost you $2.88 ($0.06 * 2 * 24) per day (or roughly $86 per month).

In order to compare apples to apples, I’ve created a chart below that makes some assumptions based on corresponding memory to see the difference in cost between provisioned versus serverless. The instance prices below reflect on-demand pricing, which are obviously much higher than reserved instances.

ACUs Memory (GB) Serverless/hr Instance Type Cost/hr Diff/hour Diff/month
2 4 $0.12 db.t2.medium $0.082 $0.038 $27.36
8 16 $0.48 db.r4.large $0.29 $0.190 $136.80
16 32 $0.96 db.r4.xlarge $0.58 $0.38 $273.60
32 64 $1.92 db.r4.2xlarge $1.16 $0.76 $547.20
64 122 $3.84 db.r4.4xlarge $2.32 $1.52 $1,094.40
128 244 $7.68 db.r4.8xlarge $4.64 $3.04 $2,188.80
256 488 $15.36 db.r4.16xlarge $9.28 $6.08 $4,377.60

As you can see, for sustained loads, the pricing for Aurora Serverless quickly becomes extremely expensive as compared to a single provisioned Aurora instance. However, according to the announcement, Aurora Serverless “creates an Aurora storage volume replicated across multiple AZs.” It also seems to indicate that it relies on multiple nodes to handle requests, which suggests that the service automatically provides high-availability and failover via multiple AZs. If you follow AWS’s recommendation to place “at least one Replica in a different Availability Zone from the Primary instance” to maximize availability, then the cost in the above chart would double for on-demand instances. This is a more fair comparison given the inherent high-availability nature of Aurora Serverless. The chart below assumes that each instance has a replica in a different availability zone.

ACUs Memory (GB) Serverless/hr Instance Type Cost/hr Diff/hour Diff/month
2 4 $0.12 db.t2.medium $0.164 ($0.04) ($31.68)
8 16 $0.48 db.r4.large $0.58 ($0.10) ($72.00)
16 32 $0.96 db.r4.xlarge $1.16 ($0.20) ($144.00)
32 64 $1.92 db.r4.2xlarge $2.32 ($0.40) ($288.00)
64 122 $3.84 db.r4.4xlarge $4.64 ($0.80) ($576.00)
128 244 $7.68 db.r4.8xlarge $9.28 ($1.60) ($1,152.00)
256 488 $15.36 db.r4.16xlarge $18.56 ($3.20) ($2,304.00)

From this we can see significant cost savings, even if you maintained comparable capacity for 24 hours each day. Of course, the idea behind Aurora Serverless is to assume unpredictable workloads. If your application doesn’t have extremely long periods of sustained load, only paying for occasional spikes in traffic would actually be significantly cheaper.

With regards to reserved instances, obviously there is a significant price difference. However, I always find it hard to plan database capacity (especially a year out), so buying reserved instances is always a crap shoot (for me anyway). However, the ability to buy “Reserved ACUs” would be a really interesting concept. That way I could prepay for HOURS of capacity at a discounted rate. Something to think about, Amazon. 😉

Autoscaling Aurora Serverless

A cornerstone of serverless architectures are their ability to provide continuous scaling. Aurora Serverless is designed to scale up based on the current load generated by your application. Your cluster will automatically scale ACUs up if either of the following conditions are met:

  • CPU utilization is above 70% OR
  • More than 90% of connections are being used

You cluster will automatically scale down if both of the following conditions are met:

  • CPU utilization drops below 30% AND
  • Less than 40% of connections are being used

There is also a 3 minute cooldown period after a scale up operation occurs. I haven’t fully tested this, but it seems to restrict the system from autoscaling more than once every 3 minutes. Similarly, there is a cooldown period of 15 minutes for scale down operations, meaning that your scaled capacity will be maintained for at least 15 minutes after a scale-up operation. This makes sense to avoid scaling down too quickly.

As we saw in the setup, it’s also possible to have your cluster automatically pause itself after a period of inactivity. If you are using your Aurora Serverless cluster in a production environment, I strongly suggest disabling this feature. It takes about 30 seconds to “unpause” a database, which is much too long for client-facing applications.

Aurora Serverless also introduces the concept of “scaling points”, which refer to a point in time at which the database can safely initiate a scaling operation. The documentation specifies long-running queries, transactions, temporary tables, and table locks as reasons why it might not be able to scale. Aurora Serverless will try to scale the cluster five times before cancelling the operation.

Manually Setting the Capacity

An extremely handy feature of Aurora Serverless is the ability to manually provision capacity for your cluster. This can be done via the console, AWS CLI, or the RDS API. If you are anticipating an increase in transactions, whether by detecting some type of leading indicator or preparing for a large batch operation, manually setting the ACUs allows you to quickly scale your cluster to handle the extra workload. Note that setting the capacity manually might drop existing connections if they prevent scaling operations. Once capacity is manually scaled, the same cooldown periods apply for autoscaling the cluster up and down.

Manually scaling capacity in the Console

Details for modifying the capacity manually can be found in the API documentation here. The AWS Node.js SDK already provides support for manual scaling as well.

Max Connections

A major limitation of relational databases in serverless architectures is the maximum number of concurrent connections allowed by the database engine. While FaaS services like Lambda may scale infinitely (in theory anyway), massive spikes in volume can quickly saturate the number of available connections to the underlying database. There are ways to manage connections in serverless environments, but even with Aurora Serverless, this still appears to be a possible limiting factor.

AWS uses the following formula for generating the max_connections value for Aurora instances:

log( ( <Instance Memory> * 1073741824) / 8187281408 ) * 1000 = <Default Max Connections>

A db.r4.xlarge instance with 30.5 GB of memory for example would have a default max_connections value of 2,000.

log( (30.5 * 1073741824) / 8187281408 ) * 1000 = 2000

You can see the a list of default max_connections settings for provisioned Aurora instances here.

Aurora Serverless uses a similar formula based on memory allocation. I manually scaled a test cluster and ran select @@max_connections; after each operation to retrieve the actual value. The following chart outlines my results.

ACUs Memory (in GB) Max Connections
2 4 90
4 8 135
8 16 1,000
16 32 2,000
32 64 3,000
64 122 4,000
128 244 5,000
256 488 6,000

If you compare this to the chart of provisioned instance values, you’ll notice that they are essentially identical to their corresponding instance types. However, with provisioned instances, you can manually change the max_connections value and squeeze a few more connections out of it. Aurora Serverless does not allow you to modify this value.

Provisioned clusters also allow for read replicas that can scale the number of available connections as well. Aurora Serverless does not allow for these.

Time to Scale

I ran a few tests to see how quickly the cluster would actually scale. Rather than simulating load (which I assume would have similar results), I manually scaled the cluster and measured the time it took for the max_connections value to change. I’m assuming that at that point, the new capacity was available for use. I also measured the time it took for the cluster status to change from “scaling-capacity” to “available”.

ACUs Time to Capacity Time to Completion
Up to 4 0:48 2:15
Up to 8 1:30 3:00
Up to 16 0:45 1:37
Up to 32 0:50 1:45
Up to 64 1:00 1:40
Up to 128 1:25 2:05
Up to 256 2:30 3:45
Down to 2 0:35 2:21

My tests showed most of the scaling operations taking less than a minute. Assuming similar performance for autoscaling operations, this would seem to be adequate for handling steadily increasing or potentially even sudden traffic bursts.

Monitoring

CloudWatch provides all the same metrics for Aurora Serverless that it does for provisioned clusters. In addition, CloudWatch allows you to monitor the capacity allocated to your cluster using the ServerlessDatabaseCapacity metric. This would make for a great alarm.

ServerlessDatabaseCapacity CloudWatch Metric

Final Thoughts

So far I’m very impressed by Aurora Serverless and the new capabilities it introduces. I also think that most of the limitations, such as IAM authentication, S3 integration, and Lambda triggers will eventually make their way into the platform. But even the lack of those features (which didn’t even exist until AWS introduced them with Aurora), isn’t enough to outweigh the tremendous value that Aurora Serverless provides. If you are a PostgreSQL shop, or you really need MySQL 5.7, then you might need to wait, but otherwise, this could be the new way to do relational databases at scale.

I need to run some additional tests to see how well it plays with Lambda, especially with regards to the connection limitation, but after playing with this for a few months, I’m inclined to start using it in production. I already have several clusters that are over provisioned to handle periodic traffic spikes. If this scales as expected, it very well could be the silver bullet I’ve been looking for. 🤔

Are you planning on using Aurora Serverless? Let me know in the comments and describe the problem you plan on solving with it.

Tags: , , , , ,


Did you like this post? 👍  Do you want more? 🙌  Follow me on Twitter or check out some of the projects I’m working on. You can sign up for my WEEKLY newsletter too. You'll get links to my new posts (like this one), industry happenings, project updates and much more! 📪

Sign Up for my WEEKLY email newsletter


I respect your privacy and I will NEVER sell, rent or share your email address.

12 thoughts on “Aurora Serverless: The Good, the Bad and the Scalable”

  1. Excellent article, to add a note on the approach to connect to the database you could sshuttle as a wrapper of the SSH Tunnel.

    Terminal 1:
    $ sshuttle –dns -r username@jumphost 192.168.1.0/24
    Connected…

    Terminal 2:
    $ mysql -h aurora-serverless -u dbuser -p -P 3306
    Enter password:
    mysql>

  2. Excellent article! A question I have is: how does it help lambda in light of your previous articles? Do we still need to manage a connection pool for lambda functions? or Aurora Serverless connects so fast and renders the connection pool unnecessary? – Thanks.

    1. Hi Ken,

      Unfortunately, Aurora Serverless doesn’t solve the max_connections issue yet. It’s still possible that zombie connections or high concurrency could use all the connections. Aurora Serverless should definitely keep scaling to mitigate it, but there’s no guarantee it will be fast enough. I’m working on a little something to better handle this myself that I hope to share soon.

      Interestingly, there is a session called “SRV301 – Best Practices for Using AWS Lambda with RDS/RDBMS Solutions” at this year’s re:Invent. I plan on attending to get some more insight myself.

      – Jeremy

  3. Jeremy,
    Thanks for your answer. That is what I thought also. The Lambda with RDS is indeed interesting. Along the same lines, do you use ORM in your Lambada function? I am wondering what are your thoughts about Lambda with ORM?
    – Thanks.

    1. I hate ORMs with a passion. I have a project that I inherited that is using Doctrine ORM with Symfony and the amount of unnecessary queries it runs is ridiculous.

      A key component to keeping serverless functions fast and efficient is to minimize dependencies as much as possible. I doubt your Lambda function needs the full power of an ORM to run a few single purpose RDS queries. I just don’t see the benefit to abstracting simple T-SQL queries within serverless environments.

      – Jeremy

  4. I totally agree with you. I feels that ORM is not a good fit for microservices either, even not combined with serverless. I know you work with microservices also. What are your thoughts on microservices with ORM?

    1. Same feeling. I really dislike how much control you give to ORMs. Oftentimes a simple join can accomplish what takes an ORM 5+ queries to do. I avoid them at all costs. I’m unlikely to change my underlying RDBS, so the flexibility it provides in that regard is essentially useless to me.

  5. Glad on the same page as you in this regard. Changing database is always a big effort in reality. I doubt ORM can help a whole lot, unless it is a simple simple CRUD application.

  6. I’d like to say thank you because your articles are so clear and very useful!!
    After I’ve read this I really ken to test Aurora serverless but I’m just wondering…
    Lambda service uses an elastic network interface (ENI) to access resources inside of a private VPC.
    A cold start has to create and attach the ENI to the lambda container.
    This seems to be a very slow process.
    I’m quite scared about that because could have a huge bad impact, especially in development environment context.
    Can I ask you what do you think about that?
    Thank you so much and sorry for my English 🙂

    1. Hi Diego,

      I’m glad you found the article helpful. You’re right that Lambdas inside a VPC need to use ENIs and it does have an impact on cold start times. However, VPC functions stay warm for at least 15 minutes (and much longer if in use), so once you get passed the cold start (which is usually less that a few seconds), subsequent requests will be very fast. Also, Lambda functions smartly reuse ENIs, so a new ENI doesn’t necessarily need to be created on each cold start. I use Aurora heavily in production and I find cold starts to have a very minimal effect on overall performance.

      Hope that helps,
      Jeremy

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.