AWS re:Invent 2019 is a wrap, but now the real work begins! There are hundreds of session videos now available on YouTube. So when you have a few days (or weeks) of downtime, you can dig in to these amazing talks and learn about whatever AWS topics you fancy.
I was only able to attend a few talks this year, but one that I knew I couldn’t miss in person, was Rick Houlihan’s DAT403: Amazon DynamoDB deep dive: Advanced design patterns. At the last two re:Invents, he gave similar talks that explored how to use single-table designs in DynamoDB… and they blew my mind! 🤯 These videos were so mind-bending, that they inspired me to immerse myself in NoSQL design and write my How to switch from RDBMS to DynamoDB in 20 easy steps post. I was hoping to have a similar experience with this year’s edition, and I WAS NOT DISAPPOINTED.
As expected, it was a 60 minute firehose of #NoSQL knowledge bombs. There was A LOT to take away from this, so after the session, I wrote a Twitter thread that included some really interesting lessons that stuck out to me. The video has been posted, so definitely watch it (maybe like 10 times 🤷♂️), and use it to get started (or continue on) your DynamoDB journey.
Posted in Serverless on October 20, 2019 Last Updated: March 30, 2020
I’m a big fan of following the Single Responsibility Principle when creating Lambda functions in my serverless applications. The idea of each function doing “one thing well” allows you to easily separate discrete pieces of business logic into reusable components. In addition, the Lambda concurrency model, along with the ability to add fine-grained IAM permissions per function, gives you a tremendous amount of control over the security, scalability, and cost of each part of your application.
However, there are several drawbacks with this approach that often attract criticism. These include things like increased complexity, higher likelihood of cold starts, separation of log files, and the inability to easily compose functions. I think there is merit to these criticisms, but I have personally found the benefits to far outweigh any of the negatives. A little bit of googling should help you find ways to mitigate many of these concerns, but I want to focus on the one that seems to trip most people up: function composition.
I posted a thread on Twitter with some thoughts on how to how to switch from RDBMS to DynamoDB. Some people have asked me to turn it into a blog post to make it easier to follow. So here it is… with some bonus steps at the end. Enjoy! 😁
I've been spending a lot of time lately with @dynamodb in my #serverless applications, so I thought I'd share my surefire guide to migrating to it from #RDBMS. So here is…
How to switch from RDBMS to #DynamoDB in *20* easy steps… (a thread)
Developing and testing serverless applications locally can be a challenge. Even with tools like SAM and the Serverless Framework, you often end up mocking your cloud resources, or resorting to tricks (like using pseudo-variables) to build ARNs and service endpoint URLs manually. While these workarounds may have the desired result, they also complicate our configuration files with (potentially brittle) user-constructed strings, which duplicates information already available to CloudFormation.
This is a common problem for me and other serverless developers I know. So I decided to come up with a solution.
In the serverless world, we often get the impression that our applications can scale without limits. With the right design (and enough money), this is theoretically possible. But in reality, many components of our serverless applications DO have limits. Whether these are physical limits, like network throughput or CPU capacity, or soft limits, like AWS Account Limits or third-party API quotas, our serverless applications still need to be able to handle periods of high load. And more importantly, our end users should experience minimal, if any, negative effects when we reach these thresholds.
There are many ways to add resiliency to our serverless applications, but this post is going to focus on dealing specifically with quotas in third-party APIs. We’ll look at how we can use a combination of SQS, CloudWatch Events, and Lambda functions to implement a precisely controlled throttling system. We’ll also discuss how you can implement (almost) guaranteed ordering, state management (for multi-tiered quotas), and how to plan for failure. Let’s get started!
An extremely useful AWS serverless microservice pattern is to distribute an event to one or more SQS queues using SNS. This gives us the ability to use multiple SQS queues to “buffer” events so that we can throttle queue processing to alleviate pressure on downstream resources. For example, if we have an event that needs to write information to a relational database AND trigger another process that needs to call a third-party API, this pattern would be a great fit.
This is a variation of the Distributed Trigger Pattern, but in this example, the SNS topic AND the SQS queues are contained within a single microservice. It is certainly possible to subscribe other microservices to this SNS topic as well, but we’ll stick with intra-service subscriptions for now. The diagram below represents a high-level view of how we might trigger an SNS topic (API Gateway → Lambda → SNS), with SNS then distributing the message to the SQS queues. Let’s call it the Distributed Queue Pattern.
This post assumes you know the basics of setting up a serverless application, and will focus on just the SNS topic subscriptions, permissions, and implementation best practices. Let’s get started!
Posted in Serverless on December 21, 2018 Last Updated: March 30, 2020
I’ve been building serverless applications since AWS Lambda went GA in early 2015. I’m not saying that makes me an expert on the subject, but as I’ve watched the ecosystem mature and the community expand, I have formed some opinions around what it means exactly to be “serverless.” I often see tweets or articles that talk about serverless in a way that’s, let’s say, incompatible with my interpretation. This sometimes makes my blood boil, because I believe that “serverless” isn’t a buzzword, and that it actually stands for something important.
I’m sure that many people believe that this is just a semantic argument, but I disagree. When we refer to something as being “serverless”, there should be an agreed upon understanding of not only what that means, but also what it empowers you to do. If we continue to let marketers hijack the term, then it will become a buzzword with absolutely no discernible meaning whatsoever. In this post, we’ll look at how some leaders in the serverless space have defined it, I’ll add some of my thoughts, and then offer my own definition at the end.
Our serverless applications become a lot more interesting when they interact with third-party APIs like Twilio, SendGrid, Twitter, MailChimp, Stripe, IBM Watson and others. Most of these APIs respond relatively quickly (within a few hundred milliseconds or so), allowing us to include them in the execution of synchronous workflows (like our own API calls). Sometimes we run these calls asynchronously as background tasks completely disconnected from any type of front end user experience.
Regardless how they’re executed, the Lambda functions calling them need to stay running while they wait for a response. Unfortunately, Step Functions don’t have a way to create HTTP requests and wait for a response. And even if they did, you’d at least have to pay for the cost of the transition, which can get a bit expensive at scale. This may not seem like a big deal on the surface, but depending on your memory configuration, the cost can really start to add up.
In this post we’ll look at the impact of memory configuration on the performance of remote API calls, run a cost analysis, and explore ways to optimize our Lambda functions to minimize cost and execution time when dealing with third-party APIs.
Posted in Serverless on December 3, 2018 Last Updated: March 30, 2020
Last week I spent six incredibly exhausting days in Las Vegas at the AWS re:Invent conference. More than 50,000 developers, partners, customers, and cloud enthusiasts came together to experience this annual event that continues to grow year after year. This was my first time attending, and while I wasn’t quite sure what to expect, I left with not just the feeling that I got my money’s worth, but that AWS is doing everything in their power to help customers like me succeed.
Posted in Serverless on November 21, 2018 Last Updated: April 22, 2020
Update June 5, 2019: The Data API team has released another update that adds improvements to the JSON serialization of the responses. Any unused type fields will be removed, which makes the response size 80+% smaller.
Update June 4, 2019: After playing around with the updated Data API, I found myself writing a few wrappers to handle parameter formation, transaction management, and response formatting. I ended up writing a full-blown client library for it. I call it the “Data API Client“, and it’s available now on GitHub and NPM.
Update May 31, 2019: AWS has released an updated version of the Data API (see here). There have been a number of improvements (especially to the speed, security, and transaction handling). I’ve updated this post to reflect the new changes/improvements.
On Tuesday, November 20, 2018, AWS announced the release of the new Aurora Serverless Data API. This has been a long awaited feature and has been at the top of many a person’s #awswishlist. As you can imagine, there was quite a bit of fanfare over this on Twitter.
Ya'll been waiting eagerly for this one! Access an Aurora Serverless database via HTTPS API: https://t.co/OUkqA8U3Y1 still in beta, but will change the way you think about #awsLambda and relational databases! #serverless
Obviously, I too was excited. The prospect of not needing to use VPCs with Lambda functions to access an RDS database is pretty compelling. Think about all those cold start savings. Plus, connection management with serverless and RDBMS has been quite tricky. I even wrote an NPM package to help deal with the max_connections issue and the inevitable zombies 🧟♂️ roaming around your RDS cluster. So AWS’s RDS via HTTP seems like the perfect solution, right? Well, not so fast. 😞 (Update May 31, 2019: There have been a ton of improvements, so read the full post.)