I posted a thread on Twitter with some thoughts on how to how to switch from RDBMS to DynamoDB. Some people have asked me to turn it into a blog post to make it easier to follow. So here it is… with some bonus steps at the end. Enjoy! 😁
I've been spending a lot of time lately with @dynamodb in my #serverless applications, so I thought I'd share my surefire guide to migrating to it from #RDBMS. So here is…
How to switch from RDBMS to #DynamoDB in *20* easy steps… (a thread)
— Jeremy Daly (@jeremy_daly) June 7, 2019
I’ve been spending a lot of time lately with DynamoDB in my serverless applications, so I thought I’d share my surefire guide to migrating to it from RDBMS. So here is…
How to switch from RDBMS to DynamoDB in 20 easy steps…
STEP 1: Accept the fact that Amazon.com can fit 90% of their retail site/system’s workloads into DynamoDB, so you probably can too. 🤔
STEP 2: Create an Entity-Relationship Model, just like you would if you were designing a traditional relational database. 👩💻
STEP 3: Create a list of ALL your access patterns. If “search” is an access pattern, don’t worry, we’ll deal with that in STEP 17. 😉
STEP 4: Narrow down your access patterns to the ones *required* by app users. Data analysis and admin tasks, such as analytics, aggregations, behavioral triggers, etc, are all important, but likely not necessary for the end user interacting with your app in real-time. 🚫📊
STEP 5: Determine if your user access patterns require ad hoc queries that need to reshape the data. Likely the answer is, no. However, if you’re building an OLAP application, NoSQL is not a good choice. Pat yourself on the back for trying, and use another technology. 🤷♂️
STEP 6: Put your head in the microwave for 3 seconds (or however long is necessary for you to forget what data normalization and third normal form are). 🤤
STEP 7: Watch Rick Houlihan’s Advanced Design Patterns for Amazon DynamoDB (DAT403-R) talk from AWS re:Invent 2017. 😯
STEP 8: Watch Rick Houlihan’s Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB (DAT401) talk from AWS re:Invent 2018. 😮
STEP 9: Clean up all the tiny pieces of your brain splattered around the room, then regroup, and watch them again, this time at half speed. Take notes. 🤯
STEP 10: Read the “Best Practices for DynamoDB” guide on the AWS site. Then read it again. 🤓
STEP 11: Design *ONE* DynamoDB table that uses overloaded indexes to store all of your entities using composite Sort Keys (when necessary), adding additional LSIs and GSIs (again, when necessary) to accommodate the aforementioned access patterns. 😳 (this will make more sense once you go through the previous steps)
STEP 12: Write some sample queries and test your access patterns against your table design. Realize you did it completely wrong this first time, take a breath, drink a beer (or two), and go back to STEP 7. 😞🍺
STEP 13: Test your access patterns against your *NEW* table design. Iterate. Test again. Iterate, and test again. 🤨
STEP 14: Repeat STEP 13 until you are 95% confident that you’ve got your table design right. 😀
STEP 15: You did it (well, the first part anyway)! 🎉 Celebrate your accomplishment (maybe have another beer), and then wire your DynamoDB access patterns into AppSync or build out an API with API Gateway. Remember to use the Transactions API when necessary. 😎
STEP 16: Test, test, and test again. And when you’re done testing, test it again. And make sure you write some tests so you can automatically test it, again and again. ✅
STEP 17: Enable DynamoDB streams and use them to generate/update aggregations and replicate data for reporting, search indexing, and/or other application requirements. (Lambda functions make for really good stored procedures, btw) 🚀
STEP 18: Test your DynamoDB streams *IN THE CLOUD* to ensure data is flowing correctly and downstream resources are being populated/updated correctly. Write some tests to automate this. ☁️
STEP 19: Cross fingers, publish to production. 🤞
STEP 20: Profit! 💰
This was obviously a bit over simplified, but hopefully it gets you started on your journey to discovering the power of DynamoDB. If don’t get it at first, just keep at it. It will eventually “click” (unless you left your head in the microwave too long). Good luck! 👍
There were a ton of great comments in the Twitter thread with suggestions for additional DynamoDB learning material. If you want to go even deeper into DynamoDB, here are some additional resources that’ll get your brain smoking even more. 🔥
- Simplify Amazon DynamoDB data extraction and analysis by using AWS Glue and Amazon Athena
- How to perform advanced analytics and build visualizations of your Amazon DynamoDB data by using Amazon Athena
- Alex DeBrie’s DynamoDBGuide.com
- From relational DB to single DynamoDB table: a step-by-step exploration by Forrest Brazeal
Do you know of more amazing DynamoDB resources? Please feel free to send them my way!
Tags: databases, dynamodb, nosql, rdbms, serverless
Did you like this post? 👍 Do you want more? 🙌 Follow me on Twitter or check out some of the projects I’m working on.
5 thoughts on “How to switch from RDBMS to DynamoDB in 20 easy steps…”
Thank you for sharing
Thank you, this has totally changed how I look at DynamoDb (and databases). Especially eye-opening:
– the cost-benefit of flexibility vs. design to access patterns
– the ready-made functionality of DynamoDb — change log (dynamoDB streams), cache (DAX), autoscaling DB access!, “stored procedure” (lambda) scaling independently of DB!
Not getting (at all) the use of lambda for “computed aggregations” written back as “metadata” to the one table… mentioned a few times in the videos. Any pointers that would help would be much appreciated? And your thoughts on this?
Thanks again for making this a blog post, followed it from twitter link.
I’m glad you found it useful. In terms of computed aggregations, you could use a Lambda function attached to the DynamoDB stream to update counts (or averages, sums, etc.) in your DynamoDB table. So for example, every time a new “file download” record is added to your DynamoDB table, you could use the Lambda function listening to the stream to increment a counter on the main “file” item in the table. No need for your application to try and update several items at once, just harden the data you need to capture, then use Lambda as a stored procedure to update all the aggregations.
Hope that makes sense,
But still Dynamodb doesnt have option to track the item modification by default . Streams needs to be enabled to track the item level change.
Interesting fact relational databases are the metadata auditing… In case of some critical issues , audit columns in relational tables helps to backtrack the root cause which inturn is missing in nosql databases.
Whats your view about my point ?
Like you said, if you need column auditing, DynamoDB Streams provides that service for you. The benefits of using fully-managed DynamoDB for hyper-scale, data isolation, and single digit millisecond performance in OLTP and DSS applications is hard to beat IMO.