How To: Normalize URLs Stored in MySQL

I came across an interesting problem the other day. As part of our URL normalization strategy at AlertMe, we have been adding a trailing slash to URLs without file extensions. We did a lot of research when deciding on this tactic and the general consensus around the web was to use trailing slashes for directories and (obviously) no slashes on filenames. See this article from the official Google Webmasters blog: https://webmasters.googleblog.com/2010/04/to-slash-or-not-to-slash.html (I know it’s old, but the concept is still relevant).

We even tested a number of publisher URLs to see what their redirection strategies were. Every one we tested responded correctly to both the slash and no-slash versions of the URL. Some redirected to a trailing slash, some redirected to no trailing slash, but they all returned (or redirected to) the intended page.

Continue Reading…

How To: Stub AWS Services in Lambda Functions using Serverless, Sinon.JS and Promises

I know the title is a quite a mouthful, but if you are trying to run tests on your Lambda functions that interact with AWS Services using the aws-sdk node module, then you’ve probably run into an issue stubbing or mocking the requests. In this post we’ll learn how to stub different AWS Services with Sinon.JS so that you can properly test your scripts.

UPDATE: AWS Lambda now supports Node v8.10, so we can use async/await instead of promises. The examples below still work with either v6.10 or v8.10, however, I recommend switching to async/await as they are more compact than promises. Read my post How To: Stub “.promise()” in AWS-SDK Node.js to learn how to deal with the .promise() method on aws-sdk services.

Let’s say you have a Lambda function that interacts with AWS’s SQS (Simple Queue Service). v6.10 of Node doesn’t support async/await, so you will most likely use promises if you don’t want to transpile your code or deal with callback hell. This means you need to Promisify an instance of the AWS SQS service. This is easy enough with:

Continue Reading…

How To: Access Your AWS VPC-based Elasticsearch Cluster Locally

AWS recently announced that their Elasticsearch Service now supports VPC, which is awesome, for a number of reasons:

1. No more signing every request

Remember this?

Every request had to be signed with AWS’s SigV4 so that the Elasticsearch endpoint could be properly authorized. That meant additional code to sign all your requests, and additional time for the endpoint to decode it. It might only be a few milliseconds of extra processing time, but those can add up. Now we can call our VPC Elasticsearch endpoint with a simple HTTP request.

Continue Reading…

How To: Reuse Database Connections in AWS Lambda

Update 9/2/2018: I wrote an NPM module that manages MySQL connections for you in serverless environments. Check it out here.

I work with AWS Lambda quite a bit. The ability to use this Functions-as-a-Service (FaaS) has dramatically reduced the complexity and hardware needs of the apps I work on. This is what’s known as a “Serverless” architecture since we do not need to provision any servers in order to run these functions. FaaS is great for a number of use cases (like processing images) because it will scale immediately and near infinitely when there are spikes in traffic. There’s no longer the need to run several underutilized processing servers just waiting for someone to request a large job.

AWS Lambda is event-driven, so it’s also possible to have it respond to API requests through AWS’s API Gateway. However, since Lambda is stateless, you’ll most likely need to query a persistent datastore in order for it to do anything exciting. Setting up a new database connection is relatively expensive. In my experience it typically takes more than 200ms. If we have to reconnect to the database every time we run our Lambda functions (especially if we’re responding to an API request) then we are already adding over 200ms to the total response time. Add that to your queries and whatever additional processing you need to perform and it becomes unusable under normal circumstance. Luckily, Lambda lets us “freeze” and then “thaw” these types of connections.

Update 4/5/2018: After running some new tests, it appears that “warm” functions now average anywhere between 4 and 20ms to connect to RDS instances in the same VPC. Cold starts still average greater than 100ms. Lambda does handle setting up DB connections really well under heavy load, but I still favor connection reuse as it cuts several milliseconds off your execution time.

Continue Reading…

How To: Install phpMyAdmin on Amazon Linux

I’ve been managing a few small MySQL databases lately that often need record updates but certainly don’t warrant building a separate management interface. The easiest way to accomplish this (assuming you don’t have complicated joins and relationships) is to install phpMyAdmin, a robust, web-based admin utility for MySQL that is built in php. In my case, I’m running these mostly on Amazon Linux instances, so after a little poking around, it turns out the installation is just 3 simple steps.

Continue Reading…

How To: Install wkhtmltopdf on Amazon Linux

The other day I was optimizing a reporting engine and needed a quick and easy way to generate PDFs from HTML templates. There are several options out there, but after some research, I decided to use wkhtmltopdf (http://wkhtmltopdf.org). It then took me a few hours of scouring the Internet to find the appropriate steps to install it on Amazon Linux. I kept running into a series of problems with the version, required libraries, and font kerning. Some trial and error, and help from some others, finally got me up and running. Here’s what worked for me.

Continue Reading…