How To: Normalize URLs Stored in MySQL

I came across an interesting problem the other day. As part of our URL normalization strategy at AlertMe, we have been adding a trailing slash to URLs without file extensions. We did a lot of research when deciding on this tactic and the general consensus around the web was to use trailing slashes for directories and (obviously) no slashes on filenames. See this article from the official Google Webmasters blog: (I know it’s old, but the concept is still relevant).

We even tested a number of publisher URLs to see what their redirection strategies were. Every one we tested responded correctly to both the slash and no-slash versions of the URL. Some redirected to a trailing slash, some redirected to no trailing slash, but they all returned (or redirected to) the intended page.

Well, the other day we came across a publisher that returned 200 status codes with an Unhandled request message in the body. This first caused issues with our crawlers as the URLs were technically returning valid pages, but with no content. Then there was the issue of redirecting users to these URLs. They would get the Unhandled request messages instead of the correct page. 🤦🏻‍♂️

Long story short, we had to change our strategy to remove trailing slashes from ALL URLs. As I mentioned previously, we haven’t come across any one who doesn’t support no trailing slashes. This required a bit of cleanup work on the database as well. For other publishers, URLs with the trailing slash wouldn’t matter. But we had a few hundred records from our new partner that needed the trailing slashes removed.

Of course, MySQL doesn’t have a native REGEX replace function, but luckily there was a pretty simple query that took care of the issue. So if you ever need to normalize the URLs in your database, here’s a quick fix.

If you want to check the output first, run this SELECT query:

Not overly exciting, but this was a helpful pure MySQL workaround for cleaning up our URLs.

Tags: , , , ,

Did you like this post? 👍  Do you want more? 🙌  Follow me on Twitter or check out some of the projects I’m working on.

2 thoughts on “How To: Normalize URLs Stored in MySQL”

  1. Hey, thanks for posting, very useful info. Still, I’m struggling with the reverse of your approach, where I want to add a trailing slash to URLs that are missing it (rather than remove it). Would it look like:

    UPDATE post_content
    SET url = if(
    SUBSTRING(url, -1, 1)!=’/’,
    CONCAT(url, ‘/’),
    WHERE url LIKE ‘’

    I’m trying to update the post_content column in the wp_posts table in a WordPress database.

    1. Hey Vlad,

      We actually chose the reverse because of the potential for file extensions in URLs (e.g. ends in .html or something). So you would need to add something that detects a file extension (which could be practically anything) and then ONLY add the trailing slash if it was a directory. Your method above looks fine if you are not worried about file extensions.

      Hope that helps,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.