Menu

Michael Gracie

Rewriting tag underscores to hyphens for WordPress 3.1

When I first moved existing content to this domain, I had a pile of tags coming from Movable Type with underscores (“_”) in them. Upon getting them into the new database, I changed the underscores to hyphens using a SQL script, but then had to worry about redirecting old links to new. I wound up with a very long .htaccess file full of 301 directives that looked like this:

Redirect 301 /tag/tags_with_underscores/ https://michaelgracie.com/tag/tags-with-underscores/

With the upgrade to WordPress 3.1, I started having problems with URL rewrites – the culprit wound up being the Advanced Permalinks plugin. That plugin had been used as a patch, allowing pretty permalinks to function alongside some stray special characters such as periods (“.”). Once I disabled it I was forced to clean up those special characters, and it then dawned on me that this list of redirects was WAY too long. So I set out to conjure another solution for the original underscore issue. After significant research, followed by too much trial and error, this is what I came up with…

If you are using clean URLs, WordPress has inserted this chunk of code into your site’s .htaccess file:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

Inserting this additional code in between “RewriteBase /” and “RewriteRule ^index\.php$ - [L]” will change the underscores in individual tags to hyphens:

#begin permanent tag fix
RewriteRule ^tag/([^_]*)_([^_]*)_([^_]*)_([^_]*)$ tag/$1-$2-$3-$4 [L,R=301]
RewriteRule ^tag/([^_]*)_([^_]*)_([^_]*)$ tag/$1-$2-$3 [L,R=301]
RewriteRule ^tag/([^_]*)_([^_]*)$ tag/$1-$2 [L,R=301]
#end permanent tag fix

The above assumes your tag base is “tag”, but as you can see the code above can be easily modified if your tag base is different – just replace “tag” with whatever yours is. Additionally, my fix only replaces up to three underscores within a tag string – if you have tags with more underscores just add an extra line directly below “#begin permanent tag fix” with another replacement string as such…

RewriteRule ^tag/([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ tag/$1-$2-$3-$4-$5 [L,R=301]

And so on. When done, that original block of WordPress-generated code might look like this:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
#begin permanent tag fix
RewriteRule ^tag/([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ tag/$1-$2-$3-$4-$5 [L,R=301]
RewriteRule ^tag/([^_]*)_([^_]*)_([^_]*)_([^_]*)$ tag/$1-$2-$3-$4 [L,R=301]
RewriteRule ^tag/([^_]*)_([^_]*)_([^_]*)$ tag/$1-$2-$3 [L,R=301]
RewriteRule ^tag/([^_]*)_([^_]*)$ tag/$1-$2 [L,R=301]
#end permanent tag fix
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

I’d keep a backup copy of it, particularly if you are prone to upgrading WordPress manually.

MG signing off (with a significantly smaller .htaccess file)

Comments

adrian says:

Hi,

we just moved to wordpress from joomla and have thousands of links with underscores and ending in .htm that we need changed to hypens with no .htm, any ideas of how to fix please?

I was thinking of some find and replace script in combination with implementing an ht access revision.

eg architecture_news.htm needs changed to architecture-news

we can’t do find and replace sitewide as our images are a mix of underscores and hypens!

How big is the site i.e. how many posts and pages are you dealing with total?

adrian says:

Hi, in joomla we had about 15,000 pages, with on average maybe 30 links per page,

in wordpress we have 822 pages and 13,663 posts

Ok, so big. Are the links-per-page you speak of internal or external? Contained entirely in wp_posts/post_content now? Or are you talking categories, tags, and other meta in those counts now?

Probably going to require some analysis and trial-n-error testing, regardless of which way this little Q&A we are having goes.

adrian says:

links are both internal and external, but the ones we need fixed are internal which is the majority,

they need fixed because we’ve moved from joomla where underscores were ok and whee pages are apppended with .htm

the stats are just for pages and posts

my assistant David has planned some testing on one of our 5 websites on Thursday, but we’re not sure of how to go about it.

Please note we don’t have a budget, our revenue is pretty limited.

Nearly impossible to assess from afar. Again, as mentioned earlier, this is going to be analysis and test driven – not sure of easy fix post-migration, and as I hope you understand I am reluctant to start randomly speculating on possibilities considering the delicacy of what you are facing.

Regardless, best wishes on the work.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.