You migrate from Blogger to self-hosted WordPress. Your posts move over just fine, but for some reason (or another) your images forget their bus pass. Those pornographic stupid cat, hastily-prepared food, and trying-to-make-people-think-you-are-wealthy-instead-of-deep-in-debt vacation photos still show on the new site as they are properly referenced in the posts, but they actually remain on Google’s servers. You (or your client) don’t like that.
Meanwhile, the two plugins you found to solve this problem, Archive Remote Images and Cache Images, haven’t been updated in years. You take your chances anyway because you are lazy (if it is a personal site), or consistently over-promise and under-deliver (due to the impossibility of getting real work done at coffee shops). Either way, you must now hope you made a full site and database backup beforehand. If you did, you’re solution is now staring you in the face.
The script I concocted (shown after the jump) will get you a folder full of those images – with clean and pretty naming conventions – that you can upload to your wp-content directory, along with a SQL script to update links in your WordPress posts. Said programmatic wizardry dirty hack is written in Python – debugged using version 3.5.2 Anaconda custom (x86_64) on macOS 10.12.3 to be precise – and does rely on some SQL prep work. If you do not know Python, SQL and how to navigate directories while a terminal prompt blinks back, you have two choices: Google it (after determining what the definition of “it” is), or inquire about retaining me to do your work for you.
I’ll make the decision whether to continue easy too; if you cannot execute the following block of code sans assistance you are officially deemed “without paddle” …
SELECT * FROM `wp_posts` WHERE `post_content` LIKE "%blogspot%"
INTO OUTFILE '/home/dump/blogspotposts.csv'
FIELDS TERMINATED BY '|'
That look easy? Then proceed.
First, decide whether to run on your desktop (for future upload) or directly on server. Next, create a directory underneath where the script is located called /bspics. Lastly, make sure the directory the code is in is writable by all.
The code can be found here -> processblogspotimagelinks.py
Once you have changed the obvious stuff to suit your need, run it. Your /bspics directory will fill up with those images I promised – you can then place that entire directory underneath /wp-content – and you’ll also have a file called bsreplacescript.sql which you will run against your WordPress database to update image links in the associated posts.
Important [final] note: the coding was an iterative process, and some data analysis was done between steps in order to account for string possibilities encountered, generating clean file names, etc. It could be refactored, but wasn’t because 1) the end result works as intended and 2) removing those iterations would handicap attempts to modify it for a different data set.
MG signing off (to solve some not-so-commonplace problems)
Self-reliance is nobody’s fault but my own
Just over a year ago I installed an OpenID provider on this site, and have been using the URL here ever since to harass and harangue other blogging types (mostly fishy ones).
Unfortunately, several months back I did some behind the scenes changes. They were merely back-office tweaks, since as you all know the theme/style here is already the most artistic, creative…heck downright gorgeous hunk of web design anywhere on the interwebs. Sadly my flair for technicolor wowza does not extend to my left-brain, and OpenID provision went bust.
At first I pointed fingers at Blogger, and took those I regularly denigrate there to task. But after significant amounts of research and tinkering, I now realize that it is the technology within causing the problems.
I make no apologies, primarily because I know certain denizens of the tubes have expressed sighs of relief during this otherwise difficult period. They are undoubtedly thanking me for my ineptitude. But someday near I will make reparations – I vow that the cynical, ill-humored, irritable commentary certain folks have previously accepted while cussing under their breath will resume.
MG signing off (while Alex, Kyle, Jean-Paul and others tremble in their boots)
March 19, 2010 1 Comment