Category archives for Hosting your own blog

Monday, December 4, 2006

How-to: Import posts and comments from Blogger Beta into WordPress

(And into ExpressionEngine and Movable Type, too!)

Update: If you're migrating to WordPress, you might want to check out a shiny new WordPress plugin before you get into all my manual tomfoolery. I haven't had a chance (or a need) to test it, but Import New Blogger looks fast and painless.

Update 2: Supposedly WordPress 2.2 and higher natively support importing from "new" Blogger. These instructions are likely now completely irrelevant for WordPress, but should still work just fine for migrating to MT or EE.
 

You know I'm a general, all-purpose fix it guy, right? As snarky and bitchy as I can be about my job, it really is a great fit for me. I genuinely like helping people. In that same vein, as I'm sure most of you have noticed, I've kind of positioned myself as "the answer guy" to a lot of people for just about anything blogging related.

One of the questions I'm asked most often is, "How do I get my Blogger Beta posts and comments into WordPress?"* Anyone who's tried this knows that the import script currently available with WP (even in the bleeding edge developer versions) doesn't support Blogger Beta.

There already exist several workarounds that will allow someone to do this very thing. Unfortunately none of them are ideal. They'll either not import comments, or they'll only import recent posts or they're far too complex for ordinary mortals. I've always thought there must be a better way. In the absence of a proper WP import script, here then is what I've come up with.** This method works for importing into WordPress, Movable Type and Expression Engine. It also works for WordPress.com users, but with a very big caveat in step five.

Step one: tweaking Blogger Beta's settings

Login to Blogger Beta and visit the Settings tab of the control panel for the blog from which you'll be importing. Select the Formatting page. Change the setting for Timestamp format to mm/dd/yyyy hh:mm:ss like you see in the first image below (all images are hyperlinked to larger versions).

Blogger Beta screencap

While still on the Formatting page, set both Convert line breaks and Enable float alignment to "No."

Blogger Beta screencap

Click the Save Settings button at the bottom of the page to proceed.

Now select the Comments page, and there change the setting for Comments timestamp format to mm/dd/yyyy hh:mm:ss, just as you did for the post timestamps.

Blogger Beta screencap

Click the Save Settings button to move on.

Select the Archiving page. Set Archive Frequency to Monthly.

Blogger Beta screencap

Click Save Settings.

Step two: your Blogger Beta template

Now select the Template tab.

If you are using a Beta-style template, visit the Edit HTML page and there click the link for Revert to Classic Template

Blogger Beta screencap

You'll be prompted to confirm this and advised that your Beta-style template will be saved.

Now you're ready to continue regardless of what style template you were using. (So classic-style template users should start paying attention again.)

If your classic-style template contains any customizations, make a backup of your template now. Do this by copying the template from the Edit HTML page and pasting it into Windows Notepad or some other similar plain text editor. I can't stress this part enough. We're going to do something irreversible to the template so a manual backup is the only way to preserve your changes. Blaming me for losing your blogroll because you didn't follow the steps will create a curse upon your family.

Once your backup is safely tucked away, kill the entire template (in the Blogger control panel, not in your backup [can't stress that enough]) and replace it with this:

<?php header("Content-type: text/plain"); ?><Blogger>TITLE: <$BlogItemTitle$>
AUTHOR: <$BlogItemAuthorNickname$>
DATE: <$BlogItemDateTime$>
STATUS: publish
-----
BODY:
<$BlogItemBody$>
<BlogItemCommentsEnabled><BlogItemComments>-----
COMMENT:
<?php $comment<$BlogCommentNumber$> = '<$BlogCommentAuthor$>';
$comment<$BlogCommentNumber$>_author_pre = explode('>', $comment<$BlogCommentNumber$>);
$comment<$BlogCommentNumber$>_author = explode('<', $comment<$BlogCommentNumber$>_author_pre[1]);
echo "AUTHOR: " . $comment<$BlogCommentNumber$>_author[0] . "\n";
$comment<$BlogCommentNumber$>_url_pre = explode('href="', $comment<$BlogCommentNumber$>);
$comment<$BlogCommentNumber$>_url = explode('" rel="', $comment<$BlogCommentNumber$>_url_pre[1]);
echo "URL: " . $comment<$BlogCommentNumber$>_url[0] . "\n"; ?>
DATE: <$BlogCommentDateTime$>
<$BlogCommentBody$>
</BlogItemComments></BlogItemCommentsEnabled>--------
</Blogger>

Before going farther, a few words about this template. It's essentially an ugly hack of Movable Type's import/export format. Possibly incomplete details on the format can be found here. I say "possibly incomplete" because I can't find any details about the STATUS: publish bit. Since I found that bit in the WordPress Codex, it's possible that field is not part of the official specification and thus may not be supported by the importers built into MT or EE.

And while we're on the subject of the STATUS: field, the other option is "draft." Draft is probably the preferred option, as it will allow you to edit imported entries as required before they become part of your new blog, but manually publishing every new draft could be a massive chore if you have many posts. Replace STATUS: publish with STATUS: draft (or remove that line completely) at your preference.

When you're ready, click Save Template Changes.

Now view your blog. You'll note that it's disgustingly fugly. If you care to try to read any of it, you'll probably also note that none of the PHP has been processed. If you view the source of your page, you'll see the PHP tags sitting there ignored.

Update: A user has reported difficulty with step six when using Internet Explorer. If you're still using IE, it's time to move up to a better browser. If you can't (or won't) use a different browser, you may have better luck on step six if you edit this template to remove <?php header("Content-type: text/plain"); ?> from the first line. I thought manually specifying the output content type would increase compatibility, but it just might be having the opposite effect.

Step three: download the blog

Visit each of your monthly archive pages. Since you don't have a list of clickable links, you'll have to type the addresses manually. The structure of the archive URLs are like so: http://yourblog.blogspot.com/2006_12_01_archive.html. For each month of your archives, you'll need to change month and year in the URL. This could be a little time consuming if your blog has been around for a while. It would be nice if Blogger had yearly archives, but we've got to roll with what we've got.

You'll need to save each of your archive pages to your computer. In Firefox, select Save page as… from the File menu. In Opera, select Save as…, also on the File menu. In Internet Explorer 6 and lower, Save as… is found on the File menu but in IE7, you'll find it on the Page menu. Whatever web browser you're using, be sure to save your page as "All files" if it's available, or "text files" if it's not. Replace the .html extension with .php.

Blogger Beta screencap

When you're all finished, you'll have one file for each month of your Blogger archives. Those files should be named like so:

2006_12_01_archive.php
2006_11_01_archive.php
2006_10_01_archive.php
2006_09_01_archive.php

…and so on.

And you're ready for the next step.

Step four: remove Blogger's mess

Open one of your new PHP files with a plain text editor like Windows Notepad.

You've got to get rid of that Blogger bar foolishness. Do this by removing the first four lines and part of the fifth. Everything before has got to go.

Blogger Beta screencap

Save the file. Repeat this process for each of your monthly archives.

Step five: upload to your webspace

Now you'll need to get your PHP files onto your webspace. Do this any way with which you are comfortable: FTP, cPanel upload, whatever. I'd recommend placing them right in the root of your webspace for short URLs, but it doesn't really matter where you place them because they won't need to stay there permanently.

For WordPress.com users only: all of the steps in this process are completely compatible with your service, except this one. But all is not lost. If you've got a pal who's willing to host your few files for a while, that's all you really need.

Step six: saving again

Load the URL for each of your uploaded files in your web browser. You'll see your output is dramatically different.

Blogger Beta screencap

Without the Blogger gobbledygook, and after PHP has specified text/plain and parsed comment author links into separate AUTHOR/URL fields (if Blogger had separate template tags for those, this would be a lot simpler) we've got a proper, well-formed mt-import file. Yee-haw.

Now to save them. For each file, select Save as… again from the File menu (or whatever the command is for your browser). Set the file type to text and give your files a txt extension. Since this time the content type is set properly, you likely won't need to change anything, but you'll still need to double check.

Blogger Beta screencap

Now you'll have a lovely set of well-formed mtimport files with names like these:

2006_12_01_archive.txt
2006_11_01_archive.txt
2006_10_01_archive.txt
2006_09_01_archive.txt

And now we're ready to move on.

Step seven: importing (finally!)

We're in the home stretch here. The actual import process is easy as pie.

For WordPress users: In your Dashboard, look for Import as one of the top level options. You'll see "Movable Type and Typepad" on the list. Click that link and the very simple MT import wizard will begin. If you need assistance with this bit, check out the very detailed documentation and examples in the Codex. (Note that WordPress.com and the current 2.1 development versions of WordPress have the importers as a subpanel of the Manage page.)

For ExpressionEngine users: Upload your set of text files to your webspace (EE doesn't upload as part of the import). In your Control Panel, visit Admin -> Utilities -> Import Utilities -> Movable Type Import Utility. I'm really not an EE guy, so if you need assistance, you should visit pMachine's EE forums or check out this page in the EE wiki.

For Movable Type: I don't even have an MT installation anymore, so I really don't know how to direct you. This page in Six Apart's MT documentation would probably be a great place to start.

Step eight: clean up your own mess

Nothing remains now but mop-up. Delete all the temporary files from both your webspace and your own hard drive, although you might want to keep those text files. They're a neat, human readable backup of your blog. Can't hurt to keep.

Put your Blogger template back the way it was. If you made a backup of your classic style template (and you did, right? 'cuz I stressed that… a lot), restore it now. If you were using the Beta style template, "upgrade" from classic back to the template you had before.

Adjust Blogger's settings on the Formatting, Comments and Archiving pages back to their original values, or to whatever other settings you might prefer.

If you elected to replace STATUS: publish with STATUS: draft, you'll still need to review, edit and publish each of your imported posts.

Step nine: drink beer

You're done!

 

*In case you were wondering, the question I'm asked more than any other is, "What happened to my gravatars?" (go back up)

**Now that I've put forth all this effort, would you care to wager how quickly WordPress will have an update for the built-in importer? (go back up)

Friday, July 21, 2006

Wow, that was stupid

Hey, guess what? I'm a dumbass!

So I got this new plugin the other day, WP-phpMyAdmin 2.8.2. As the name would suggest (assuming you geek), the plugin is a WordPress implementation of phpMyAdmin. All the power and gooey goodness, and the convenience of a WP admin panel. Cool stuff.

Anyway, I got the plugin installed yesterday and sat down to use it for the first time tonight. I was smart enough to make a full backup of my database before I started fiddling around. With my backup safely stored on my desktop, I dove right in to do all those little things I never get around to adjusting because of the hassle of using cPanel to tweak.

After making a few more changes than I should have, I decided I didn't like the way I'd taken things and went to restore my backup.

And then I couldn't find my backup.

It turns out I accidentally deleted my fresh clean backup while I was still tweaking. I found this out when I inadvertently restored last night's automatic backup. What a dumb ass.

So every database change since 1 a.m. yesterday was lost. I was able to rebuild everything, so nothing was permanently lost. Fortunately, it was kind of a slow day: only one post and about a dozen comments. It's possible that a comment or two was lost in the shuffle somewhere, so if you see something missing, don't think I deleted out of anything other than stupidity.

Lesson learned: make double dog sure I've got a backup before I do something permanent. Other lesson learned: quit being so fucking dumb.

Tuesday, July 4, 2006

Must. Have.

Have any of you WordPress users heard of ShortStat? I continuously see that plugin mentioned on "best of" lists. I've tried it out once or twice, but I've been disappointed both times.

See, I could never get it to work and I have no idea why. It just wouldn't track anything. I figured it for a broken plugin and just moved on. It turns out that the original ShortStat plugin doesn't work in WP 2.0.2 and 2.0.3. The developer hasn't bothered to release a new version. And then I found this version. Now I see what everyone was talking about.

ShortStat is fantastic. The amount of information it logs is almost enough to make me ditch my hit counter. (Almost.)

For example, I now know that my RSS feed got 41 hits yesterday. And that 3% of my visitors are from China. And that I get three times as many search engine hits from MSN than I do from Google, in spite of the fact that Google crawls my site more thoroughly, etc., etc.

Cool. I loves me some stats. I also love the fact that my Bush Family Porn script nets me around ten search engine hits a day. However, I do not love the fact that everyone seems to be finding it by searching for "family porn." Not "Bush family porn," just "family porn."

So apparently search engines think I've become one of the internet's premier sources of incest porn. That's a stat I could probably do without, but I guess it's still good to know.

Wednesday, June 7, 2006

Notes on hosting your own blog, part 10

So, last week an update to WordPress was released. The new version, 2.0.3, includes a few minor performance enhancements and bug fixes and one big new feature: nonces. Nonces are some moderately complex technobabble that Owen at Asymptomatic does a pretty good job translating into something resembling layman's terms here. The short version is that nonces are a system for verifying that administrative commands come from the right place.

I read the release notes and was very interested in the new version. So I got all of my little ducks lined up in a row and prepared to upgrade. I downloaded the new installation, made an up-to-the minute backup of my database, and FTP'd a copy of my current install to my hard drive.

I've modified the WP core files a bit, so let's just say that I was using WordPress "2.0.2.fish." I took the new 2.0.3 files and added my mods before uploading them to save me the difficulty of doing it after the files were in place.

I activated the Maintenance Mode plugin to take my blog offline and overwrote my current installation with the new version. After the upload, I ran the one-click database upgrade. All finished, I viewed my blog… and found that for some unfathomable reason all of my content was crammed into the sidebar. The content that should have composed the main column was inexplicably displaying with a width of about half a character.

It was completely unusable.

I wondered, "Why would the theme fail in the new WP? Did the Maintenance Mode plugin mess with the install?" So I deleted my WP install and database, restored my backups, disabled Maintenance Mode and tried again.

No luck. Same problem.

I played around with the templates and found that it was only my custom theme that had a problem. The two themes included with WP worked just fine.

I wondered, "So is it a plugin problem?" I disabled all my plugins and tried the whole thing all over again, only to learn that nothing in my custom theme worked without the plugins the theme used.

Eventually, I mucked around with the blog output enough to learn that there was nothing wrong with my theme, or my plugins, or with the WP update.

The problem is that I'm a dumbass.

It was all in those mods I made to the WP core files. See, it turns out that if a file has two modifications, I better damn well make sure I update both of them, not just one. Some of the code I changed involves how the sidebar displays. I was sloppy when I inserted my changes. Garbage in, garbage out.

So in addition to being reminded yet again of how important it is to pay attention and do things right the first time, I also learned a lesson about using plugin template tags.

Trying to troubleshoot plugins without being able to disable the plugins was a colossal pain in the ass. So I took the time to sort out my template and wrap all plugin calls with function checks. I'm now at a point where I can disable all my plugins and everything will still work. Which is exactly what I need for troubleshooting in the future.

Confused about that function check business? It's simple, yet kind of important, so I'll explain.

The install instructions for a plugin with template features may tell you to insert a line of code like this:

<?php plugin_function('do_something_cool'); ?>

I would strongly recommend you wrap that in a function check, like this:

<?php if (function_exists('plugin_function')) { plugin_function('do_something_cool'); } ?>

The first line of code basically says "do this or else." The second line of code is more like "if you can do this, then do. Otherwise don't worry about it." And that's exactly what one needs to disable a plugin without also removing any matching template tags.
 

It's a good thing that I finally got around to adding those function check wrappers, but it turns out that the whole thing was kind of an exercise in foolishness. Although the new nonce system is cool, WordPress 2.0.3 is no kind of improvement at all. Those nonces don't really work right.

For some reason or another, every edit or delete operation I perform generates an extra warning prompt. And everything I edit suddenly has all single and double quotes escaped. (Like this: \"Let\'s go to John\'s house.\") What a pain in the ass.

Apparently these problems are not unique to me. There's a support thread on the WordPress site where people are talking about it. Mark Jaquith has even developed a plugin to fix it.

But that's a little messed up. Install the new update and then install a third party plugin to get the update to work? That blows. I'd recommend skipping WP 2.0.3 altogether and just waiting for the next revision.

Friday, May 5, 2006

Notes on hosting your own blog, part 9

A reader has an excellent post that should be required reading for anyone considering moving to their own domain. It's a cautionary tale about the perils of reseller hosting. The short version is that resellers are best avoided. Although her story isn't nearly as apocalyptic as it could be, it's an excellent example of why you should go straight to the source.

Thursday, March 23, 2006

Notes on hosting your own blog, part 8

Apparently there is the possibility that WordPress will occasionally fall on its sword. My blog is broken and I have no idea why. My server is up, my database is working, other test installations I've installed are functioning normally.

Doesn't that just suck?

Please stand by while I engage in a little percussive maintenance.

Update: Not my fault, not WordPress' fault! I spent some time this morning beating on things in a semi-orderly sort of way and ran into some curious roadblocks. I opened a support ticket with my web host and was directed to an open thread on their support forums.

Apparently they upgraded 27 servers, including mine, with newer versions of PHP, MySQL and cPanel. And everything worked just great… until they all broke. So I'm sitting tight and watching the progress on the support thread.

Update #2:All better! The tech team at my web host put on their grass skirts and did the six nines dance and everything is up and running just as it should be.

Monday, March 13, 2006

Notes on hosting your own blog, part 7

Just a few days ago, the fine folks at WordPress.org released WordPress 2.0.2. From the inception of this blog's current form I've been using 2.0.1, so this is my first upgrade. The directions in the codex are simple to follow and worked flawlessly. It was just delete this, that and the other thing, upload new files and I was done.

The directions remind you to backup your language packs if you're using WP in a language other than English, but they miss one vital thing. You should also backup your .htaccess file. Failing to do so will change things like the permalink structure and will break addons like Ordered List's Feedburner plugin.

Fortunately I thought ahead and made a backup of everything. The upgrade from 2.0.1 to 2.0.2 was painless and quick, but only because I took that extra step they didn't bother to mention.

Thursday, March 9, 2006

Notes on hosting your own blog, part 6

The Akismet plugin is a must have for WordPress. A spammer found me the other day and left 250 spam comments. All of those comments were flagged by the internal WordPress spam filters, but those filters still leave something to be desired. For example, each comment is held for moderation, which generates an administrative e-mail for each instance.

I've read good things about Akismet, but hadn't enabled it because it requires a WordPress API key. It's free and simple, but still an extra step that I didn't feel like taking. Now I'm glad I did.

Akismet intelligently analyzes all comments before the internal filters kick in, so all of my spam comments now go to Akismet's own administration panel for moderation. Akismet is worth the effort just to avoid the avalanche of "please moderate" e-mails.

Thursday, February 16, 2006

Notes on hosting your own blog, part 5

So I'm planning on getting off the Blogger tit for good and migrating to Wordpress. (Not as big a deal as you might think. More details on that later.)

Wordpress is amazingly powerful. I've got a test installation up and running to fiddle with before I "go live." There's quite a learning curve on some of this stuff. For example, a standard Blogger template is XHTML with special markup stored in one file. The default Wordpress template is a mix of XHTML and PHP with special markup stored in seventeen files.

Installing Wordpress is fast and is a task of "intermediate difficulty." The basic options are as simple to use as the Blogger Dashboard. But when it comes to advanced customization, only ubergeeks need apply.

Saturday, February 11, 2006

Notes on hosting your own blog, part 4

So I was playing around with hotlink* protection today. I thought I was having some kind of problem with my web server's configuration because none of the protected files would load on my home computer. It turns out that hotlink protection uses HTTP referrers to determine if a request should be allowed, and my firewall blocks referrers by default.

This could be problematic. I want to use hotlink protection to keep other people from scamming my bandwidth, but it's not reasonable to expect all of my visitors to add custom rules to their firewall just for me.

*What is a hotlink? A hotlink is an object inserted on one site that is loaded from another site. For example, if someone inserts an image stored on my webspace into their own site, whenever their site is displayed the image loads from my site. Flickr and Photobucket are examples of sites that are designed specifically for hotlinking images. You upload your photo to Flickr's servers and then you use that photo somewhere else.

Removing the ability to do that kind of thing might sound mean, but bandwidth isn't free. Sharing my bandwidth with the world does not also share my bandwidth costs with the world.