Viewing Category: Gweeping
I've been trying to import my current WordPress database into my staging blog so I can play around with formatting; alas, the wordpress to wordpress importer has presented a few hurdles. Here's what's supposed to happen:
- On the old blog, go to Manage -> Export, download a WXR file with all yours post
- On the new blog, go to Manage -> Import, and upload the WXR file. Easy!!!
Except it's not, when you have more than 2 megabytes of data. You're not allowed to upload more than 2MB, because PHP poops out due to an internal limit.
I tried to work around this by patching WordPress to look for the file on the server instead of requiring you to upload it. That way, you can download your giant WXR file and use FTP to upload it somewhere to your server first. Geeky notes follow so I don't forget this stuff.
The Problem
There are several bottlenecks, many of them related to PHP's built-in limits:
- The
post_max_size and upload_max_filesize settings in php.ini are often set to something like 2MB or 8MB. That means you can't upload a file larger than that. If you have a lot of writing in your blog, as I do, you just won't be able to upload a big enough file. Fortunately, I have a dedicated virtual server and can up those limits, but if you're on a shared server you're screwed.
To work around this, I spend a couple hours trying to modify the WordPress import filter to use a file that had been already uploaded. I eventually hacked it to actually work, but hit another bottleneck related to memory_limit. But first, here are the brutal modifications I made to the WordPress 2.1 files:
In wp-admin/admin-functions.php: wp_import_handle_upload():
Added the following lines between $overrides = array... and $file = wp_handle_upload(...) as follows:
$overrides['test_size'] = false;
$localFile = array('name'=>'import',tmp_name=>'/full/path/to/wp-import.xml');
Don't forget to replace /full/path/to/wp-import.xml with the name of your exported WXR file. Next, I modified the $file = wp_handle_upload(... line to read as follows:
$file = wp_handle_upload( $localFile, $overrides );
Next, in wp-admin/admin-functions.php: wp__handle_upload():
Commented out the following around line 1838-9:
// if (! @ is_uploaded_file( $file['tmp_name'] ) )
// return $upload_error_handler( $file, __( 'Specified file failed upload test.' ));
Modified the move_uploaded_file() call to use copy() instead, around line 1879
Finally, in wp-admin/import/wordpress.php:
Comment out the statements for case 0 around line 324-325, so control flows through to case 1:
case 0 :
// $this->greet();
// break;
The net result of these changes is to bypass the uploading form when you click the import -> wordpress selection. It should automatically attempt to read the WXR file. The modifications above are to bypass the security mechanisms in place that prevent you from using non-uploaded files.
If you find that you're getting a dialog box that asks you to download admin.php, what has probably happened is you've run out of memory (check your PHP error log). The WordPress importer reads the entire file into memory at once, so if you've got a big file you'll need a lot of working memory to process everything. For my blog, I needed about 64MB of working PHP memory, which I could fix by changing memory_limit to 64M in my php.ini file. If you're on a shared server, you're kind of screwed if you can't change these.
You probably are better off exporting piecemeal using Aaron Brazell's WordPress to WordPress importer, which gives you the option to export selected categories. This is what I did the first time. Note that it works best between the same Wordpress database versions; Importing from WP 2.0 to WP 2.1, for example, will cause some funny things to happen with Pages and vice versa. As it is, subpage importing is currently broken, so watch out for that too.
WP-Cache helps my site run smoothly by storing copies of the web pages that are fetched most often from this site, which means that the WordPress system just needs to generate that page once. This has worked fine, except for a mysterious disappearing page problem that happens every once in a while.
A few of you have probably seen it: a visit to a popular page shows just the top-half of the page. I spent a little time debugging this tonight because I saw this problem occur three times with a popular post, which meant that a lot of people were unable to see it when they tried to. Well, no more!
WARNING! Geeky notes follow, so I don't forget what I did.
Occasional Cache Corruption
Every once in a while, someone will send me a nice email telling me that a certain page is coming up "blank", showing just the top-half of the web page with no text. What I usually do in this case is invalidate the cache in the WP-Cache options menu, and this fixes it. Unfortunately, it means all the cached files are also gone, which means the server has to rebuild all the cache files.
I eventually noticed that the problem files could be found by listing the cache contents under the WP-Cache Options panel, seeing which file is unusually small, and then deleting just that file. This tells WP-Cache to rebuild it next time someone requests it. In my case, if I see a posting file that is less than 9K, it's probably screwed up. It appears that the file is truncated, chopped off at a certain point before the rest of WordPress has a chance to populate the page.
After today's issues with my "Procrastinator's Clock" page going under without warning, I decided to poke through the WP-Cache source code to see if I could insert an automatic check, so I wouldn't have to worry about it.
Modifications to WP-Cache
The file wp-cache-phase1.php is that piece of WP-Cache that checks whether a particular URI has been cached already, serving the cached copy if it exists. Around line 35, I inserted the following:
...
foreach ($meta->headers as $header) {
header($header);
}
// DS: start hack
$url = $meta->uri;
$size = @filesize($cache_file);
if ( $size < 9216 ) {
error_log ("WPCache: $size < 9216, expiring $url");
// write problem file
$myFile = "/path/to/writeable/file/in/htdocs/wpcache.log";
$fh = fopen($myFile, 'a') or die("can't open file");
$stringData = file_get_contents($cache_file);
fwrite($fh, "nn##n## Truncated File: ".$meta->uri." ($size) bytesnn");
fwrite($fh, $stringData);
fclose($fh);
// tell WP to recreate cached file
$file_expired = true;
return;
}
// DS: end hack
$log = "<!-- Cached page served by WP-Cache -->n";
if ( !($content_size = @filesize($cache_file)) > 0 || $mtime < @filemtime($cache_file))
...
The code block marked DS: start hack is the new stuff. It grabs the url of the cached file being loaded out of the meta information stored with it and then sees how big the cached copy is. A good page on my site is always bigger than 9K, so if it's LESS than that this means that the cached copy has been screwed up. This error is logged to the PHP Error Log AND the truncated output is written to another file called wpcache.log so I can analyze it later. The hack tells WordPress to re-generate the page by setting the "file expired" flag to true; WP-Cache will then recache it through a different module.
I'm hoping that this ensures that corrupted cache copies don't stick around for hours as they've done in the past. Also, because I'm logging the errors, hopefully I'll be able to figure out what the pattern is that causes this cache corruption to occur.
I got dugg for the first time yesterday, for the Water post of all things, and this was an excellent test of my new Media Temple dedicated virtual (dv) server.
I'm running the very cheapest of (dv) plans ($50/month), which has a "guaranteed" memory allocation of 256MB. It actually can use more, because the (dv) is a virtual server sharing a single machine with others. If you need more memory, and it's available, your server can grab it. Freshly minted, my (dv) was configured to make as much use as possible of this pooled memory, which I suppose encourages people to upgrade to higher-capacity (and more expensive) plans. I can't afford that, so I learned how to modify the MySQL, Apache, and SMTP configuration to run within a 256MB footprint. Then, still seeing esoteric memory allocation failures, I tracked down some significant inefficiencies in my WordPress installation and got rid of them. Just in time too, to handle the unexpected spike in traffic.
It may have been the time of day (2PM), but the peak Digg traffic lasted only a couple hours. Those first couple of hours, though, the (dv) served 2500-2750 pageloads per hour without breaking a sweat, the server load hovering between 0.5 and 0.7 for most of the time. The site remained highly responsive, once I turned off the "KeepAlive" web server option. This option allows a web browser connection to serve more than multiple chunks of data (like all the graphic files on a web page) in one long transaction; ordinarily it's one chunk per transaction. KeepAlive is sort of like being able to monopolize a shoe salesman at a big shoe warehouse, insisting that he bring you a steady stream of shoes for your convenience exclusively. This isn't a problem until the number of pushy customers exceeds the number of salespeople. Then, anyone who's late to the party will wait a looong time to get any service. With 2750 page requests, each with 30 chunks of data and only 30 processes maximum to deal with them, I had to turn off KeepAlive so everyone got served in a timely manner instead of timing out. And yes, I did have a short KeepAliveTimeout set (2 seconds). There is probably some interesting formula to calculate the optimal way to serve the most connections with the least resources, but since I didn't know it I just watched the server and made sure it didn't boil over. When it failed to even get warm, I disabled WP-Cache (remembering to delete the existing cache) to see what kind of increase I'd see. By this time traffic was starting to die off slightly, pulling only 20-40 pageloads per minute, but I saw the load climb to about 1.5 to 2.5. Still not too bad, but I turned the cache back on.
As far as Digg effects go, my experience was relatively mild compared to others. 2750 pageloads/hour is still the record for my site; previously the max I saw was 1600 pageloads/hour, which almost killed the shared host I was on. Of course, the inefficiencies in my WordPress setup (the Mint pepper DLoads, primarily) helped drag the entire server down. I'm starting to keep notes in a new area of the site; if you want a sneak peek, you can read about my experiences with WordPress and shared hosting. I'll be writing up my (dv) experience (and configuration) later.
On a side note, I've been fairly happy with (mt) customer service. They can take a couple days to get back to you via the request system (weekends are especially long), but the quality of support has not been bad. Everyone I've talked with, via email and phone, has been polite and respectful. Of course if you need something done right now or you're experiencing yet another (gs) outage, you probably have a different view of things.
That's it for now!
So here I am on the new server, a dedicated virtual server (dv) from Media Temple. I didn't realize that this (dv) platform is brand new, having launched at the very end of December 2006. Lucky me! My server problems could not have been timed better.
It's only been a few month since I moved to FutureQuest, so the whole How to Move a Working WordPress Installation procedure was relatively fresh in my mind. It worked out a little differently, so I'm documenting this process again.
Geeky notes follow!
The Basic Idea
Since I have a working installation on davidseah.com, I need to do a few things in this order:
- Buy new hosting from Media Temple, keeping the old host active so I can move files.
- Move my email mailboxes over to the new host
- Move my Wordpress files and the MySQL database that powers it
- Move any other non-wordpress services that might exist on the old site.
- Change the official name servers for the davidseah.com domain to use the new ones
What is dedicated virtual hosting?
The dedicated virtual server (dv) from Media Temple is different from the usual shared hosting I was working with. For one thing, you don't share the server with anyone else from your point of view. Technically speaking, you're actually running a simulated dedicated server (that's why it's called "virtual"), on hardware that is shared with other virtual servers. The advantage is that you can make any changes you like to the operating system environment, including having full root access. You also gain the economy that comes from sharing hardware resources, with improved isolation from your neighbors CPU- and memory-hogging hijinx.
On the down side, you need to know something about system administration. The (dv) version 3 package uses an enterprise-level Linux (CentOS) on Virtuozzo, which is controllable to some extent by the Plesk 8.1 control panel. With Plesk, you adminstrate the server and can create clients, domains, and mail users. You can do some limited configuration of common services, but it does assume understanding of how these services work. If words like daemon, mysqld, cron, xinetd, smtp and ssh mean nothing to you, then Plesk might not be all that easy to understand for ya.
The main advantage of Plesk, from my perspective, is that it allows you to manage your domains and hosting clients from within a pretty GUI that works. Plesk also provides a measure of stability on your server because all the software and operating system components have been tested and made to fit together; sort of like having managed hosting without the expense or the expertise. You can purchase additional "snapshots" of your server configuration, which is really really handy if you're the kind of person to mess around with things and need a way to undo the damage. Especially useful if you have a tweaked configuration.
Where Plesk falls short is lower-level configuration of the services it manages. If you want to change the setting of some internal MySQL or PHP variable, you'll have to get root level access to the server (which you can, since it's dedicated to you). Plesk will restart a service (like mail) for you, but that's about it.
Since the server is dedicated, you can install your own software on it. At Media Temple you need to request to have the developer tools installed first, then you can do things like install Ruby on Rails and compile from sources. Note, though, that if you update the "plesk managed" parts of your server, you may not longer be eligble for the automatic updates that MediaTemple will do for you through their Update Option Program.
Getting situated on the new (dv) 3
There was a problem with my order, and I didn't get my welcome emails that described what I was supposed to do. Normally with shared servers, you would receive an email that tells you the basics of how to move your files and how to set up email. I didn't get any of that, so I had to figure it out from scratch.
The good news: You use Plesk to set that all up. The bad news: you're on your own. There is a Plesk User Administration Manual I just found, which is probably worth reading. Until you set up your accounts to allow FTP and ssh into the system, you're sunk. The quick way to do this:
Set up your domain through Plesk. Then select the domain and click the SETUP icon, choose Account Preferences and enter the FTP Login name. This create a user that you can use to FTP into the site. You can also optionally allow this user to SSH in, if you assign a shell using the dropdown menu.
FTP your files using the login name you created. Drop 'em in the httpdocs directory.
Create email mailboxes by clicking back on the domain you created (in my case, davidseah.com) and then clicking MAIL under the Services heading. This is where you can add your email boxes and configure aliases for each one. It's actually pretty nice. I used the exact same names and login credentials for my new mailboxes, so theoretically I won't have to change much in my email program setup.
I'm skipping a lot of strange first-time configuration here...you'll be forced to set up your domain and a default Client. Every domain (like davidseah.com) is tied to a Client (for me, I chose Dave Seah). I can create more domains and clients under my dedicated IP address (up to 30 with the basic license) and run a mini hosting business through Plesk, which is pretty cool.
Moving Your WordPress Files
I've got about 500MB of files on my current host that I got to move, not including the WordPress MySQL database. I could download the files to my computer and re-upload them, but that tends to be slow, so I do a server-to-server FTP transfer. This requires shell access on each host.
Login to both your new server and your old server via SSH. For the purposes of this description, the new server will be called newserver.com and the old one oldserver.com.
On the old server, use the tar command to compress your wordpress folder. Like tar cvzf wordpress.tar.gz wordpress/*, assuming that you're at the same directory level of your wordpress folder.
On the new server, type ftp oldserver.com into the shell and login to your account. Make sure bin is set, then do a get wordpress.tar.gz. Since your hosts have a lot more bandwidth than your crappy home cable connection, this will be way faster.
On the new server, uncompress your archive with tar xvzf wordpress.tar.gz. The entire directory structure will be recreated.
There's actually more than this that you have to do usually, because there are probably other directories you'll want to move, including hidden files like the .htaccess file at the root of your installation. Move 'em all! Make sure you have enough disk space on your old server to create the tarfiles...they can get big, even with the compression.
If all this talk of tar and command-line FTP makes you ill, you could of course just re-upload your wordpress files from your home computer using FTP or DreamWeaver or whatever you're using. My cable modem upload speed is about 40K per second, versus the megabyte-per-second bandwidth when talking server-to-server.
Moving your WordPress Database
The database lives in MySQL, the database engine that stores all of the posts and comments in my blog. This isn't stored as a regular file, so you have to use two command-line tools called mysqldump and mysql.
If you're using WP-CACHE, disable it before doing the following steps.
Shut down your wordpress installation by renaming your wordpress directory or equivalent. You don't want people hitting your database while you're trying to dump out a copy of the data.
Dump the database on oldserver.com into a file called wordpress.sql.gz with the mysqldump command. You can get the values of db_name db_host, db_user, and db_passwd out of your wordpress/wp-config.php file.
mysqldump db_name -hdb_host -udb_user -pdb_passwd -Q --opt > wordpressdb.sql
gzip wordpressdb.sql
FTP the wordpressdb.sql.gz file to newserver.com.
On newserver.com, make sure that you have a database setup with a database user and password. Usually you can find some tool that will do it for you, like PHPMyAdmin or a control panel of some kind.
There's a good chance that the new server will have different db_name, db_user, db_host, etc values, so you'll have to update your wp-config file to use the new values. Do this now!
Assuming you've got the new database set up, it's time to re-import the database. Using the values you just generated when creating the new database and database user, do the following on the new server:
gunzip wordpressdb.sql.gz
mysql db_name -hdb_host -udb_user -pdb_passwd < wordpressdb.sql
If you're very lucky, everything will import cleanly, and you'll get no errors. I got a couple, a "syntax error" that resulted because the MySQL 4.0.x installation on my old server didn't emit quotes properly, as some tables used reserved keywords for their field names (adding the -Q option to mysqldump fixed that). The second error was due to the max_packet_size value being set too small by default on the (dv). It defaults to 1 megabyte, but is usually higher on shared servers. I had to modify the etc/my.cnf file and restart MySQL, which did the trick. You will need root access to edit the configuration file, so make sure you request it immediately when you buy your (dv). It took (mt) 3 days to respond to my request!!!
You might have import problems too if you are downgrading from one version of MySQL to another, as I did the last time I moved. You might want to read the older post for some hints on how to use mysqldump to get around that.
Ok, you've just moved a copy of your wordpress database to the new server! Unfortunately, you've got to do a little surgery on it now, because your new server doesn't have a domain name yet. WordPress stores the domain name in your database, so you'll have to use a database tool like PHPMyAdmin to edit the wp-options table, specifically the siteurl entry, to point to my temporary server address. In my case, it's the numeric IP number of the new server, which is 64.13.223.31. The value of siteurl from my old site is http://davidseah.com/wordpress, so I need to change it to http://64.13.223.31/wordpress so I can test the site.
At this point, I'm ready to test that WordPress has made it over. Visit http://64.13.223.31 in a web browser and cross your fingers. If you get a blank page, that might mean you have to update the WP Cache symbolic link in your wp-content directory (I always forget this). Check your PHP error logs to see what the problem is. You could be missing files from your transfer, or you might need to change permissions for certain folders. My old FutureQuest environment kicked ass, so I had to re-adjust to some of the restrictions in place on the new server.
If everything pops up, yay! You now need to change another option in WordPress temporarily. Login as a wordpress administrator, go to OPTIONS->GENERAL and change the Blog Address to match the numeric IP.
Also, go back and re-enable WP-Cache. You might have to do some additional configuration based on what it complains about.
There's an additional checklist I follow, which is available in part 1. I have just finished running through it, so now I'm ready for the last step.
Do the Nameserver Switch
I control my domain name via a third-party registry, so the last step is to tell the world that the new home of davidseah.com is Media Temple. Media Temple's name servers are responsible for telling the world this now, so I update my domain registration to make them the "name servers of record." Some notes:
Some plugins, like ones that depend on the Flickr API, may not work until the domain name change switches over. At least I think that's what's going on.
It takes 24-48 hours for the entire world to see the switch. In the meantime, email will probably be going to both servers, so be sure to use webmail to check both.
After the new domain servers are stable, I'll switch the WordPress SiteURL options from the numeric IP address back to davidseah.com.
Here goes nothing...see ya on the flip side!
Pulling the switch!!!
UPDATE: I have started writing a guide to configuring the (dv) Base for WordPress to optimize performance. They're quite long, but if you're having problems running out of memory this might be helpful to you.
I had a problem with my Google Sitemap, which was not being recognized by Google because my "404 (file not found) error page returns a status of 200 (Success) in the header." So I dug around to fix my 404 page setup, which never really worked. Geeky notes follow, so I don't have to look this up again.
Setting up a Custom 404 Page
I had noticed some time ago that non-existent pages on my site which should have generated 404 pages were instead delivering "post not found" pages. This was right after I upgraded to WordPress 2.0 from 1.5, so I figured it was just some change to the way it worked.
As I was researching Google's 404 verification requirements and WordPress, I realized that it was that my custom theme doesn't have a custom 404.php page. So I added one, following the directions. Still no go on Google verification. I used a web page header display tool to check that the 404 was being sent. It worked, but then when I told Google to verify the site again, it failed. Weirdness.
Caching
After some digging, I tracked it down to WP-Cache 2.0.17, the plugin I use to reduce the load on my shared server. What happened: when an attempt to access a non-existent page occured, the first time WordPress properly delivers a 404 page with the right headers set. However, this output is CACHED by WP-Cache, so the next time* the bad page is request, the cached error page is delivered! And of course, that's not a 404, but a successful delivery.
WP-Cache 2.0.19 fixes this by no longer caching 404 errors. Google Sitemaps verified my site, and everything seems to be working again
Spiffing up the 404 Page
I came across the A Perfect 404 article as I was figuring out what was going on, and cleaned up my 404.php file to be friendlier. If the $_SERVER['HTTP_REFERER'] variable exists, it emits it as partof the error message, and provides a link back. If it doesn't exist, it prints a more generic message. I was thinking of implementing a check of the referring link to customize the message to search engine traffic, but I'll leave that for another day. The A Perfect 404 has some instructions if you're interested.
SECURITY UPDATE
In the comments, reader "epc" points out that printing out the value of Referer without some escaping is not a safe practice. I added a test that checked whether the referer value begins with http://davidseah.com or http://www.davidseah.com, and further escaped the output using the htmlspecialchars() function. I'm not sure what can really be done with the 404 page that might be dangerous, but thinking about issues like this is a good habit to get into. This article on Top 7 PHP Security Blunders was helpful in understanding some of the other issues. Thanks epc!
This is my first "real" paid review through the ReviewMe service. Today's topic: easy online remote backup! The product in review: Data Deposit Box from Acpana Business Systems Inc., a Canadian company based in Toronto.
Um, Where's my Files?
I'm pretty meticulous about backing up my data. When I was a happy-go-lucky kid in high school, I remember working all night on some paper or program, only to accidentally LOSE HUGE TRACTS OF IT because I forgot to save. Or I would accidentally save over the wrong file, destroying a critical fragment of my personal history. Small disasters of this kind would occassionally crop up even through grad school; I might be working on some Photoshop file on a System 8.x Mac, and then the entire computer would lock up for some unknown reason. Hours of work lost, much cursing and swearing ensued.
So I evolved. I have an automatic "save" habit that kicks in everytime I pause to reflect. I automatically save new versions of files, with new filenames, using a versioning system. I archive and copy across multiple disks on multiple computers when possible, and burn to CD. I use version control software. I've gone through great lengths to separate the operating system, applications, and data onto physically separate drives; that way, if I have to restore my operating system due to some catastrophe, my DATA will not be erased in the process.
Frankly, I'm kind of a nut about data archiving and redundancy, but normal people have better things to do with their time. Occasionally, one of them will ask me how they too can not lose all their mail every time they upgrade to a new computer, or perhaps they've experienced for the Nth time some horrible loss of an important file due to a hardware failure. I tell them what I do, and their eyes glaze over. Most people find this to be a chore, but I actually like it, for some perverse reason. It's my version of gardening, I guess.
Which is why I found the prospect of reviewing Data Deposit Box interesting. It claims to be designed specifically for non-technical people, secure via its on-the-fly data encryption, and affordable at 2 bucks per gigabyte a month, paying only for what you use. While I don't have a particular need for it, I know lots of people would rather have a service take care of this for them. Let's take a quick look.
The Basics
Data Deposit Box (DDB) runs on Windows PCs, and installs as a program that monitors certain folders of your choice. Whenever a change is made to the contents of that folder (say, your "My Documents" folder), DDB detects that and then uploads the file to their encrypted server over the Internet. The cool thing is that once you tell DDB which folders to monitor, you don't have to do a darn thing except leave your computer on long enough so it can do its thing. If you're in the habit of turning off your computer every time you are not using it, then this program will probably not work for you. I leave my computer on overnight so it can run its daily virus check, so it works well with me.
Strapping In
Because DDB runs in the background, I was particularly concerned about how well behaved it is with respect to other applications. I'm constantly running big apps like Photoshop, Illustrator, Flash, and Dreamweaver with Excel, Thunderbird, Firefox, and Word open in the background. I am very sensitive to any program that gets slow and bogs the computer. There are several things I check for when installing a new system-level utility like this.
How big is the installer? Smaller is always better. In this case, the installer was about 3.83MB, which is fairly small. Compactness is often a sign of good and lean software engineering, though it's no guarantee. If I'd seen anything greater than maybe 7MB, I would be suspicious...an overzealous marketing department perhaps loading up the software with giant images and video files in an attempt to make their product more consumer friendly.
Create a System Restore Point. Windows XP has the ability to create a "snapshot" of your system before you install something. You can go to PROGRAMS -> ACCESSORIES -> SYSTEM TOOLS -> SYSTEM RESTORE to access the tool and create a restore point. After I'm done experimenting with this, I'll do a restore and rollback to the state my system was in before.
Turn on Performance Monitoring. This is an computer administration tool, available at CONTROL PANEL -> ADMINISTRATIVE TOOLS -> PERFORMANCE, that lets you monitor some of the inner processes of your computer. The ones I was watching was % Process utilitization overall and by the Data Deposit Box program itself, and also network bandwidth used. If DDB was a process hog, I'd see this plotted in real time on my monitor.
Turn on Process Monitoring. I just like to know what's going on, so I installed Winternals Process Monitor to watch as the program did its various things. ProcessMon tells you secret things about the computer, showing you what programs are doing at the operating system level. Perhaps I was being a little bit overzealous in my monitoring, but hey, I'm a nerd.
Then I ran the Installer, and braced myself.
The Particulars
The first thing the installer asks (after making you accept the End User License Agreement) is what folders to watch. By default it will monitor:
- My Documents
- Desktop
- Favorites
- Microsoft Outlook
- Microsoft Outlook Express
- Windows Address Book
That's a pretty good default list for most people, and you can modify it later if you wish. Since I keep all my data elsewhere, I unchecked the defautls set the folders manually after installation.
DDB installs a task tray icon that allows you to pull up the main dialog. Here's some screen shots:
The main dialog box. The main options I used were OPTIONS (to select which folders to back up and how) and RESTORE (to check that it really worked).
Here's the settings for the program. You can see you have some options on the number of versions to save per file (you can have up to 21, though I was unable to determine how to access any particular one) and how aggressive to be in terms of bandwidth and CPU hogging.
This is where you set which folders to watch. When a file changes, DDB will start uploading the changes in the background using your Internet connection.
In case you're wondering, the Advanced tab allows you to set Proxy Internet settings (if you have a proxy server that sits between you and the Internet, like at work) and where to store temporary files.
Uploading Files
After I set the folders to watch (about 35MB of files), I did some reading while watching the uploads out of the corner of my eye. The first thing DDB does when it activates is do a version check of all the files under its care; this can take a while. Then, it uploads the changed files to the server. On my cable modem connection, my upstream rate (i.e.: the fastest I can upload) is about 50K a second, so 35MB takes a bit of time to sync up. This is the kind of thing you'd probably want to run overnight, or if it's running during the day you would probably want to set the Bandwidth/CPU Allocation on the Options dialog to less than 100%.
The files are all encrypted, though I'm not sure what key it uses to encrypt them.
Restoring Files
After you've uploaded the files, you can click on the RESTORE button and get a list of files you can recover back to your computer. This seemed to work. As I mentioned, I couldn't determine how to access the versions of a file. I edited a text file a few times to see how quickly DDB would pick up on the changes and upload it. I wanted to restore the very first version, but didn't see how to do it immediately. It might be a buried option somewhere.
Sharing Folders and Files
Once your files are uploaded, you can choose to share them with people. This is not a file sharing service for MP3s, mind you (this is expressly forbidden in the EULA), but if you have a client that you'd like to provide a file to, you can do that on either the folder or file level. You can do all this through the online interface.
This part of the experience, actually, was a little less robust than the rest of the application. When you delete files, the tree view showing all my folders didn't update, so I was unsure if anything actually had happened. I forced a page refresh to see that it worked. A couple times I saw a database connection error, which was less than reassuring. The interface could use a little smoothing out, I think, but it was otherwise pretty usable.
One other note: the file sharing URLs that the system generates are ludicrously long. They're prone to wrapping in an email message, which creates problems when distributing links to people.
The Experience
Because the ReviewMe terms allowed only 2 business days to write this, I can tell you about long-term stability of the program. However, I can tell you that the experience was not bad. I didn't experience processor issue or problems with the installation, and the program did seem to work as advertised. I didn't notice anything that would make me uninstall the program immediately, which is actually kind of unusual. Any of the following are grounds for banishment from my system:
- bloatedness relative to function
- sluggishness
- ugliness
- excessive marketing
- bad UI
- excessive bugs
- unclear function
- unclear feedback about operation
I didn't experience any of that, so I would say that the desktop component was surprisingly good. There are a few confusing UI spots, but it behaved well and seemed to do the job. The help button isn't context-sensitive, for example...that's a minor quibble. I was happy that my system didn't get bogged down with this running continuously. And, it's not ugly or filled with questionably-useful graphic imagery, and I didn't get cross-sold on "other wonderful products you might be interested in".
I was a little less impressed with the robustness of the Internet side of things. It looks like a solid "Web 1.0" application, but seeing an ODBC database connection error on the "My Data" page does not fill me with good cheer. The experience is almost there, and may be above average for services of this kind, but my daily web experience revolves around Web 2.0 apps like BaseCamp, and they are shockingly robust. The bar has been raised!
What about the Cost?
This isn't a free service, so you have to weigh the cost/ease of use between their service and just buying a big external hard drive.
Data Deposit Box Pros: Works in the background, and it behaves with the rest of the system (at least, in my limited experience). Don't have to think about it, and data is sharable with other people. Backup is also offsite, so if something happens to your office, your data is safely somewhere else.
Data Deposit Box Cons: Costs money, online interface could use some tweaks, takes a relatively long time to upload data compared to using a local hard drive.
There are some other uses I can think of immediately:
I could set something like this up for family members under a single account (there can be as many users as you want). No relative of mine would ever lose their email again.
If you think of the service as file hosting, $2/gigabyte isn't that bad compared to regular web hosting. Factor in the ease of backup and sharing, and it seems quite reasonable.
So while I can't make any long-term assessments on the product, my initial impression is quite favorable. Certainly worth a look. It lasted on my system way longer than Adobe's Version Cue software, which was so slow I thought my computer had crashed. All that process monitoring software didn't raise a single red flag in my brief session.
There's a 14-day free trial available. I think I'm using it now. That reminds me, it would be nice to see some kind of feedback about how the free trial works before you hand over your credit card number...that is offputting enough for me to not even want try the service.
Again, this was a paid review booked through ReviewMe.
» Link: Data Deposit Box
I was curious why people were reporting having trouble accessing the Compact Calendar. Usually, my site access problems stem from PHP timing-out while WP-Cache is attempting to build the page, which means that nothing shows up. Usually I clear the cache, and the problem goes away.
This time, though, it was a user error on my part; apparently I'd set the Private flag on that post while making some update. WordPress then continues to show me the post because I'm logged-in to the system, but everyone else sees the message sorry no posts found. Because of the clicky nature of the post editing window, which took a small usability step backwards in 2.0, it's not difficult to accidentally click and set the post status to private without realizing it.
Anyway, the Compact Calendar is back online...I apologize for the trouble in accessing it. And for WordPress users, here's what I did to fix it so this doesn't happen again. WARNING! Geek talk follows!
Making the Invisible Visible
The problem is that I can't tell as a logged-in user that a post is marked private. This is entirely because my aging WordPress template isn't coded to display this information. It's been frankencoded out of bits of the old WordPress 1.1 template with MovableType's old CSS. So the solution is to add some conditional logic to check the following:
- Is a private post being displayed?
- If yes, then display something like "PRIVATE POST " somewhere visible on the page.
- If no, display normally.
The challenge is now to figure out how to modify my template to do this, which means that once again I get to dive into the WordPress API. Which is incompletely documented. The way that I've cobbled together my bits of functionality is by studying other plugins and templates discovered through dilligent googling, using search terms that are descriptive of the problem ("wordpress show private posts") and are also a bit technical ("template is_private"). The "is_private" search term is based on a guess I made about what function WordPress might have to help me detect if a particular post is private or not; there are several other 'is_this' and 'is_that' functions that are built-in to the system. Read on.
Checking for a Private Post
There are a number of special functions that you can use in your WordPress Loop to conditionally display information. The Loop is the PHP code that actually pulls your articles one-by-one from the database and displays them in vivid HTML; if you want to modify the appearance of a post based on its category, you can use the is_category() function to check whether it is in a certain one.
There's a whole bunch of such functions available, and I've never found a complete and comprehensive list of them in one place, until I stumbled upon PHPXRef. There's a complete WordPress function list available (click the functions tab at the top of the screen) that lists everything that I need to see for the latest version of WordPress. Alas, there is no additional documentation other than the list, but since WordPress is coded with fairly clear function and variable naming conventions, a programmer with direct experience with the application can scan the list and get an idea of how the application operates just by seeing how elements are named. It's not unlike being able to walk into an office and read, just by looking at the arrangement and details of the space, how things get done (or not).
The specific function I was looking for was something like is_private(), but SURPRISE...there IS no such function! Fortunately, someone else already figured this out and posted a solution. The code just queries the WordPress post directly and checks the post_status flag. I didn't know how the WordPress posts were organized, so this saved me a lot of digging. I took the function and dropped it into my template directory's functions.php file. This makes the function available for use in my template (and ties it directly to my template, which makes it more portable from installation to installation).
Now I can make the changes to my template. There's a line of code in it that looks something like this:
<?php
if ( is_private() )
echo "<span style='color:red'>PRIVATE POST</span>";
?>
Filed under <?php the_category(', ') ?>
The conditional IF statement will prepend PRIVATE POST (in red letters) in front of the normal "Filed under [category]" bit of text, but only if it's a private post. For regular users, this will never happen. For ME, I should see the warning signs pretty clearly next time I bungle the privacy setting on a post.
Going Further
There are 30-something other simple is_something functions that you can explore at PHPXref, along with everything else. Unfortunately it's not exactly documentation. I find it useful for just seeing the "big picture" of function availability. If I want to know what the function does, I have to click on the function name, then look at the source code to divine what it actually does. It's not a bad way to learn PHP though, by modifying a piece of software you use everyday.
Here's an example of exploring the reference: I was scanning the list and the is_preview() function caught my eye. So I clicked it, then clicked on the defined in line # link on the next page, and this took me to the place in the WordPress source code that defined it. You can follow around such links until you find some comments that tell you what the hell it's supposed to do; in this case, the pertinent comment in wp-include/classes.php was if we're previewing inside the write screen. Now I know what this function does.
There's a lot of tracing lines of logic and following declarations, so this technique won't work for the novice programmer who's just learning how to code. For more experience programmers, it's a pretty quick way to explore the code without a heavy-duty integrated development environment (IDE) capable of cross-referencing functions, variables and classes. Doing this on the web is actually a little more convenient for me, since I don't have to worry about messing up my WP source files by accident.
Another cool thing: the PHPXRef site also covers other blogging and CMS systems, so this could be a very useful resource for understanding the architecture of those systems using a "ground-up" approach...which is really your own option when the official documentation fails to provide that picture for you.
So that's that.
First Post, The Second Time Around
Ok, I'm about to embark on my server moving adventure! I'm moving my entire davidseah.com domain from pair.com to futurequest.net, and I may do it yet again with MediaTemple in a few weeks to try them out. Here's my live move notes, which should come in handy later.
Moving the Database
Here's what I did to move the blog:
Disabled Mint stats package on all pages so it doesn't access the database. I'm not going to be using it on the new server, will stick with Google Analytics for a while and see how that goes.
Changed Basecamp's File Upload to use a different server. There are still some problems with old links though, so I'll have to add a redirect later.
Moved all important files over (tar gz'd, ftp server-to-server). Had to nuke some, because the disk space limits on FutureQuest are pretty limiting (another reason to consider MediaTemple).
Recreated mailboxes on the new server, including the dozens of forwarding rules I use.
Locked out comments on the old server by modifying wp-comments-post.php to die() with a "sorry, posting comments is disabled" message.
Used mysqldump wordpress_db --opt > backup.sql to create a backup file. I had to use the --compatible=mysql40 switch, because the version of MySQL on FutureQuest is an older one.
Created database on the new server for WordPress, to restore the backup into.
Used mysql wordpress_db < backup.sql to import the data. No errors! Yay!
Used PHPMyAdmin, which I had installed earlier, to drop the tables I didn't need from wordpress_db. These were mostly old stats from Mint and SpamKarma. Also ran REPAIR and OPTIMIZE just for the hell of it.
Edited the wp-options table in the wordpress_db, specifically the siteurl. I pointed it to the temporary URL for the new server. Then I went to my sites admin panel and set the blogurl in general options to the same. Why do this? If I don't, then clicking on internal links sends me back to the OLD blog, which makes it difficult to test this one. Eventually I will change them back to point to davidseah.com.
I had to restore the auto_increment and eliminate spurious default values of '0' for each table's id, otherwise I got posting errors like Duplicate entry '0' for key 1. There's a bug in mysqldump where it does not preserving auto_increment when the compatible=mysql40 flag is set. All of the wp_ tables needed the default '0' removed, and the 'auto_increment' flag set. I used phpMyAdmin to reset the value.
Deactivated / Activated Plugins, just for the heck of it.
Testing the Site
I disabled wp-cache and clicked through the following on the new server:
- Checked blog categories one-by-one
- Checked the "About Dave" links on the sidebar
- Checked all the .htaccess redirects I have, to make sure they worked
- Changed absolute links from http://davidseah.com/archives/... to just /archives/...
- Checked FAlbum links and htaccess redirects
- Checked comment links from the sidebar
- Checked link to the New Media Group posts
- Checked public RSS feed links
- Checked private RSS feed link for Feedburner
- Checked Google SiteMap Plugin
- Checked robots.txt
- Checked The Printable CEO download links
- Checked links in the footer
- Checked services that might be referring to graphics on davidseah.com (forums, etc)
- Posted a comment...works, after restoring the auto_increment and default value issues I mentioned above.
- Posted this post.
- Scanned the directory listing on the old server to see if anything else jogs my memory...nothing that has to happen right away.
The only thing left to do is switch nameservers and cross my fingers.
I'm going to be moving davidseah.com to a new server starting Friday (October 20). I'm currently using Pair Networks, which has been great. I'm moving to FutureQuest, which has as gold-plated a reputation as Pair. The reason I'm moving is primarily because I'm starting to exceed the "maximum transactions per minute" limit allowed by Pair's database servers during peak times. I'm also curious about FutureQuest, which I've heard good things about, and how the heck one moves a Wordpress blog in the first place.
My checklist looks like this:
- Recreate mailboxes and forwarding aliases on the new server.
- Copy files over to new server via server-to-server FTP
- Update dependent services (Basecamp's FTP settings).
- Lock comments on the old server blog
- Clone the WordPress database (which I've already tested last month)
- Switch over the DNS name servers from Pair to FutureQuest, and wait for the new DNS information to propagate over the weekend.
- Cross my fingers
My email will also be up and down as the nameservers sort themselves out on Saturday and Sunday.
RANDOMLY THINKING...
Hm, that MediaTemple Grid-Server package is looking kinda tasty too.
I've been noticing an increase in comment spam yet again. While Bad Behavior and Akismet are doing a good job of keeping the comment spam out of the blog, I'm also starting to hit some server performance issues. I'm not quite ready to move yet, so I thought I'd try an easy trick to see if I could alleviate server load due to heavy spamming activity.
Geeky notes follow.
Bad Behavior and Akismet Revisited
Someone asked me why I was using two spam plugins...didn't they do the same thing? So here's why:
Bad Behavior 2 is the first line of defense. From the Bad Behavior website:
Bad Behavior was designed and built by watching actual spambots which harvested email addresses, posted comment spam, and used fake referrers. By logging their entire HTTP requests and comparing them to HTTP requests of legitimate users, it is possible to detect most spambots.
In other words, it actually acts as a screening program to prevent suspicious software from "seeing" your website. It does this by doing a bit of behavioral profiling; software that does not play nice is not allowed any further. This means that the spambots can't even see the comment form, which means they can't leave spam. Legimate users with a web browser, though, behave nicely, so they are allowed to see the web page in most cases. Advanced users who have tricked out their Internet to be anonymous, for example, fall into the "spooky" category and probably get bounced.
Despite all the work that BB2 does, some spambots still get through. This is when Akismet comes into play. This plugin analyzes the comment text itself and determines whether it's spammy or not. Bad Behavior, by comparison, just looks at how the entire website is accessed, not the content. If you were to think of Bad Behavior as law enforcement, it's like an officer who arrests anyone that even looks like they might commit a crime. Akismet, however, waits for the act to be commited, then sees if it's criminal by consulting its database of known acts of spam; these are collected and analyzed by the network of Akismet plugin users.
So the combination of Bad Behavior 2, which vigorously beats off the majority of spam attempts, makes for a lighter workload on the server AND for Akismet, which does cleanup on a case-by-case basis. Since installing this combination of plugins several months ago, I've seen maybe 2 or 3 actual spam comments get to the moderation queue where I can give the final OK. A few legitimate comments have been caught, but even that has been pretty rare (maybe 5 or 6).
Never the less, I'm still experiencing some server sluggishness, so I thought maybe I'd tweak my setup.
Security through Obscurity
The phrase security through obscurity usually has negative connotations: any security system that relies on secrecy alone is pretty half-assed. You can't rely on secrets to keep things safe; you also need active countermeasures. It occured to me, though, that in this case security-through-obscurity thinking was appropriate.
Spambots aren't personal; they just blindly go out and try to access the comment posting mechanisms on blogs based on the default settings. For example, the file that does the actual posting of comments to WordPress is wp-comments.php which lives in a certain directory on every installation. A spambot knows this; it's just like an opportunistic thief trying the doorknob on your car or house to see if it's unlocked. It's just the obvious thing to try. It's easy enough to do quickly, and the payoff when it works is enough to justify the effort. It just takes one payoff.
Now, knowing thisimagine that you've hidden your doorknob. Or moved the door behind something. A thief who's in a hurry will just walk right on by your house...there are always easier targets. Spambots are similar; they need to make thousands of posts to be able to turn any kind of profit for their overlords, and there just isn't time to spend cracking your particular blog. There are lots of default installations of WordPress out there, and those are the ones that are being targeted. Spammers aren't out to make a personal connection; they need thousands and thousands of links to make their dark search engine optimization tricks work for dozens of sites. That means they will target high-yield scenarios like "default installations" of blogging software. They will also target sites that are protected by solutions like Akismet and Bad Behavior, if they've figured out a way to get around the countermeasures. Even in this case, however, they're not targeting your specific blog, but a particular plugin.
Filename Switcheroo
If you measure security by not allowing any accesses to your valuables, then security through obscurity is clearly not enough. A determined attacker will find your secret. However, since Spambots aren't determined attackers, and our goal is merely reduced violations, applying a little secret sauce is entirely appropriate.
I found Yami McMoot's write-up on renaming script files on the Wordpress Codex, and have changed my comment and trackback files to a random name. Spambots that try to access the default pages will get served a blank page; they'll just appear broken to the spambot, which will move on its merry way to the next target.
Now, this won't stop an attacker who's actually using a browser to access and enter comment spam, and I imagine that there are attempts right now to create an automated browser-based spambot that can automatically find and navigate to the comment box. Spambots that parse HTML would requires active countermeasures like Bad Behavior, or perhaps would have to use an entirely different comment posting mechanism (Flash, anyone?).
Summing Up
So I'm hopeful that this will reduce the amount of incoming spam by a bit more. The anti-spam fortifications in place now look like this:
Secret Entrance -- For automated spam scripts to access the comment posting mechanisms directly, they'll have to guess what random name I've chosen for these two files. The OLD files are there, but they're blank and will not invoke Bad Behavior. This procedural change is invisible to "regular" users.
Bad Behavior -- All visitors, normal people and spambot alike, are submitted to the screening. Spammy behavior gets you booted before you even reach the blog. Everyone else is allowed to see the website.
Akismet -- Any comments that are submitted will have been made by people (or spambots) that have made it through the two lines of defense. Akismet now looks at what's been written; spam is marked as such.
Moderation -- Any legitimate comment that's made it through steps 1 through 3 are held in moderation, until such time when I personally read and approve it. On approval, the comment is posted to the blog.
In my WordPress dashboard, Akismet shows me how much spam it's blocked since it was installed. I added a little hack to Bad Behavior 2 in bad-behavior-wordpress.php to display a similar message:
function bb2_dashboard_stats() {
bb2_insert_stats(true);
}
add_action('activity_box_end', 'bb2_dashboard_stats');
I'm not actually sure if adding countermeasure #1 will make a huge difference; I suspect my laggy server performance is just due to me starting to outgrow it. However, if there are a significant number of spambots just hammering the ser that bad behavior has to deal with, it might add up. Bad Behavior 2 reports about 150-200 blocked posting attempts a day, and Akismet blocks maybe 20-30 that get through.
We'll see what happens.
You are reading page 1 of 8
Go to Next Page >>