Sick of that massive Apache mod_rewrite rules generated by WordPress to have user-friendly URLs? Annoyed by the fact that you need to re-generate all these rules and save them into your already-crowded .htaccess file, when a “page” is created or re-named in WP? Try the following rules, and it’ll fix mod_rewrite woes once for all.
<IfModule mod_rewrite.c>
RewriteEngine On RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ /index.php/$1 [L,QSA]
</IfModule>
I discovered that WP actually parses PATH_INFO according to the permalink rules (at the beginning of wp-blog-header.php), when I intended to write a plugin to achieve similar goals. This solution is indeed very neat, and it seems to handle all cases that I have thrown at it (tested briefly on a fresh WP 1.5.1.2 install). I wonder why WP does not generate this rule by default, when all the code is already in place?
Update: A bit of searching on the Internet reveals that Ryan Boren has already announced this alternative rewrite rule in his blog last October. Oh man. I am slow.
With regard to our emails Scott, I was thinking you should throw Apache Bench at your server with/without your new mod_rewrite rules and see how it performs. I’d be very interested to see if it does in fact perform better/worst. If it is faster, there would be no reasonable reason why it wouldn’t be included into the core as well (pending Matt/Ryan coming up with a reason for using the long method of course).
Al.
Totally unscientific experiement with ApacheBench 2.0.41-dev. Single thread. 100 requests. Lightly loaded server Gentoo Linux running Apache 2.0.54.
With old mega-complicated mod_rewrite rules: 601.521 ms / request.
With new simplified rules: 621.703 ms / request.
So parsing the mod_rewrite rules in Apache is indeed faster than doing the same in PHP (if the result is accurate). Not by much though.
I haven’t looked into how the WP parses the ruleset for the clean URLs. However I wonder if it’d be worth going through the code and attempting to clean/lean it up – might be able to bring it back on par with doing it through mod_rewrite?
Al.
My investigation came down to Ryan provided the code for PATH_INFO parsing in Feb 2004. Here’s the CVS diff on Sourceforge:
http://cvs.sourceforge.net/viewcvs.py/cafelog/wordpress/wp-blog-header.php?r1=1.24&r2=1.25
That has however being patched quite a few times according to CVS. I have not set up SVN yet to see whether more editing has taken place in WP 1.5. I suspect the reason for not using this code is that it might still be buggy. I’ll drop Ryan an email on this, after he came back from Alaska.
One thing i came across is that PATH_INFO unknown to WP is ignored, so my custom 404 that tries to search for the data doesn’t get processed.
WordPress still sends back 404, just that ErrorDocument of Apache does not catch that and redirect to its own handler. However, it is also possible to write your own 404 handler in WordPress by creating a 404.php in your theme directory. You can then add your search script there.
Gack! I remember when Ryan B. put this up on his site, and I replaced all the lines from WP with it. I had the same result today as I did then: “Input File Not Found” page for any internal links. Last time round I didn’t bother to ask the question… so I’m doing it now… Any clue/hint how to fix this issue?
Scott,
Thanks for posting this. I have used it to try to fix a few things, but I still can not eliminate the .index.php version of my blog. I have added to it to fix www versus non www issues, but I still keep getting the index.php if I actually type it into the browser window.
My htaccess is as follows;
# BEGIN WordPress
Options +FollowSymlinks
RewriteEngine On
RewriteBase /blog
#
# Redirect to canonical domain
RewriteCond %{HTTP_HOST} !^www\.cadwebsitedesign.com [NC]
RewriteRule ^(.*) http://www.cadwebsitedesign.com/blog/$1 [R=301,L]
#
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ blog/index.php/$1 [L,QSA]
# END WordPress
But, if you navigate to cadwebsitedesign.com/blog/index.php, you will see that it is not redirecting to the preferred cadwebsitedesign.com/blog/
Any suggestions? Is there somewhere else I must make changes for this to be accomplished? I have noticed that your blog front page removes the index.php if I type it in at the browser window…
Where do I put these codes?!? Copy and paste on .htaccess? For great justice, help me!