ForceType for nice URLs with PHP

This
has been covered before, but I was just setting up a new force type on
our servers and thought I would mention it for the fun of it. You see
lots of stuff about using mod_rewrite to make friendly URLs or SEO
friendly URLs. But, if you are using PHP (and I guess other Apache
modules) you can do it without mod_rewrite. We have been doing this
for a while at dealnews. Even before SEO was an issue.

Setting up Apache

From the docs,
the ForceType directive “forces all matching files to be served as the
content type given by media type.” Here is an example configuration:

<Location /deals>

ForceType application/x-httpd-php

</Location>

Now any URL like
http://dealnews.com/deals/Cubicle-Warfare/186443.html will attempt to
run a file called deals that is in your document root.

Making the script

First save a file called deals witout the .php extension. Modern
editors will look for the <?php tag at the first and will color it
right. Normally you take input to your PHP scripts with the
$_SERVER[”QUERY_STRING”] or the $_GET variables. But, in this case,
those are not filled by the URL above. They will still be filled if
there is a query string, but the path part is not included. We need to
use $_SERVER[”PATH_INFO”]. In the case above, $_SERVER[”PATH_INFO”]
will be filled with /Cubicle-Warfare/186443.html. So, you will have to
parse the data yourself. In my case, all I need is the numeric ID
toward the end.

$id = (int)basename($_SERVER["PATH_INFO"]);

Now I have an id that I can use to query a database or whatever to get my content.

Avoid “duplicate content”

The bad part of my use case is that any URL that starts with /deals/
and ends in 186443.html will work. So, now we have duplicate content on
our site. You may have a more exact URL pattern and not have this
issue. But, to work around this in my case, we should verify that the
$_SERVER[”PATH_INFO”] is the proper data for the content requested.
This code will vary depending on your URLs. In my code, I generate the
URL for the content and see if it matches. Kind of a reverse lookup on
the URI. If it does not match, I issue a 301 redirect to the proper
location.



header(”HTTP/1.1 301 Moved Permanently”);

header(”Location: $new_url”);

exit();

Returning 404

Now, you have to be careful to always return meaningful data when
using this technique. Search engines won’t like you if you return
status 200 for every possible random URL that falls under /deals. I
know that Yahoo! will put random things on your URLs to see if you are
doing the right thing. So, if you get your id and decide this is not a
valid URL, you can return a 404. In my case, I have a 404 file in my
document root. So, I just send the proper headers and include my
regular 404 page.

header('HTTP/1.1 404 Not Found');

header(’Status: 404 Not Found’);

include $_SERVER[”DOCUMENT_ROOT”].”/404.html”;

exit();



Powered by ScribeFire.