The amazing adventures of Doug Hughes

A week or so ago I was asked by a reader of my blog to explain how I handle search engine safe URLs on my websites. The process I use is quite simple and has remained more or less unchanged over the last few years.

As you can see on this site my URLs tend to look like this:

http://www.doughughes.net/index.cfm/page-blogLink/entryId-44

The first part of the URL looks just like any other URL. We all know what this does. If not, don’t bother reading any further. However, after the index.cfm things begin to look a little different:

/page-blogLink/entryId-44

What I’ve done is replace the “?” and “&” characters with front slashes. Equal signs are changed to hyphens. This creates a name and value list which we can easily parse. Everything before the first hyphen in each pair indicates the variable name. Everything after the first hyphen indicates the value of the variable.

I’ve seen other solutions which have a format where every other slash indicates a variable and value. This style of URL might look like this:

http://www.somedomain.net/index.cfm/page/blogLink/entryId/44

I’m not a fan of this because it’s a little hard to read and seems like it could cause errors if, for whatever reason, a variable doesn’t have a value. In my example, if the page variable didn’t have a value it would simply look like this:

http://www.somedomain.net/index.cfm/page-/entryId-44

What would the other style look like? How would ColdFusion know what to do? It’s my opinion that my way is a little nicer and more reliable. It’s up to you how you choose to do it.

Another cool thing about these URLs is that, if need be, you can tack additional URL variables in the traditional format like this:

http://www.somedomain.net/index.cfm/page-/entryId-44?anotherVar=anotherVal

I’ve needed this capability in the past and it’s come in quite handy.

If you create any ColdFusion page and format a URL according to the way I defined them above and dump of the CGI.PATH_INFO variable you will see something similar to this:

/index.cfm/page-blogLink/entryId-44

As a note, I’m not sure if using CGI variables is the only option here. It’s the only one I know of. The biggest problem with them is that they’re not consistent between platforms and web servers. For instance, on Apache on Linux CGI.PATH_INFO would have looked like this:

/page-blogLink/entryId-44

On Windows and IIS it looks like this:

/index.cfm/page-blogLink/entryId-44

You may want to use other CGI variables to determine which parts of PATH_INFO contain the variables you want and which parts don’t. For instance, on Windows and IIS there’s a variable CGI.SCRIPT_NAME which holds only the path to the file:

/index.cfm

When I moved this site from Apache to IIS I was a bit confused because on Windows I had an extra variable named “URL.index.cfm” being set to nothing. How odd. A little debugging solved the problem.

Once you’ve isolated the portion of the URL which contains the variable names and values parsing them out is quite simple. All I do is loop over the list of name value pairs and extract them. I then split them apart and set a URL variable to the value provided.

Here’s a complete example:

<!---
Make sure that the CGI.PATH_INFO var is longer than CGI.SCRIPT_NAME + 1.
If not, then we don't have any url variables.
I add one to the length of CGI.SCRIPT_NAME because of the / after the
file path. IE: "/index.cfm/"
--->
<cfif + 1 GT Len(CGI.PATH_INFO) Len(CGI.SCRIPT_NAME)>
    <!--- we have SES URL vars --->
    <cfset urlString=Right(CGI.PATH_INFO, Len(CGI.PATH_INFO) - Len(CGI.SCRIPT_NAME) - 1)/>
    <!---
urlString is now a list of name value pairs (separated by url.seperator).
loop over the list and extract them
--->
    <cfloop delimiters="#arguments.seperator#" index="varAndVal" list="#urlString#">
        <!--- grab the variable name and value --->
        <cfset varName=ListFirst(varAndVal, arguments.equal)/>
        <cfset varValue=ListDeleteAt(varAndVal, 1, arguments.equal)/>
        <!--- set the url variable --->
        <cfset "URL.#varName#"=varValue/>
    </cfloop>
</cfif>

I’ve grouped all of the code above into a CFC which can be downloaded from the attachments section below.

The CFC provides a method parseURL which accepts two arguments, the variable separator and the equals sign. These default to my preferences of “/” and “-” respectively. This allows you to change them to be whatever you want. This could easily be turned into a Mach-II filter too.

(Note: This CFC isn’t as encapsulated as it could be, but it’s simple enough for this example.)

Have fun! Good luck! Don’t come up higher than me on Google!

Search Engine Safe URL CFC

Comments on: "Search Engine Safe URL How-To" (16)

  1. Trond Ulseth said:

    Great Doug,

    I’ll start playing around with it.

    Thank you
    Trond

    Like

  2. Trond Ulseth said:

    Hi Doug,

    I tried to apply your method and found my self running into a problem.

    My index.cfm file consist of one simple cfinclude tag:

    &lt;cfinclude template=&quot;view/templates/mainFrameset_1.cfm&quot;&gt;

    When I try to open the index.cfm file in my browser it seems to generate some kind of loop.

    The only solution I could make work was to replace the trailing slashes with #

    However the # is usually made to link to an anchor within the page – so I’m not sure if Search Engines will disregard them.

    I don’t want to waste any of your precious time, but if you from the top of your head had any advice I’d be thankfull.

    Trond

    Like

  3. Doug Hughes said:

    Trond, I wouldn’t suggest using my code without going through the entire process. Please go though the various steps. Check to see what the values of the CGI variables are. Step though the code. I wouldn’t suggest cutting and pasting.

    Give it another day’s work. Let me know what you did. If at the end of tommorow you still havn’t solved the problem then let me know what steps you took and show some code samples and problems.

    Doug

    Like

  4. Trond Ulseth said:

    Hello Doug,

    I figured it out.

    index.cfm is a simple cfinclude like this:
    &lt;cfinclude template=&quot;view/templates/mainFrameset_1.cfm&quot;&gt;

    The url on my local machine is this:
    http://localhost/waterswing/wscrm/index.cfm

    The page I cfinclude from index.cfm is a frameset where one of the frames is declared like this:
    &lt;frame src=&quot;view/templates/rightFrame_1.cfm&quot; name=&quot;rightFrame&quot; id=&quot;rightFrame&quot; /&gt;

    This is calling the rightFrame_1.cfm relative to index.cfm

    Somehow, when a trailing slash was included the call to the frames went into went into an infinite loop, in the end causing the browser to stall.

    The solution which worked was to use absolute path for the pages called by the frameset, like this:
    &lt;frame src=&quot;/waterswing/wscrm/view/templates/rightFrame_1.cfm&quot; name=&quot;rightFrame&quot; id=&quot;rightFrame&quot; /&gt;

    I can now happily go on playing with SES URLs.

    Thank you very much.
    Trond

    Like

  5. Greg Bugaj said:

    Hi, This is the system I am using
    So links like this :
    /index.php?set_language=en&cccpage=index&set_z_index=271#7
    index.php?set_language=en&cccpage=index&set_z_index=195

    Become:
    /template.cfm/1,0,0,0,0,1343,0,0,0.html
    /template.cfm/1,0,0,0,0,abc,0,0,0.html

    SMX URL can be exposed via SMXRewrite Object.
    URL Structure

    URL Consists of 8 Elements. There are two primary elements
    1-Indicates which file to load
    6-Indicates which element has gained focus.
    Other 4 alements are used to pass parameters.
    Example
    /template.cfm/1,0,0,0,0,1343,0,0,0.html
    /template.cfm/180,fileA:dirx,0,0,0,1343,0,0,0.html

    if anyone would like the code please contact me, it works for apache and IIS

    Like

  6. Just wondering, but don’t you think this is a bit more complex? It’s certinaly harder to read.

    Like

  7. Sorry to come so late in on this but I have your example setup and I know its working (I passed two the same seperators for example and got some error feedback). What is strange is that I have a url like this one:

    /tdw/sesUrls.cfm/passed-26

    yet when dumping the url scope I get nothing. Am I doing this wrong?

    Great work Doug!

    Like

  8. I am new to coldfusion pleas let me know why this code is returning an error message “Page ca not be found” when i try to parse with the following

    Like

  9. This is the following: ()

    Like

  10. This is the following: ( “#CGI.SCRIPT_NAME#/webid/#Attributes.WebID#”)

    Like

  11. I found this at http://www.forumvadisi.com . Really great! Thanks!

    Like

  12. stuart marks said:

    Question: Do you invoke your cfc and the parseURL function on the receiving page? Meaning the page being pointed to, or the page that uses the cgi variables?

    Thanks ahead, Stuart

    Like

  13. stuart marks said:

    Question: Do you invoke your cfc and the parseURL function on the receiving page? Meaning the page being pointed to, or the page that uses the cgi variables?

    Thanks ahead, Stuart

    Like

  14. Doug Hughes said:

    @Stuart – I put invocation in the application.cfm or application.cfc.

    Like

  15. Doug Hughes said:

    @Stuart – I put invocation in the application.cfm or application.cfc.

    Like

  16. Can you please show the code how you are calling the component in the application.cfm and how you are passing parameters to it?

    Like

Comments are closed.

Tag Cloud