The amazing adventures of Doug Hughes

Archive for September, 2004

Search Engine Safe URL Scare!

I moved this site to a new server last weekend. The server is running ColdFusion
MX 6.1 on JRun to facilitate multiple instances of ColdFusion. However, I had
quite a scare when I moved this site to the new server. As you can tell by looking
at the address for this page, my URLs are "search engine safe", meaning
that some characters which search engines supposedly don’t like are changed
to characters which search engines don’t seem to mind. I had quite a scare on my new server because I couldn’t initially get them
to work!

Before I get into how I fixed this, here’s what a standard URL might look like:

http://www.doughughes.net/index.cfm?page=blogLink&entryId=36

A search engine safe version might look like this:

http://www.doughughes.net/index.cfm/page-blogLink/entryId-36

The problem I had was probably due to changing web servers from Apache to IIS
and maybe due to using ColdFusion on JRun. (I didn’t nail this down exactly,
it wasn’t really very relevant to the problem.) In the end I googled around
and found this
link
(see item #52942). This pointed me to the web.xml file which can be
found under this path:

{jrun_root}/servers/{server_name}/cfusion-ear/cfusion-war/WEB-INF/web.xml

This file has a number of nodes which look like this:

<servlet-mapping id="macromedia_mapping_3">
    <servlet-name>CfmServlet</servlet-name>
    <url-pattern>*.cfm</url-pattern>
</servlet-mapping>

These nodes tell ColdFusion what it can process. I simply went to the bottom
of this list and added this:

<servlet-mapping id="macromedia_mapping_11">
    <servlet-name>CfmServlet</servlet-name>
    <url-pattern>*.cfm/*</url-pattern>
</servlet-mapping>

The ID was based on adding one to the ID of what was previously the last node.
The *.cfm/* tells coldfusion to process files even when they have a trailing
slash as do my search engine safe urls. After adding this I restarted CF and
my urls worked again! Woo Hoo!

Trimming the Whitespace From Mach-II Generated HTML

As most people who use Mach-II knows, it inserts a lot of white space at the top of your website. This can be significant! I was getting several thousand characters of white space at the top of some of my applications and decided that was a bad thing. So, I took action against it.

The first thing I did was go though all of my components and double check that all of them have output=”false” in all methods. Some didn’t, I fixed that. I also added output=”no” into the cfcomponent tag itself. I also did a few other things to remove the vast quantities of white space. Even after all of my efforts I was still getting between 4000 and 7000 characters of white space in front of all my pages.

Finally, just to see what would happen, I wrapped the call to the Mach-II framework in my index.cfm in a cfsavecontent. I then simply output the trimmed value of that content like this:

<cfsavecontent variable="content">
    <cfinclude template="/MachII/mach-ii.cfm" />
</cfsavecontent>
<cfoutput>#trim(content)#</cfoutput>

Viola! Much smaller files and almost no leading white space. How happy I am.

A Quick Microsoft SQL Gotcha

This entry is a quick MS SQL gotcha. One of my employer’s clients has written a set of rules as fragments of SQL code in their back office database. I was called in the other day because the client insisted that their SQL fragment was correct and that we were doing something wrong. Guess what! Their SQL was flawed. However, their flaw was quite subtle and it was tricky to track down. The client provided a SQL fragment which looked like this:

CASE WHEN (
    SELECT ISNULL(someColumn,0)
    FROM someTable
    WHERE someOtherColumn = 'someOtherValue'
) < 1 THEN 35 ELSE 50 END

The rule was evaluated by dropping the SQL fragment into another query. In general it looked like this:

<cfquery name="runRule" datasource="#mydatasource#">
    SELECT #SqlRule# as value
</cfquery>

The SqlRule variable was the text of the case statement above. When the subquery in the case statement returned NULL we would return 35 in the value column. When the subquery returned a value which was not null would return 50. What do you think happened if the subselect returned no rows? If you guessed that we would receive 50 in the value column then you’re much smarter than me. I thought the answer would be 35. From the perspective of SQL, what does a lack of data equate to? NULL. If there are no rows and you’re comparing no rows to 1 then would you not be comparing NULL to 1? And is NULL less than 1? It turns out that NULL is not less than one. It also turns out that NULL is not greater than one. Try running these two queries:

SELECT 'ah ha!'
WHERE NULL < 1

SELECT 'ah ha!'
WHERE NULL > 1

They both return no rows. So, if NULL < 1 evaluates to false then SQL will return the else portion of the case statement, or 50!

Use the Image Component to Scale Images To a Specific Width or Height Proportionately

I’ve recently had a number of questions form past users of the Alagad MagickTag who were transitioning to the Alagad Image Component regarding scaling of images.

When scaling an image using the MagickTag you would use the <cf_magickaction> tag and provide attributes for width and height. If you were only to provide a width or height but not the other, the image would be scaled proportionately. Unfortunately, the Image Component doesn’t make scaling an image proportionately to a specific width or height quite as easy (yet). However, I’ll tell you how to do this quite easily.

Before I do that, I wanted to announce that later this year I will be releasing the next major version of the Image Component. This upgrade will include a number of additional methods for scaling images and will address this missing functionality. But, I expect you probably need to do this now.

Let’s say that you have a form on your website where users will upload images. You do not know the height or width of these images ahead of time and you want to take scale all images proportionately so that their width is 100px. The algorithm for this is quite easy. Simply use the Image Component’s method to get the uploaded image’s width (as of now, we really don’t care about the height). Let’s say the height is 200px. If you divide the target width (100px) by the source width (200px) you will get the ratio which you need to use to scale the entire image. In this case it’s .5. Once you have the ratio call the scalePixels method and pass in the image width and height multiplied by the ratio. If the image height were 300px then the resulting images width and height would be 100 x 150.Here’s some sample code:

<!--- create the image component --->
<cfset myImage = CreateObject("Component", "Image") />

<!--- read an image --->
<cfset myImage.readImage(expandPath("test.jpg")) />

<!--- declare the target width to be 100px --->
<cfset targetWidth = 100 />

<!--- get the image's width and height --->
<cfset initialHeight = myImage.getHeight() />

<cfset initialWidth = myImage.getWidth() />

<!--- find out the ratio between the target and initial widths --->
<cfset scaleRatio = targetWidth/initialWidth />

<!--- scale the image --->
<cfset myImage.scalePixels(initialWidth * scaleRatio, initialHeight * scaleRatio) />

<!--- output the new image --->
<cfset myImage.writeImage(expandPath("test_sm.jpg"), "jpg") />

This code can easily be changed to scale to a specific height and the same basic concept could be extended to provide the ability to scale to fit within a width and height or to only scale if the image is bigger than the target size.

Good Luck!

ColdFusion MX 6.1 Updater Introduces New Bug When Looping Over Query In the Application Scope

I ran into a new and exciting issue with ColdFusion today. I was called into figure out a bizarre problem a coworker of mine was having. The problem was that the navigation bar on a website was occasionally (meaning about 1 out of 25 times or so) not showing all the data or was showing too much data, or just plain wrong information. If you were to go to the site and reload the page, every once in a while youd see data that was just wrong. For Example:

Working correctly:

Working correctly

Then, randomly, youd see this:

Working ingcorrectly

In general, the code that outputs the navigation menu was looping over a query in the application scope. When the application was initially loaded application.nav_menu would be set to the results of a query. Then, on all page loads the nav menu would be generated by looping over the query in application.nav_menu.In addition to that, the problem could be duplicated by creating some code as simple as this and then reloading the page over and over:

<cfoutput query="application.nav_menu">
#CurrentRow#.) #Name#"
</cfoutput>

When you ran this code most of the time you would see rows 1 through 30. However, occasionally, it would skip rows! You might see 1, 2, 4, 5, 7. Notice that 3 and 6 are missing. The next time youd reload the page youd see 1, 2, 3, 4, 5 with no missing data.

Correct:

1.) Active_Matter_Navigation_Menu
2.) WEBSITE
3.) Home
4.) ContactUs
5.) SiteMap
6.) LegalNotice

Incorrect (note the missing data):

1.) Home
6.) WEBSITEHIGHLIGHTS
9.) TodaysHeadlines
12.) AskWebsite
16.) MyWEBSITEBio
18.) MyWEBSITEArticles

(and so on)After beating my head against this problem for a few hours I began to have an idea what was going on. Here are a few more hints:

  • The problem only occurred on the live site.
  • The problem began occurring after upgrading from ColdFusion 5 to ColdFusion MX 6.1 with the 6.1 updated released in August.
  • The ColdFusion server had been restarted at least once.
  • The site gets about 125,000 hits per day. (I assume thats on more than just .cfm files)

After some research I was able to reproduce the problem outside of production. It seems that when you have a query in the application scope that all of the querys metadata is in the application scope too, including the currentrow value. That means that if you have two people looping over the same query in the application scope at the same time that as user one reaches the end of the loop and the currentrow is incremented that its also incremented for the other user looping over the same application variable. The second user when they reach the end of the query will increment the currentrow and begin the loop again. However, at this point they will appear to have jumped two rows, not one. To test this theory, I created a new folder in my application and wrote a few test files. One was an application.cfm designed to isolate my tests from the application.

<CFAPPLICATION name="test6"
    clientmanagement="No"
    sessionmanagement="Yes"
    applicationtimeout="#CreateTimespan(2,0,0,0)#"
    sessiontimeout="#CreateTimespan(0,2,0,0)#"
    setdomaincookies="true">

I then created a simple file, test1.cfm, which would create and cache the query into the application scope if it didnt exist and loop over the cached query:

<cfif not isdefined("application.NavMenuTEST")>
    <cfquery name="application.NavMenuTEST" datasource="beta">
        EXEC amsp_NavMenuSetup 0, 'http://www.website.org/', 'http://www. website.org/', 4.1
    </cfquery>
    <h2>Application Query Var Set</h2>
</cfif>

<cfoutput query="application.NavMenuTEST" maxrows="30">
    #CurrentRow#.) #Name#"
</cfoutput>

At this point I could load test1.cfm all day and never see the problem. This is good. These tests were not under any load and it was the expected behavior.So, I created another file, tDoug.cfm. This file simulated thousands of users looping over the cached query at the same time.

<cfloop from="1" to="50000" index="x">
    <cfoutput query="application.NavMenuTEST">
        <cfset t = "#CurrentRow#.) #Name#" />
    </cfoutput>
</cfloop>

This file looped over the entire query 50000 times. Each time it would perform some unimportant mumbo-jumbo.Running tDoug.cfm took 15 or so seconds. If, while this file was running, I hopped to another tab and ran test1.cfm the data would be all out of order with lots of missing rows. Just the same problem I was having but to a larger extreme.Interestingly enough, I was not able to reproduce this on ColdFusion 5, or ColdFusion MX 6.1 (on Linux) but I was able to reproduce it on a different ColdFusion MX 6.1 with the 6.1 updater server while pointing to a different database and server altogether. It sure sounds to me like a problem with the 6.1 updater.For those of you who are having this problem my solution was to duplicate the query into the local variables scope before looping over it. The major drawback is that Ive got to have a minimum of two copies of the query in memory at any given point in time for this to work (because Im duplicating the query on each request.) In other words, if I updated my test1.cfm as follows I no longer ran into the problem.

<cfset navmenu = duplicate(application.NavMenuTEST) />
<cfoutput query="navmenu" maxrows="30">
    #CurrentRow#.) #Name#"
</cfoutput>

This is all I can say at the moment. I let some people at Macromedia know about the problem and the ball is in their court now.

Update 9/23/2004: So, my coworker did send the results of our findings off to Macromedia yesterday. Apparently Macromedia has no confirmed that this is a bug going all the way back to MX 6.0! And no one ever caught this before?! Any how, they hadn’t heard of this problem at all before yesterday and then, all of the sudden, five companies call in with the same problem. Very strange.

The Macromedia support engineer is apparently going to suggest a hot fix for this issue. In the meantime, the official Macromedia line appears to be that you should use an exclusive lock around any loops over application queries. The other suggestion we’ve come up with is to use the duplicate method to copy the query into the variables scope before looping over it. Both will hurt application performance in their own ways. If you have this problem , I would suggest trying both solutions under load before choosing one or the other.I’ll keep this post updated as I learn more.

Coercing the Alagad Image Component to Work on Headless Systems

Several users of the Alagad Image Component, myself included, have run into an annoying limitation of Java on Unix platforms. These unfortunate people receive the, “This graphics environment can be used only in the software emulation mode” error on most calls to Image Component methods.

The problem is not with the Image Component, it stems from limitations of Java running on systems which do not have a mouse, keyboard or display. These systems are said to be “headless”. On Unix, the java.awt package requires you to have a full, and as far as I can tell, running X server installed.

As of Java 1.4.1, Sun added a work around for headless systems. By starting Java with “-Djava.awt.headless=true” argument you should be able to resolve the error message. Implementing this work around with the Alagad Image Component was originally documented here.

But, what if the work around doesn’t work for you? What then?! Rest assured, even I, the author of the Image Component, had a very hard time getting it to run on my RedHat Linux servers. (It’s a piece of cake on Windows!) I followed instructions linked above which were provided by Sun and Macromedia to no avail. So, I did a bit more research and found that that there are three typical solutions to the “headless” problem:

Solution 1: Start Java with “-Djava.awt.headless=true”. I tried this. It didn’t work for me. I tried everything I could think of to get this to work, but I had no luck whatsoever. In every single configuration I tried I still received the same errors in the same places about “software emulation mode”. I have read several accounts of this working for people, but it didn’t work for me. However, if you’re going down the headless road, stop here first and make sure it doesn’t work for you too. Instructions on how to implement this for ColdFusion can be found here.

Solution 2: Install Xvfb, a virtual X server.Xvfb is a virtual X server available from the XFree86 group. It is apparently a reasonably lightweight virtual framebuffer X server and provides the resources needed by Java (and not much more). The major drawback to this is actually having an X server, even it if is a virtual one, running on your server. The good thing is that it supposedly provides all the necessary resources for Java and the Alagad Image Component.

I’ve read good things about it. I’ve also read that the underlying graphics subsystem doesn’t do a very good job with font rendering. However, like the last solution, it didn’t work for me. I do suggest giving it a try though.To get it to work you’re supposed to install Xvfb. Xvfb can be downloaded from the XFree86.org website. I didn’t download mine from there; I downloaded an RPM file from RedHat. (I’m pretty sure it was also on my RedHat CDs, I simply didn’t think to look there.) Because it was an RPM, installation was a breeze.

On a related note, I tried this same solution on an extremely minimal RedHat installation (as in router-like, but with ColdFusion installed). I didn’t complete the install because the Xvfb RPM has many dependencies on other installed RPMs which have their own dependencies. It became too much for me to track down and I gave up.After installing the Xvfb RPM you need to start it. The command I used was:

Xvfb :0 -screen 0 1280x1024x32 &

This is supposed to create a virtual frame buffer on display zero which is 1280×1024 at 32 bit color. I then edited my /etc/rc.local file and added this command so that Xvfb would start up when my server reboots (which is more or less never!).Once you have Xvfb up and running you need to create an environment variable named “DISPLAY” which holds the display number for the system to use. This is where I think I went wrong. You might want to do more research as to what exactly this means and how to do it and have Java be aware you did it. I simply added a line in my /etc/profile file which made the variable permanent and across all terminal sessions. The line I added was:

DISPLAY=":0.0"

You will then need to call:

export DISPLAY

In my /etc/profile file there is a line which exports several variables and I just tacked DISPLAY on to the end.After this, I restarted ColdFusion and… I still had the exact same problem as before. Oh well. I don’t know if this will work for you or what it is I was doing wrong, but I do suggest you try it out. If you know something I don’t, please let me know.

Solution 3: Use Pure Java AWT (PJA). The last thing I tried, and the only thing that worked for me, was using Pure Java AWT or PJA. It seems that someone else out there thought that Java’s requirement for an X11 server was a little, well, odd. They decided to fix it and created an implementation of the java.awt package in pure Java. All you need to do to use it is download the package, extract it, point ColdFusion at it, and it just works! At least, it just worked for me.

The first thing I did was download pja_2.4.zip. Don’t do that! The PJA version 2.4 is the current production version but it doesn’t work with ColdFusion because ColdFusion comes with a 1.4.1 JRE. It seems that the java.awt package changed rather significantly from 1.4 to 1.4.1 and PJA 2.4 doesn’t support all the needed methods. However, there is a beta of the 2.5 version available here. I downloaded this and it simply works for me.

After downloading the ZIP version I extracted it to /usr/PJA. This created a directory /usr/PJA/lib which contains pja.jar. To get ColdFusion to use PJA I had to edit the jvm.config file which is located at {cf directory}/bin/jvm.config. There is a line in this file which reads something like (all on one line):

java.args=-server -Xmx512m -Dsun.io.useCanonCaches=false -Xbootclasspath/a:{application.home}/lib/webchartsJava2D.jar -XX:MaxPermSize=128m -XX:+UseParallelGC -Djava.awt.graphicsenv=com.gp.java2d.ExHeadlessGraphicsEnvironment

This needed to be changed. First, and this was a gotcha for me, I had to add the path to the pja.jar file and another file rt.jar onto the -Xbootclasspath attribute. (I honestly don’t know if rt.jar is required. I added it because it was in the instructions I was following without explanation. Now that it works I don’t want to change anything.) To add additional paths, separate them with a colon “:” like this (note the addition of /opt/coldfusionmx/runtime/jre/lib/rt.jar and /usr/PJA/lib/pja.jar) (all on one line):

-Xbootclasspath/a:{application.home}/lib/webchartsJava2D.jar:/opt/coldfusionmx/runtime/jre/lib/rt.jar:/usr/PJA/lib/pja.jar

This allows Java to find the pja.jar and rt.jar files at startup, which are needed to work as we want it to. (At first, I tried adding these paths to the the Java classpath and it kept not working for me. This was quite frustrating! Don’t let it happen to you!)After that was added I needed to delete the following chunk:

-Djava.awt.graphicsenv=com.gp.java2d.ExHeadlessGraphicsEnvironment

I replaced it with the following:

-Djava.awt.graphicsenv=com.eteks.java2d.PJAGraphicsEnvironment -Djava.awt.fonts=/usr/java/j2sdk1.4.0_03/jre/lib/fonts

This tells Java to use PJA for calls to the java.awt packages. It also tells java to look in the /usr/java/j2sdk1.4.0_03/jre/lib/fonts directory for fonts. You will probably want to find the fonts directory which is correct for you and use that. As a note, I used a directory from a different JRE on my system than the one provided with ColdFusion. This is because the ColdFusion version does not include a fonts directory. I chose a Java fonts directory because that’s what was in the examples I was following. This directory simply contains a set of TTF files so you might be able to point this somewhere else, if you want to. My ending java.args line looked like this (all on one line):

java.args=-server -Xmx512m -Dsun.io.useCanonCaches=false -Xbootclasspath/a:{application.home}/lib/webchartsJava2D.jar:/usr/PJA/lib/pja.jar -XX:MaxPermSize=128m -XX:+UseParallelGC -Djava.awt.graphicsenv=com.eteks.java2d.PJAGraphicsEnvironment -Djava.awt.fonts=/usr/java/j2sdk1.4.0_03/jre/lib/fonts

After saving my updated jvm.config, I restarted ColdFusion and, viola, the Alagad Image Component worked! Yay!

Since this, I’ve had no problems with any portion of this implementation. Yet. Granted, as of now, it has been less than one day. Additionally, the site I’m working with is under development so it’s not under any load yet. I do believe that this will work just fine.If you have any further suggestions or comments on this, please let me know!FYI: I found several resources. One in particular was a day saver. When I was trying to get Xvfb working with ColdFusion I Googled on “ColdFusion Xvfb” and found a link to an entry on Christian Cantrell’s blog. This talked about the exact problems I was having and linked to this site. I found several other sites which linked to the same location, but they all seemed to be being redirected somewhere else. Apparently, the page had been removed from the website. So, I went to the trusty old Way Back Machine and found this link which finally helped me solve my problem. Many thanks to the original author.

Searching with Lucene

As I blogged a couple of days ago, I just recently put up a new site at DougHughes.net. One thing I wanted to do was implement a search tool that used the Verity spider which comes with ColdFusion. After much research and hair-pulling I found out that the Verity spider’s not supported on Linux. D’oh! I pondered several solutions to the problem, including writing my own spider in Java, and then feeding the data into Verity somehow or other.

The problem I kept running into is that Verity, at least as it ships with ColdFusion, is really intended for indexing files on disk and not dynamic web content directly from websites. This meant that for me to try to index html and other binary content that I would need to use a spider and cache web documents to disk before using a bulk insert file to index them with Verity.

I tried several different approaches to the problem, but I wasn’t really very happy with any of them. In the end, I did a little more research and found out about a custom tag called Lindex which was distributed on Macromedia’s DRK 3 CD and which used a search engine I’d never heard of to index content. The search engine was Lucene.

Lucene, it seems, is a “high-performance, full-featured text search enginelibrary written entirely in Java”. It also happened to be free, open source, and published by the Apache Foundation. I downloaded it right away and started learning how to use it.

I’ve now build a CFC which uses lucene to create indexes, index content, and search indexes. Currently, I’m only indexing HTML content however, I plan to grow that to PDF, DOC, XLS, PPT, and RTF accoding to the FAQs I found.Content is fed to Lucene via a half-assed spider I wrote using cfhttp. Asside from some things which could use a little work on my side, I’ve been very happy with the preformance and functionality of Lucene and my search component.

It’s just so much much more powerful and easier than Verity. It’s just exactly what I need. I love the fact that I have an API I can code against. (There’s really not an API for Verity.) If you’re using (or failing to use) Verity, I strongly suggest looking into Lucene. When is all said and done, I’ll probably release the search component as an Alagad product. Maybe by 2005? It could happen!

Tag Cloud