The amazing adventures of Doug Hughes

Archive for February, 2008

A Matter Of The Heart – The JVM (UPDATE)

I have blogged and presented on the Java Virtual Machine (JVM) and how it is in my opinion “The Heart Of The Matter”. Those presentations emphasize that at the heart of ColdFusion MX-8 and JRun is the JVM but there’s more. Adobe has a range of Server products that run with a JVM at the heart of there operations; here they are.

  • Acrobat Connect
  • Blaze DS
  • ColdFusion
  • Contribute Publishing Server
  • Flash Media Server
  • Flash Remoting MX
  • JRun
  • LiveCycle DS
  • LiveCycle Enterprise Suite
  • RoboHelp Server

If I have missed any from the list, please don’t hesitate to let me know. I apologize for including Flash Media Server is this list that was an error on my part. FMS does not run on a JVM.

CFWindow Naming Gotcha

I am working on a small application that can be used to demonstrate some of the AJAX capabilities of ColdFusion 8. It is a simple master/detail application for managing information about users. The main page uses to display the list of users. When you want to edit a user, or create a new one, I use the ColdFusion.navigate() JavaScript method to load a URL into a . This page contains the user form.

Whenever I created or edited a user, it would work as expected…the first time. The next time I would try to edit or create a user and submit the form, I would get an error from ColdFusion stating that one of the form fields was missing. Huh? How is that possible since the form just worked?

Using FireBug I was able to look at the POST of the form submission. On the instances where I would receive the error, sure enough, the form fields were not getting posted, but, the was. Now, how did that happen? Well, the answer to that question is quite simple.

Here is what the code looked like for the :

<cfform name="userGridForm" method="post">
	<cfgrid name="userGrid" ......... >
		<cfgridcolumn name="USERID" header="ID" display="false" />
		<cfgridcolumn name="FIRSTNAME" header="First Name" width="200" />
		<cfgridcolumn name="LASTNAME" header="Last Name" width="300" />
		<cfgridcolumn name="EMAILADDRESS" header="Email" width="220" />
		<cfgridcolumn name="PHONE" header="Phone" />
	</cfgrid>
</cfform>

Here is what the code looked like in the user form, which was a seperate file:

<cfform name="userForm">
	<cfinput type="hidden" name="userId"....../>
	... {more form fields} ...
</cfform>

Can you see it? Do you see the problem? As soon as I saw that the was posting instead of the user form, it tipped me off. When the is open, there are 2 forms named userForm” on the page, causing the to be submitted sometimes, and the user form others. Why it did not error out everytime I tried to save a user is beyond me. Changing the name to “userGridForm” on the mian page cleared up the issue.

Remember, even though you are in a different physical file, naming can cause issues, especially when using them as I did in this litle demo.

San Diego Adobe-ColdFusion User Group JVM Presentation

Tonight, Wednesday February 20, 2008, I will be presenting to the San Diego Adobe-ColdFusion User Group on all things relating to the JVM and in particular the Sun JVM that ships with CF. I realize this may sound unappealing and lacking some of the zap that other items might have, such as the Flex-Air Roadshow but the JVM is not just at the heart of ColdFusion, but many other Adobe products such as FlexDS, BlazeDS, Connect, Jrun etc. My experience is a good understanding of the JVM is key to optimizing performance. Here is the official link.

Applying Load To Troubleshoot Application Issues

We are asked to help clients with various issues at various stages. Sometimes we are brought in when an application is simply being planned, sometimes just before launch or enhancement. This blog piece covers the times when we are brought in to pinpoint application slow-downs or failures.

As I have mentioned in previous blog postings, the first thing I always want to do when trying to help a client with application slow-downs or failures is to see the recent Production ColdFusion and JRun logs. The logs I focus on can depend on what kind of install, what operating system and also if there are any custom logging settings. If there is a {server-instance}-out.log I look there first. I never go back further than 7 days unless the client has a specific reason for me to do so.

So, what does this have to with Load-Testing? Please read on, if you wish. The logs I mention above will definitely give us pointers to what is going wrong, if we add Metrics Logging and Verbose Garbage Collection logging we will see more detail on that. All of this will give some overall indications of the system state. For instance, if there are many different templates, events or fuseactions in the logs I would suspect overall environment problems. If it is a just a few templates, events or fuseactions repeating over and over again then that is usually indicative of specific code issues.

As a note point, in frameworks such as FuseBox, Mach-ii or ModelGlue knowing what actually ran is more difficult and we need to capture the URL Query String. Both FusionReactor and SeeFusion are of great assistance in this respect. Hopefully the client can provide a testing environment, and in that case we would want to use a load test tool.

That test environment should be as identical to Production as possible, and it is critical that the code be the same. From the pointers gained from the Production logs and hopefully with some guidance from the client, we then create load-test scripts that exercise the code that we saw causing issues in Production.

We almost always start by tuning the JVM, particularly if there are indications that the site overall is slow. One last critical point; we look very closely at is the errors in the logs on the test environment. We want to see the same or very similar errors to those seen in Production otherwise we have some fundamental differences which will make it more difficult to achieve success.

Model-Glue Training In DC Recap

Today is my first day back in the office after four intense days of teaching my Enterprise Model-Glue with ColdFusion class in Arlington, Virginia. This was the first time I gave the class in front of a general audience, but not the first time Ive given the class. Overall, I was pleased with how it went and I think the participants were too.

The group of attendees had a wide range of experience, expectations, and intentions. For example, there were a few people who already used Model-Glue but wanted some formal instruction. There were others who used other frameworks already, or none at all. A couple of people were there more to learn how to support the frameworks in various capacities rather than learning how to implement them. Most people had at least some level of understanding of what we went over coming in.

The title of the course was Enterprise ColdFusion with Model-Glue. I obviously provided training on Model-Glue. But, most of the attendees were surprised by how much focus was given to other topics such as the usage of CFCs, Object Oriented Programming, Design Patterns, unit testing, ColdSpring, Reactor and much, much, more. Why all this focus on things other than Model-Glue? Well, thats really what makes up the Enterprise in Enterprise ColdFusion.

The hardest lesson taught was that Model-Glue is not only fairly simple; its also a very small part of the picture. Focusing on Model-Glue does not, by itself, do much for you. Where a developer needs to focus is on their applications model. This is, after all, where the magic really happens. By building a cohesive and well-encapsulated model you will ease the maintenance of your application over time. In turn, you save money and lower total cost of ownership.

The feedback I received was overwhelmingly positive and I can’t wait to do it again.

By the way, if youre in a corporation, Im happy to come on-site to provide your team with training. Just send us a message and well get the ball rolling.

In your expert opinion, where should I hold the next general audience class?

The Sales Job at Alagad is Still Unfilled

A couple weeks ago I posted a long-winded blog entry announcing that Alagad is hiring a Senior Sales Representative. It turns out that this is quite a challenging prospect and I was wondering if the blog community could help.

Alagad is looking for a really, really, good sales person. Someone who can sell millions of dollars of custom software a year.

The problem is twofold:

  • The sales people who can produce are already employed and not trolling around career builder looking for my advertisement.
  • The sales people who can produce probably don’t read my blog.

That’s where you come in. If you know of someone who works for a consulting company selling custom software, could you please mention this job to them? I would sincerely appreciate it.

I plan to compensate the right sales person generously in terms of salary and commission. They’ll be coming in at the top and will hopefully be able to work with Alagad to define our entire sales strategy. Over time I think this will become a director-level position where growth is virtually unlimited.

So, if you have any friends or acquaintances you think might be interested, please put them in contact with me:

Doug Hughes, President
Alagad Inc.
dhughes@alagad.com
888-Alagad4

Details of Tuning A ColdSpring, Model Glue, Reactor Application On ColdFusion 8

This is a real world and recent example of tuning a ColdFusion application built in ColdSpring, ModelGlue and Reactor. Here are the details of the system and software this application is running on.

  • Windows Server 2003 – IIS6
  • ColdFusion 8 Enterprise with Sun JVM 1.6
  • ColdSpring
  • ModelGlue 2.0.304(2)
  • Reactor revision 4.29

In the interests of really making this posting interesting I am going to start with two graphs. Both show the performance of the application under load using a load test tool with scripts modeled from user behavior. This first graphic shows response times in our first load tests and before any tuning.

Before

This graphic shows a spectrum of request response times during a 15 Minute, 50 vUser load test with 8 second think times between “clicks. The bottom left access shows request response times with the fastest on the left progressing to the slowest at the bottom right axis point. The bottom right to farthest right shows the duration of the test, from the start at the axis point to the end at the far right.

I realize the numbers are hard to read but by looking at the 3D colored bars we can see that most of the requests response times are in the mid to high range, many of them being in excess of 5 seconds in duration. The next graphic shows the performance results of the application running the same 15 Minute, 50 vUser load test with 8 second think times load test, after JVM, code and SQL tuning.

After

The bottom left access shows request response times with the fastest on the left progressing to the slowest at the bottom right axis point. The bottom right to farthest right shows the duration of the test, from the start at the axis point to the end at the far right. I realize the numbers are hard to read but by looking at the 3D colored bars we can see that most of the requests response times are now much faster, many of them under 1 second in duration. So that is the graphical representation of what we achieved, here is what we achieved and how we achieved it.

The JVM

I firmly believe in starting by optimally tuning the JVM before the application code, SQL etc. My reason is this; in the web application world, most applications consist of objects that are created and hopefully destroyed fairly quickly. That is because the request lifecycle is fairly short lived, as opposed to something like a client-server paradigm where objects can and often will remain referenced for considerable amounts of time. For this reason there are typically some overall good practices that I have found to be beneficial, here they are.

Before making any changes to the JVM I enabled verbose garbage collection logging and referred to that log throughout the testing. Full details of how to do that can be found in This previous blog posting.

  • Set a Start Size for the JVM Memory Space (The Heap). This is done by passing the Xms{number}m (where {number} is a numeric value) argument to the JVM at start up. I would typically set this to 50% of the size set in the Xmx setting. If there is no Xmx setting I have found that safe “rule of thumb” settings of -Xms512m and -Xmx1024m, where a server has a minimum of 2GB of Ram. I like to do this by adding it directly to the jvm.config file in the Arguments to VM. The reason that I favor directly editing the jvm.config file is because there have been occasions where doing this via the CF Admin GUI has caused problems with the jvm.config file not being written correctly. In fairness, this was in earlier versions of ColdFusion. For the server I am reporting on in this test the setting was set to -Xms512m -Xmx1024m.
  • Another good practice is the set a start size for the Permanent Generation of the JVM Heap. On looking at the verbose garbage collection logs I often see this generation, also known as the permgen; fills up. The permanent generation is where the Servlet Container (JRun) has it’s working objects and I like to set both a start and maximum size by passing these settings via the Arguments to VM – -XX:PermSize=128m -XX:MaxPermSize=192m. These are the settings for the server in this test and again are usually fine on a system with a minimum of 2GB of RAM. The permanent generation is in addition to the other maximum and minimum that we set in the paragraph above, not a part of it.
  • Full Garbage Full GC’s) collections are another issue which comes very much down to tuning to find the right balance of enough Full GC’s and not too many. From what I have seen and this is a general observation. the Sun 1.4x and prior JVM’s seem to run Full GC’s at 60 second intervals. The 1.5 and 1.6 JVM’s seem to run too few Full GC’s, I have seen almost out of memory issues and the Xmx (maximum heap size setting) does not always kick in properly on 1.5 and 1.6 JVM’s. Typically I set explicit Full GC’s to occur on a regular basis, I have found that setting them to occur at 10 minute intervals is a good interval. However, in the case of this particular system I was load-testing, I was still seeing the heap running low on memory. So I set Full GC’s to run at 5 minute intervals with the following via the Arguments to VM -Dsun.rmi.dgc.client.gcInterval=300000 -Dsun.rmi.dgc.server.gcInterval=300000. I am still working through what effects the 1.5 and 1.6 Sun JVM’s have on ColdFusion and Flex/Air based applications.

All the settings here were arrived at by testing different settings against both the default CF8 JVM (Sun 1.6) and also the 1.5 as suggested in other postings to be a better JVM for cfc heavy apps; as this one certainly is. In actual fact and after doing lots of load testing the default 1.6 JVM proved to be much more responsive; I will enumerate this shortly.

ColdFusion

The first thing I did in ColdFusion was to establish Metrics Logging and split out the JRun logs by using the log.level modifier. This is really useful in my opinion, as the state of the memory, number of threads in use and if enabled number of current sessions are shown at regular intervals in the metrics log and the out log. This latter log is particularly useful in troubleshooting as all errors end up there. Being able to see problems and the state of threads and memory at or around the same time can be really helpful in pinning down cause and effect.

There were two settings in CF Admin that I also set, the first was to turn on Trusted Cache which is recommended for Production servers. With a formal deployment process in place, Production code should only change as and when needed, so Trusted Cache is a very beneficial setting. In addition I turned off the saving of Class Files to disk. Once again, these changes were made in a controlled fashion and load tested after changing to determine their effects. At this point I had tuned the JVM and ColdFusion and as a result we saw average request response times drop from over 10 seconds to just over 4 seconds; so this was a good start achieved by server side tweaks only.

SeeFusion

This is a ColdFusion 8 installation so I am sure there are those who could rightly question why I would choose to use a third-party tool in addition. I have used SeeFusion since before it was officially launched; I was a beta tester. What I found is that SeeFusion would often continue to respond after ColdFusion had stopped responding altogether. This is because SeeFusion runs at the Java layer and the Servlet Container and the JVM will often continue responding and reporting after the Application server has failed. In addition, it is very easy to set up SeeFusion to log information to a database and I had set this up before starting the load testing. This information enabled us to now start homing in on code level-SQL issues. Here is a screen shot taken during the initial load test of the JVM 1.5 tuning, this may be hard to read but we are almost out of heap memory, indicated by the yellow bar graphic at top left and the bottom left graph; I have found that both 1.5 1nd 1.6 Sun JVM’s have different memory handling characteristics to 1.4 and prior in the sense that Full GC’s occur far less frequently; hence my comments above in the JVM tuning section.

Here are some blow-ups of the “almost-out-of-heap-memory” parts of the SeeFusion monitored load test. Firstly the bar chart showing a yellow warning status…

And here is the bottom right graph showing the rapid rise in memory usage as the tests progressed…

Code and SQL

At this point I consulted the considerable programming-architectural talents of Alagad. By using the SeeFusion data logged into a SQL Server database we began to focus on the slower responding requests. Here are some examples with some pointers to what we interpreted them to be.

In this sort of situation we can see that the “TotalTimeMS” and “TimeToFirstByteMs” are very close but that “TotalQueryTime” is low; these are all milliseconds by the way. What this suggests is that there is either a network-communication issue back to the browser or a lot of ColdFusion churn to produce the GUI.

Here we see a fairly long “TotalTimeMs” with “NumQueries” of 16. We identified this as possibly a lot of queries for a single request and were able to reduce this to 3 queries after investigation.

Here we see “TotalQueryTime” here shows as 19095 which of course is very long so we looked at the SQL involved in this query and were able to reduce the run time down to just over 2 seconds. We carried on identifying long running code and queries, doing this via the SeeFusion data, in reality there are usually not that many different slow requests and by using this methodology we improved performance dramatically by methodically dealing with each prioritized issue.

This was a typical, successful application tuning engagement. At the commencement of the assignment we had slow application response time. In our initial load testing a 50 concurrent vUser (virtual user) test produced average request times in excess of 10 seconds. By the end of the engagement we had average page response times at just over 500MS.

After tuning the JVM and the application, I wanted to go back and run some more tests on both the 1.5 and 1.6 JVM’s just to make sure that the results I had been getting throughout this assignment, were still borne out; that 1.5 was not more efficient than 1.6 in ongoing usage; I had NOT set out to prove that by the way, in fact it was the first thing I tried at the onset of load-testing was to switch to 1.5.

Firstly, here is the summary of the 50 vUser test on the default CF 8 JVM (Sun Java 6 or 1.6):

** Results per URL for complete test **
URL#1 (): Average Click Time 302 ms, 300 Clicks, 0 Errors
URL#2 (): Average Click Time 237 ms, 300 Clicks, 0 Errors
URL#3 (): Average Click Time 244 ms, 300 Clicks, 0 Errors
URL#4 (): Average Click Time 241 ms, 300 Clicks, 0 Errors
URL#5 (): Average Click Time 248 ms, 300 Clicks, 0 Errors
URL#6 (): Average Click Time 191 ms, 300 Clicks, 0 Errors
URL#7 (): Average Click Time 194 ms, 300 Clicks, 0 Errors
URL#8 (): Average Click Time 210 ms, 283 Clicks, 0 Errors
URL#9 (): Average Click Time 254 ms, 270 Clicks, 0 Errors
URL#10 (): Average Click Time 299 ms, 257 Clicks, 0 Errors
URL#11 (): Average Click Time 488 ms, 251 Clicks, 0 Errors
URL#12 (): Average Click Time 4,573 ms, 250 Clicks, 0 Errors
URL#13 (): Average Click Time 972 ms, 250 Clicks, 0 Errors
URL#14 (): Average Click Time 501 ms, 250 Clicks, 0 Errors
URL#15 (): Average Click Time 279 ms, 250 Clicks, 0 Errors
URL#16 (): Average Click Time 249 ms, 250 Clicks, 0 Errors
URL#17 (): Average Click Time 333 ms, 250 Clicks, 0 Errors
URL#18 (): Average Click Time 1,370 ms, 250 Clicks, 0 Errors
URL#19 (): Average Click Time 615 ms, 250 Clicks, 0 Errors
Total Number of Clicks: 5,161 (0 Errors)
Average Click Time of all URLs: 590 ms

Here is the summary of the same 50 vUser test on the Sun Java 5 or 1.5.

** Results per URL for complete test **
URL#1 (): Average Click Time 1,078 ms, 254 Clicks, 0 Errors
URL#2 (): Average Click Time 877 ms, 250 Clicks, 0 Errors
URL#3 (): Average Click Time 1,111 ms, 250 Clicks, 0 Errors
URL#4 (): Average Click Time 1,450 ms, 250 Clicks, 0 Errors
URL#5 (): Average Click Time 1,445 ms, 250 Clicks, 0 Errors
URL#6 (): Average Click Time 888 ms, 250 Clicks, 0 Errors
URL#7 (): Average Click Time 683 ms, 250 Clicks, 0 Errors
URL#8 (): Average Click Time 496 ms, 250 Clicks, 0 Errors
URL#9 (): Average Click Time 485 ms, 250 Clicks, 0 Errors
URL#10 (): Average Click Time 492 ms, 250 Clicks, 0 Errors
URL#11 (): Average Click Time 1,006 ms, 250 Clicks, 0 Errors
URL#12 (): Average Click Time 8,400 ms, 250 Clicks, 0 Errors
URL#13 (): Average Click Time 2,254 ms, 250 Clicks, 0 Errors
URL#14 (): Average Click Time 1,380 ms, 250 Clicks, 0 Errors
URL#15 (): Average Click Time 810 ms, 250 Clicks, 0 Errors
URL#16 (): Average Click Time 709 ms, 242 Clicks, 0 Errors
URL#17 (): Average Click Time 749 ms, 221 Clicks, 0 Errors
URL#18 (): Average Click Time 3,849 ms, 211 Clicks, 0 Errors
URL#19 (): Average Click Time 1,743 ms, 206 Clicks, 0 Errors
Total Number of Clicks: 4,634 (0 Errors)
Average Click Time of all URLs: 1,559 ms

I have found that the only way to really determine the correct garbage collector to use for your applications and what the correct arguments are to pass to the JVM, via the jvm.config file, to effectively manage garbage collections is to first determine how many Full GC’s are occurring and control that if there are too many. Then load-test the application whilst tuning the arguments passed to the JVM, typically via the jvm.config file.

Tag Cloud