In this post series, I would like to put forward a hypothetical situation involving poor ColdFusion application performance, the investigative steps to take to isolate the issues, and the remedial steps to perform in order to solve those issues. I would really like some feedback from readers as well here, to hypothesize on possible issues, possible resolutions, and supply other tools or methods which may identify or solve the issues we discuss. I hope that this post series will not only help you identify and deal with ColdFusion issues, but also help to identify database, network, or hardware issues as they may arise. Note: This hypothetical situation, while pulled from my experiences, is not a direct parallel to any of my previous customers, and is instead a combination of factors from several different projects. Lets call it Project X.
Project X is setup to run across (4) Coldfusion 8 Enterprise edition servers, in a load balanced cluster behind a hardware load balancer. Sticky sessions are configured, so once a user makes a request to a given server, their subsequent requests should continue on the same server. Project X has a single MS SQL 2005 database on a 32 bit Windows 2003, which has 4 GB of ram. This server has (3) 15K 75 gig SCSI hard drives in RAID 5, upon which the operating system and the MS SQL binaries are installed. There is an iSCSI connected device which has (8) 10K 147 gig SCSI hard drives. The iSCSI device contains the MS SQL data and log files. Each Project X web server is a 32 bit Windows 2003 server with 4 GB of ram, and (3) 15K 75 gig SCSI hard drives in RAID 5. Each web server is running 2 instances of ColdFusion in a local cluster (using round robin to split requests between instances), and each ColdFusion instance is using the default JVM configuration that ships with ColdFusion 8. There is a shared folder on the MS SQL server which contains all shared page assets (files uploaded by the users, PDF documentation, and images). Each web server is running Apache, and has an alias pointing to the shared folder on the SQL box (using a UNC path).
All servers are connected to a 24 port gigabit switch, and is hosted on an OC3 line. Project X is a web based file sharing application which allows users to upload and share files of many types (images, pdf’s, office documents, and more). It makes use of Application.cfm to load site variables, and uses several CFC objects to encapsulate database queries and user information.
For several years this configuration has worked fine for the customer, with stable servers and acceptable response times. Project X has recently run an ad campaign in the national media, which has increased their site traffic by a factor of 2. Since the campaign, users have been complaining of slow web response times, as well as error messages. Investigating the server logs also shows that the coldfusion instances have been crashing with out of memory errors.You are tasked with uncovering the issues that are causing the slow page rendering, and the out of memory server issues. In my next post I will share both my techniques I use to identify these issues, as well as a selection of idea’s in comment responses to this post. Thoughts?
Now to encourage participation here, anyone who contributes in comments to this chain of blog posts by commenting (with a relevant comment to this discussion) will be entered into a drawing to win an Alagad backpack (trust me, these are the best backpacks ever, I use mine for travel, school, everything). I will do a drawing for the backpack in a connect room after this blog series is completed, so lets bring on the idea’s!