How is this timeout on CFLOOP possible?

I've been having troubles with a particular hosting provider for a client's site, and rather than going on a rampage about my troubles with the provider, let me get to the really odd error that I get on the hosted site intermittently, but cannot duplicate on my development box.

Here is the offending code:

<cfdirectory name="Dir" filter="*.jpg" directory="/my/path/to/photos/">
<cfloop index="i" from="1" to="#Dir.RecordCount#"><cfset numArray[i]=i></cfloop>
<cfset QueryAddColumn(Dir,"Num",numArray)>

Sometimes this will process in a second and move on, other times the page will sit there for 30 seconds, time out, and then report, "The request has exceeded the allowable time limit Tag: CFLOOP".

Never mind even why I may be doing this or if there is a better way.. I'd like to know how this is even feasible. I know <cfloop> is obviously a candidate for infinite loops, but I'm looping over files in a directory that won't number more than a few hundred. Heck, even if for some crazy reason the record count provides a number in the millions, I'm executing such a simple command that it should take a couple seconds at most.

I don't understand how this error is feasibly possible without there being serious issues with the ColdFusion service on the server, which I actually wouldn't doubt.

CMS Roundup: Introduction

Over the next few weeks, I am going to review a handful of web content management systems for the community college where I work, and I will be blogging my reviews here in a series called "CMS Roundup". This review will cover mainly ColdFusion solutions, since that is my love and specialty, but a few non-CF packages may sneak their way into the mix.

Scale. I will be reviewing open source, entry level, and mid-market packages, which places the price range from free to $150,000. Enterprise-level solutions in the $200,000 to 7-digit numbers will not be considered because they are ridiculously out of our price range.

Weighted features. There are some features that I will be particularly interested in. The system has to be very easy for a non-technical user to understand. The developer should be able to set up various templates and security for different site areas. The marketing department should be able to enforce page approval and workflow to ensure the accuracy, consistency, and propriety of the content. Easy integration with existing external applications should be simple as well. These are just some of the features that will be considered; the strengths and weaknesses specific to each CMS will be reviewed as well. And I'm sure I'll get a feeling for more desirable features as I progress through the reviews.

The players. My list of CMS's for consideration is a fluid one, so please make recommendations while I'm in the middle of this process. The open source ColdFusion players are FarCry, Gerobase, and Katapult; commercial ColdFusion players are CommonSpot, ShadoCMS, HotBanana, and Savvy. I'll look at a couple non-CF contenders, like Estrada and CrownPeak in the commercial space and Plone, Joomla, and Drupal in the open source space.

Likely, this journey will end with a summary of strengths and weaknesses of all CMS's, including winners in each category, and of course, the name of the CMS that ultimately earned a new customer. ;-)

The game is afoot.

Using LIKE in Query of Queries SQL

The Query-of-Queries (QoQ) feature of ColdFusion is especially useful for manipulating data that ColdFusion holds in a Query object but isn't actually a result set from a SQL query. For instance, LDAP results, directory lists, and FTP directory lists are all returned as Query objects by <cfldap>, <cfdirectory>, and <cfftp>.

That said, don't forget that QoQ supports the LIKE conditional for pulling the records you want. But you can search for more than 'StartOfPhrase%' when using LIKE.

I have an app that calls an LDAP query and then uses QoQ to find all User IDs (cn's) that start with "W" or "G". Why didn't I just make this filter in the LDAP call? With effort, I could have, but I was already filtering on different requirements in LDAP, and it was easier at this point to fine-tune the results with a QoQ. Previously, my code looked something like this:

SELECT * FROM LdapResults WHERE cn LIKE 'W%' OR cn LIKE 'G%'

Whereas that worked, it is needlessly verbose; this became more clear when I also had to check for IDs starting with "A", and I realized how uncomely that approach was. It also doesn't enforce the fact that the legit IDs will always be 6 characters in length.

So start using LIKE's other accepted symbols! Brackets [] let you specify a range of acceptable characters, and underscore _ lets you define how many characters to look for, versus the wildcard % which just accepts any length of characters.

SELECT * FROM LdapResults WHERE cn LIKE '[WAG]_____'

That query is more accurate and actually shorter (note: syntax highlighting makes the underscores difficult to see).

Finally, what if the first character should always be a "W", "A", or "G", but the remaining 5 characters should always be digits? The QoQ LIKE doesn't support repetition quantifiers like the curly braces {} do in RegEx. So you will have to manually repeat [0-9] for each character:

SELECT * FROM LdapResults
WHERE cn LIKE '[WAG][0-9][0-9][0-9][0-9][0-9]'

That does start getting more verbose again, but the SQL is now very precise regarding its filter requirements. A little tweaking like this, and QoQ can be extremely handy.

Notes: Are You “Cashing In” on Caching?

Caching is a way of saving data after it has been originally computed or loaded. Alleviates load from server(s). Makes them faster and "healthier". Also improves the users' experience.

On the server level, we can cache templates. ColdFusion has a trusted cache of compiled CFML. We can also cache data objects (queries, structs, arrays, CFCs, etc), and then render HTML content, either entire pages or HTML regions.

What is good to cache? Things that are global to all users, like lists of states, navigation, etc. Even if not for all users, things that are used repeatedly in the application. Things that aren't subject to much change. The more variants of something (based on the number of inputs), the less it will benefit from caching.

So how do we find these things? Examine ColdFusion logs, look for long-running pages. Ugh. Better: Go through debugging output. Look for poor execution times. But usually we look at it in a dev environment where load isn't heavy. Can also use getTickCount(), <cflog>, <cftrace>. Best: SeeFusion, FusionReactor, CF8 Admin will now give us more insight into performance of server and, more specifically, the threads or tasks that might be causing the heavy load.

Ask yourself:

1. How often do changes take place? The more the item is in flux, the less it is a caching candidate.

2. Up-to-the-second data is critical? No good.

3. How many cache variants are possible based on the variable input arguments?

4. What is the average count/size/length of each cache variant? Memory footprint might become an issue.

What is a "Cache Variant"? A single set of data derived from the application code and based on a single combination of values for its arguments. So if we have SQL with userid=#userID and active=#active#, these are two distinct inputs. If you have 1,000 users, and 2 active states, you have 1,000 x 2 = 2,000 variants for the query cache. Obviously, the number of inputs drastically increases the variant pool. But ask yourself: How many likely variants exist? There likely are many permutations that you know are very unlikely to happen. Consider this when thinking about the likely footprint of your caching. Give attention to where you cache as well. Perhaps memory, disk, database.

Measuring effects of caching. A 500ms page executed 60 times per minute is effectively taking 30s/1m. 30s x 60m = 1,800s per hour. If caching that page takes it down to 20ms, this will be 12s/1m x 60m = 72s per hour. You can see how drastic the caching is on overall system performance.

Trusted Cache. Only have this turned on in the production environment. Can be problematic for a dev server since files are changing constantly. Will recompile or depend on compiled cache based on date/time stamps of the files. This could cause problems when templates are deployed via FTP or source control. For instance, if you deploy an older version to roll back some code, if may not recompile the new code! It will use the cache! So be cognizant of this and clear the cache if that occurs.

Restarting entire cache: Restart Application Server. Ugh. But in CFMX7, can "Clear Template Cache Now" in ColdFusion Administrator. Brian Szoszorek have an article for clearing just specific caches.

"Save Class Files" option. In the CF Admin "Caching" settings. When enabled, generates and saves a *.class file for each CFML template executed in WEB-INF/cfclasses/ directory. Can save a small bit of load when server is restarted. Server-wide setting. This option often doesn't actually give any decent speed gain, because it uses File I/O.

Query Caching. Adding cachedWithin or cachedAfter to <cfquery> tag. Note that CF Admin has a "Maximum number of cached queries" setting. Should be set high for a server-wide multi-application environment to be useful. Query caching is driven by: Query name, SQL statement, datasource, username/password, input arguments to SQL statement. All of these must be the same to use the cache. Even the tabbing/spacing of the SQL statement will cause it to not reference the cache if different!

Good caching examples: CachedWithin is good if something isn't changing for a short period of time. Especially great when no inputs that would cause variants. Show top 10 news articles, cache for 15 minutes. CachedAfter is good when running a query on data that won't change after a certain time. Show records for month of 6/2007. You could do cachedAfter with a date of 6/30/2007.

Pros of query caching: Built-in. Debugging output will show that it was cached. But many cons: No control of where queries "go" when being cached. In one big shared resource pool. Max number of cached queries is hard to set reasonably. Difficult to clear particular items unless clear whole cache. Cannot cache queries with <cfqueryparam> (until CF8). Cannot cache with <cfstoredproc>, however you can invoke the stored procs inside of <cfquery> with the EXEC command. No way to track or view the whole cache. Service/instance specific, not application specific.

Alternatives to data caching. Take queries and put them in Server, Application, or Session scope. But you will have to programmatically start managing that. This obviously can be done with not just queries but also structs, arrays, objects, simple vars.

Content Caching. You can use the <cfcache> tag. Allows you to cache entire contents of a page. Like <cfquery>, you give it a cachedWithin value. Stores the pushed HTML to disk. Decision to use cache is based on the URL requested.

Alternatives for content caching. You can use <cfsavecontent> then put it in a shared scope (server, application, session, etc).  Various custom solutions (cf_superCache, cf_accelerate, cf_cacheOmatic, cf_turboCache, etc).

  Theme Brought to you by Directory Journal and Elegant Directory.