ColdFusion Muse

List Delimiters and Coldfusion Magic

Mark Kruger March 20, 2008 10:56 PM Coldfusion Tips and Techniques Comments (9)

Here is one of those finicky nuances that might surprise you about Coldfusion. Many languages have list functions or something similar to list functions. In many of these languages there is some version of split or splitf that allows you to specify any string as a delimiter regardless of length. This might lead you to believe that you can use a multi-character string as a delimiter in list functions in Coldfusion. Not only is this not the case but the way delimiters behave can cause you to believe it is working when in fact it is not. Let me explain.

Let's start with some sample code. Consider this simple little list parsing scriptlet:

<cfset itemlist = 'joe3bob3harry3mary3Ann4jo-bob'/>

<cfloop list="#itemlist#" index="x" delimiters="3">

<cfoutput>#x#<br></cfoutput>

</cfloop>
If you run this code it will display a list that looks like this:
joe
bob
harry
mary
Ann4jo-bob

The 3 is the delimiter and the list is exactly what you think it should be. So far there is nothing surprising. But now let's attempt a multi-character delimiter. Let's use "3-4" as our delimiter in the following example.
<cfset itemlist = 'joe3-4bob3-4harry3-4mary3-4Ann4jo-bob'/>

<cfloop list="#itemlist#" index="x" delimiters="3-4">

<cfoutput>#x#<br></cfoutput>

</cfloop>
You would expect this to show:
joe
bob
harry
mary
Ann4jo-bob

But instead you get a very peculiar result. It will look like this:
joe
bob
harry
mary
Ann
jo
bob

What in the ham sandwich is going on here? As it happens, Coldfusion will look for any one of the three characters as a delimiter. So any time a 3, 4 or a dash (-) shows up in the string, it is treated as a delimiter and a list item is identified. This will fool you because your compound delimiter will often look like it is working correctly. Why? Because the first letter of your delimiter will always trigger a list item. If your characters are semi-unique it is often going to look right. But you will end up with a hard-to find bug that only rears its head when your tested code starts handling regular data.

Please note, the code above was tested on Coldfusion 7. Coldfusion 8 introduces lots of additional functionality - looping over a file a line at a time for example - that may change this behavior. Oddly, I have never actually seen this list nuance blogged or discussed anywhere, but that could be an anomaly. In any case I will try and test it on CF 8 if I have the time and post an update.

  • Share:

9 Comments

  • Darrell's Gravatar
    Posted By
    Darrell | 3/20/08 10:41 PM
    Jeff Peter’s book “ColdFusion Lists, Arrays and Structures” covers an example like these plus lots of other behavior you would sometimes not expect.
  • JC's Gravatar
    Posted By
    JC | 3/21/08 6:02 AM
    hmm.. I never even considered it could mean anything else. Every coldfusion function that has a "delimiters" attribute defines it as a list of delimiters. Some of them even have an "includeEmptyElements" attribute that would show the problem pretty clearly. Maybe setting your multi-character delimiter as a variable and then using that variable would work, but I doubt it... seems like you'd end up having to use a user defined function like Split() http://www.cflib.org/udf.cfm/split
  • JC's Gravatar
    Posted By
    JC | 3/21/08 6:05 AM
    CF8 results are identical to CF7. I just finished upgrading my last server to it, so thanks for the excuse to play. lol.

    joe
    bob
    harry
    mary
    Ann4jo-bob

    joe
    bob
    harry
    mary
    Ann
    jo
    bob
  • Tom Mollerus's Gravatar
    Posted By
    Tom Mollerus | 3/21/08 8:17 AM
    Yeah, I too think that it's common understanding that list functions deal with single-character delimiters. As JS says, the "delimiters" attribute is plural, indicating that each character is considered a delimiter by itself.
  • Dan Roberts's Gravatar
    Posted By
    Dan Roberts | 3/21/08 8:35 AM
    <cfset equation = "1*9+4-3/2.5">
    <cfset operands = listToArray(equation,"+-*/")>
    <cfset operators = listToArray(equation,"0123456789.")>

    that's pretty cool
  • mark kruger's Gravatar
    Posted By
    mark kruger | 3/21/08 8:47 AM
    @Dan,

    Wow... that is a really useful example that I had not thought of - a way to get all the operators out of an equation... neato.
  • Dan Roberts's Gravatar
    Posted By
    Dan Roberts | 3/21/08 8:53 AM
    In the past I would have checked each character individually or looped over a regular expression. This is so much better.

    I looked at Java's docs and String.split() accepts a regular expression. That is just awesome and would eliminate a lot of looping over regular expressions in parsing.
  • Dan Roberts's Gravatar
    Posted By
    Dan Roberts | 3/21/08 9:19 AM
    The same using String.split() would be the following. Heck, this may be exactly how CF is doing it under the hood. This also allows for using multiple lengths of delimiters.

    <cfset equation = "1*9+4-3/2.5">
    <cfset operands = equation.split("[+\-*/]")>
    <cfset operators = equation.split("[0-9.]+")>

    It does produce an extra empty element at the start of the operators array, though that is probably my fault somewhere in the expression. Also it is my understanding that it returns a slightly different variable type than CF generally uses for arrays.
  • David Sirr's Gravatar
    Posted By
    David Sirr | 3/26/08 6:34 AM
    Hi Mark, this has caused me heartache since way back in the day... the easiest way to work with this scenario i found was to do a replace on your multiple character delimiter with say a pipe '|' character just before you need to run list functions on it, such as replace(list,'mulidelim','|')