ColdFusion Muse

UUID Magic With Java - When Speed is Critical

Mark Kruger January 13, 2011 2:32 PM ColdFusion Comments (6)

Note, this post is compiled from information foraged and provided generously by the inestimable Brian Meloche who's ColdFusion skills are (quite obviously) legendary. The Muse gives praise where praise is due - and with a nice dose of hyperbole to boot.

Many folks use UUID's for various reason. CF has a nice function built in that handles UUID creation - createUUID(). Try it out - use <cfoutput>#createUUID()#</cfoutput>. You should see a funky 36 character string that looks like this - 7C286425-CA3D-1B10-16A3CCD259C21FEE. If you are new to ColdFusion or programming in general you might not realize what's special about the UUID. It is guaranteed to be unique - at least within reason. There is a statistical probability that a duplicate is possible, but your chance of finding one is about the same as being kissed by Ann Paquin or being intellectually stimulated by the show "Jersey Shore". If you want to know more about the uniqueness of UUID's check out this interesting (or mind numbing - depending on your perspective) article on Wikipedia.

Just how would you use a UUID? There are a myriad of ways. You could store it as a cookie to identify a "unique" visitor. You could use it to tie into your custom "roll your own" session management. You could use it as part of a seed for encryption. But probably the way most folks use it is as a primary key to the DB. Now I know that most RDBMS systems include a built in UUID function. If you are planning on programming for one DB (and there are often very good reasons to do so) then I recommend using the built in function. It's typically faster to have your DB Server create something like a UUID than passing a 36 character string to it via the JDBC driver. However, if you wish your code to be portable then it's likely you will be creating your own UUIDs using ColdFusion's built in functionality.

Slight Problem

If you are using a version of CF prior to version 9 then ColdFusion has a particular way of generate UUIDs that is tied to the clock and MAC address. It is capable of generating about 100 per second. There are times when that number might become a bottleneck on a high traffic site. For example, if you are at the peak of your traffic with 700 or 800 concurrent connections and suddenly an aggressive bot starts crawling your site. ColdFusion may not be able to quite keep up with that number of generated UUIDs (assuming your Bots are generating UUIDs through you code somehow – logging, sessions or whatever).

Fortunately there is a "JAVA" way to do UUIDs that is able to get around this issue. Actually (quoting Brian) there are 5 "classes" of UUID in the java.util package. CF Apparently uses a slower one of the 5 (presumably more compatible with disparate environments or whatever). The following Java code uses a different one of those classes. The code is pretty easy to figure out. In most cases you could simply drop this code in to replace your createUUID() code:

<cfset uuid = createobject("java", "java.util.UUID") />
<cfset newUUID = uCase(removeChars(uuid.randomUUID().toString(), 24, 1)) />

Another Brian, Brian Ghidinelli, published this post where he tests the speed of createUUID() against the Java code above and finds a nearly thousand times increase in speed using the Java UUID code.

Meanwhile Brian Meloche wanted to verify that ColdFusion 9 improves the performance of the original CreateUUID() so he ran the following test on both platforms:


<cfset timeIs = getTickCount()>
<cfloop from="1" to="1000" index="i">
    <cfset uuid = createUUID() />
<cfset timeIs = getTickCount() - timeIs>


<cfset timeIs = getTickCount()>
<cfloop from="1" to="1000" index="i">
    <cfset uuid = createobject("java", "java.util.UUID") />
    <cfset new = uCase(removeChars(uuid.randomUUID().toString(), 24, 1)) />
<cfset timeIs = getTickCount() - timeIs>

What did he find? While ColdFusion 8 benefited tremendously from the Java version with a 200 to 1000 times improvement, ColdFusion 9 saw only a modest 10 times improvement. From this test I think we could reasonably conclude that using ColdFusion 9 you are not likely to run into any problems with UUID bottlenecks regardless of whether you use createUUID() or the Java version of the code.

  • Share:


  • Paul Nielsen's Gravatar
    Posted By
    Paul Nielsen | 1/13/11 4:46 PM
    When considering Primary Keys, the database is certainly a part of the performance equation. Most folks put a clustered index on the primary key (in fact SQL Server does this by default). The data is organized physically in the table (clustered index) in the order of the clustered index key. Inserting non-sequential rows into a clustered index causes a ton of page splits and reorganization of the data pages. This is why inserting rows using a UUID (or MSFT GUID) primary key is about 400 times slower than inserting sequential (Identity col) primary keyed data.
    If uniqueness across the universe is important to your data there is an option. SQL Server has the ability to generate GUIDs that are always larger than the largest existing GUID in the table. Instead of using NewID() in your code, use NewSequentialID(), but this ONLY works in the column default.
  • Mark Kruger's Gravatar
    Posted By
    Mark Kruger | 1/13/11 5:01 PM

    Great great comment - thanks. Remind me ... what version of SQL started that "newSequentialID()"?

  • Peter Boughton's Gravatar
    Posted By
    Peter Boughton | 1/14/11 6:40 AM
    >> But probably the way most folks use it is as a primary key to the DB <<

    Not here. If the unlikely situation of merging two sets of data arises, I'd write a script to do it anyway, and would handle potential collisions with that.
    Then, in the regular day to day debugging and maintenance, I can deal with simple numbers for primary keys, and avoid unnecessary headaches.


    >> uCase(removeChars(uuid.randomUUID().toString(), 24, 1)) <<

    Given this is a UUID, why bothering to uppercase it?

    Similarly, wouldn't it be simpler and faster to increase the database length by one instead of removing the extra hyphen (at least I'm assuming it's an extra hyphen being removed, and not actual data).

    The actual value of a UUID is generally irrelevant; other than it being a unique string that doesn't contain any fancy characters, it doesn't really matter what it looks like?
  • Mark Kruger's Gravatar
    Posted By
    Mark Kruger | 1/14/11 10:22 AM

    Good comments - not strictly on point, but good :)

    Personally I find the upper case useful - easier to read than jumble case. But to each his own.

    As for ints for primary keys - I use that approach as well in many cases - but there is security to consider. If you are passing values back and forth ints make it easy to probe your data. Yes, I know you should inoculate your app and harden it against injection and secure your data in other ways. But there are cases where UUID is the proper choice - indeed there are some schools of thought that dictate UUIDs for all primary keys for this very reason.

  • Sean Corfield's Gravatar
    Posted By
    Sean Corfield | 4/16/11 9:02 PM
    This post cropped up in my recent research into UUIDs as primary keys. See some of my investigation here:
  • Javin @ classpath Java tutorial's Gravatar
    Indeed great comments.