There has been a lot of talk recently about design patterns aimed at circumventing the overhead ColdFusion imposes upon us when creating CFCs. I'm not sure who coined the term "Object Instantiation Penalty", but the first reference to it I can find in the CF community was over at the Dot Matrix blog. Everyone seems to agree that object creation in ColdFusion leaves something to be desired, but I haven't seen anyone really quantify the price yet. I decided some line charts were in order.Here comes the disclaimer: These are nothing more than simple iterative tests performed on my home PC (Windows XP, 1Gig RAM, 1.8 GHz AMD) using CF 8 Developer. They do not emulate load, they are not official, nor are they comprehensive. In fact you should probably stop reading now and ignore this entire post. Ok, now that the disclaimer is out of our way, here is what I did. I simulated what you might do with a result set of 1, 5, 10, 25, 50, 100, 500, 1000, 2500, 5000, and 10000 records if you were to prepare them as an array of structs compared to an array of components. The most common scenario of course is probably just to pass back the result set directly. Unless you are using duplicate() the overhead is probably negligible regardless of the size of the query since query objects are passed via reference, so only a pointer would get created in memory. I assume an IBO would fall in this category since only one object and a pointer to the result set gets created. I didn't count the database call itself in the test, since that is standard across all tests. I run garbage collection before each pass, then wait 700 ms just to make sure it has a chance to finish in case it collects asynchronously. I measured how long it took to process each set of records as well as how much memory usage climbed during the test. Here is the code for my test harness:
[code]<cfsetting enablecfoutputonly="Yes">
<cfset arrObjects = arrayNew(1)>
<cfset i = 0 />

<cfset runtime = CreateObject("java","java.lang.Runtime").getRuntime()>
<cfset thread = CreateObject("java","java.lang.Thread")>

<!--- Run the test 11 times with the following result set sizes --->
<cfloop list="1,5,10,25,50,100,500,1000,2500,5000,10000" index="num_records">
	<!--- This will get the number of records contained in the num_records variable --->
	<cfinclude template="qry.cfm">
	<!--- Run garbage collection --->
	<cfset runtime.gc()>
	<!--- Wait to make sure it is finished --->
	<cfset thread.sleep(700)>
	<!--- Get starting memory usage --->
	<cfset start_mem = runtime.totalMemory() - runtime.freeMemory()>
	<!--- Get start time --->
	<cfset start_time = gettickcount()>
	<cfloop from="1" to="#num_records#" index="i">
		<cfinclude template="process_records.cfm">
	<cfset total_time = (gettickcount() - start_time) / 1000>
	<cfset mem_increase = ((runtime.totalMemory() - runtime.freeMemory()) - start_mem) / 1024 / 1024>
<cfoutput>#num_records#	#total_time#	#mem_increase#
For my test data, I chose a table at random out of my database. This table just so happened to hold calendar events. It seemed like a good sample since it had varchars, dates, and integers. The qry.cfm file simply has a cfquery in it with the maxrows attribute using what ever was passed in for num_records. I also had each of my tests append their results to an array as they went along. I created the structs three different ways. The first way was just to create completely empty structs and not even bother populating them from the result set.
	tmpStruct = structnew();
The second struct version created a struct and used simple set statements to populate it from that record of the query.
	tmpStruct = structnew();
	tmpStruct.event_id = qry_calendar_events.event_id;
	tmpStruct.calendar_id = qry_calendar_events.calendar_id;
	tmpStruct.title = qry_calendar_events.title;
	tmpStruct.start_date = qry_calendar_events.start_date;
	tmpStruct.start_time = qry_calendar_events.start_time;
	tmpStruct.end_date = qry_calendar_events.end_date;
	tmpStruct.end_time = qry_calendar_events.end_time;
	tmpStruct.location = qry_calendar_events.location;
	tmpStruct.contact_person_id = qry_calendar_events.contact_person_id;
	tmpStruct.more_info = qry_calendar_events.more_info;
	tmpStruct.registration = qry_calendar_events.registration;
	tmpStruct.reg_start_date = qry_calendar_events.reg_start_date;
	tmpStruct.reg_end_date = qry_calendar_events.reg_end_date;
The last struct method used a function found in the Illudium PU-36 Code Generator called queryRowToStruct().
Now, I moved on to actual CFC's getting created. For this example, I turned to Illudium PU-36 again to generate a generic bean for my calendar class with getters and setters and just already in it. Perhaps I'll try a version with synthesized getters and setters later. To follow my trend, my first object example, simply created empty objects without actually populating them with any data.
	tmpObj = createObject("component","calendar");
My next test, created an empty object, and then called the setters one-by-one to populate it.
	tmpObj = createObject("component","calendar");
My third object example passed a struct in during object creation and let the init() method call all the setters.
	tmpObj = createObject("component","calendar").init(argumentCollection=queryRowToStruct(qry_calendar_events,i));
And finally, I commented out all of the cfarguments and setters in the init() and replaced them with this.
[code]<cffunction name="init" access="public" returntype="calendar" output="false">
	<cfset structappend(variables.instance,arguments)>
	<cfreturn this />
Then I ran the same code as the third test (Passing in a struct to the init method for population upon object creation). I ran each test a couple of times to make sure everything was compiled and cached before I recorded the results. When I was finished, I pointed ColdFusion's JVM to Java 1.5 (down from 1.6) and did the whole thing all over again. Here are the graphs of processing time and memory usage on Java 1.6: Click to enlarge
      Here are the graphs of processing time and memory usage on Java 1.5: Click to enlarge
      So, what did I find? Here are some notes on the comparisons of my methods:
  • Structs are faster to create than objects (duh)
  • Empty structs and empty objects are faster to create than populated once. (duh)
  • It is faster to manually populate a struct than to use queryRowToStruct
  • It is faster to populate your object with setters AFTER creating it, than passing in the arguments to a bunch of setters in the init().
  • The exception to above is if your init() uses structappend to cram the arguments collection into he variables scope. This obviously precludes the option of custom setters though.
  • Java 5 is slightly faster at creating objects than Java 6. It's not much, but there is a difference. Creation of structs was constant.
Here are some overall conclusions I came to:
  • You can instantiate up to 100 simple objects with effectively zero penalty.
  • You can create up to 1000 simple objects and probably not notice the penalty (~200 ms - 800 ms).
  • Once you get to 4000 to 5000 objects the penalty starts to turn sharply upward. I don't know if the affect is actually exponential since my result set size deltas were non-linear.
  • For an example where you were only dealing with 20 to 30 records, I think it might be perfectly acceptable to deal with your records as objects.
  • A caveat to above being that object inheritance and concurrent load might affect the performance more than I am guessing.
  • I am absolutely positive your mileage will vary.
For future tests if anyone is interested, I would like to try creating sythenized objects and or Transfer objects which wouldn't have as many methods like getters and setters. I would also like to conduct the same test in Java just for a base reference to another OO language.