I’ve been working with xml feeds lately, and I’ve stumbled across some irritating issues with ColdFusion, or perhaps it all revolves around programmer error.. naa! Things were rolling along smooth with xml. I pull in the feed, XmlParse(), then XmlSearch(). XPath was making life easy. Then comes that overgrown batch of xml. ColdFusion coughed, farted, and fell over dead. XmlParse couldn’t hand the larger xml string, about 40,000ish chars long. While that is a long string, in terms of xml it is not all that gigantic. Or maybe it flaked out on some type of content.

So now off to the next resort, regular expressions. Since I only needed one tag name in the xml this was an easy solution, at first. With later version of ColdFusion we have a ReMatch(), which is handy since ReFind only returns the position of the match, not the actual string found. I whipped up my regex:

matches = ReMatchNoCase("\<c\:mytagname.*?\>(.+?)\<\/c\:mytagname\>", xmlstr);

To my surprise this breezed through that same large xml string and found all occurrences of the tag I was looking for. But I didn’t want to return the actual tags, just the text inside, so I made another tweak to add look ahead and look behinds to the tags:

// forewarning, this won't work
matches = ReMatchNoCase("(?<=\<c\:mytagname.*?\>)(.+?)(?=\<\/c\:mytagname\>)", xmlstr);

As my luck goes, another let down. ColdFusion doesn’t support look behinds, but look aheads seemed to work ok. There is however a small work around according to this stack exchange which is to use java. There is also a cfRegex library too.

jrex = createObject('component','jre-utils').init();
matches = jrex.match("(?<=\<c\:mytagname.*?\>)(.+?)(?=\<\/c\:mytagname\>)", xmlstr);

While that certainly improves the power of the regex, I didn’t feel the need to install cfRegex or use the jre-utils. I went on my way and just chopped out the tags myself. However, if I were doing a lot of processing I would definitely make use of cfRegex. More over, I would much prefer to use XmlParse and XmlSearch if they could handle the data. My final solution consisted of a try catch. Try to use XmlParse and XmlSearch, however if that fails just use string parsing.

I hope that ColdFusion improves support for regex and xml. I don’t have much experience with Open Blue Dragon or Railo, they may have these issues covered. Are there any other ways to tackle this issue?