Javascript RegExp Match - Named Captures
Anyone who does much javascript development and is familiar with regular expressions knows that javascript is missing one fun little part of regular expression matching, named captures. However without this functionality javascript still works fine, it just returns results from a match in a standard array. For convenience sake it would be nice if these were returned in a hash, or for global matches an array of hashes with these named captures. Javascript is very extensible, so why not create the method?
The next problem is javascript's syntax for regexp: /^\w+$/g and RegExp("regex string"). The compiler will complain if you give it any syntax other than the native syntax. Well that only leaves one thing left to do. Create our own syntax, parse it out to native syntax, then put it all back together for the result. Sounds like a lot of work but its really not that bad. Lets get going.
First, this isn't re-inventing the wheel entirely. There is already a perl syntax for this: (?P
String.prototype.keyMatch = function(re, flags){
var is_global = false,
results = [],
keys = {},
native_re=null,
str = this,
tmpstr = str;
if(flags === undefined)
flags = "";
// find the keys inside the re, and place in mapping array {'1':'key1', '2':'key2', ...}
var tmpkeys = re.match(/(?!\(\?\P\<)(\w+)(?=\>)/g);
if(!tmpkeys){ // no keys, do a regular match
return str.match(re);
}
else{
for(var i=0,l=tmpkeys.length; i<l; i++){
keys[i] = tmpkeys[i];
}
}
// remove keys from regexp leaving standard regexp
native_re = re.replace(/\?\P\<\w+\>/g,'');
if(flags.indexOf('g') >= 0)
is_global = true;
flags = flags.replace('g','');
native_re = RegExp(native_re, flags);
do{
// parse string
var tmpmatch = tmpstr.match(native_re),
tmpkeymatch = {},
tmpsubstr = "";
if(tmpmatch){
// get the entire string found
tmpsubstr = tmpmatch[0];
tmpkeymatch[0] = tmpsubstr;
// map them back out
for(var i=1,l=tmpmatch.length; i<l; i++){
tmpkeymatch[keys[i-1]] = tmpmatch[i];
}
// add to results
results.push(tmpkeymatch);
tmpstr = tmpstr.slice( (tmpstr.indexOf(tmpsubstr)+tmpsubstr.length) );
}
else{
tmpstr = "";
}
} while(is_global && tmpstr.length > 0) // if global loop until end of str, else do once
return results;
}
Ok, thats a good bit, but following the comments will lead you right along. I've also added it as a function to the String object. So how do we call it? Right off of a string variable:
var results = str.keyMatch('(?P<num1>\\d+)\\-(?P<num2>\\d+)', "g");
The only ugly thing left is needing a backslash to escape the string. But besides that we have a quick, fairly good working name capture with the Javascript RegExp object. However, if you need a much more advanced collection of methods you may want to take a look at XRegExp. Enjoy!