Javascript RegExp Match - Named Captures

Anyone who does much javascript development and is familiar with regular expressions knows that javascript is missing one fun little part of regular expression matching, named captures. However without this functionality javascript still works fine, it just returns results from a match in a standard array. For convenience sake it would be nice if these were returned in a hash, or for global matches an array of hashes with these named captures. Javascript is very extensible, so why not create the method?

The next problem is javascript's syntax for regexp: /^\w+$/g and RegExp("regex string"). The compiler will complain if you give it any syntax other than the native syntax. Well that only leaves one thing left to do. Create our own syntax, parse it out to native syntax, then put it all back together for the result. Sounds like a lot of work but its really not that bad. Lets get going.

First, this isn't re-inventing the wheel entirely. There is already a perl syntax for this: (?P\w+). We will pass in a string representing a regular expression using that syntax. Then we'll simply extract that token, leaving what should be a native javascript regular expression. We will parse our string for matches, one by one to keep up with our named captures. Then at the end we should have an array of each capture. Sounds complex? Lets take a look:

String.prototype.keyMatch = function(re, flags){
    var is_global = false,
        results = [],
        keys = {},
        str = this,
        tmpstr = str;

    if(flags === undefined)
        flags = "";

    // find the keys inside the re, and place in mapping array {'1':'key1', '2':'key2', ...}
    var tmpkeys = re.match(/(?!\(\?\P\<)(\w+)(?=\>)/g);
    if(!tmpkeys){  // no keys, do a regular match
        return str.match(re);
        for(var i=0,l=tmpkeys.length; i<l; i++){
            keys[i] = tmpkeys[i];

    // remove keys from regexp leaving standard regexp
    native_re = re.replace(/\?\P\<\w+\>/g,'');

    if(flags.indexOf('g') >= 0)
        is_global = true;
    flags = flags.replace('g','');

    native_re = RegExp(native_re, flags);

        // parse string
        var tmpmatch = tmpstr.match(native_re),
            tmpkeymatch = {},
            tmpsubstr = "";

            // get the entire string found
            tmpsubstr = tmpmatch[0];

            tmpkeymatch[0] = tmpsubstr;

            // map them back out
            for(var i=1,l=tmpmatch.length; i<l; i++){
                tmpkeymatch[keys[i-1]] = tmpmatch[i];

            // add to results

            tmpstr = tmpstr.slice( (tmpstr.indexOf(tmpsubstr)+tmpsubstr.length) );

            tmpstr = "";
    } while(is_global && tmpstr.length > 0) // if global loop until end of str, else do once

    return results;

Ok, thats a good bit, but following the comments will lead you right along. I've also added it as a function to the String object. So how do we call it? Right off of a string variable:

var results = str.keyMatch('(?P<num1>\\d+)\\-(?P<num2>\\d+)', "g");

The only ugly thing left is needing a backslash to escape the string. But besides that we have a quick, fairly good working name capture with the Javascript RegExp object. However, if you need a much more advanced collection of methods you may want to take a look at XRegExp. Enjoy!