Gist of the Day: Named Capture in Perl Regular Expressions (Briefly)

One of the largest critiques I see about regular expressions is that they lack readability. Well, in Perl 5.10 named capture was added (http://perldoc.perl.org/perlretut.html) which I think adds an awful lot of readability to Perl regular expressions.

The Caveats

  • I’m using UTF-8 in this demo. I am not going to go into all the details of working with UTF-8 since it isn’t the point of this gist.
  • There are a number of ways to capture matches in a regular expression. This is one of them, I’m not going to weigh all of the pros and cons of the different methods (especially since most of it comes down to personal preference).
  • This is going to be just an introduction, a very brief run-through.
  • The demo is intended to simulate a plausibly realistic scenario, not an actual real-life scenario. I was trying to come up with a simple scenario where the benefits of this feature would be apparent. I would agree with most arguments about how this might not be the best or most common approach to this problem, keep in mind that I’m demonstrating a specific feature of regular expressions, not trying to come up with the best way to solve this specific problem.

The Demo


We’re going to focus entirely on lines 22 through 33, that’s where the magic happens. Take special note of how the (?<symbol>.) piece, and other (?<...>...) bits name a match. This syntax takes the match and sticks it into either a grouping of g{name} for backreferences, or $+{name} for captures. That’s what you see when I’m assigning things, you see me pulling from the %+ hash.

Why’s This Useful?

This is useful primarily for reasons of readability and maintainability. If you use the traditional $1 then when your pattern changes and you need to add something else into the beginning, you now have to change your $1 into a $2. If there were more capture variables, you’ll also need to update those as well. Named capture really helps in this situation since you can just name your new capture match and you’re good to go.

Conclusion

I like named capture, I think it’s useful, it’s easy, and it solves some real problems with regular expressions. Let me know what you think, and let me know if you have any other requests for gists.