VS2005 Regular Expression Search Rules!

One of the things I had to do to eliminate a few thousand bugs as part of this C++ to C# conversion is replace the text transmission functions.

Nevermind how they work internally, the important thing for the sake of the current conversion is that they look completely different.

The old functions looked something like:

send_to_char( “Words.\n\r”, ch );

While the new functions are supposed to look like:

ch.SendText( “Words.” );

With around 4000 or so calls to that function, it would be a dauntingly huge project to retype every reference to send_to_char. Because it’s a bit more complicated than a simple word replace, we would be hosed if not for Visual Studio 2005’s regular expression search and replace.

If you Google regular expression search-and-replace, most likely you’ll come up with a lot of people complaining about it. Ignore them — those people are whiny idiots. It is easily one of the most useful things ever added to Visual Studio and it takes about 10-15 minutes to get the hang of.

To make the above change, all I had to do was do a search for:

send_to_char[(] {:q}, {:i} [)];

And replace it with:

\2.SendText( \1 );

As a basic explanation:

1. Anything in [] brackets means “any of these characters”. I had to bracket the parenthesis to keep the parser from evaluating them as an expression.
2. Anything in {} brackets means that it’s assigned an “expression tag”. It’s the equivalent of the scripting language act of assigning it to %0, %1, etc and they’re numbered in the order they are found.
3. :q means match a “quoted expression”.
4. :i means match a C++/C# identifier (i.e. a variable name).
5. \1 means “insert the first tagged expression”. \2 inserts the second, etc.

This is enough to get the basic idea going, and it works like a charm, provided the functions are spaced EXACTLY as indicated. If you have something like (notice spaces):

send_to_char(“Something” , ch );

It will not work. In any codebase that has had more than one person’s fingers in it, you’ll have inconsistent spacing. Some people will put spaces before/after every variable, some won’t, and some will be mixed. That’s why we have to set it to ignore spaces anywhere they will be a concern. We do this by inserting [ ]* which means “match anywhere from 0 to infinity spaces”. The search expression now looks like:

send_to_char[(][ ]*{:q},[ ]*{:i}[ ]*[)];

And now that I’ve figured out how to use regular expressions, all of the references to the Diku send_to_char function have been replaced with our shiny new code.

The error count is now down to 24,414.