VS2005 Regular Expression Search Rules!

One of the things I had to do to eliminate a few thousand bugs as part of this C++ to C# conversion is replace the text transmission functions.

Nevermind how they work internally, the important thing for the sake of the current conversion is that they look completely different.

The old functions looked something like:

send_to_char( “Words.\n\r”, ch );

While the new functions are supposed to look like:

ch.SendText( “Words.” );

With around 4000 or so calls to that function, it would be a dauntingly huge project to retype every reference to send_to_char. Because it’s a bit more complicated than a simple word replace, we would be hosed if not for Visual Studio 2005’s regular expression search and replace.

If you Google regular expression search-and-replace, most likely you’ll come up with a lot of people complaining about it. Ignore them — those people are whiny idiots. It is easily one of the most useful things ever added to Visual Studio and it takes about 10-15 minutes to get the hang of.

To make the above change, all I had to do was do a search for:

send_to_char[(] {:q}, {:i} [)];

And replace it with:

\2.SendText( \1 );

As a basic explanation:

1. Anything in [] brackets means “any of these characters”. I had to bracket the parenthesis to keep the parser from evaluating them as an expression.
2. Anything in {} brackets means that it’s assigned an “expression tag”. It’s the equivalent of the scripting language act of assigning it to %0, %1, etc and they’re numbered in the order they are found.
3. :q means match a “quoted expression”.
4. :i means match a C++/C# identifier (i.e. a variable name).
5. \1 means “insert the first tagged expression”. \2 inserts the second, etc.

This is enough to get the basic idea going, and it works like a charm, provided the functions are spaced EXACTLY as indicated. If you have something like (notice spaces):

send_to_char(“Something” , ch );

It will not work. In any codebase that has had more than one person’s fingers in it, you’ll have inconsistent spacing. Some people will put spaces before/after every variable, some won’t, and some will be mixed. That’s why we have to set it to ignore spaces anywhere they will be a concern. We do this by inserting [ ]* which means “match anywhere from 0 to infinity spaces”. The search expression now looks like:

send_to_char[(][ ]*{:q},[ ]*{:i}[ ]*[)];

And now that I’ve figured out how to use regular expressions, all of the references to the Diku send_to_char function have been replaced with our shiny new code.

The error count is now down to 24,414.

Skwish Skwish Skwish Go The Bugs

Now we’re down to 30,975. Nothing interesting is happening, but we’re making progress.

Most of these errors are related to one of four things:

1) References to functions moved into classes have to be changed.
2) Pointer-based functions and comparisons that need to be rewritten to be reference-based.
3) Public vs. private data.
4) Different handling of arrays in C#.

The majority are #2, with a large quantity of each of the others. When you have 30,000+ bugs to fix, everyone gets their share.

Just Past Halfway

The compiler tells me:

“You have just passed the halfway point in the journey toward your next codebase.”

Errors are now quite a bit less thant the halfway mark (which would be 37,571).

We now have 35, 542 errors to resolve. We’re making pretty good progress so far.

Under 40,000

We’re down to 39,610 errors now. Even though that’s a stunningly huge amount, it’s still progress.

My brain’s starting to hurt, I should probably work on a different project for a few days.

Significant Progress

We’re down to 44,773 errors. That’s 12,486 fixed for today.

Visual Studio 2005 has become noticeably more responsive. It’s still a little slow and clunky, but each error corrected is a slight speed increase.

New Personal Error Correction Record

Yesterday I mentioned that I had 75,142 second-pass compile errors in the codebase. That’s a lot no matter how you describe it.

Aside from spending a couple hours reading Heinlein’s Revolt in 2100, all I’ve really done today is hammer away at the code trying to reduce that number. This has been a little slow because Visual Studio 2005 slows down A LOT when it has to store tens of thousands of error messages in memory and check every change you make to code against those errors to see whether it can remove them from the error list.

I really like intellisense and realtime error checking, but not when it pretty much causes the IDE to grind to a halt. As the number of errors decreases, VS is gradually becoming more responsive, but it’s still too much of a mess to be a very pleasant editing experience.

So, the number of errors is now 58,264. That’s still a ton, but I’ve now set a new personal record of fixing 16,878 compile errors in a single day. My previous record was about 8500, basically due to some weird recursive STL/templating issues with a C++ app I was writing a year or two ago.

Even better, the day’s still not completely over yet, so I could probably add a few more to that total.

I Broke The Compiler

Seriously.

I cleared up all the first-pass compiler errors (the remaining 1,447). It was tedious, but not too terrible. That means that the compiler made it to the second pass.

It churned away for a while, and then gave me a “build failed message”, which I expected. It then proceeded to show me every last little error that it could find.

There were seventy five thousand one hundred and forty-two errors.

After displaying that many errors, Visual Studio 2005 promptly crashed in a horrible flaming fireball of death.

There are only 116,000 lines of code in the whole codebase. That means two-thirds of the code is broken. Nice, eh? I wonder how long it will take me to fix 75k errors…

1,447 Errors

That’s a good thing. Down from 7,899 errors. Everything that could be fixed via search-and-replace (about 5,000 errors) has been, and the rest (about 1,400) have been edited by hand. There’s still a lot more work to do in stage 2, but the number of compile errors is steadily decreasing. Soon we will have a C# version.

C#: Stage 1 of 5 Complete

Stage 1 is done.

Q: What is stage one?

A: Hammering the code into C# syntax so that Visual Studio gives no complaint before trying to compile to code.

Q: What’s that involve?

A: Placing all of the functions into classes, eliminating all global variables, converting #define statements into variables (in the case of constants) or class methods (in the case of macros), changing all char[x] variable declarations to strings, rephrasing all array declaration and initialization, and removing stray const keywords.

Q: How many changes did you have to make?

A: I had to change about 4,000 lines of code, mostly by hand. I love search-and-replace, but it couldn’t do much here.

Q: Did you completely break the heck out of anything in the process.

A: ABSOLUTELY! I know for a fact that I will have to completely rewrite mob tracking (which has always been one of the things I intended to do anyhow). Some of the changes to correct syntax also just pushed the errors later in the process — they won’t show up in a syntax check and will instead show up in a compile.

Q: What are the other 4 stages of this conversion?

2 = Compiling without errors. This is a huge step. After just running the first initial build, there were 7,899 compile errors. A large percentage of these will be pointer errors because I didn’t change any pointer-based code (“Pointers and fixed size buffers may only be used in an unsafe context.”)

3 = Building an executable. This may be moderately complex, but since C# pushes so much of the work onto the syntax checker and compiler, chances are that if it compiles without errors then the program will build and run.

4 = Booting without immediate crashes. This will involve firing up the debugger and making sure that all of the changes didn’t have any unintended consequences. They will, errors are unavoidable in a rewrite of this magnitude. Hopefully debugging will be fairly quick.

5 = Logging a character in and walking around without crashes. Most of the code lies dormant until a player is logged in. This includes things like network code, combat routines, hit/mana/move regen, informational screens, skills, etc.

Completing all of this will get us ready to put up an alpha port of the game so that we can actively start building content.

The Idea Evolution of Basternae: Technical

Over the past few days I’ve spent a while working on rewriting the Basternae source code in C# (even though the original code is not completely object-oriented yet).  Ideally I’d like to have it run as a standalone application linked to a SQL server for data storage.  This is doable in C++, but in C# it’s far easier.

I still have the C++ version, but after some experimentation it looks like I’m going to stick with the idea of migrating to C#.  It’s been my favorite language lately, and it offers a bit more power than C++ does.  What better way to fully master a language than port a 115k-line project to it?  I will still have the old code to fall back on, but I hope to have the new code working in less than a month (which is the timeframe I had planned for the original string conversion anyway).

For now I’m pretty sure that I’m going to stick to C++ for the client, especially since it’s already written and just needs some slight modifications to work with the new engine and to work with Linux (ideally it will be able to run on Linux, Windows, and MacOS, but I have no access to MacOS so that’s a “maybe eventually” thing) .

As a techie I usually focus on the tech stuff first, but sometime soonish I’ll probably be posting about the gameplay changes I have in mind.

Global Domination

Or more like, “getting owned by globals”.

As part of the conversion to C++ one of the major tasks is moving all of the global functions into classes. There are a lot of them – each spell, command, skill, bard song, et cetera has been handled by a global in the past. Even with all that’s been moved, I still count 1,467 global functions that need to be moved into classes. This is down from about 2,000 when I started.

Some of this is easy. It’s obvious that a get_object_weight function belongs to the object classs and an initialize_mob function belongs to the mobile class. Other things might not be so easy, like functions that belong to both characters and objects like give_object_to_char. It could just as easily belong to either one. It’s not a big deal if I just make an arbitrary decision and stick it somewhere, but it is something that I have to think about.

String Conversion Update

I’m still working on converting char * strings to std::string strings. Here’s the recent progress:

Reference

5-27-07

5-29-07

6-1-07

6-4-07

6-10-07

6-15-07

strncat

772

723

641

605

606

581

snprintf

1199

1166

1096

1079

1064

1032

const char *

343

287

317

330

341

330

MAX_STRING_LENGTH

2404

2313

2062

2011

2000

1955

MAX_INPUT_LENGTH

471

446

153

153

142

108

Since 5/27 we’ve gone from about 5200 references down to 4000. There’s still PLENTY to do.

Because of the way strings are handled, the number of references “to const char *” will fluctuate during the conversion and probably not decrease much until we’re nearly complete.

Inheritance = A Good Thing

No, I didn’t just have some rich relative kick off and leave my name in the will. The closest I have to a rich relative is an uncle who can afford to buy a new pair of shoes every two years.

In the original MUD code and in most C-based codebases I’ve seen, mobs have one set of data and players have a similar but different set of data. They all have things like hitpoints and movement points, but mobs have things like AI scripts and behavior flags while players have extra things like skill values and guild memberships.

There are two ways to handle this in C: either have completely different sets of data for each, or have a core set of data and a pointer to the extended data depending on which type it is (player or mob). The first way is sloppy and dangerous, while the second is sort of a “poor man’s inheritance”.

I’ve rewritten this to use parent and derived classes and all of a sudden some things that were very hard to do are incredibly easy. It’s also become incredibly easy to have things work differently for mobs and players by being able to use overridden functions for the player versions of code.

For example, since mobs don’t have skill training, almost every check for something like a bash or headbutt is based on a combination of that mob’s level and racial statistics. For players the check uses skill values modified by actual attributes like dexterity and strength.

Instead of writing a huge function riddled with lots of “if” statements, I can now just write different versions for each. The code is cleaner, easier to write, and automatically smart enough to “do the right thing”.

I love C++. Even though I’m getting a bit addicted to C# lately, it still rocks.

I never could have made this change without massive use search-and-replace, since the way to access most of the data members of players has changed completely. I did not want to change 2000 data references by hand.

Visual Studio 2005

I’ve been using Visual Studio .Net 2003 for a long time. I’ve finally upgraded to 2005, and some of the changes are interesting.

One of the things I’ve been doing is converting a lot of the c-string functions to STL std::string. It turns out that the old string functions I’m gradually eliminating have been deprecated:

_snprintf: “This function or variable may be unsafe. Consider using _snprintf_s instead.”
strncat: “This function or variable may be unsafe. Consider using strncat_s instead.”
stricmp: “The POSIX name for this item is deprecated. Instead, use the ISO C++ conformant name: _stricmp”
fopen: “This function or variable may be unsafe. Consider using fopen_s instead.”
strncpy: “This function or variable may be unsafe. Considre using strncpy_s instead.”

I will gladly avoid using those functions. I hate them.

Combat Bug Fixed

It was far easier to fix than I had expected.

Here’s what I was doing:

std::list<CharData *>::iterator it;
CharData * wch;
for( it = CharList.begin(); it != CharList.end(); )
{
wch = *it;
<stuff happens to wch here, and during combat wch could potentially be killed and deleted>
}

When there was a death in combat, the iterator would get corrupted, because in the destructor for CharData an iterator would remove the CharData from the CharList, corrupting the iterator in the violence update. Due to the chain of function calls between the violence update and where the CharData was actually deleted, there was no way to update the original iterator or do any sort of “safe” removal.

After tinkering around a bit, reading some message boards, it turns out that using a second iterator saves me. One of the iterators is incremented safely, the other damaged/destroyed, and things move on happily because we’re not referencing the broken iterator.

std::list<CharData *>::iterator it;
std::list<CharData *>::iterator jt;
for( it = CharList.begin(); it != CharList.end(); )
{
jt = it++;
ch = *jt;
<dangerous deletion stuff happens here>
}

Pretty strange, but it works.

SourceMonitor Update

Ahh, the joy of code metrics.

Files: 132
Lines: 113,584
Statements: 58,853
% Branches: 29.2
% Comments: 8.8
Class Definitions: 52
Methods/Class: 6.86
Average Statements/Method: 14.7
Max Complexity: 477
Max Depth: 8
Average Depth: 1.86
Average Complexity: 11.47

For the first time, the number of lines of code has gone down.  This is because MobProgs were removed.  We also lost about another 400 or so lines because the change to saving objects using XML allowed us to eliminate some duplicate code (repeat after me kids: Duplicate code is BAAAD!)   The average complexity also seems to be gradually decreasing a little.  In general that’s a good thing, since maintainability of code is generally thought to be the inverse of its complexity.

XML Objects!

It’s done – objects save and load as XML data rather than some ad-hoc text format.  I’m sure there will be a few extra details to work out, but the saving and loading of basic objects works now.  The change cleared up a bug or two that would come up once in a while due to formatting inconsistencies.

I still have to fix the violence update problem that I mentioned on the 1st.  I’ve tried a few minor changes in the hope that the problem would be resolved, but it really does look like a full combat-process rewrite is in order.  The original method is fundamentally flawed, so we need to work out a better way.

More XML Conversion

I’ve started tackling the conversion of all object saving to XML. Player saving was easy, since players tend to be pretty much the same and have all the same data fields. However, with all the different types of objects, nesting, affects, extra descriptions, etc. objects are a bit more of a project to convert. Objects are also saved in more than one type of file — corpse data file, player files, storage chests, etc.

So far I have them saving to XML the way I want them to. The next step is loading, which will take a good solid afternoon or evening of codework.

I’ve been mulling over the idea of moving data files over to an SQL database rather than XML, since XML is DREADFULLY SLOW. It’s not dreadfully slow on this fast machine, nor will it be on any server that I use, but it could become an issue at some point.

SQL would be a good thing and a bad thing. I’d get the automatic field matching and data integrity at the expense of greater complexity. Although there are tools to edit SQL databases directly, it’s not as easy as editing a text file if I needed to change a value in a file. Backups would also be a bit more complicated.

One thing at a time — I need to finish this first.

Goodbye Mobprogs

Mobprogs – a neat idea, but mostly useless.

The idea was to have a scripting language that could be used to write actions and triggers for mobs and objects that would give them a little more life. They weren’t ever used much, and I think I may have been the only one to write one. At least, I only see the ones I wrote on the old backups I have.

The code, unused as it is, has been lying around and hindering development efforts because it’s not much but more files that have to be modified every time a major change is made.

Today I’ve finally dropped them.

I’m pondering the idea of setting up Python scripting, even though I know it’ll likely not be used by anyone except maybe me. It wouldn’t be for anything other than to figure out how to do it, but it would probably be fun.