Lost at C

This week I've learned a few things (always the mark of a good week in my book), the foremost of which is that I don't know very much about C.

I expect this post will mostly result in comments such as "well, duh..." and the like :)

How I spent an afternoon chasing a star...

After a fairly relaxing bank holiday weekend, I came back to work on Tuesday to find myself in the position of needing a to write a library for a client to plug into their software and to have it ready by Friday.

Though I'd written some (very bad) C++ while at uni, I've fairly recently written a couple of very small utilities in C (the library they use is written in C and I fancied a challenge) and wanted to learn some more, so I chose C as the language to write in.

This afternoon, with the library and a small demo application written, I handed the code over to my colleague who'd promised to do all the necessary wrapping up to take my developed-in-linux code and produce a windows DLL from it. After a short while, he'd compiled the library and the demo, BUT... the demo app crashed every time.

At first, it looked like I'd forgotten to free() some malloc()ed memory. I had; but even after doing so, the code was still crashing in windows. The search continued for quite some time until I eventually found what was wrong.

There was an asterisk where there shouldn't have been, FFS!

It turns out that I'd carried some pre-conceptions with me from my previous life as a Java developer and various other places. I'm so used to pretty much every language passing things around by value when the data is small (ints, chars, etc.) and by reference when it's not (objects, etc.). I was completely unprepared for the fact that C deals only in values.

I'm not one of those who are scared of pointers, I'm quite comfortable with pointer arithmetic, allocating and freeing memory and the like. What I had was some code like this:

typedef struct {
    int a;
    int b;
} AB;

void do_some_stuff(int *a, int *b, int num_records, AB **out) {
    int i;
    AB ab[num_records];

    for(i=0; i<num_records; i++) {
        ab[i].a = a[i];
        ab[i].b = b[i];
    }

    *out = ab;
}

void get_stuff() {
    int a[2] = {1, 2};
    int b[2] = {3, 4};

    AB *ab;

    do_some_stuff(a, b, 2, &ab);

    // Do some stuff with ab;
}

Although the real code actually did useful things :P

After handing the code over however, it transpired that MSVC doesn't support all of C99 (why pick a standard and implement part of it?!) specifically, variable-length arrays; so the AB ab[num_records] line had to go.

Here's where my preconception came in:

So, I want to allocate space for an array of these structs...

Ok, so structs are kinda like objects...

That array will be an array of pointers to those then, yep...

So that array declaration became AB *ab = malloc(sizeof(AB*) * num_records) and a corresponding free(ab) in get_stuff().

Yep, nothing in C is a reference unless you really, really say it is. Arrays of structs are just like arrays of any other type: a sequence of those things laid end to end in memory. sizeof(AB*) needed to be sizeof(AB) and that was it.

The. Entire. Afternoon.

Consider that my lesson learned.

Luckily, I seem to have ended up quite fond of C, pleasantly more aware of how it works, and quite keen to write some more.