How to read C and C++ variable declarations
A few days ago I started wondering about the weirdness around variable and function declarations in C (and, naturally, C++). Every few months I need to look up how to declare something fancy like functions returning function pointers, but I forget the syntax the minute it compiles.
Probably quite a few people were introduced to C variable declarations like me – a small set of instructions how to get this or that variable type. For example, putting a * behind the type is how to declare a pointer. As it turns out, that’s not a coincedence, but merely a logical consequence of the way variables in C are declared.
Maybe you know all this already, at least subconsciously, but realizing the simplicity of it all helped me a great deal in reading and especially writing C.
Starting with the basics
A variable declaration in C has the following form:
[typename] [expression containing variable name]
which can be read as an assertion that if you use a variable in the same way as in the expression (right side), it will yield the specified type (left side).
This obviously is an oversimplification (e.g. square brackets allow non-fixed indices) but it’s the basic concept behind the C syntax for variable declarations.
Let’s start with the simplest example – just pretend you see something like this for the first time:
int i;
Simple enough: everytime the compiler encounters/uses our variable i, it will be of type int.
Now let’s look at a pointer to that type:
int *i; // pointer to int
This basically translates to: If you see i anywhere, and apply the dereference operator * to it, you will, again, get an int.
That’s the reason some people write int *i and others int* i. It’s a matter of preference but you’ll need the first version as soon as you have more than one pointer in a declaration:
int* i, j; // whooops! i = int pointer, j = int
I’m sure we can skip pointer to pointer and move on to arrays.
Arrays
Let’s warm up a little more with array declarations, and pointers to arrays:
int i[5]; // array of 5 int
is an array of 5 int. Here again, the rule is: If you see i anywhere, and apply the index operator [] to it, you will get an int.
The same goes for multi-dimensional arrays, for example an array of 3 times 5 int (3 rows, 5 columns) is declared
int i[3][5]; // array of 3 [ array of 5 int ]
The [] operators are applied left to right, so with the first index operator you choose any of the 3 int arrays. Applying the second operator then lets you choose out of 5 int.
Then what is
int *i[5];
? Just apply the operators backwards! Note: The index operator is applied before the dereference operator.
First, reverse the dereferentiation, that is if we apply a * we should get an int, so it’s an int pointer. Now we’re left with the square brackets, meaning it’s an array with 5 of them. An array of 5 int pointers!
How do you declare a pointer to an array then? Like this:
int (*i)[4]; // pointer to array of 4 int
We change the order the operators are applied in with the parantheses, ie. first * and then []. Reversing operations in our head again, we have something that, if [] are applied, yields an int. So, right before applying [] we should have an int array. Then what type should i have so that after applying * to it, should result in an int array?
Function pointers
There’s not much more to it than that once you’ve realized that pointer or array definitions aren’t fancy ways of explicitly defining a type, but rather implicit descriptions in the form of type equality.
Logically, function declarations work the same:
int f(int, float); // function (int, float) -> int
Take any occurence of f, put a (, an int, a float and a closing ) behind it, and you’ll end up with an int.
Function pointers are essentially the same, except for the need for another indirection: we first have to dereference the pointer.
A function pointer to a function like the one above looks like this;
int (*f)(int, float); // pointer to function (int, float) -> int
All we had to change was add the *. As with pointer to array before, the () binds stronger than the * so we have to change the order using parantheses around *f so that the pointer is dereferenced first, and then the resulting function is called. If we hadn’t done that, you’d have a function returning an int pointer instead.
Let’s change the pointer to a function returning an int* The only difference is that you get the int by applying an additional * after calling the function pointer, so our definition looks just like that:
int *(*f)(int, float); // pointer to function (int, float) -> int*
About time to get where I was going with the whole article, function returning function pointer. How about a function that returns a function pointer as we declared above?
We have the same frame as above but we have to modify f to reflect the fact that there is another level of indirection. The
*f simply becomes *f():
int *(*f())(int, float); // function () -> [ pointer to function (int, float) -> int* ]
because we have to call f first to get the function pointer, then we dereference that pointer, then call it, and finally apply * to get the int.
And a pointer to that function we just declared? Just turn the f into (*f) to dereference the pointer, the rest is the same:
int *(*(*f)())(int, float); // pointer to function () -> [ pointer to function (int, float) -> int* ]
The cure for the common break fest
As you can see, things get very tricky very fast, so you shouldn’t be doing this for anything beyond 1 or 2 levels of indirection.
That’s where typedefs come in: they allow you to express a complicated type with a single, new typename. The syntax is straight-forward:
typedef [typename] [expression containing new typename]
Looks similar to variable declarations. Basically all you have to do is declare a variable named x, put typedef before the declaration and you get a new type x that acts like any variable declared like this.
A small example:
int (*a1)[10] = 0; // pointer to array of 10 int typedef int (*ax)[10]; // definition of new type ax ax a2 = 0; a1 = a2; // this works, a1 and a2 are of the same type
This makes the whole business a lot easier than hacking together the declarations every single time. Take for example this small snippet:
// returns pointer to int
int *func2(int a, float b)
{
static int x = 0;
return &x;
}
// returns pointer to function returning pointer to int
int *(*func1())(int, float)
{
return &func2;
}
int main()
{
int *(*(*ptr1)())(int, float) = &func1; // pointer to function returning pointer to function returning pointer to int
int *(*ptr2)(int, float) = (*ptr1)(); // pointer to function returning pointer to int
int val = *(*ptr2)(0, 0.0f);
printf("%i\n", val);
return 0;
}
Those are the same declarations as above, but deciphering and especially writing them from scratch takes time and is fairly error prone.
The same code using typedef, a lot easier on the eye:
int *func2(int a, float b)
{
static int x = 0;
return &x;
}
// func2_type == pointer to function (int, float) -> int*
typedef int *(*func2_type)(int, float);
func2_type func1()
{
return &func2;
}
// func1_type == pointer to function () ->func2_type
typedef func2_type (*func1_type)();
int main()
{
func1_type ptr1 = &func1;
func2_type ptr2 = (*ptr1)();
int val = *(*ptr2)(0, 0.0f);
printf("%i\n", val);
return 0;
}
You only have to do the work once, and simply use the new type as a return type of functions. Things become a lot more readable and maintainable.
Finito
That’s it for now. I hope this helps a few people get a better understanding of the C syntax, even if this probably isn’t new to most intermediate and expert C programmers.
Either way it was fun to write up and test. If you have any questions or feedback, let me know.