Tag Archives: CPU

C/C++ developers please use -Wcast-align

Did you ever read the GCC documentation part Warning Options? If not, you may know the -Wall option. Yeah well, it enables a lot of options, but not literally all possible warnings. In my opinion setting -Wall should be the minimum you should set in every project, but there are more. You can also set -Wextra which enables even more warnings, but as you now might guess, still not all. Missing is especially one option, this post is about and the following describes why I consider it important to set: -Wcast-align

So what does the GCC doc say about it?

Warn whenever a pointer is cast such that the required alignment of the target is increased. For example, warn if a char * is cast to an int * on machines where integers can only be accessed at two- or four-byte boundaries.

In case you can not imagine what this means, let me explain. For example there are 32bit-CPUs out there which access memory correctly only at 32bit boundaries. This is to my knowledge by design. Let’s say you have some byte stream starting at an arbitrary aligned memory offset and it contains bytes starting from 0, followed by 1, then 2 and so on like this:

Now you set an uint32_t pointer to a non aligned address and dereference it. What would you expect? To help you a little, I have a tiny code snippet for demonstration:

The naïve assumption would be the following output:

Note the last three containing some random bytes from memory behind our buffer! This is the output you get on a amd64 standard PC with little endian format (compiled on Debian GNU/Linux with some GCC 4.9.x).

Now look at this output:

This comes from an embedded Linux target with an AT91SAM9G20 Arm CPU, which is ARM9E family and ARMv5TEJ architecture or lets just say armv5 or older Arm CPU. Here It runs as little endian and was compiled with a GCC 4.7.x cross compiler.

Well those 32bit integers look somehow reordered, as if the CPU would shuffle the bytes of the word we point into? If you’re not aware of this this means silent data corruption on older Arm platforms! You can set the -Wcast-align option to let the compiler warn you, you may try this by yourself with the above snippet and your favorite cross compiler. Note: the warning does not solve the corruption issue, it just warns you to fix your code.

When reading the FAQ by Arm itself on this topic it’s not quite clear what the supposed behavior is, but what is clear is the following: unaligned access is not supported on older Arm CPUs up to ARM9 family or ARMv5 architecture.

Another point is interesting: even if the CPU supports unaligned access, whether it’s hard coded or an optional thing you must switch on first, it will give you a performance penalty. And coming back to my PC: this is also true for other processor families like Intel or AMD, although on recent processors it might not be that bad.

So what could or should we do as software developers? Assuming there are still a lot of old processors out there and architectures you might not know, and you never know where your code will end up: design your data structures and network protocols with word alignment in mind! If you have to deal with legacy stuff or bad protocols you can not change, you still have some other possibilities, have a look at The ARM Structured Alignment FAQ or search the net on how to let your kernel handle this.

If you want to handle it in code, memcpy() is one possibility. Assume we want to access a 32bit integer at offset 2, we could do it like this:

And as said in the topic and above: turn on the -Wcast-align option!

(If you don’t want to be too scared about silent data corruption on your new IoT devices with those cheap old processors and your freshly compiled board support package, you might not want to turn it on on all those existing software out there. You might get a little depressed … )

Update: There’s a chapter on the Linux kernel documentation on this: Unaligned Memory Accesses.