When asking about common undefined behavior in C, souls more enlightened than I referred to the strict aliasing rule.
What are they talking about?
Strict aliasing is not allowing different pointer types to the same data.
This article should help you understand the issue in full detail.
Type punning via pointer casts (as opposed to using a union) is a major example of breaking strict aliasing.
The best explanation I have found is by Mike Acton, Understanding Strict Aliasing. It's focused a little on PS3 development, but that's basically just GCC.
From the article:
So basically if you have an
The exception to the rule is a
A typical situation you encounter strict aliasing problems is when overlaying a struct (like a device/network msg) onto a buffer of the word size of your system (like a pointer to
So in this kind of setup, if I want to send a message to something I'd have to have two incompatible pointers pointing to the same chunk of memory. I might then naively code something like this:
The strict aliasing rule makes this setup illegal: dereferencing a pointer that aliases another of an incompatible type is undefined behavior. Unfortunately, you can still code this way, maybe get some warnings, have it compile fine, only to have weird unexpected behavior when you run the code.
(GCC appears somewhat inconsistent in its ability to give aliasing warnings, sometimes giving us a friendly warning and sometimes not.)
To see why this behavior is undefined, we have to think about what the strict aliasing rule buys the compiler. Basically, with this rule, it doesn't have to think about inserting instructions to refresh the contents of
Keep in mind, if you think the example is contrived, this might even happen if you're passing a buffer to another function doing the sending for you, if instead you have.
And rewrote our earlier loop to take advantage of this convenient function
The compiler may or may not be able to or smart enough to try to inline SendMessage and it may or may not decide to load or not load buff again. If
So how do I get around this?
This is only one potential minefield when overlaying two types onto each other. You should also learn about endianness, word alignment, and how to deal with alignment issues through packing structs correctly.
Strict aliasing doesn't refer only to pointers, it affects references as well, I wrote a paper about it for the boost developer wiki and it was so well received that I turned it into a page on my consulting web site. It explains completely what it is, why it confuses people so much and what to do about it. Strict Aliasing White Paper. In particular it explains why unions are risky behavior for C++, and why using memcpy is the only fix portable across both C and C++. Hope this is helpful.
This is the strict aliasing rule, found in section 3.10 of the C++03 standard (other answers provide good explanation, but none provided the rule itself):
C++11 and C++14 wording (changes emphasized):
Two changes were small: glvalue instead of lvalue, and clarification of the aggregate/union case.
The third change makes a stronger guarantee (relaxes the strong aliasing rule): The new concept of similar types that are now safe to alias.
Also the C wording (C99; ISO/IEC 9899:1999 6.5/7; the exact same wording is used in ISO/IEC 9899:2011 §6.5 ¶7):
As addendum to what Doug T. already wrote, here is a simple test case which probably triggers it with gcc :
For those who are interested here is the x64 assembler code, produced by gcc 4.6.3, running on ubuntu 12.04.2 for x64:
So the if condition is completely gone from the assembler code.
According to the C89 rationale, the authors of the Standard did not want to require that compilers given code like:
should be required to reload the value of
The authors of the Standard identified a few cases where aliasing might be used in code that should be nearly 100% portable, and mandated that compilers allow for aliasing in at least those cases. They made no attempt to pass judgment upon what constructs should be usable within code that is only intended to be usable on specific platforms, nor what constructs should be supported by quality implementations that claim to be suitable for systems programming on those platforms.
If a compiler for a particular platform indicates that it is intended for high-end number crunching applications, and a piece of operating system code for that platform malfunctions when fed to that compiler, that does not mean that the compiler is defective, nor does it mean that the code is defective. It merely means that the compiler and operating system code are not suitable for use with each other.
Unfortunately, some compiler writers point to the fact that the Standard doesn't require that all compilers recognize certain aliasing constructs as implying a judgment that all code using such constructs should be considered defective, even if the code does things which could not be done as efficiently any other way. If such compiler writers would recognize that the authors of the Standard have never tried to enumerate all the features and guarantees needed to make a compiler suitable for any particular purpose, they could shift their efforts toward figuring out how to make their compiler as useful as possible for a wide range of purposes, rather than trying to argue that the Standard doesn't require them to do so.
After reading many of the answers, I feel the need to add something:
Strict aliasing (which I'll describe in a bit) is important because:
Since two pointers can point to the same location in the memory, this could result in complex code that handles possible collisions.
This extra code is slow and hurts performance since it performs extra memory read / write operations which are both slower and (possibly) unnecessary.
The Strict aliasing rule allows us to avoid redundant machine code in cases in which it should be safe to assume that two pointers don't point to the same memory block (see also the
The Strict aliasing states it's safe to assume that pointers to different types point to different locations in the memory.
If a compiler notices that two pointers point to different types (for example, an
Lets assume the following function:
In order to handle the case in which
Step 3 is very slow because it needs to access the physical memory. However, it's required to protect against instances where
Strict aliasing would allow us to prevent this by telling the compiler that these memory addresses are distinctly different (which, in this case, will allow even further optimization which can't be performed if the pointers share a memory address).
Now, by satisfying the Strict Aliasing rule, step 3 can be avoided and the code will run significantly faster.
In fact, by adding the
This optimization couldn't have been done before, because of the possible collision (where