Writing efficient overflow checks in C is very challenging (but good brain exercise). Even libraries written by security experts, such as Microsoft’s SafeInt and CERT’s IntegerLib, could contain broken overflow checks.
For another example, I have seen a lot of bit tricks for detecting
int overflow, many of which are actually questionable
because they rely on undefined behavior. At C-language level one
cannot compute the signed addition first before overflow checking
since signed integer overflow is undefined. Guess what C compilers
would do to the following code.
1 2 3 4 5 6 7 8 9 10
GCC will optimize away the check
c < 0, while Clang keeps
it. You may try another (broken) form
c < a:
this time GCC keeps the check, while Clang removes it.
I am proposing an overflow detection library, libo. Unlike previous ones, it consists of assembly code, and due to the fact that I am lazy, the assembly code is generated automatically from a short program.
Let’s start with a simple task: writing an array allocator
malloc_array(n, size), where
n is the number of elements
size is the element size
(i.e., the non-zeroing version of
Here’s a popular way.
1 2 3 4 5 6
Unfortunately neither GCC nor Clang is able to optimize away the division (dunno why though; looks like a straightforward transformation).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
If you are crazy about performance, try this trick from the Boehm GC guys.
1 2 3 4 5
Another way is to promote the integer operation to a wider type,
such as 128-bit (assuming
size_t is 64-bit); just don’t
forget to do the type cast before multiplication.
I am not sure how portable this is. The
grsecurity patch uses something similar
1 2 3 4 5 6 7
With the new libo,
malloc_array could be implemented as follows.
1 2 3 4 5 6 7
With link-time optimization, Clang produces very nice code:
compared to the baseline code (without overflow checking),
it has only one more
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Here is the trick. LLVM internally supports arithmetic with overflow operations, based on which the libo generator builds up functions like this:
1 2 3 4 5 6 7 8 9
To use libo with Clang (and link-time optimization), one can just use the IR. To use libo with GCC, one can invoke LLVM’s backends to generate assembly. Assuming LLVM is implemented correctly, we get a correct and efficient libo implementation for all architectures LLVM supports.
One downside is that it may be difficult to do link-time optimization
when using GCC; this would end up with having a call to
rather than inlining it. How to ``inline’’ a function purely written
in assembly code using GCC? Or is it possible to have LLVM generate
inline assembly instead? Of course it would be nicer if compilers
provide built-in functions for overflow detection.