I’d like to talk here about two not-so-new features of C++ (auto types and lambda functions) that managed to bite me recently, even though I thought I knew them well enough (I still do!).
According to me, it doesn’t show that one should ditch them into oblivion and never use them (but some people make their life easier by doing so), but clearly that they should be handled with care, as they more or less act as syntactic sugar on C++ type system, which can be, unfortunately, both overly rigid (which is why we use them in the first place) and overly flexible (which is usually why bugs get in our way).
Let’s get to the point.
« auto » acts as any other type: don’t assume too much about it
I myself rarely use auto. I tend to think that with a proper static analyzer with working type suggestion, even long type names aren’t such a pain to write anymore.
I basically use it mainly in two occurrences: long iterator names and lambda functions. In the first case, it’s usually because I’m plain lazy and in the other, it’s just that it seems to me that lambdas were designed to be declared as auto anyway, as writing their actual type would tend to obfuscate the code meaning rather than make it clearer (a personal opinion).
So what’s the deal with auto ? I will sum it up like this: unwanted silent object copying !
The thing is that if we’re not careful about it, we tend to treat auto as the type we automatically get from the expression we set the auto variable with. In most cases, it does indeed what we expect, but problems can arise with references to copy constructable objects.
Consider the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
#include <iostream> class HugeObject { public: HugeObject () {} HugeObject (const HugeObject & toto) { std::cout << "copy!" << std::endl; } // here be huge data... }; HugeObject & GetHugeObject () { static HugeObject toto; return toto; } int main() { HugeObject& hugetoto = GetHugeObject(); auto hugetoto_copy = GetHugeObject(); // auto = HugeObject -> implicit copy !! auto& hugetoto_ref = GetHugeObject(); // OK: auto& = HugeObject& return 0; } |
If you compile it, you will see in horror that indeed, the second call to GetHugeObject is making a copy of the returned object, and that is perfectly valid because even if what you meant by using auto is « get me a reference », you actually declared a new object, so construction from the returned reference takes place instead! The third call does what you probably meant in this case.
One way of avoiding this scenario, aside of being careful and always wondering « do I want a reference ? », is to make the copy constructor explicit:
1 |
explicit HugeObject (const HugeObject & toto) |
so this kind of problematic syntax won’t compile and you’ll have to explicitly (no sh#t Sherlock!) declare an object that will be copied from the returned reference:
1 2 3 |
auto hugetoto_copy = GetHugeObject(); // doesn't compile if copy constructor is explicit HugeObject tototo(GetHugeObject()); // OK: totally explicit auto totototo(GetHugeObject()); // OK: auto used but still explicit |
As for a small rant about static analyzers I told you about earlier (skip to the next paragraph for more C++ 🙂 ):
Unfortunately, as I write, I find « proper static analyzers with working type suggestion » to be much scarcer than they should be these days, even in mainstream IDEs. To my mind, Visual Studio’s IntelliSense, for example is a real shame in this respect, as even on small projects it always seems to be lagging behind, unable to perform the simplest of tasks (ie. creating declaration/definition couples, renaming…) without it taking several seconds, and updating and rescanning solutions (making it lag further behind…) virtually all the time (why ?), thus regularly failing to perform what should be one of its core features: type suggestion ! It’s no wonder third party tools such as Visual Assist X or ReSharper get to eat cake on Visual Studio’s back, since they’re so much superior to built-in features of the IDE.
To be fair on this particular issue, it seems that the Enterprise Edition of Visual Studio works better about it (nerf of the Community Edition ? perhaps but hey, shouldn’t we complain so much about it, this isn’t what we haven’t paid for, right ? 😉 ), but still fails to properly manage large projects (which are fairly common on C++ ground) on its own.
Lambdas can implicitly copy data too, if you use them wrong
As a matter of fact, this is the other side of the same coin, but this time with lambdas.
This has to do with the capture syntax of lambdas. Remember how the capture list of a lambda can be declared:
- [a,&b] where a is captured by copy and b is captured by reference.
- [this] captures the current object (*this) by reference
- [&] captures all automatic variables used in the body of the lambda by reference and current object by reference if exists
- [=] captures all automatic variables used in the body of the lambda by copy and current object by reference if exists
- [] captures nothing
In other words, you have to be careful in a lambda declaration whether you want to capture existing variables around and, in that case, if you want to do it by copy or reference.
Then, consider the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
#include <iostream> #include <vector> #include <algorithm> class HugeObject { public: HugeObject() {} explicit HugeObject(const HugeObject & toto) { std::cout << "copy!" << std::endl; a = toto.a; } int a = 42; }; HugeObject& GetHugeObject() { static HugeObject toto; return toto; } int main() { std::vector<int> test(3, 42); // 3 ints with value 42 HugeObject& toto = GetHugeObject(); std::for_each(test.begin(), test.end(), [toto](int a){ std::cout << a << std::endl; }); std::cout << std::endl; return 0; } |
Can you see the error ?
The problem here is that, without being careful, it’s extremely easy to capture the HugeObject reference variable by copy, and, without giving much thought to it, one could expect that a reference variable captured by copy would capture the reference, right ?
Wrong ! It will still copy the whole object and store it in a brand new const HugeObject toto (effectively shadowing the reference from outside the lambda) that will have been copy-constructed from the reference ! And that case maybe is even nastier than the auto one, since as you can see, it occurs, compiles and runs smoothly without any warning or error even though the copy constructor is explicit.
Of course, to capture toto as a reference (what we probably want), you indeed have to capture it as such (don’t forget the ampersand, even for variables that already are references!):
std::for_each(test.begin(), test.end(), [&toto](int a){ ...
In this case, we declare a lambda on-the-fly to use within a for-each loop. The good news are that in such a case, the object won’t get copied for each loop iteration, but should be only once.
But funnily enough, running this code on my computer has shown that the copied HugeObject toto gets copied twice, regardless of the number of iterations: once before the loop and once again after the end of the loop. I’m still wondering why, and as always with lambdas, the call stack isn’t very helpful. My guess in that particular case is that it may be tied to the implementation of the for_each algorithm I’m using.
As of today, I can’t think of any silver bullet for this one: you need to be very careful with your capture lists and really think about how and why you capture variables in your lambda. I personally caught this one by trying to use a non-const method of the captured variable, which didn’t compile since it created a const copy. I was lucky since otherwise I would probably never have noticed my error.
So the bottom line is really this:
When using tools such as auto types and lambdas, you have to be absolutely sure you understand the level of control you’re giving up to the type system, or soon, problems can arise!
I find this kind of « new age of C++ » problems particularly tricky as even seasoned programmers may not be familiar with them, and that’s given you spot the problem in the first place, since they can be particularly silent, usually produce the expected result (but unecessarily copy data around, basically) and you may never even notice it.
Then again, this kind of problem could be filed under the « performance problems » list, which, as anyone knows, are usually the last ones you should deal with (if performance is ever to be an issue, « Premature optimization » yadda yadda…).
But still, these typically are bugs in the strictest sense (even if it works, that may not be how you want it to behave), and this is quite frightening to know that such innocent-looking features that will get more and more common in the future may cost you more than what you expect without ever realizing it.
Sometimes, C++ reminds you how shitty of a language it can be…