вторник, 19 июля 2011 г.

Is it reasonable to use the prefix increment operator ++it instead of postfix operator it++ for iterators?

I decided to find out if there is practical sense in writing ++iterator instead of iterator++ when handling iterators. My interest in this question arouse far not from my love to art but from practical reasons. We have intended for a long time to develop PVS-Studio not only in the direction of error search but in the direction of prompting tips on code optimization. A message telling you that you'd better write ++iterator is quite suitable in the scope of optimization.

But how much relevant is this recommendation nowadays? In ancient times, for instance, it was advised not to repeat calculations. It was a good manner to write:

TMP = A + 10;

X = TMP + B;

Y = TMP + C;

instead of

X = A + 10 + B;

Y = A + 10 + C;

Such subtle manual optimization is meaningless now. The compiler would handle this task as well. It's just unnecessary complication of code.

Note for pedantic ones. Yes, you'd better not repeat calculations and calculate long expressions, which are used several times, separately. What I'm talking about is that there is no reason in optimizing simple cases like the one I have mentioned.

Well, we have digressed from our main point which is the question if the advice to use the prefix increment instead of postfix increment for operators is obsolete nowadays; if we should store our mind with one more subtle thing. Perhaps the compiler learned to optimize prefix increments long ago.

A bit of theory at first for those who are not familiar with the topic. All the rest may scroll the text a bit.

The prefix increment operator changes an object's state and returns itself in the changed form. The prefix increment operator in the iterator class to handle std::vector may look this way:

_Myt& operator++()
{ // preincrement
  ++_Myptr;
  return (*this);
}

The situation with the postfix increment is more complicated. The object's state must change but it is the previous state which is returned. An additional temporary object is created:

_Myt operator++(int)
{ // postincrement
  _Myt _Tmp = *this;
  ++*this;
  return (_Tmp);
}

If we want to just increment the iterator's value, it turns out that the prefix operator is more preferable. That is why, here you are one of the tips concerning software micro-optimization: write "for (it = a.begin(); it != a.end; ++it)" instead of "for (it = a.begin(); it != a.end; it++)". In the latter case, an unnecessary temporary object is created, which reduces performance.

You may read about all this in detail in the book by Scott Meyers "Efficient use of C++. 35 new recommendations on improving your programs and projects" (Rule 6. Distinguish between prefix increment and decrement operators) [1].

The theory is over. Now practice. Is there sense in replacing the postfix increment with the prefix one in code?

size_t Foo(const std::vector<size_t> &arr)
{
  size_t sum = 0;
  std::vector<size_t>::const_iterator it;
  for (it = arr.begin(); it != arr.end(); it++)
    sum += *it;
  return sum;
}

I know that we may wander into the depths of philosophy now. Say, it may turn out that some other class would become the container instead of vector and iterators in this new class would be very complex and heavy; when copying the iterator, we would have to establish a new connection to the database and so on. So, you must always write ++it.

But it's theory. But in practice, when we encounter such a loop in our code, is it reasonable to replace it++ with ++it? Cannot we rely on the compiler that will guess itself that it can throw away an unnecessary iterator?

The answers are strange but the reason why we give them will become evident through further experiments.

Yes, we must replace it++ with ++it.

Yes, the compiler will optimize the code and it won't matter what increment we use.

I chose an "average compiler" and created a test project for Visual Studio 2008. It has two functions which calculate the sum using it++ and ++it and also estimates their running time. You may download the project here. Here is the code of functions the speed of which was measured:

1) Postfix increment. iterator++.

std::vector<size_t>::const_iterator it;
for (it = arr.begin(); it != arr.end(); it++)
  sum += *it;

2) Prefix increment. ++iterator.

std::vector<size_t>::const_iterator it;
for (it = arr.begin(); it != arr.end(); ++it)
  sum += *it;

Working time in the Release version:

iterator++. Total time : 0.87779

++iterator. Total time : 0.87753

This is the answer to the question if the compiler can optimize the postfix increment. Sure it can. If you study the implementation (assembler code), you will see that the both functions are implemented with the same instruction set.

Now let's answer the question what for we should replace it++ with ++it then. Let's measure the speed of functions in the Debug version:

iterator++. Total time : 83.2849

++iterator. Total time : 27.1557

There is practical sense in writing the code so that it slows down only 30 times and not 90 times.

Of course, the speed of Debug versions is not so crucial for many programmers. But if a program does something for a long time, sure such a large slow-down might be crucial, for instance, from the viewpoint of unit-tests. So, it is reasonable to optimize the speed of the Debug version.

I carried out one more experiment to find out what I would get using the good old size_t for indexing. I know it doesn't relate to the topic we are discussing and I understand that we cannot compare iterators with indexes and that the former are higher-level entities. But still I wrote and measured the speed of the following functions just out of curiosity:

1) Classic index of the size_t type. i++.

for (size_t i = 0; i != arr.size(); i++)
  sum += arr[i];

2) Classic index of the size_t type. ++i.

for (size_t i = 0; i != arr.size(); ++i)
  sum += arr[i];

The speed in the Release version:

iterator++. Total time : 0.18923

++iterator. Total time : 0.18913

The speed in the Debug version:

iterator++. Total time : 2.1519

++iterator. Total time : 2.1493

As we had expected, the speeds of i++ and ++i coincided.

Note. Code with size_t works faster in comparison to iterators due to absence of array overrun check. We can make the loop with iterators as fast in the Release version by adding the line "#define _SECURE_SCL 0".

To make it easier for you to evaluate the results of speed measurements, I present them in the table (Figure 1). I have converted the results taking the running time of Release version with iterator++ for a unit. I also rounded them off a bit to make them clearer.

Figure 1. The running time of sum calculation algorithms.

Figure 1. The running time of sum calculation algorithms.

Each one of you can draw your own conclusions. They depend upon tasks you are solving. Personally I came to the following conclusions:

  1. I made sure that it is reasonable to perform such micro-optimization. We should implement the search of postfix increment iterators in PVS-Studio when their previous states are not used. Some programmers will find this functionality useful. All the rest can disable it in the settings if they don't need it.
  2. I will always write ++it. I did so before but I did it "just in case". Now I can see how useful it is because I regularly launch debug versions. In general, of course, ++it has a very slight influence on the running time. But if I don't make such small optimizations in different places of the code, it will be too late and the profiler won't help me. Bottlenecks will be spread throughout the code.
  3. I notice that the PVS-Studio analyzer is spending more and more time inside various functions of std::vector, std::set, std::string classes and the like. This time is growing more and more because new diagnostic rules appear - and it is quite convenient for us to write them using STL. So, I think - hasn't that frightful time come when the program acquires its own specialized string classes, array classes and so on. Well, it is just my cares... Don't listen to me! I tell people seditious things... Sh!..

P. S.

Someone will say now that untimely optimization is evil [2]; when you need optimization, you take the profiler and search for bottlenecks. I know this. And I got rid of certain bottlenecks long ago. But when I'm waiting for the tests to finish for 4 hours, I start thinking that it is a very good idea to gain at least 20% of speed. Such optimization is comprised of iterators, structure sizes, avoiding using STL or Boost in some fragments and so on. I believe that some developers agree with me.

References

  1. Meyers, Scott. More Effective C++: 35 New Ways to Improve Your Programs and Designs. Addison-Wesley, Reading, Mass., 1996. ISBN-10: 020163371X. ISBN-13: 9780201633719.
  2. Randall Hyde. The Fallacy of Premature Optimization. http://www.viva64.com/go.php?url=660

Комментариев нет:

Отправить комментарий