Friday 20 July 2018

Iterator Invalidation

"Iterator Invalidation" -- quite common and misunderstood term with people who use c++ STL library, as for most of them it just means dereferencing an iterator whose underlying memory is deleted and therefore such a scenario will lead to program crash.However this is just one of the use case because Iterator Invalidation also means the element that is expected from iterator dereferencing is not coming in output.So let's work on this with below code :

#include <iostream>
#include <vector>
int main()
{
  std::vector<std::string> cities{"NewYork","Singapore","HongKong","Mumbai","London"};
  auto secondElementIterator = cities.begin() + 2;
  //Before any Deletion Iterator to second element will give HongKong 
  //as index starts from 0
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  //Let's Delete  0th element
  cities.erase(cities.begin());
  //After Deletion Iterator to second element will give Mumbai as shifiting 
  //occured and this is also use case of Iterator Invalidation
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  return 0;
}

So as observed in above code we have not deleted second element from vector but then also iterator to second element is giving different element after deletion so we can conclude that deleting any element from vector will invalidate all iterators after the deleted element irrespective of whether program crashed or not.

Now let's take an example of set and run through same scenario :

#include <iostream>
#include <set>
int main()
{
  std::set<std::string> cities{"NewYork","Singapore","HongKong","Mumbai","London"};
  
  //You cannot directly perform arithmetic operation on set iterator since it is 
  //bidirectional and not random iterator
  //So we can have iterator to second element using std::advance
  auto secondElementIterator = cities.begin();
  std::advance(secondElementIterator,2);
  
  //Before any Deletion Iterator to second element will give Mumbai 
  //as by default set elements are ordered lexicographically
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  
  //Let's Delete  0th element which will delete HongKong
  //(first element in dicitionary order)
  cities.erase(cities.begin());
  
  //After Deletion Iterator to second element will give Mumbai 
  //as in case of set standard guarantees that only deleted element is invalidated 
  //and rest all element iterator remain valid
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  return 0;
}

So as we analyzed with above example that in case of set (same applies to map also) only the iterator to element that is getting deleted is invalidated but all the other iterators remain intact and this is guaranteed by standard.

Now let's take one final example where while iterating over map we need to delete some element based on condition :

#include <iostream>
#include <map>
int main()
{
  std::map<std::string,unsigned int> cityRankMap{{"NewYork",1},{"Singapore",3},
                                           {"HongKong",4},{"Mumbai",5},{"London",2}};
  //C++ 98/03 Style 
  for(auto itr=cityRankMap.begin();itr!=cityRankMap.end();/*Do not increment here*/)
  {
     if(itr->second == 3)
     {
         cityRankMap.erase(itr++);
     }
     else
     {
         ++itr;
     }
  }
  
  //C++11/14 Style where erase now returns iterator
  for(auto itr=cityRankMap.begin();itr!=cityRankMap.end();/*Do not increment here*/)
  {
     if(itr->second == 2)
     {
         itr = cityRankMap.erase(itr);
     }
     else
     {
         ++itr;
     }
  } 
  return 0;
}

As we understood from our set example that in case of associative containers erase/delete will only invalidate that iterator so in use case (described in above code) where we need to delete element based on some condition while iterating over container then we cannot increment the deleted iterator as it is invalid now so what we do prior to C++11 is to follow associative container erase idiom which means while erasing do post increment of that iterator so that after erase iterator is incremented to proper next location but why post increment and the reason for that is post increment operation makes a copy of variable in some temporary variable so even after erase this temporary variable is valid.However with C++11/14 this is made little easier as erase is now returning an iterator  so after current iterator invalidation due to erase we can capture erase return value in that as demonstrated in above code.

So with this example this post of iterator invalidation is completed and the key takeaways from this is as follows:


  • When we delete element from vector all the iterators after that deleted element are invalidated.
  • When we delete element from associative container (set/map) only the iterator to that deleted element is invalidated
We will talk some other day about iterator invalidation use case in insert scenario because as per me that is very specific use case for vectors which happens when that much big contiguous block is not available for vector and it has to do reallocation.


Thanks for reading and do post your comments specifically if you find something incorrect.

No comments:

Post a Comment