Friday, 20 July 2018

Iterator Invalidation

"Iterator Invalidation" -- quite common and misunderstood term with people who use c++ STL library, as for most of them it just means dereferencing an iterator whose underlying memory is deleted and therefore such a scenario will lead to program crash.However this is just one of the use case because Iterator Invalidation also means the element that is expected from iterator dereferencing is not coming in output.So let's work on this with below code :

#include <iostream>
#include <vector>
int main()
{
  std::vector<std::string> cities{"NewYork","Singapore","HongKong","Mumbai","London"};
  auto secondElementIterator = cities.begin() + 2;
  //Before any Deletion Iterator to second element will give HongKong 
  //as index starts from 0
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  //Let's Delete  0th element
  cities.erase(cities.begin());
  //After Deletion Iterator to second element will give Mumbai as shifiting 
  //occured and this is also use case of Iterator Invalidation
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  return 0;
}

So as observed in above code we have not deleted second element from vector but then also iterator to second element is giving different element after deletion so we can conclude that deleting any element from vector will invalidate all iterators after the deleted element irrespective of whether program crashed or not.

Now let's take an example of set and run through same scenario :

#include <iostream>
#include <set>
int main()
{
  std::set<std::string> cities{"NewYork","Singapore","HongKong","Mumbai","London"};
  
  //You cannot directly perform arithmetic operation on set iterator since it is 
  //bidirectional and not random iterator
  //So we can have iterator to second element using std::advance
  auto secondElementIterator = cities.begin();
  std::advance(secondElementIterator,2);
  
  //Before any Deletion Iterator to second element will give Mumbai 
  //as by default set elements are ordered lexicographically
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  
  //Let's Delete  0th element which will delete HongKong
  //(first element in dicitionary order)
  cities.erase(cities.begin());
  
  //After Deletion Iterator to second element will give Mumbai 
  //as in case of set standard guarantees that only deleted element is invalidated 
  //and rest all element iterator remain valid
  std::cout<<"secondElementIterator pointing to "<<*secondElementIterator<<std::endl;
  return 0;
}

So as we analyzed with above example that in case of set (same applies to map also) only the iterator to element that is getting deleted is invalidated but all the other iterators remain intact and this is guaranteed by standard.

Now let's take one final example where while iterating over map we need to delete some element based on condition :

#include <iostream>
#include <map>
int main()
{
  std::map<std::string,unsigned int> cityRankMap{{"NewYork",1},{"Singapore",3},
                                           {"HongKong",4},{"Mumbai",5},{"London",2}};
  //C++ 98/03 Style 
  for(auto itr=cityRankMap.begin();itr!=cityRankMap.end();/*Do not increment here*/)
  {
     if(itr->second == 3)
     {
         cityRankMap.erase(itr++);
     }
     else
     {
         ++itr;
     }
  }
  
  //C++11/14 Style where erase now returns iterator
  for(auto itr=cityRankMap.begin();itr!=cityRankMap.end();/*Do not increment here*/)
  {
     if(itr->second == 2)
     {
         itr = cityRankMap.erase(itr);
     }
     else
     {
         ++itr;
     }
  } 
  return 0;
}

As we understood from our set example that in case of associative containers erase/delete will only invalidate that iterator so in use case (described in above code) where we need to delete element based on some condition while iterating over container then we cannot increment the deleted iterator as it is invalid now so what we do prior to C++11 is to follow associative container erase idiom which means while erasing do post increment of that iterator so that after erase iterator is incremented to proper next location but why post increment and the reason for that is post increment operation makes a copy of variable in some temporary variable so even after erase this temporary variable is valid.However with C++11/14 this is made little easier as erase is now returning an iterator  so after current iterator invalidation due to erase we can capture erase return value in that as demonstrated in above code.

So with this example this post of iterator invalidation is completed and the key takeaways from this is as follows:


  • When we delete element from vector all the iterators after that deleted element are invalidated.
  • When we delete element from associative container (set/map) only the iterator to that deleted element is invalidated
We will talk some other day about iterator invalidation use case in insert scenario because as per me that is very specific use case for vectors which happens when that much big contiguous block is not available for vector and it has to do reallocation.


Thanks for reading and do post your comments specifically if you find something incorrect.

Saturday, 14 July 2018

Curiously Recurring Template Pattern (CRTP)

As a C++ developer we all are very comfortable with shape hierarchy example that we all study while learning virtual functions and run-time polymorphism and this post will just be leveraging that example to explain CRTP. So first things first "What is CRTP ?" .

CRTP or Curiously Recurring Template Pattern is C++ principle according to which Derived Class will inherit from template class and template parameter in that class will be derived class(curiously recurring !).

So with this definition in mind we will see how we can apply this principle to our classic Shape Class hierarchy in which all our derived classes namely Circle,Square and Rectangle will inherit from templated Shape class with respective derived class as template parameter. It seems everything is quite confusing till now so let's go look at below code snippet which should clear all our doubts :

#include <iostream>
class Rectangle;
class Circle;
class Square;
template<typename T>
class Shape
{
 public:
  Shape() = default;
  template<typename... Args>
  Shape(Args... args):ptr(new T(args...)){};
  double area()
  {
     return ptr->area();
  }
 protected:
  T* ptr;
 
};

class Rectangle:public Shape<Rectangle>
{
 public:
  Rectangle(unsigned int arg_nLength,unsigned int arg_nBreadth):m_nLength(arg_nLength),m_nBreadth(arg_nBreadth)
  {
  }
  double area()
  {
   return m_nLength*m_nBreadth;
  }
 private:
  unsigned int m_nLength;
  unsigned int m_nBreadth;
};

class Square:public Shape<Square>
{
 public:
  Square(unsigned int arg_nside):m_nside(arg_nside)
  {
  }
  double area()
  {
   return m_nside*m_nside;
  }
 private:
  unsigned int m_nside;
};
class Circle:public Shape<Circle>
{
 public:
  Circle(unsigned int r):R(r)
  {
  }
  double area()
  {
   return 3.14*R*R;
  }
 private:
  unsigned int R;

};
int main()
{
 Shape<Circle> C(3);
 Shape<Square> S(3);
 Shape<Rectangle> R(3,4);
 std::cout<<"Area of circle is "<<C.area()<<std::endl;
 std::cout<<"Area of square is "<<S.area()<<std::endl;
 std::cout<<"Area of square is "<<R.area()<<std::endl;
 return 0;
}
So with this example CRTP should be more clearer and it is also understood now that CRTP cannot replace virtual mechanism completely as we cannot have single pointer that can point to all the classes Shape<Circle>, Shape<Square> and Shape<Rectangle> as each is distince type in itself, but it seems we can get closer if our main idea is to have consistent interface for distinct types but given that type is known at compile time.Below is sample code with updated main for this approach :

template<typename T>
double calculateArea(Shape<T>& obj)
{
    return obj.area();
}
int main()
{
 Shape<Circle> C(3);
 Shape<Square> S(3);
 Shape<Rectangle> R(3,4);
 std::cout<<"Area of circle is "<<calculateArea(C)<<std::endl;
 std::cout<<"Area of square is "<<calculateArea(S)<<std::endl;
 std::cout<<"Area of rectangle is "<<calculateArea(R)<<std::endl;
 return 0;
}