| ====================================================== | 
 | How to set up LLVM-style RTTI for your class hierarchy | 
 | ====================================================== | 
 |  | 
 | .. contents:: | 
 |  | 
 | Background | 
 | ========== | 
 |  | 
 | LLVM avoids using C++'s built in RTTI. Instead, it  pervasively uses its | 
 | own hand-rolled form of RTTI which is much more efficient and flexible, | 
 | although it requires a bit more work from you as a class author. | 
 |  | 
 | A description of how to use LLVM-style RTTI from a client's perspective is | 
 | given in the `Programmer's Manual <ProgrammersManual.html#isa>`_. This | 
 | document, in contrast, discusses the steps you need to take as a class | 
 | hierarchy author to make LLVM-style RTTI available to your clients. | 
 |  | 
 | Before diving in, make sure that you are familiar with the Object Oriented | 
 | Programming concept of "`is-a`_". | 
 |  | 
 | .. _is-a: http://en.wikipedia.org/wiki/Is-a | 
 |  | 
 | Basic Setup | 
 | =========== | 
 |  | 
 | This section describes how to set up the most basic form of LLVM-style RTTI | 
 | (which is sufficient for 99.9% of the cases). We will set up LLVM-style | 
 | RTTI for this class hierarchy: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |    class Shape { | 
 |    public: | 
 |      Shape() {} | 
 |      virtual double computeArea() = 0; | 
 |    }; | 
 |  | 
 |    class Square : public Shape { | 
 |      double SideLength; | 
 |    public: | 
 |      Square(double S) : SideLength(S) {} | 
 |      double computeArea() override; | 
 |    }; | 
 |  | 
 |    class Circle : public Shape { | 
 |      double Radius; | 
 |    public: | 
 |      Circle(double R) : Radius(R) {} | 
 |      double computeArea() override; | 
 |    }; | 
 |  | 
 | The most basic working setup for LLVM-style RTTI requires the following | 
 | steps: | 
 |  | 
 | #. In the header where you declare ``Shape``, you will want to ``#include | 
 |    "llvm/Support/Casting.h"``, which declares LLVM's RTTI templates. That | 
 |    way your clients don't even have to think about it. | 
 |  | 
 |    .. code-block:: c++ | 
 |  | 
 |       #include "llvm/Support/Casting.h" | 
 |  | 
 | #. In the base class, introduce an enum which discriminates all of the | 
 |    different concrete classes in the hierarchy, and stash the enum value | 
 |    somewhere in the base class. | 
 |  | 
 |    Here is the code after introducing this change: | 
 |  | 
 |    .. code-block:: c++ | 
 |  | 
 |        class Shape { | 
 |        public: | 
 |       +  /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | 
 |       +  enum ShapeKind { | 
 |       +    SK_Square, | 
 |       +    SK_Circle | 
 |       +  }; | 
 |       +private: | 
 |       +  const ShapeKind Kind; | 
 |       +public: | 
 |       +  ShapeKind getKind() const { return Kind; } | 
 |       + | 
 |          Shape() {} | 
 |          virtual double computeArea() = 0; | 
 |        }; | 
 |  | 
 |    You will usually want to keep the ``Kind`` member encapsulated and | 
 |    private, but let the enum ``ShapeKind`` be public along with providing a | 
 |    ``getKind()`` method. This is convenient for clients so that they can do | 
 |    a ``switch`` over the enum. | 
 |  | 
 |    A common naming convention is that these enums are "kind"s, to avoid | 
 |    ambiguity with the words "type" or "class" which have overloaded meanings | 
 |    in many contexts within LLVM. Sometimes there will be a natural name for | 
 |    it, like "opcode". Don't bikeshed over this; when in doubt use ``Kind``. | 
 |  | 
 |    You might wonder why the ``Kind`` enum doesn't have an entry for | 
 |    ``Shape``. The reason for this is that since ``Shape`` is abstract | 
 |    (``computeArea() = 0;``), you will never actually have non-derived | 
 |    instances of exactly that class (only subclasses). See `Concrete Bases | 
 |    and Deeper Hierarchies`_ for information on how to deal with | 
 |    non-abstract bases. It's worth mentioning here that unlike | 
 |    ``dynamic_cast<>``, LLVM-style RTTI can be used (and is often used) for | 
 |    classes that don't have v-tables. | 
 |  | 
 | #. Next, you need to make sure that the ``Kind`` gets initialized to the | 
 |    value corresponding to the dynamic type of the class. Typically, you will | 
 |    want to have it be an argument to the constructor of the base class, and | 
 |    then pass in the respective ``XXXKind`` from subclass constructors. | 
 |  | 
 |    Here is the code after that change: | 
 |  | 
 |    .. code-block:: c++ | 
 |  | 
 |        class Shape { | 
 |        public: | 
 |          /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | 
 |          enum ShapeKind { | 
 |            SK_Square, | 
 |            SK_Circle | 
 |          }; | 
 |        private: | 
 |          const ShapeKind Kind; | 
 |        public: | 
 |          ShapeKind getKind() const { return Kind; } | 
 |  | 
 |       -  Shape() {} | 
 |       +  Shape(ShapeKind K) : Kind(K) {} | 
 |          virtual double computeArea() = 0; | 
 |        }; | 
 |  | 
 |        class Square : public Shape { | 
 |          double SideLength; | 
 |        public: | 
 |       -  Square(double S) : SideLength(S) {} | 
 |       +  Square(double S) : Shape(SK_Square), SideLength(S) {} | 
 |          double computeArea() override; | 
 |        }; | 
 |  | 
 |        class Circle : public Shape { | 
 |          double Radius; | 
 |        public: | 
 |       -  Circle(double R) : Radius(R) {} | 
 |       +  Circle(double R) : Shape(SK_Circle), Radius(R) {} | 
 |          double computeArea() override; | 
 |        }; | 
 |  | 
 | #. Finally, you need to inform LLVM's RTTI templates how to dynamically | 
 |    determine the type of a class (i.e. whether the ``isa<>``/``dyn_cast<>`` | 
 |    should succeed). The default "99.9% of use cases" way to accomplish this | 
 |    is through a small static member function ``classof``. In order to have | 
 |    proper context for an explanation, we will display this code first, and | 
 |    then below describe each part: | 
 |  | 
 |    .. code-block:: c++ | 
 |  | 
 |        class Shape { | 
 |        public: | 
 |          /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | 
 |          enum ShapeKind { | 
 |            SK_Square, | 
 |            SK_Circle | 
 |          }; | 
 |        private: | 
 |          const ShapeKind Kind; | 
 |        public: | 
 |          ShapeKind getKind() const { return Kind; } | 
 |  | 
 |          Shape(ShapeKind K) : Kind(K) {} | 
 |          virtual double computeArea() = 0; | 
 |        }; | 
 |  | 
 |        class Square : public Shape { | 
 |          double SideLength; | 
 |        public: | 
 |          Square(double S) : Shape(SK_Square), SideLength(S) {} | 
 |          double computeArea() override; | 
 |       + | 
 |       +  static bool classof(const Shape *S) { | 
 |       +    return S->getKind() == SK_Square; | 
 |       +  } | 
 |        }; | 
 |  | 
 |        class Circle : public Shape { | 
 |          double Radius; | 
 |        public: | 
 |          Circle(double R) : Shape(SK_Circle), Radius(R) {} | 
 |          double computeArea() override; | 
 |       + | 
 |       +  static bool classof(const Shape *S) { | 
 |       +    return S->getKind() == SK_Circle; | 
 |       +  } | 
 |        }; | 
 |  | 
 |    The job of ``classof`` is to dynamically determine whether an object of | 
 |    a base class is in fact of a particular derived class.  In order to | 
 |    downcast a type ``Base`` to a type ``Derived``, there needs to be a | 
 |    ``classof`` in ``Derived`` which will accept an object of type ``Base``. | 
 |  | 
 |    To be concrete, consider the following code: | 
 |  | 
 |    .. code-block:: c++ | 
 |  | 
 |       Shape *S = ...; | 
 |       if (isa<Circle>(S)) { | 
 |         /* do something ... */ | 
 |       } | 
 |  | 
 |    The code of the ``isa<>`` test in this code will eventually boil | 
 |    down---after template instantiation and some other machinery---to a | 
 |    check roughly like ``Circle::classof(S)``. For more information, see | 
 |    :ref:`classof-contract`. | 
 |  | 
 |    The argument to ``classof`` should always be an *ancestor* class because | 
 |    the implementation has logic to allow and optimize away | 
 |    upcasts/up-``isa<>``'s automatically. It is as though every class | 
 |    ``Foo`` automatically has a ``classof`` like: | 
 |  | 
 |    .. code-block:: c++ | 
 |  | 
 |       class Foo { | 
 |         [...] | 
 |         template <class T> | 
 |         static bool classof(const T *, | 
 |                             ::std::enable_if< | 
 |                               ::std::is_base_of<Foo, T>::value | 
 |                             >::type* = 0) { return true; } | 
 |         [...] | 
 |       }; | 
 |  | 
 |    Note that this is the reason that we did not need to introduce a | 
 |    ``classof`` into ``Shape``: all relevant classes derive from ``Shape``, | 
 |    and ``Shape`` itself is abstract (has no entry in the ``Kind`` enum), | 
 |    so this notional inferred ``classof`` is all we need. See `Concrete | 
 |    Bases and Deeper Hierarchies`_ for more information about how to extend | 
 |    this example to more general hierarchies. | 
 |  | 
 | Although for this small example setting up LLVM-style RTTI seems like a lot | 
 | of "boilerplate", if your classes are doing anything interesting then this | 
 | will end up being a tiny fraction of the code. | 
 |  | 
 | Concrete Bases and Deeper Hierarchies | 
 | ===================================== | 
 |  | 
 | For concrete bases (i.e. non-abstract interior nodes of the inheritance | 
 | tree), the ``Kind`` check inside ``classof`` needs to be a bit more | 
 | complicated. The situation differs from the example above in that | 
 |  | 
 | * Since the class is concrete, it must itself have an entry in the ``Kind`` | 
 |   enum because it is possible to have objects with this class as a dynamic | 
 |   type. | 
 |  | 
 | * Since the class has children, the check inside ``classof`` must take them | 
 |   into account. | 
 |  | 
 | Say that ``SpecialSquare`` and ``OtherSpecialSquare`` derive | 
 | from ``Square``, and so ``ShapeKind`` becomes: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |     enum ShapeKind { | 
 |       SK_Square, | 
 |    +  SK_SpecialSquare, | 
 |    +  SK_OtherSpecialSquare, | 
 |       SK_Circle | 
 |     } | 
 |  | 
 | Then in ``Square``, we would need to modify the ``classof`` like so: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |    -  static bool classof(const Shape *S) { | 
 |    -    return S->getKind() == SK_Square; | 
 |    -  } | 
 |    +  static bool classof(const Shape *S) { | 
 |    +    return S->getKind() >= SK_Square && | 
 |    +           S->getKind() <= SK_OtherSpecialSquare; | 
 |    +  } | 
 |  | 
 | The reason that we need to test a range like this instead of just equality | 
 | is that both ``SpecialSquare`` and ``OtherSpecialSquare`` "is-a" | 
 | ``Square``, and so ``classof`` needs to return ``true`` for them. | 
 |  | 
 | This approach can be made to scale to arbitrarily deep hierarchies. The | 
 | trick is that you arrange the enum values so that they correspond to a | 
 | preorder traversal of the class hierarchy tree. With that arrangement, all | 
 | subclass tests can be done with two comparisons as shown above. If you just | 
 | list the class hierarchy like a list of bullet points, you'll get the | 
 | ordering right:: | 
 |  | 
 |    | Shape | 
 |      | Square | 
 |        | SpecialSquare | 
 |        | OtherSpecialSquare | 
 |      | Circle | 
 |  | 
 | A Bug to be Aware Of | 
 | -------------------- | 
 |  | 
 | The example just given opens the door to bugs where the ``classof``\s are | 
 | not updated to match the ``Kind`` enum when adding (or removing) classes to | 
 | (from) the hierarchy. | 
 |  | 
 | Continuing the example above, suppose we add a ``SomewhatSpecialSquare`` as | 
 | a subclass of ``Square``, and update the ``ShapeKind`` enum like so: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |     enum ShapeKind { | 
 |       SK_Square, | 
 |       SK_SpecialSquare, | 
 |       SK_OtherSpecialSquare, | 
 |    +  SK_SomewhatSpecialSquare, | 
 |       SK_Circle | 
 |     } | 
 |  | 
 | Now, suppose that we forget to update ``Square::classof()``, so it still | 
 | looks like: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |    static bool classof(const Shape *S) { | 
 |      // BUG: Returns false when S->getKind() == SK_SomewhatSpecialSquare, | 
 |      // even though SomewhatSpecialSquare "is a" Square. | 
 |      return S->getKind() >= SK_Square && | 
 |             S->getKind() <= SK_OtherSpecialSquare; | 
 |    } | 
 |  | 
 | As the comment indicates, this code contains a bug. A straightforward and | 
 | non-clever way to avoid this is to introduce an explicit ``SK_LastSquare`` | 
 | entry in the enum when adding the first subclass(es). For example, we could | 
 | rewrite the example at the beginning of `Concrete Bases and Deeper | 
 | Hierarchies`_ as: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |     enum ShapeKind { | 
 |       SK_Square, | 
 |    +  SK_SpecialSquare, | 
 |    +  SK_OtherSpecialSquare, | 
 |    +  SK_LastSquare, | 
 |       SK_Circle | 
 |     } | 
 |    ... | 
 |    // Square::classof() | 
 |    -  static bool classof(const Shape *S) { | 
 |    -    return S->getKind() == SK_Square; | 
 |    -  } | 
 |    +  static bool classof(const Shape *S) { | 
 |    +    return S->getKind() >= SK_Square && | 
 |    +           S->getKind() <= SK_LastSquare; | 
 |    +  } | 
 |  | 
 | Then, adding new subclasses is easy: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |     enum ShapeKind { | 
 |       SK_Square, | 
 |       SK_SpecialSquare, | 
 |       SK_OtherSpecialSquare, | 
 |    +  SK_SomewhatSpecialSquare, | 
 |       SK_LastSquare, | 
 |       SK_Circle | 
 |     } | 
 |  | 
 | Notice that ``Square::classof`` does not need to be changed. | 
 |  | 
 | .. _classof-contract: | 
 |  | 
 | The Contract of ``classof`` | 
 | --------------------------- | 
 |  | 
 | To be more precise, let ``classof`` be inside a class ``C``.  Then the | 
 | contract for ``classof`` is "return ``true`` if the dynamic type of the | 
 | argument is-a ``C``".  As long as your implementation fulfills this | 
 | contract, you can tweak and optimize it as much as you want. | 
 |  | 
 | For example, LLVM-style RTTI can work fine in the presence of | 
 | multiple-inheritance by defining an appropriate ``classof``. | 
 | An example of this in practice is | 
 | `Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_ vs. | 
 | `DeclContext <http://clang.llvm.org/doxygen/classclang_1_1DeclContext.html>`_ | 
 | inside Clang. | 
 | The ``Decl`` hierarchy is done very similarly to the example setup | 
 | demonstrated in this tutorial. | 
 | The key part is how to then incorporate ``DeclContext``: all that is needed | 
 | is in ``bool DeclContext::classof(const Decl *)``, which asks the question | 
 | "Given a ``Decl``, how can I determine if it is-a ``DeclContext``?". | 
 | It answers this with a simple switch over the set of ``Decl`` "kinds", and | 
 | returning true for ones that are known to be ``DeclContext``'s. | 
 |  | 
 | .. TODO:: | 
 |  | 
 |    Touch on some of the more advanced features, like ``isa_impl`` and | 
 |    ``simplify_type``. However, those two need reference documentation in | 
 |    the form of doxygen comments as well. We need the doxygen so that we can | 
 |    say "for full details, see http://llvm.org/doxygen/..." | 
 |  | 
 | Rules of Thumb | 
 | ============== | 
 |  | 
 | #. The ``Kind`` enum should have one entry per concrete class, ordered | 
 |    according to a preorder traversal of the inheritance tree. | 
 | #. The argument to ``classof`` should be a ``const Base *``, where ``Base`` | 
 |    is some ancestor in the inheritance hierarchy. The argument should | 
 |    *never* be a derived class or the class itself: the template machinery | 
 |    for ``isa<>`` already handles this case and optimizes it. | 
 | #. For each class in the hierarchy that has no children, implement a | 
 |    ``classof`` that checks only against its ``Kind``. | 
 | #. For each class in the hierarchy that has children, implement a | 
 |    ``classof`` that checks a range of the first child's ``Kind`` and the | 
 |    last child's ``Kind``. |