How "super" Works In Python
This is from Guido's paper Unifying types and classes in Python 2.2.
Cooperative methods and "super"
One of the coolest, but perhaps also one of the most unusual features of the new classes is the possibility to write "cooperative" classes. Cooperative classes are written with multiple inheritance in mind, using a pattern that I call a "cooperative super call". This is known in some other multiple-inheritance languages as "call-next-method", and is more powerful than the super call found in single-inheritance languages like Java or Smalltalk. C++ has neither form of super call, relying instead on an explicit mechanism similar to that used in classic Python. (The term "cooperative method" comes from "Putting Metaclasses to Work".)
As a refresher, let's first review the traditional, non-cooperative super call. When a class C derives from a base class B, C often wants to override a method m defined in B. A "super call" occurs when C's definition of m calls B's definition of m to do some of its work. In Java, the body of m in C can write super(a, b, c) to call B's definition of m with argument list (a, b, c). In Python, C.m writes B.m(self, a, b, c) to accomplish the same effect. For example:
class B:We say that C's method m "extends" B's method m. The pattern here works well as long as we're using single inheritance, but it breaks down with multiple inheritance. Let's look at four classes whose inheritance diagram forms a "diamond" (the same diagram was shown graphically in the previous section):
def m(self):
print "B here"
class C(B):
def m(self):
print "C here"
B.m(self)class A(object): ..
class B(A): ...
class C(A): ...
class D(B, C): ...Suppose A defines a method m, which is extended by both B and C. Now what is D to do? It inherits two implementations of m, one from B and one from C. Traditionally, Python simply picks the first one found, in this case the definition from B. This is not ideal, because this completely ignores C's definition. To see what's wrong with ignoring C's m, assume that these classes represent some kind of persistent container hierarchy, and consider a method that implements the operation "save your data to disk". Presumably, a D instance has both B's data and C's data, as well as A's data (a single copy of the latter). Ignoring C's definition of the save method would mean that a D instance, when requested to save itself, only saves the A and B parts of its data, but not the part of its data defined by class C!
C++ notices that D inherits two conflicting definitions of method m, and issues an error message. The author of D is then supposed to override m to resolve the conflict. But what is D's definition of m supposed to do? It can call B's m followed by C's m, but because both definitions call the definition of m inherited from A, A's m ends up being called twice! Depending on the details of the operation, this is at best an inefficiency (when m is idempotent), at worst an error. Classic Python has the same problem, except it doesn't even consider it an error to inherit two conflicting definitions of a method: it simply picks the first one.
The traditional solution to this dilemma is to split each derived definition of m into two parts: a partial implementation _m, which only saves the data that is unique to one class, and a full implementation m, which calls its own _m and that of the base class(es). For example:
class A(object):
def m(self): "save A's data"
class B(A):
def _m(self): "save B's data"
def m(self): self._m(); A.m(self)
class C(A):
def _m(self): "save C's data"
def m(self): self._m(); A.m(self)
class D(B, C):
def _m(self): "save D's data"
def m(self): self._m(); B._m(self); C._m(self); A.m(self)There are several problems with this pattern. First of all, there is the proliferation of extra methods and calls. But perhaps more importantly, it creates an undesirable dependency in the derived classes on details of the dependency graph of their base classes: the existence of A can no longer be considered an implementation detail of B and C, since class D needs to know about it. If, in a future version of the program, we want to remove the dependency on A from B and C, this will affect derived classes like D as well; likewise, if we want to add another base class AA to B and C, all their derived classes must be updated as well.
The "call-next-method" pattern solves this problem nicely, in combination with the new method resolution order. Here's how:
class A(object):
def m(self): "save A's data"
class B(A):
def m(self): "save B's data"; super(B, self).m()
class C(A):
def m(self): "save C's data"; super(C, self).m()
class D(B, C):
def m(self): "save D's data"; super(D, self).m()Note that the first argument to super is always the class in which it occurs; the second argument is always self. Also note that self is not repeated in the argument list for m.
Now, in order to explain how super works, consider the MRO for each of these classes. The MRO is given by the __mro__ class attribute:
A.__mro__ == (A, object)
B.__mro__ == (B, A, object)
C.__mro__ == (C, A, object)
D.__mro__ == (D, B, C, A, object)The expression super(C, self).m should only be used inside the implementation of method m in class C. Bear in mind that while self is an instance of C, self.__class__ may not be C: it may be a class derived from C (for example, D). The expression super(C, self).m, then, searches self.__class__.__mro__ (the MRO of the class that was used to create the instance in self) for the occurrence of C, and then starts looking for an implementation of method m following that point.
For example, if self is a C instance, super(C, self).m will find A's implementation of m, as will super(B, self).m if self is a B instance. But now consider a D instance. In D's m, super(D, self).m() will find and call B.m(self), since B is the first base class following D in D.__mro__ that defines m. Now in B.m, super(B, self).m() is called. Since self is a D instance, the MRO is (D, B, C, A, object) and the class following B is C. This is where the search for a definition of m continues. This finds C.m, which is called, and in turn calls super(C, self).m(). Still using the same MRO, we see that the class following C is A, and thus A.m is called. This is the original definition of m, so no super call is made at this point.
Note how the same super expression finds a different class implementing a method depending on the class of self! This is the crux of the cooperative super mechanism.
Quite cool indeed.
Labels: Python Programming
If you find this post useful, please conside buying me a pizza!


0 Comments
<< Home