The difference between __getattribute__ and __getattr__

The other day I was teaching Python meta-programming to a workmate. I think it’s a good way to learn about high order functions since meta-programming makes extensive use of closures, function builders, decorators… I was trying to make a concept probe about a very, very generic REST connector. Here is my first (and wrong) attempt:

class RESTConn(object):

  def __init__(self, entry_point):
    self.entry_point = entry_point

  def method_builder(self, method_name):
    verb, _, collection = method_name.split('_', 2)
    def do_verb(payload=None, **kwargs):
      uri = self.make_uri(collection)
      querystring = self.make_querystring(kwargs)
      print verb.upper(), self.combine(uri, querystring)
      if payload:
        print payload

    return do_verb

  def make_uri(self, collection):
    return '/'.join([self.entry_point, collection])

  def make_querystring(self, kwargs):
    return '&'.join(['='.join(pair) for pair in kwargs.iteritems()])

  def combine(self, uri, querystring):
    if querystring:
      return '&'.join([uri, querystring])

    return uri

  def __getattribute__(self, name):
    if not hasattr(self, name):
      method = self.method_builder(name)
      setattr(self, name, method)

    return super(RESTConn, self).__getattribute__(name)

Try this example by instantiating a new connector and trying to call something like:

c = RESTConn('unoyunodiez.com')
c.get_from_articles()

The program falls into an infinite recursion and do nothing before crashing. Why?

There are three problems here. First and most important is using __getattribute__(), second is using hasattr() and third is accessing self.method_builder().

The object’s method __getattribute__() is used to retrieve an attribute from an instance. It captures every attempt to access an instance attribute by using dot notation or getattr() built-in function. Unless it was overridden, the former expression is translated into object.__getattribute__(self, ‘get_from_article’). The default implementations looks into the instance’s namespace, then looks into the class namespace, then into each base’s namespace and so on. Finally, if not found, the default implementation calls the fallback __getattr__() method of the instance and it raises an AttributeError exception as default implementation.

This is not a problem by itself but if you pay attention enough you’ll notice we are trying to create the new method only if the object does not have the method yet. It is semantically the same as overriding __getattr__() because it is called only when the attribute was not found. So, even if we cannot explain the infinite recursion error yet, we can fix the class by replacing:

  def __getattribute__(self, name):
    if not hasattr(self, name):
      method = self.method_builder(name)
      setattr(self, name, method)

    return super(RESTConn, self).__getattribute__(name)

with:

  def __getattr__(self, name):
    method = self.method_builder(name)
    setattr(self, name, method)
    return getattr(self, name)

So, the difference between __getattribute__() and __getattr__() is that the first one is called unconditionally when an attribute is being retrieved from an instance while the second is called only when the attribute was not found.

But, what about the infinite recursion? Why the first example was failing?

Ok, I said the problems were using __getattribute__, hasattr() and self.method_builder() but the real problem is using hasattr() and self.method_builder() inside __getattribute__(). According to Python documentation hasattr(self, name) is implemented by trying getattr(self, name) and catching exceptions. If an exception is caught, then hasattr() returns False. But using getattr() is an attempt to access an instance attribute so __getattribute__() is called again. This leads to another call to hasattr() which use getattr() which is another attempt to find and attribute and it calls __getattribute__() which call hasattr() which use getattr() and on and on and on… until a RuntimeError exception is raised. So all the hasattr() return False and the program can continue with the next statement.

The next statement is more dangerous even. It tries to access ‘method_builder’ from instance self so __getattribute__() is called again which involves calling hasattr() again and then progressing to the next try to access ‘method_builder’. Can you see what is next? Of course, another call to __getattribute__()!

So, what is the correct way to implement this by using __getattribute__() instead of __getattr__()? The answer, below:

  def __getattribute__(self, name):
    try:
      method = super(RESTConn, self).__getattribute__(name)
    except AttributeError:
      method = super(RESTConn, self).__getattribute__('method_builder')(name)
      setattr(self, name, method)

    return method

As you see, we are first trying to retrieve the method by calling the default implementation of __getattribute__(). If it is not found, we build a new method by calling ‘method_builder’ without calling RESTConn’s __getattribute__() and avoiding the infinite recursion. Finally we return the method. Simple, once you know what is __getattribute__ actually doing. ,)

And that’s all. You can read more about the attribute retrieving process in the Python data model. As a result of this little research I updated my presentations about meta-programming in Python and my former articles in Spanish.

Hope it helps!

Anuncio publicitario

Un comentario en “The difference between __getattribute__ and __getattr__

  1. Hello, nice article.
    Thanks to him I have understood better how descriptors works

    I want to clarify only one thing regarding how __getattribute__ method is executed. You affirmed that:

    «The object’s method __getattribute__() is used to retrieve an attribute from an instance. It captures every attempt to access an instance attribute by using dot notation or getattr() built-in function. Unless it was overridden, the former expression is translated into object.__getattribute__(self, ‘get_from_article’). The default implementations looks into the instance’s namespace, then looks into the class namespace, then into each base’s namespace and so on. Finally, if not found, the default implementation calls the fallback __getattr__() method of the instance and it raises an AttributeError exception as default implementation.»

    If you use descriptors in your class things change; the calling chain becomes:
    – Check if there is a descriptor with the same name (attribute name) defined for each class in the MRO (method object resolution)
    – Then looks into the instance’s namespace
    – Then looks into the class namespace
    – Then into each base’s namespace and so on.
    – Finally, if not found, the default implementation calls the fallback __getattr__() method of the instance and it raises an AttributeError exception as default implementation.

    This is the actual implementation of the object.__getattribute__ method:

    «.. c:function:: PyObject* PyObject_GenericGetAttr(PyObject *o, PyObject *name)
    Generic attribute getter function that is meant to be put into a type
    object’s «tp_getattro« slot. It looks for a descriptor in the dictionary
    of classes in the object’s MRO as well as an attribute in the object’s
    :attr:`~object.__dict__` (if present). As outlined in :ref:`descriptors`,
    data descriptors take preference over instance attributes, while non-data
    descriptors don’t. Otherwise, an :exc:`AttributeError` is raised. «

Deja una respuesta

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Salir /  Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Salir /  Cambiar )

Conectando a %s