An Apprentice Experiment in Python Programming, Part 3

post by gilch, konstell (parsley) · 2021-08-16T04:42:31.948Z · LW · GW · 10 comments

Contents

    [Note to readers: We've converted this post into a Jupyter notebook so you can code along! (The notebook works on Firefox and Chrome, but may not work on all browsers.)] Let us know what you think!
  Registry
  Register, Part 2
  Python Objects in Memory
  Constants (Only One Copy in Memory)
  A More Difficult Puzzle
  Attempt 1
  Attempt 2
  Attempt 3
  Wasted Motion
  Passing in Fixtures in Any Order
  Dictionary Comprehension
  Docstrings
  Going Back to @run_with.fixture
  Magic Methods
  Solution 2
  More Magic Methods
  Loose Threads
  Observations
None
10 comments

[Note to readers: We've converted this post into a Jupyter notebook so you can code along! (The notebook works on Firefox and Chrome, but may not work on all browsers.)] Let us know what you think!

[Epistemic Status: this post is reconstructed from primarily chat logs, so I am more confident in the accuracy of this post than the previous one which was reconstructed from more fragmented record.]

Previously: https://www.lesswrong.com/posts/jkaCF3yrfKvFQL4ym/an-apprentice-experiment-in-python-programming-part-2 [LW · GW]

Since the last pair-programming session, gilch and I have done some more Python programming.

Registry

After we talked about the solutions for the previous puzzle [LW · GW], gilch sent me the next one (gilch later clarified that I could put any code above the provided code):

registry = []
@register
def alice():
    print("Alice")
@register
def bob():
    print("Bob")
@register
def charlie():
    print("Charlie")
>>> for name in registry: name()
Alice
Bob
Charlie

This one was pretty straightforward after I had figured out the previous one, I saw that all the decorator register had to do was to put the functions in the list:

def register(f):
    registry.append(f)

registry = []
@register
def alice():
    print("Alice")
@register
def bob():
    print("Bob")
@register
def charlie():
    print("Charlie")

While the output was exactly what I expected, I was a bit surprised that I got to use registry before it was declared, but then I realized the variable was not actually used until the first instance of @register, after registry was instantiated. To which gilch responded that globals don't need to exist until they are actually used. We moved on to the next puzzle:

Register, Part 2

registry = {}
@register
def alice():
    print("Alice")
@register
def bob():
    print("Bob")
@register
def charlie():
    print("Charlie")
>>> for name, func in registry.items():
...     print(f"{name}() does:")
...     func()
...     print()
... 
alice() does:
Alice

bob() does:
Bob

charlie() does:
Charlie

>>> 

I played around the Python shell and noticed the function name was in the string representation of the function object:

>>> alice
<function alice at 0x7fb5ad0db040>

So I used the first way I could think of to parse the function name in my solution:

def register(f):
    function_name = f.__str__().split(" ")[1]
    registry[function_name] = f
    return f

This gave me the output I wanted, but I asked gilch if there was a simpler way to get the function names. Gilch's solution was:

def register(f):
    registry[f.__name__] = f
    return f

I didn't know __name__ existed.

Gilch remarked,

Technically, you don't need the return f to make the tests pass, but it's an important point that a decorator could leave the function as-is and just have a side effect. If we didn't have the return line, they'd all be set to None in the module, but the registry would still have the originals."

Me:

Oh it took me a moment to get why this was the case. At first I thought the functions were being passed in by value, but that didn't seem right. Then I thought they were passed in by reference, and variables alice bob and charlie would still point to the functions when the decorator was run, but that still didn't explain why we were able to use the functions in the test case. I visualized the code on Python Tutor and realized that, upon reassignment, while alice bob and charlie were no longer pointing at the functions, the function objects themselves still existed, and were (only) accessible through the dictionary."

Python Objects in Memory

Noticing my confusion, gilch started talking about how objects were stored in memory in Python:

Thinking about pass-by-value/reference in Python like you do in C will confuse you. Python has a simpler and more consistent model. The only things on the call stack are references to objects on the heap. There are no stack objects in Python. Calls always copy the references used as arguments to the next stack frame.

When asked about whether this also applied to small types like boolean or int:

Yes. In Java terminology, these are boxed. Python has no "primitives". The closest you can get to unboxed objects are numpy arrays.

I was surprised to learn this, because some pieces in my mental model have come from Python Tutor, which displays complex objects such as functions and arrays as references on the stack, but integers and strings as values on the stack:

Gilch noted that "Python Tutor has an option to 'render all objects on the heap (Python/Java)'."

Constants (Only One Copy in Memory)

Gilch:

I may be fuzzy on all of the details of Python's internals. But True and False are constants. There are only ever one of them each per interpreter session. You can get the memory address of a reference using the id() builtin.

>>> id(True)
140735313696872
>>> x = True
>>> id(x)
140735313696872
>>> def foo(z):
    print(id(z))
    def bar(y):
        print(id(y))
    bar(z)

>>> foo(x)
140735313696872
140735313696872

A More Difficult Puzzle

As before, the format of the puzzles is that I'm allowed to add any code before the given code snippet to produce the specified output.

@run_with.fixture
def foo():
    return [42,'eggs']

@run_with.fixture
def bar():
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(foo, bar):
    for k in foo:
        print(bar[k])
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
fourty-two
spam

Attempt 1

I saw @run_with.fixture, so I thought "class method", and when I saw @run_with, I thought I'd need to re-purpose a constructor.

class run_with:
    def __init__(self, f):
        self.fixtures = {}
        return f

    def fixture(self, f):
        self.fixtures[f.__name__] = f()

This did not work:

Traceback (most recent call last):
  File "code.py", line 11, in <module>
    def foo():
TypeError: fixture() missing 1 required positional argument: 'f'

So I thought, I'd just make everything a class method instead of an instance method:

class run_with:
    fixtures = {}  # took this out of __init__
    def __init__(f):  # attempted to make __init__ a class method by removing `self`
        return f

    def fixture(f):  #  attempted to make fixture a class method by removing `self`
        fixtures[f.__name__] = f()

Didn't work either:

Traceback (most recent call last):
  File "code.py", line 11, in <module>
    def foo():
  File "code.py", line 7, in fixture
    fixtures[f.__name__] = f()
NameError: name 'fixtures' is not defined

I think I was not making class variable the right way. In addition, I also couldn't return anything in a constructor.

Gilch:

__init__ is not a constructor. That's a common misconception. By the time __init__ is called, there is already a self, so the object has been constructed already. __init__ is instead the default initializer. The constructor is actually __new__. __init__ is required to return None. I see that you're not using the terms "class method" and "instance method" correctly. We'll need to cover that later. That, or you understand what they mean, but not how to actually implement them. Either way. Making a class is one of the ways to solve this one, but you might be missing an important piece to do it that way.

I had another idea:

class run_with:
    fixtures = {}
    def __new__(f):  # used __new__ instead of __init__
        return f

    def fixture(f):
        run_with.fixtures[f.__name__] = f()

This time we run into problems when decorating the tests:

Traceback (most recent call last):
  File "/home/jas/code_stuff/python_scratch/02-run-with.py", line 19, in <module>
    def test1(foo, bar):
TypeError: __new__() takes 1 positional argument but 2 were given

I asked, "I thought __new__ was only taking in the function it's decorating, why does python tell me it's given 2 arguments?"

To which gilch answered,

__new__ is special-cased as a classmethod without having to be declared as such. Thus, its first argument is cls, rather than self. Of course, you can name the cls or self parameters anything you want. These names are just (very strong) conventions. In this case, you named it f. The interpreter doesn't care though.

Me:

So, if we're interpreting the . in @run_with.fixture as a method, then I can also make run_with an object, however that still doesn't solve the problem of run_with also being a callable that takes in a function. The other possibility I can think of is to use modules, like functools.partial. But still, I don't see how to make the module itself a callable.

Gilch:

There are some other possibilities. I do know of a way to make a module callable, but if you know how to do that, you don't need a module. I didn't expect the .fixture to be the hard part, although it is one of the concepts you'd have to get. If you're out of ideas about this part, we can try a slightly easier puzzle.

Attempt 2

The modified puzzle that we're now solving:

@fixture  # replaced "run_with.fixture" with "fixture" so we're making separate decorators
def foo():
    return [42,'eggs']

@fixture  # same as the first line
def bar():
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(foo, bar):
    for k in foo:
        print(bar[k])

The expected output stays the same:

>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
fourty-two
spam

Gilch also commented that this strategy is generalizable:

By the way, coming up with a simplified problem and solving that first like this is a useful technique in real-world programming, not just for toy problems like this one. Often the solution to the easy problem makes a solution for the harder one easier to discover.

I came up with a first-pass solution:

from functools import partial

fixtures = {}

def fixture(f):  # defining `fixture` as a standalone function
    fixtures[f.__name__] = f()
    return f

def run_with(test):  # rewriting `run_with` as a function rather than a class
    def wrapper():
        return partial(test, *fixtures.values())()
    return wrapper

While I was able to get the output I wanted from the tests, I could only run one test in a session. Running both tests in the same session didn't work:

$ python3.9 -i code.py 
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
>>> 
$ python3.9 -i code.py 
>>> test2()
forty-two
spam
>>> 

Also, since I used *fixtures.values() to unpack the fixtures passed in as arguments, if we changed the order in which the fixtures were passed in, this solution would no longer work. I suspected this was because the two tests were referencing the same fixtures object, so I tried to make a copy of fixtures:

 def run_with(test):
     def wrapper():
-        return partial(test, *fixtures.values())()
-    return wrapper
+        def replacement_test():
+            fixtures_copy = fixtures.copy()
+            partial(test, *fixtures_copy.values())()
+            fixtures_copy = fixtures.copy()
+        return replacement_test
+    return wrapper()
from functools import partial

fixtures = {}

def fixture(f):
    fixtures[f.__name__] = f()
    return f

def run_with(test):
    def wrapper():
        def replacement_test():  # added a new layer so I could copy the dictionary before calling `partial`
            fixtures_copy = fixtures.copy()
            partial(test, *fixtures_copy.values())()
            fixtures_copy = fixtures.copy()  # this line doesn't do anything! Why?
        return replacement_test
    return wrapper()

This didn't work, I was still unable to run both tests in the same session.

Gilch asked,

What, exactly, do you expect that line to do?

To which I replied,

So we run the test by using partial. test1 modifies fixtures_copy, and I want to restore the state of fixtures_copy so it can be used by the next test.

Gilch:

Your mental model does not match what the code is doing. This is map-and-territory stuff. Have you tried stepping through it with the debugger? Or smaller pieces interactively? You might still be thinking in terms of pass-by-reference. Python doesn't work that way. Were you writing a lot of C or C++ before? C#? You must unlearn what you have learned. Python is a different animal.

I ran the code with a debugger and noticed that the original fixtures was still modified. I was not expecting this:

-> fixtures_copy = fixtures.copy()
(Pdb) fixtures_copy
{'foo': [], 'bar': {'z': 'Q', 'foo': 2}}
(Pdb) fixtures
{'foo': [], 'bar': {'z': 'Q', 'foo': 2}}

When I wrote the line fixtures_copy = fixtures.copy() I was specifically trying to make a deep copy of the dictionary, I guess I still ended up with a shallow copy.

Gilch:

Had you checked, you might have seen this:

>>> help({}.copy)
Help on built-in function copy:

copy(...) method of builtins.dict instance
    D.copy() -> a shallow copy of D

from copy import deepcopy might be what you are looking for. It's important to know that exists, but you can solve this puzzle without it.

Let's try a modification.

Attempt 3

Once again, we decided to solve an easier problem first.

 @fixture
 def foo():
+    print('made a foo')
     return [42,'eggs']
 
 @fixture
 def bar():
+    print('made a bar')
     return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}
 
 @run_with
-def test2(foo, bar):
+def test2(bar, foo):
     for k in foo:
         print(bar[k])

The modified problem:

@fixture
def foo():
    print('made a foo')  # added print statement
    return [42,'eggs']

@fixture
def bar():
    print('made a bar')  # added print statement
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(bar, foo):  # order switched to address an weakness of my previous solution
    for k in foo:
        print(bar[k])
made a foo
made a bar
made a foo
made a bar
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
forty-two
spam
>>> 

I decided to move the mechanism to generate fixture within each time a test was run:

-def fixture(f):
-    fixtures[f.__name__] = f()
+def fixture(f, fixture_store=fixtures):
+    fixture_store[f] = f()
     return f
 
 def run_with(test):
+    new_fixture_store = {}
+    for fixture_fn in fixtures.keys():
+        fixture(fixture_fn, new_fixture_store)
+
     def wrapper():
         def replacement_test():
-            fixtures_copy = fixtures.copy()
-            partial(test, *fixtures_copy.values())()
-            fixtures_copy = fixtures.copy()
+            partial(test, *new_fixture_store.values())()
         return replacement_test
     return wrapper()

-def test2(foo, bar):
+def test2(bar, foo):
     for k in foo:
         print(bar[k])
from functools import partial

fixtures = {}

def fixture(f, fixture_store=fixtures):
    fixture_store[f] = f()
    return f

def run_with(test):
    new_fixture_store = {}  # make a new copy of fixtures inside `run_with`
    for fixture_fn in fixtures.keys():
        fixture(fixture_fn, new_fixture_store)

    def wrapper():
        def replacement_test():  # no longer make copies of `fixtures` here
            partial(test, *new_fixture_store.values())()
        return replacement_test
    return wrapper()


@fixture
def foo():
    print('made a foo')
    return [42,'eggs']

@fixture
def bar():
    print('made a bar')
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(foo, bar):  # I switched back the order for now because this seems to be a separate issue
    for k in foo:
        print(bar[k])
$ python3.9 -i code.py 
made a foo
made a bar
made a foo
made a bar
made a foo
made a bar
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
forty-two
spam

I noticed that calling the test functions within the fixture function was printing the extra lines. So I fixed that:

-fixtures = {}
+fixtures = []
 
 def fixture(f, fixture_store=fixtures):
-    fixture_store[f] = f()
+    fixture_store.append(f)
     return f
 
 def run_with(test):
     new_fixture_store = {}
-    for fixture_fn in fixtures.keys():
-        fixture(fixture_fn, new_fixture_store)
-
+    for fixture_fn in fixtures:
+        new_fixture_store[fixture_fn.__name__] = fixture_fn()
     def wrapper():
         def replacement_test():
             partial(test, *new_fixture_store.values())()
from functools import partial

fixtures = []  # changed from {} to []

def fixture(f, fixture_store=fixtures):
    fixture_store.append(f)  # avoid calling `f` so nothing gets printed here
    return f

def run_with(test):
    new_fixture_store = {}
    for fixture_fn in fixtures:
        new_fixture_store[fixture_fn.__name__] = fixture_fn()  # store fixtures by name as key instead of function objects as key

    def wrapper():
        def replacement_test():
            partial(test, *new_fixture_store.values())()
        return replacement_test
    return wrapper()


@fixture
def foo():
    print('made a foo')
    return [42,'eggs']

@fixture
def bar():
    print('made a bar')
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(foo, bar):
    for k in foo:
        print(bar[k])
$ python3.9 -i code.py 
made a foo
made a bar
made a foo
made a bar
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
forty-two
spam

Wasted Motion

Gilch asked me if I could find any wasted motion in the code I wrote above. Upon reading the code again, I realized that the replacement_test layer was not necessary, so run_with could be simplified to:

def run_with(test):
    new_fixture_store = {}
    for fixture_fn in fixtures:
        new_fixture_store[fixture_fn.__name__] = fixture_fn()

#     def wrapper():
#         def replacement_test():
#             partial(test, *new_fixture_store.values())()
#         return replacement_test
#     return wrapper()

    def wrapper():  # removed the `replacement_test` layer
        partial(test, *new_fixture_store.values())()
    return wrapper

The feedback I got from gilch was that I had not eliminated all of the waste. Then I realized that wrapper was also not necessary:

def run_with(test):
    new_fixture_store = {}
    for fixture_fn in fixtures:
        new_fixture_store[fixture_fn.__name__] = fixture_fn()

    return partial(test, *new_fixture_store.values())  # removed `wrapper` altogether

Gilch remarked that the fixture_store=fixtures part is also not required.

Passing in Fixtures in Any Order

Gilch gave me a hint, "do you remember all the types of packing and unpacking [LW · GW] we discussed before?" I answered, "there's unpacking an iterable of arguments, like *args, and there's unpacking a dictionary of keyword arguments, like *kwargs." "Not right. **kwargs. Two stars. If you unpack a dict with one star, you just get the keys. This is because dicts are both mappings and iterables," gilch corrected me.

Then I realized that I could pass in the fixtures as keyword arguments in any order:

-def fixture(f, fixture_store=fixtures):
-    fixture_store.append(f)
+def fixture(f):
+    fixtures.append(f)
     return f

 def run_with(test):
     new_fixture_store = {}
     for fixture_fn in fixtures:
         new_fixture_store[fixture_fn.__name__] = fixture_fn()
-    def wrapper():
-        def replacement_test():
-            partial(test, *new_fixture_store.values())()
-        return replacement_test
-    return wrapper()
+
+    return partial(test, **new_fixture_store)

 @run_with
-def test2(foo, bar):
+def test2(bar, foo):
     for k in foo:
         print(bar[k])
from functools import partial

fixtures = []

def fixture(f):  # removed default parameter `fixture_store=fixtures`
   fixtures.append(f)
   return f

def run_with(test):
   new_fixture_store = {}
   for fixture_fn in fixtures:
       new_fixture_store[fixture_fn.__name__] = fixture_fn()

   return partial(test, **new_fixture_store)  # no more calling `new_fixture_store.values()`, also added keyword unpacking


@fixture
def foo():
   print('made a foo')
   return [42,'eggs']

@fixture
def bar():
   print('made a bar')
   return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
   while foo:
       del bar[foo.pop()]
   print(bar)

@run_with
def test2(bar, foo):  # now the code works regardless of the order of parameters
   for k in foo:
       print(bar[k])
$ python3.9 -i code.py 
made a foo
made a bar
made a foo
made a bar
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
forty-two
spam
>>> 

Dictionary Comprehension

Gilch asked me if I knew about list comprehensions, then asked me to rewrite the decorator with a dict comprehension. So run_with can be abbreviated to one line:

def run_with:
    return partial(test, **{f.__name__: f() for f in fixtures})

Docstrings

Next up, gilch made a small tweak:

@run_with
def test1(foo, bar):
    "The first test."
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(foo, bar):
    "The one after that."
    for k in foo:
        print(bar[k])

"The test output should be the same, but in addition, help(test1) should show the documentation."

Naturally, I thought of the wraps decorator we had seen earlier [LW(p) · GW(p)]:

from functools import partial, wraps

def run_with(test):
    @wraps(test)
    def test_with_fixtures():  # in order to use the decorater with `@` I added another layer of function
        return partial(test, **{f.__name__: f() for f in fixtures})
    return test_with_fixtures()

This did not work as I expected:

Help on partial object:

class partial(builtins.object)
 |  partial(func, *args, **keywords) - new function with partial application
:

"Aaah, you put the waste back in! A natural thing to try though," gilch was amused.

"I didn't figure out how to use @wraps without a function definition, " I replied.

"Try desugaring the wraps, and then tell me where the waste is."

"test_with_fixtures = (wraps(test))(test_with_fixtures)" I finally managed to desugar a decorator correctly [LW(p) · GW(p)].

"Yes, and?"

"We could just do (wraps(test))(partial(test, **{f.__name__: f() for f in fixtures}))?" I asked, unsure where this was going.

"Yes!"

"It did not occur to me that we could use decorators in the desugared way too, but this looks obvious in hindsight."

"Very important not to forget it. Decorators are just higher-order functions."

With this, we were able to get the docstrings working:

def run_with(test):
    return (wraps(test))(partial(test, **{f.__name__: f() for f in fixtures}))
Help on partial in module __main__:

test1 = functools.partial(<function test1 at 0x7fee604c0... 'Q', 'foo': 2, 42: 'forty-two', 'eggs': 'spam'})
    The first test.
:

I was still confused, because the fact that I put an extra layer of functions in the code didn't explain why the docstring was not working, "can I not use wraps on a partial function or something?"

"Because the one you modified with wraps was not the one the decorator returned. Look closely. Which function did you modify with wraps? Which function did you return?"

"Oh I see, I returned the partial, not test_with_fixtures."

"Yes. And that's also why it was a waste. test_with_fixtures was superfluous. It didn't do anything useful."

Going Back to @run_with.fixture

I didn't understand how class methods worked very well, but I thought I'd just try again at implementing the __new__ method to take in a parameter.

-from functools import partial
+from functools import partial, wraps
 
-fixtures = []
+class run_with:
+    fixtures = []
+    def __new__(cls, test):
+        return (wraps(test))(partial(test, **{f.__name__: f() for f in run_with.fixtures}))
 
-def fixture(f):
-    fixtures.append(f)
-    return f
+    def fixture(f):
+        run_with.fixtures.append(f)
+        return f
 
-def run_with(test):
-    new_fixture_store = {}
-    for fixture_fn in fixtures:
-        new_fixture_store[fixture_fn.__name__] = fixture_fn()
 
-    return partial(test, **new_fixture_store)
-
-
-@fixture
+@run_with.fixture
 def foo():
     print('made a foo')
     return [42,'eggs']
 
-@fixture
+@run_with.fixture
 def bar():
     print('made a bar')
     return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}
 
 @run_with
 def test1(foo, bar):
+    "The first test."
     while foo:
         del bar[foo.pop()]
     print(bar)
 
 @run_with
 def test2(bar, foo):
+    "The one after that."
     for k in foo:
         print(bar[k])

It worked!

from functools import partial, wraps

class run_with:  # defining `run_with` as a class instead of function
    fixtures = []  # this is now a class variable instead of global variable
    def __new__(cls, test):
        return (wraps(test))(partial(test, **{f.__name__: f() for f in run_with.fixtures}))

    def fixture(f):  # at first this line was "def fixture(cls, f)" but it didn't work
        run_with.fixtures.append(f)  # using the class variable instead of global variable
        return f


@run_with.fixture  # using these decorators as specified in the original problem
def foo():
    print('made a foo')
    return [42,'eggs']

@run_with.fixture
def bar():
    print('made a bar')
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    "The first test."
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(bar, foo):
    "The one after that."
    for k in foo:
        print(bar[k])
$ python3.9 -i code.py 
made a foo
made a bar
made a foo
made a bar
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
forty-two
spam
>>> help(test1)
Help on partial in module __main__:

test1 = functools.partial(<function test1 at 0x7f596979c3a0>, foo=[], bar={'z': 'Q', 'foo': 2})
    The first test.

Magic Methods

There were multiple ways to solve this problem. Alluding to other approaches, gilch gave me an introduction to magic methods:

In Python, certain double-underscore (or "dunder") names are special cases with "magic" behaviors. You generally aren't supposed to call these directly, but you do implement them pretty often. They serve as user-definable hooks into certain processes. __init__ and __new__ are examples of these. Many so-called builtin functions just call the corresponding magic method. str() calls .__str__(), repr() calls .__repr__(), len() calls .__len__(), and so forth. Sometimes the magic methods only implement part of the process, so it's usually better to use the normal process rather than calling the dunder methods yourself. Being familiar with what these are and how they work is important to advanced Python programming.

>>> class Foo:
    def __len__(self):
        return 42

    
>>> Foo()
<__main__.Foo object at 0x000001E5097AD910>
>>> len(_)
42

Here, the Foo instance is certainly not any kind of collection or iterable, but it implements the required protocol for len() to work. Thus, it has a "length".

Here's a more interesting case:

>>> class Foo:
    def __add__(self, other):
        return 42

    
>>> Foo() + object()
42
>>> object() + Foo()
Traceback (most recent call last):
  File "<pyshell#61>", line 1, in <module>
    object() + Foo()
TypeError: unsupported operand type(s) for +: 'object' and 'Foo'

One of the ways to solve the puzzle using a class is with the "callable" protocol. You can implement that with __call__. The ability to easily implement operators for custom types like this is one of the major reasons Python is big in data science. Numpy arrays, for example, implement a lot of these magic methods. I don't know if you ever got to "operator overloading" in C++, but this is Python's equivalent. The () as in foo() or Foo.foo() is the call operator. It is possible for a class to implement this operation for its instances, just like I did with __add__ and +. It's actually one of the most straightforward operators to implement, because it's a lot like writing a normal function. Can you solve this one by using __call__ instead of __new__?

Solution 2

My first attempt at implementing __call__ didn't work:

class run_with:
    fixtures = []
    def __call__(self, test):  # replaced `__new__(cls, test):`
        return (wraps(test))(partial(test, **{f.__name__: f() for f in run_with.fixtures}))

    def fixture(f):  # I wasn't sure whether to make this a class method or instance method
        run_with.fixtures.append(f)
        return f
$ python3.9 -i code.py 
Traceback (most recent call last):
  File "code.py", line 25, in <module>
    def test1(foo, bar):
TypeError: run_with() takes no arguments

Gilch pointed out that, "unlike the class method __new__, which takes the class as its first argument cls, __call__ is a normal instance method that requires a self instance. When did you construct an instance?"

So I constructed an instance and it worked:

-class run_with:
+class RunWith:
     fixtures = []
     def __call__(self, test):
         return (wraps(test))(partial(test, **{f.__name__: f() for f in run_with.fixtures}))
 
-    def fixture(f):
+    def fixture(self, f):
         run_with.fixtures.append(f)
         return f
 
+run_with = RunWith()
from functools import partial, wraps

class RunWith:
    fixtures = []
    def __call__(self, test):
        return (wraps(test))(partial(test, **{f.__name__: f() for f in run_with.fixtures}))

    def fixture(self, f):  # now an instance method
        run_with.fixtures.append(f)
        return f

run_with = RunWith()

@run_with.fixture
def foo():
    print('made a foo')
    return [42,'eggs']

@run_with.fixture
def bar():
    print('made a bar')
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    "The first test."
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(bar, foo):
    "The one after that."
    for k in foo:
        print(bar[k])
$ python3.9 -i code.py 
made a foo
made a bar
made a foo
made a bar
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
forty-two
spam
>>> help(test1)
Help on partial in module __main__:

test1 = functools.partial(<function test1 at 0x7f6bd7e673a0>, foo=[], bar={'z': 'Q', 'foo': 2})
    The first test.
(END)

More Magic Methods

There was another was to solve this problem. Gilch started to talk about __dict__: "__dict__ is an important protocol to understand. It corresponds to the builtin vars(). See help(vars)."

I stated that the attributes of an object were stored in a dict, and variables in a given scope were also in a dict. Gilch corrected me: "Some kinds of objects, especially the more primitive kind, do not implement the .__dict__ protocol, and thus don't work with vars. But they do implement the __dir__ protocol, and so can still list their attributes. Technically, there are some scopes where variables are not stored in a dict."

So I tried dir():

>>> dir(42)
['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'as_integer_ratio', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']

Gilch: "Note that __dict__ is not in that list. Ints are one of the types that don't have one. Thus, vars() doesn't work on ints. You could still make one yourself with something like {k:getattr(42, k) for k in dir(42)}. It's possible to implement __dir__ to return anything. It's supposed to be a list of all the object's attributes, but in some cases it doesn't list all of them. With neither __dir__ reporting it nor a __dict__, it's possible for a secret attribute to exist. These are hard to find."

I asked some tangential questions on the implementation of __dir__, and eventually the topic came back to, "so, given all of that, can you solve the puzzle without declaring a new class?"

At this point, I was still not entirely clear about how I'd do this:

So we have the previous function implementation of run_with:

def run_with(test):
    return (wraps(test))(partial(test, **{f.__name__: f() for f in fixtures}))
>>> dir(run_with)
['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

And I'll need to think about overloading some of these...

Gilch: "What does __dict__ do?"

Me:

>>> run_with.__dict__
{}

Gilch: "What about vars(run_with)?"

Me:

>>> vars(run_with)
{}
>>> run_with
<function run_with at 0x7fab0a1c6040>

I think the error on the line @run_with.fixture is preventing the function definition from running

Gilch: "I'll rephrase: what is the __dict__ for?"

Me: "__dict__ returns attributes of an object."

Gilch: "Why is it empty? Doesn't run_with have any attributes?"

I was quite baffled:

>>> run_with.__name__
'run_with'

It does have attributes. I don't know why run_with.__dict__ is empty...

Gilch: "How about vars(type(run_with))? type uses the __class__ protocol, btw. So run_with.__class__.__dict__ should be the same."

I tried vars(type(run_with)):

>>> vars(type(run_with))
mappingproxy({'__repr__': <slot wrapper '__repr__' of 'function' objects>, '__call__': <slot wrapper '__call__' of 'function' objects>, '__get__': <slot wrapper '__get__' of 'function' objects>, '__new__': <built-in method __new__ of type object at 0x948de0>, '__closure__': <member '__closure__' of 'function' objects>, '__doc__': <member '__doc__' of 'function' objects>, '__globals__': <member '__globals__' of 'function' objects>, '__module__': <member '__module__' of 'function' objects>, '__code__': <attribute '__code__' of 'function' objects>, '__defaults__': <attribute '__defaults__' of 'function' objects>, '__kwdefaults__': <attribute '__kwdefaults__' of 'function' objects>, '__annotations__': <attribute '__annotations__' of 'function' objects>, '__dict__': <attribute '__dict__' of 'function' objects>, '__name__': <attribute '__name__' of 'function' objects>, '__qualname__': <attribute '__qualname__' of 'function' objects>})
>>> type(run_with)
<class 'function'>

Gilch: "Or how about vars(type(run_with)).keys()? That might make it easier to see."

I was still pretty confused and was just thinking out loud at this point:

So '__dict__': <attribute '__dict__' of 'function' objects> does exist

>>> vars(type(run_with)).keys()
dict_keys(['__repr__', '__call__', '__get__', '__new__', '__closure__', '__doc__', '__globals__', '__module__', '__code__', '__defaults__', '__kwdefaults__', '__annotations__', '__dict__', '__name__', '__qualname__'])

Gilch:

Do those keys look familiar? Is that all of them?

len(vars(type(run_with)))

len(dir(run_with))

Me:

so dir(run_with) has a lot more methods than the function base class protocols (not sure if I'm wording this correctly):

>>> dir(run_with)
['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

Gilch:

A protocol is an informal interface. A set of methods for a particular purpose.

The run_with function object has more attributes than those defined by its class. Where are the other ones coming from then?

set(dir(run_with)) - set(vars(type(run_with))).

list(sorted(_))

Me:

>>> set(dir(run_with)) - set(vars(type(run_with)))
{'__setattr__', '__str__', '__sizeof__', '__getattribute__', '__format__', '__init__', '__reduce__', '__eq__', '__le__', '__ge__', '__hash__', '__lt__', '__delattr__', '__dir__', '__subclasshook__', '__reduce_ex__', '__class__', '__gt__', '__init_subclass__', '__ne__'}
>>> list(sorted(_))
['__class__', '__delattr__', '__dir__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__reduce__', '__reduce_ex__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

So these look like what I'd get from dir(object)

Gilch:

set(dir(run_with)) - set(vars(type(run_with))) - set(dir(object))

I was utterly confused:

(wraps(test))(partial(test, **{f.__name__: f() for f in fixtures})) is an object? But everything in Python are objects?

I also tried the command gilch suggested:

>>> set(dir(run_with)) - set(vars(type(run_with))) - set(dir(object))
set()

Gilch:

That accounts for all of them. Expressions in Python always evaluate to objects.

Me:

So I have the thought of adding a new object attribute fixture, but traditionally the way I'd do that was through instance methods, which we have already done before.

Gilch:

Where do attributes live?

Then I had the idea of modifying the __dict__ to make an attribute:

-class RunWith:
-    fixtures = []
-    def __call__(self, test):
-        return (wraps(test))(partial(test, **{f.__name__: f() for f in run_with.fixtures}))
+fixtures = []
 
-    def fixture(self, f):
-        run_with.fixtures.append(f)
-        return f
+def run_with(test):
+    return (wraps(test))(partial(test, **{f.__name__: f() for f in fixtures}))
+
+run_with.__dict__["fixture"] = lambda f: fixtures.append(f)
 
-run_with = RunWith()
from functools import partial, wraps

fixtures = []

def run_with(test):
    return (wraps(test))(partial(test, **{f.__name__: f() for f in fixtures}))

run_with.__dict__["fixture"] = lambda f: fixtures.append(f)  # defining `fixture` via __dict__


@run_with.fixture
def foo():
    print('made a foo')
    return [42,'eggs']

@run_with.fixture
def bar():
    print('made a bar')
    return {'z':'Q', 'foo':2, 42:'forty-two', 'eggs':'spam'}

@run_with
def test1(foo, bar):
    "The first test."
    while foo:
        del bar[foo.pop()]
    print(bar)

@run_with
def test2(bar, foo):
    "The one after that."
    for k in foo:
        print(bar[k])
$ python3.9 -i code.py 
made a foo
made a bar
made a foo
made a bar
>>> test1()
{'z': 'Q', 'foo': 2}
>>> test2()
forty-two
spam
>>> 

Gilch also gave me their version:

from functools import partial, wraps

def run_with(f):
    return wraps(f)(partial(f,**{k: v() for k, v in run_with._args.items()}))

run_with._args = {}

def fixture(f):
    run_with._args[f.__name__] = f
    return f

run_with.fixture = fixture

Loose Threads

While I figured out the solution of the problem, I still don't have a full understanding of vars, dir and __dict__ and how they interact with each other. We'll pick this up next time!

Observations

  1. These posts are basically code and dialogues. I find it hard to incorporate the quotes nicely into a post using dialogue punctuation rules that are usually for describing a scene. I'm also having trouble making the "I said" "gilch said" less repetitive when there's a lot of back and forth. I've noticed that I can usually think of different dialogue tags when I'm paraphrasing (which happens if I'm writing about something we've talked about over a call) but it becomes harder when I'm basically turning a chat log into a post.

10 comments

Comments sorted by top scores.

comment by Yoav Ravid · 2021-08-17T17:58:27.158Z · LW(p) · GW(p)

The Jupyter Notebook was cool! I only did some of the exercises but that's already more than the zero I did in the previous posts. convenience sure does matter.

comment by Zac Hatfield-Dodds (zac-hatfield-dodds) · 2021-08-16T05:58:12.275Z · LW(p) · GW(p)

I appreciate reading these logs, but I'm also a little confused about what you're aiming to learn!

I personally work with decorators, higher-order functions, and complicated introspection all the time... but that's because I'm building testing tools like Hypothesis (and to a lesser extent Pytest). If that's the kind of project you want to work on I'd be happy to point you at some nice issues, but it's definitely a niche.

Replies from: gjm, parsley
comment by gjm · 2021-08-16T10:52:57.792Z · LW(p) · GW(p)

It looks to me as if in the process of learning about decorators, konstell is learning a bunch of other things too, and very likely those things will be of greater long-term benefit than the specific knowledge of decorators.

Replies from: gilch
comment by gilch · 2021-08-28T19:43:38.972Z · LW(p) · GW(p)

Knowing how to sugar and desugar the syntax is fundamentally all there is to know about decorators per se. But to use them well, one has to know a lot more Python than that. Everything else I'm teaching konstell could be done without the sugar, but decorators are a convenient focus for now.

comment by konstell (parsley) · 2021-08-16T07:26:52.990Z · LW(p) · GW(p)

I'm also a little confused about what you're aiming to learn!

There are lots of gaps in my Python knowledge (this applies to my CS knowledge in general as well) and I'm trying to close those gaps. I asked gilch about decorators because I encountered them in pytest and was very confused about how they worked.

I didn't have a project in mind, when I signed up for this apprenticeship, I just saw gilch offering to teach Python [LW(p) · GW(p)] and thought I wanted to get better and learning from a mentor could be great.

I have attempted to contribute to open source in the past but have failed (ran into issues building things locally and didn't know how to get help), would love to try again.

Replies from: zac-hatfield-dodds
comment by Zac Hatfield-Dodds (zac-hatfield-dodds) · 2021-08-16T08:23:21.589Z · LW(p) · GW(p)

Leaning in to current confusions on e.g. decorators makes sense :-)

To ask a slightly different question - what kind of thing do you want to do with Python? It's a large and flexible language, and you'd be best served focussing on somewhat different topics depending on whether you want to use Python for {scientific computing, executable psudeocode, web dev, async stuff, OSS libraries, ML research, desktop apps, etc}.

I'll also make the usual LW suggestion of learning from a good textbook - Fluent Python is the usual intermediate-to-advanced suggestion. After than I learned mostly by following, and contributing to, various open source projects - the open logs and design documents are an amazing resource, as is feedback from library maintainers.

For open-source contributions, you should expect most of the learning curve for your first few patches to be about the contribution process, culture, and tools, and just navigating a large and unfamilar codebase. These are very useful skills though! If you need someone to help get you unstuck, I'm on the Pytest core team and would be happy to help you (or another LWer) with #3426 or #8820 if you're interested.

Replies from: parsley
comment by konstell (parsley) · 2021-08-16T16:53:19.502Z · LW(p) · GW(p)

what kind of thing do you want to do with Python?

Out of the things you listed, scientific computing & OSS libraries are things I want to explore more. I also don't just want to learn Python - although I have chosen Python to be the language to try to get pretty good at - my goal is to get myself a proper CS education. I think it would be difficult to truly get good at a language without understanding how things work underneath.

Also, what gjm said [LW(p) · GW(p)].

Replies from: zac-hatfield-dodds
comment by Zac Hatfield-Dodds (zac-hatfield-dodds) · 2021-08-17T01:45:58.019Z · LW(p) · GW(p)

The skills of 'working on an existing project' I mentioned above are not usually covered as part of a CS education, but complementary skills for most things you might want to do once you have one. I also agree entirely with gjm; you'll learn a lot any time you get hands-on practice with close feedback from a mentor.

For OSS libraries, those pytest issues would be a great start. Scientific computing varies substantially by domain - largely with the associated data structures, being some combination of large arrays, sequences, or graphs. Tools like Numpy, Scipy, Dask, Pandas, or Xarray are close to universal though, and their developers are also very friendly.

comment by purge · 2021-08-16T19:52:16.038Z · LW(p) · GW(p)
>>> x = True
>>> id(x)
[etc...]

Due to Python's style of reference passing, most of these print statements will show matching id values even if you use any kind of object, not just True/False.  Try to predict the output here, then run it to check:

def compare(x, y):
  print(x == y, id(x) == id(y), x is y)
a = {"0": "1"}
b = {"0": "1"}
print(a == b, id(a) == id(b), a is b)
compare(a, b)
c = a
d = a
print(c == d, id(c) == id(d), c is d)
compare(c, d)
Replies from: parsley