Models, myths, dreams, and Cheshire cat grinspost by Stuart_Armstrong · 2020-06-24T10:50:57.683Z · score: 21 (8 votes) · LW · GW · 7 comments
"she has often seen a cat without a grin but never a grin without a cat"
- Alice in Alice in Wonderland, about the Cheshire cat (also known as the Unitary Authority of Warrington Cat).
Let's have a very simple model. There's a boolean, , which measures whether there's a cat around. There's a natural number , which counts the number of legs on the cat, and a boolean , which checks whether the cat is grinning (or not).
There are a few obvious rules in the model, to make it compatible with real life:
Or, in other words, if there's no cat, then there are zero cat legs and no grin.
And that's true about reality. But suppose we have trained a neural net to automatically find the values of , , and . Then it's perfectly conceivable that something might trigger the outputs and simultaneously: a grin without any cat to hang it on.
Adversarial examples often seem to behave this way. Take for example this adversarial example of a pig classified as an airliner:
Imagine that the neural net was not only classifying "pig" and "airliner", but other things like "has wings" and "has fur".
Then the "pig-airliner" doesn't have wings, and has fur, which are features of pigs but not airliners. Of course, you could build an adversarial model that also breaks "has wings" and "has fur", but, hopefully, the more features that need to be faked, the harder it would become.
This suggests that, as algorithms get smarter, they will become more adept at avoiding adversarial examples - as long as the ultimate question is clear. In our real world, the categories of pigs and airliners are pretty sharply distinct.
We run into problems, though, if the concepts are less clear - such as what might happens to pigs and airliners if the algorithm optimises them, or how the algorithm might classify underdefined concepts like "human happiness".
Myths and dreams
Define the following booleans: detects the presence of a living human head, a living human body, a living jackal head, a living jackal body.
In our world real world we generally have and . But set the following values:
and you have the god Anubis.
Similarly, what is a dragon? Well, it's an entity such that the following are all true:
And, even though those features never go together in the real world, we can put them together in our imagination, and get a dragon.
Note that "is flying" seems more fundamental to a dragon than "has wings", thus all the wingless dragons that fly "by magic". Our imagination seem comfortable with such combinations.
Dreams are always bewildering upon awakening, because they also combine contradictory assumptions. But these combinations are often beyond what our imaginations are comfortable with, so we get things like meeting your mother - who is also a wolf - and handing Dubai to her over the tea cups (that contain milk and fear).
"Alice in Wonderland" seems to be in between the wild incoherence of dream features, and the more restricted inconsistency of stories and imagination.
Not that any real creature that size could fly with those wings anyway. ↩︎
Comments sorted by top scores.