Posts

Comments

Comment by Alex Khripin (alex-khripin) on Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc · 2022-06-05T00:32:59.055Z · LW · GW

Obviously, it's an exaggerated failure mode. But all systems have failure modes, and are meant to be used under some range of inputs. A more realistic requirement may be night versus day images. A tree detector that only works in daylight is perfectly usable.

The capabilities and limitations of a deep learned network are partly hidden in the input data. The autumn example is an exaggeration, but there may very well be species of tree that are not well represented in your inputs. How can you tell how well they will be recognized? And, if a particular sequoia is deemed not a tree -- can you tell why?

Comment by Alex Khripin (alex-khripin) on Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc · 2022-06-04T20:01:34.857Z · LW · GW

You use the word robustness a lot, but interpretability is related to the opposite of robustness.

When your tree detector says a tree is a tree, nobody will complain. The importance of interpretability is in understanding why it might be wrong, in either direction -- either before or after the fact.

If your hand written tree detector relies on identifying green pixels, then you can say up front that it won't work in deciduous forests in autumn and winter. That's not robust, but it's interpretable. You can analyze causality from inputs to outputs (though this gets progressively more difficult). You may also be able to say with confidence that changing a single pixel will have limited or no effect.

The extreme of this are safety systems where the effect of input state (both expected, and unexpected, like broken sensors) on the output is supposed to be bounded and well characterized.

I can offer a very minimal example from my own experience training a relatively simple neural net to approximate a physical system. This was simple enough that a purely linear model was a decent starting point. For the linear model -- and for a variety of simple non linear models I could use -- it would be trivial to guarantee that, for example, the behavior of the approximation would be smooth and monotonic in areas where I didn't have data. A sufficiently complex network, on the other hand, needed considerable effort to guarantee such behavior.

You're correct that the labels in a "simple" decision tree are hiding a lot of complexity -- but, for classical systems, usually they are themselves are coming from simple labeling methods. "Season" may come from a lookup in a calendar. "Sprinkler" may come from a landscaping schedule or a sensor attached to the valve controlling the sprinkler.

Deep learning is encouraged to break this sort of compartmentalization in the interest of greater accuracy. The sprinkler label may override the valve signal, which promises to make the system robust to a bad sensor -- but how it chooses to do so based on training data may be hard to discern. The opposite may be true as well -- the hand written system may anticipate failure modes of the sensor that there is no training data on.

If you look at any well written networking code, for example, it will be handling network errors that may be extremely unlikely. It may even respond to error codes that cannot currently happen -- for example, a specific one specified by the OS but not yet implemented. When the vendor announces that the new feature is supported, you can look at the code and verify that behavior will still be correct.

To summarize -- interpretability is about debugging and verification. Those are stepping stones towards robustness, but robustness is a much higher bar.