Previous Work on Recreating Neural Network Input from Intermediate Layer Activations
post by bglass · 2022-10-12T19:28:39.058Z · LW · GW · No commentsThis is a question post.
Contents
Answers 5 the gears to ascenscion 3 Garrett Baker None No comments
Recently I've been experimenting with recreating a neural network's input layer from intermediate layer activations.
The possibility has implications for interpretability. For example, if certain neurons are activated on certain input, you know those neurons are 'about' that type of input.
My question is: Does anyone know of prior work/research in this area?
I'd appreciate even distantly-related work. I may write a blog post about my experiments if there is an interest and if there isn't already adequate research in this area.
Answers
search quality: skimmed the abstracts search method: semantic scholar + browsing note that many of these results are kind of old
- https://www.semanticscholar.org/paper/Explaining-Neural-Networks-by-Decoding-Layer-Schneider-Vlachos/0de6c8de9154a0db199aa433fc19cdfef2a62076
- ... is cited by https://www.semanticscholar.org/paper/Toward-Transparent-AI%3A-A-Survey-on-Interpreting-the-Raukur-Ho/108a4000b32e3f6eb566151790bfea69c1f3a9db (fun: it cites the EA forum for one of its 300 cites)
- ... which cites https://www.semanticscholar.org/paper/Understanding-deep-image-representations-by-them-Mahendran-Vedaldi/4d790c8fae40357d24813d085fa74a436847fb49
- ... which is heavily cited, eg by https://www.semanticscholar.org/paper/Inverting-Visual-Representations-with-Convolutional-Dosovitskiy-Brox/125f7b539e89cd0940ff89c231902b1d4023b3ba
- ... https://www.semanticscholar.org/paper/Inverting-face-embeddings-with-convolutional-neural-Zhmoginov-Sandler/e44fc62f9fba4c9ad276544901fd1e82caaf7baa
- ... https://www.semanticscholar.org/paper/Inverting-Convolutional-Networks-with-Convolutional-Dosovitskiy-Brox/993c55eef970c6a11ec367dbb1bf1f0c1d5d72a6
- ... hmm interesting, here's a branch off into doing it on the human visual system apparently https://www.semanticscholar.org/paper/Using-deep-learning-to-reveal-the-neural-code-for-Kindel-Christensen/e79b56303a29114762f458d338d0f3b03348d618
- ... https://www.semanticscholar.org/paper/Visualizing-and-Comparing-AlexNet-and-VGG-using-Yu-Bai/dae981902b1f6d869ef2d047612b90cdbe43fd1e
- ... https://www.semanticscholar.org/paper/Understading-Image-Restoration-Convolutional-Neural-Protas-Bratti/0c807815ceaa186e99519f59ae6c3ff1ac7defdd
- https://www.semanticscholar.org/paper/Towards-Understanding-the-Invertibility-of-Neural-Gilbert-Zhang/487489253b03948a1b1c581986c086d577222e0a
- https://www.semanticscholar.org/paper/Analysis-of-Invariance-and-Robustness-via-of-Behrmann-Dittmer/0c11435e0b97b90dfc3928ce242c68289bc757f2
- https://www.semanticscholar.org/paper/Deep-Neural-Networks-are-Surprisingly-Reversible%3A-A-Dong-Yin/e8e5f0db724d65f761bd2d415ee46281f8ba751a
- https://www.semanticscholar.org/paper/Large-capacity-Image-Steganography-Based-on-Neural-Lu-Wang/d1485d298906364c4434454d25c0ed4389420892
- https://www.semanticscholar.org/paper/Robust-Invertible-Image-Steganography-Xu-Mou/786736d89d5bbfa674fabe42ecec32ed8f67901e
- https://www.semanticscholar.org/paper/Understanding-and-mitigating-exploding-inverses-in-Behrmann-Vicol/8c0b75099f577cc009065e985cae6986cf755d4d
- https://www.semanticscholar.org/paper/The-Effects-of-Invertibility-on-the-Complexity-of-Pareek-Risteski/7bb65e9167e5d21f04ebaacdd7bc59f7c4972bb7
- https://www.semanticscholar.org/paper/Evaluating-generalization-through-interval-based-Adam-Likas/f7843d212ddd65de3dc376bb6c146ce78eacf3e0
- https://www.semanticscholar.org/paper/Landscape-Learning-for-Neural-Network-Inversion-Liu-Mao/5dad3748e8d4d8c659005903062e5d8e855fa86c <= bold claims, might even read this one properly to see if they hold up
↑ comment by the gears to ascension (lahwran) · 2022-10-13T01:04:25.807Z · LW(p) · GW(p)
interesting to me but not what you asked for
https://www.semanticscholar.org/paper/The-learning-phases-in-NN%3A-From-Fitting-the-to-a-Schneider/f0c5f3e254b3146199ae7d8feb888876edc8ec8b https://www.semanticscholar.org/paper/Deceptive-AI-Explanations%3A-Creation-and-Detection-Schneider-Handali/54560c7bce50e57d2396cbf485ff66e5fda83a13 https://www.semanticscholar.org/paper/TopKConv%3A-Increased-Adversarial-Robustness-Through-Eigen-Sadovnik/fd5a74996cc5ef9a6b866cb5608064218d060d16 https://www.semanticscholar.org/paper/This-Looks-Like-That...-Does-it-Shortcomings-of-in-Hoffmann-Fanconi/78396cda15041dda05c5a21c1417683bee2a070b (does this one limit the applicability of "natural abstraction"/"everything's connected"/relative representations?) https://www.semanticscholar.org/paper/Self-explaining-AI-as-an-Alternative-to-AI-Elton/301c4c7df87f728e2589a384001e2a2755c5072c https://www.semanticscholar.org/paper/Pruning-by-Explaining%3A-A-Novel-Criterion-for-Deep-Yeom-Seegerer/ebbe984d3d7bc7edfe0cda0f1fcf49d1533bc3c3 https://www.semanticscholar.org/paper/Pruning-for-Interpretable%2C-Feature-Preserving-in-Hamblin-Konkle/370ee88bb8207651675a8fa5c93de7de4d79db36 https://www.semanticscholar.org/paper/"Will-You-Find-These-Shortcuts"-A-Protocol-for-the-Bastings-Ebert/efe376f566e5ab6113fe8e215abc7ed5149a3848
https://www.semanticscholar.org/paper/Interpreting-Deep-Learning%3A-The-Machine-Learning-Charles/b7488a0ac799a2c62882a5b40f4ea4b1c88f04c4 https://www.semanticscholar.org/paper/Minimizing-Control-for-Credit-Assignment-with-Meulemans-Farinha/0bb32a1b9a8702a38f54b64ca08df8abffc097a8 https://www.semanticscholar.org/paper/The-Union-of-Manifolds-Hypothesis-and-its-for-Deep-Brown-Caterini/3c0a4afc8f430f32442a8efa306f898d9198d7c5
Myself and some others did some work looking at the mutual information between intermediate layers of a network, and it's input here [LW · GW].
No comments
Comments sorted by top scores.