Posts

Toward A Mathematical Framework for Computation in Superposition 2024-01-18T21:06:57.040Z
Grokking, memorization, and generalization — a discussion 2023-10-29T23:17:30.098Z
Investigating the learning coefficient of modular addition: hackathon project 2023-10-17T19:51:29.720Z
The Low-Hanging Fruit Prior and sloped valleys in the loss landscape 2023-08-23T21:12:58.599Z
Decomposing independent generalizations in neural networks via Hessian analysis 2023-08-14T17:04:40.071Z
Alternative mask materials 2020-03-27T01:22:11.435Z

Comments

Comment by Dmitry Vaintrob (dmitry-vaintrob) on An Actually Intuitive Explanation of the Oberth Effect · 2024-01-11T19:23:40.203Z · LW · GW

Right - looking at energy change of the exhaust explains the initial question in the post: why energy is preserved when a rocket accelerates, despite apparently expending the same amount of fuel for every unit of acceleration (assuming small fuel mass compared to rocket). Note that this doesn't depend on a gravity well - this question is well posed, and well answered (by looking at the rocket + exhaust system) in classical physics without gravity. The Oberth phenomenon is related but different I think

Comment by dmitry-vaintrob on [deleted post] 2023-12-26T12:50:04.938Z

Hi! As I commented on your other post: I think this is a question for https://mathoverflow.net/ or https://math.stackexchange.com/ . This question is too technical, and does not explain a connection to alignment. If you think this topic is relevant to alignment and would be interesting to technical people on LW, I would recommend making a non-technical post that explains how you think results in this particular area of analysis are related to alignment.

Comment by dmitry-vaintrob on [deleted post] 2023-12-26T12:46:00.410Z

Hi! I think this is a question for https://mathoverflow.net/ or https://math.stackexchange.com/ . While Lesswrong has become a forum for relatively technical alignment articles, this question is too math-heavy, and it has not been made clear how this is relevant to alignment. The forum would get too crowded if very technical math questions became a part of the standard content.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Mapping the semantic void: Strange goings-on in GPT embedding spaces · 2023-12-15T13:19:19.708Z · LW · GW

I think it's very cool to play with token embeddings in this way! Note that some of what you observe is, I think, a consequence of geometry in high dimensions and can be understood by just modeling token embeddings as random. I recommend generating a bunch of tokens as a Gaussian random variable in a high-dimensional space and playing around with their norms and their norms after taking a random offset.
Some things to keep in mind, that can be fun to check for some random vectors: 

- radii of distributions in high-dimensional space tend to cluster around some fixed value. For a multivariate Gaussian in n-dimensional space, it's because the square radius is a sum of squares of Gaussians (one for each coordinate). This is a random variable with mean O(n) and standard deviation . In your case, you're also taking a square root (norm vs. square norm) and normalization is different, but the general pattern of this variable becoming narrow around a particular band (with width about compared to the radius) will hold.
- a random offset vector will not change the overall behavior (though it will change the radius). 
- Two random vectors in high-dimensional space will be nearly orthogonal. 

On the other hand it's unexpected that the mean is so large (normally you would expect the mean of a bunch of random vectors to be much smaller than the vectors themselves). If this is not an artifact of the training, it may indicate that words learn to be biased in some direction (maybe a direction indicating something like "a concept exists here"). The behavior of tokens near the center-of-mass also seems really interesting. 

Comment by Dmitry Vaintrob (dmitry-vaintrob) on My Criticism of Singular Learning Theory · 2023-11-19T16:31:49.391Z · LW · GW

I think there is some misunderstanding of what SLT says here, and you are identifying two distinct notions of complexity as the same, when in fact they are not. In particular, you have a line  

"The generalisation bound that SLT proves is a kind of Bayesian sleight of hand, which says that the learning machine will have a good expected generalisation relative to the Bayesian prior that is implicit in the learning machine itself."

I think this is precisely what SLT is saying, and this is nontrivial! One can say that a photon will follow a locally fastest route through a medium, even if this is different from saying that it will always follow the "simplest" route. SLT arguments always works relative to a loss landscape, and interpreting their meaning should (ideally) be done relative to the loss landscape. The resulting predictions are, nevertheless, nontrivial, and are sometimes confirmed. For example we have some work on this with Nina Rimsky.

You point at a different notion of complexity, associated to considering the parameter-function map. This also seems interesting, but is distinct from complexity phenomena in SLT (at least from the more basic concepts like the RLCT), and which is not considered in the basic SLT paradigm. Saying that this is another interesting avenue of study or a potentially useful measure of complexity is valid, but is a priori independent of criticism of SLT (and of course ideally, the two points of view could be combined). 

Note that loss landscape considerations are more important than parameter-function considerations in the context of learning. For example it's not clear in your example why f(x) = 0 is likely to be learned (unless you have weight regularization). Learning bias in a NN should most fundamentally be understood relative to the weights, not higher-order concepts like Kolmogorov complexity (though as you point out, there might be a relationship between the two). 

Also I wanted to point out that in some ways, your "actual solution" is very close to the definition of RLCT from SLT.  The definition of the RLCT is how much entropy you have to pay (in your language, the change in negative log probability of a random sample) to gain an exponential improvement of loss precision; i.e., "bits of specification per bit of loss". See e.g. this article.

The thing is, the "complexity of f" (your K(f)) is not a very meaningful concept from the point of view of a neural net's learning (you can try to make sense of it by looking at something like the entropy of the weight-to-function mapping, but then it won't interact that much with learning dynamics). I think if you follow your intuitions carefully, you're likely to precisely end up arriving at something like the RLCT (or maybe a finite-order approximation of the RLCT, associated to the free energy). 

I have some criticisms of how SLT is understood and communicated, but I don't think that the ones you mention seem that important to me. In particular, my intuition is that for purposes of empirical measurement of SLT parameters, the large-sample limit of realistic networks is quite large enough to see approximate singularities in the learning landscape, and that the SGD-sampling distinction is much more important than many people realize (indeed, there is no way to explain why generalizable networks like modular addition still sometimes memorize without understanding that the two are very distinct). 

My main update in this field is that people should be more guided by empiricism and experiments, and less by competing paradigms of learning, which tend to be oversimplified and to fail to account for messy behaviors of even very simple toy networks. I've been pleasantly surprised by SLT making the same update in recent months.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Grokking, memorization, and generalization — a discussion · 2023-10-31T23:09:23.736Z · LW · GW

Interesting - what SLT prediction do you think is relevant here?

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Grokking, memorization, and generalization — a discussion · 2023-10-31T23:07:57.171Z · LW · GW

Noticed thad I didn't answer Kaarel's question there in a satisfactory way.  Yeah - "basin" here is meant very informally as a local piece of the loss landscape with lower loss than the rest of the landscape, and surrounding a subspace of weight space corresponding to a circuit being on. Nina and I actually call this a "valley" our "low-hanging fruit" post.

By "smaller" vs. "larger" basins I roughly mean the same thing as the notion of "efficiency" that we later discuss

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Grokking, memorization, and generalization — a discussion · 2023-10-31T23:01:31.570Z · LW · GW

In particular, in most unregularized models we see that generalize (and I think also the ones in omnigrok), grokking happens early, usually before full memorization (so it's "grokking" in the redefinition I gave above). 

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Investigating the learning coefficient of modular addition: hackathon project · 2023-10-17T21:30:53.560Z · LW · GW

Oh I can see how this could be confusing. We're sampling at every step in the orthogonal complement to the gradient at that step ("initialization" here refers to the beginning of sampling, i.e., we don't update the normal vector during sampling). And the reason to do this is that we're hoping to prevent the sampler from quickly leaving the unstable point and jumping into a lower-loss basin (by restricting we are guaranteeing that the unstable point is a critical point)

Comment by Dmitry Vaintrob (dmitry-vaintrob) on The "best predictor is malicious optimiser" problem · 2020-07-29T19:54:26.756Z · LW · GW

Sorry, I misread this. I read your question as O outputting some function T that is most likely to answer some set of questions you want to know the answer to (which would be self-referential as these questions depend on the output of T). I think I understand your question now.

What kind of ability do you have to know the "true value" of your sequence B?

If the paperclip maximizer P is able to control the value of your turing machine, and if you are a one-boxing AI (and this is known to P) then of course you can make deals/communicate with P. In particular, if the sequence B is generated by some known but slow program, you can try to set up an Arthur-Merlin zero knowledge proof protocol in exchange for promising to make a few paperclips, which you can then use to keep P honest (after making the paperclips as promised).

To be clear though, this is a strategy for an agent A that somehow has as its goals only the desire to compute B together with some kind of commitment to following through on agreements. If A is genuinely aligned with humans, the rule "don't communicate/make deals with malicious superintelligent entities, at least until you have satisfactorily solved the AI in a box and similar underlying problems" should be a no-brainer.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on The "best predictor is malicious optimiser" problem · 2020-07-29T14:24:26.946Z · LW · GW

Looks like you're making a logical error. Creating a machine that solves the halting problem is prohibited by logic. For many applications assuming a sufficiently powerful and logically consistent oracle is good enough but precisely these kinds of games you are playing, where you ask a machine to predict its own output/the output of a system involving itself, are where you get logically inconsistent. Indeed, imagine asking the oracle to simulate an equivalent version of itself and to output the the opposite answer to what its simulation outputs. This may seem like a derived question, but most "interesting" self-referential questions boil down to an instance of this. I think once you fix the logical inconsistency, you're left with an equivalent problem to AI in a box: boxed AI P is stronger that friendly AI A but has an agenda.

Alternatively, if you're assuming A is itself un-aligned (rather than friendly) and has the goal of getting the right answer at any cost then it looks like you need some more assumptions on A's structure. For example if A is sufficiently sophisticated and knows it has access to a much more powerful but untrustwothy oracle it might know to implement a merlin-arthur protocol.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Research on repurposing filter products for masks? · 2020-04-03T21:10:11.536Z · LW · GW

Not sure but doubt it: IIRC, copper kills by catalysing intra-cellular reactions, which are slow (compared to salt, which should have near-instantaneous mechanism of action since it can blow up membranes). Also I would be worried about safety of breathing copper. But I might be wrong about this!

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Research on repurposing filter products for masks? · 2020-04-03T18:09:51.211Z · LW · GW

I've looked at a small amount of data on this question. I think it's a really important one (see a related question of mine), but am extremely not an expert. The most actionable item is this study that essentially "salting" a surgical mask might make it significantly more protective against flu viruses. The study's in vivo section with mice strikes me as a bit sketchy (small n, and unclear how representative of mask filtration their mouse procdeure actually is), but their in vitro section seems legit, and the study is in Scientific Reports (part of the Nature publishing group). If you're making a DIY mask/filter and it's not too thick already, it can't hurt to include a salted layer. Their proposed mechanism of action is by the salt killing the virus particles, not filtering them, so it should stack well with particulate filters. The recipe in the paper is to coat a hydrophobic filter in solution of salt and surfactant (they used polysorbate 20, which is approved to use as a food additive), then let it dry.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on What will happen to supply chains in the era of COVID-19? · 2020-03-31T14:24:56.190Z · LW · GW

What makes you say England did not have looting during WW2? England had more cohesion. But that is just one factor impacting people's behavior. Someone who is desperate or immoral enough to loot in wartime is unlikely to be seriously swayed by the need for patriotic unity. Other factors, which I think are bigger, are severity of need and enforcement. Don't know about enforcement, but it is very hard for me to envision a scenario where meeting basic needs is harder and than in WW2 Britain.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on What will happen to supply chains in the era of COVID-19? · 2020-03-31T04:19:26.066Z · LW · GW

I've done a little research about the food supply chain specifically. Presumably certain supply chains will be similar, certain ones will be different. Also note I am very much not an expert. The basic fact is that there is "enough food" but prices may rise and getting food may be worse. I think there are three key parameters, which could go either way:

(1) Hoarding/instability. Worst case scenario: people panic. People stockpile giant supplies of food. Food goes bad. People buy more food. Food gets prohibitively expensive. Best case scenario: supermarket situation stabilizes, panicky people feel like they have enough non-perishables stockpiled, most last-mile (grocery store) product shortages stop.

(2) Protectionism. This will be less dangerous in the US which exports more food than it imports. But certain countries, especially poorer countries that rely significantly on imports, will suffer if a global panic causes protectionist policies about food (e.g. wheat exporter Kazakhstan apparently stopped exporting grain because of coronavirus fears, see this article ). This is understandable, but probably bad. Here the best case according to this article is if big markets actively work to stabilize the market and punish protectionism (but the economics here is above my pay grade).

(3) Worker/driver issues. This mostly depends on "how freaked out blue-collar workers get". Currently most truck drivers, clerks, etc., are risking infection in exchange for a steady job. If things get bad (for example if there are wide-spread hospital bed shortages and fatality goes through the roof) *and younger people become afraid* (a big if), a big proportion of chain workers will take losing their job over getting infected. This would probably raise prices.

It's important to stress that it's *very unlikely* that anything catastrophic happens in developed countries like the US, and the worst-case scenario is government rationing. The example to keep in mind is WW2 Britain (I originally linked the wrong article here, which is also an interesting read ). Nevertheless, with rationing people survived basically healthy for several years of war.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on A Significant Portion of COVID-19 Transmission Is Presymptomatic · 2020-03-18T19:10:19.720Z · LW · GW

A question I always have about these studies is at what level symptoms are defined and self-reported. E.g. presumably "you have an itchy throat or a mild headache in the morning/mildly increased fever over your baseline" is pre-symptomatic. Self-isolating with mild symptoms is probably hard to measure but can be at least socially enforced.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Reasons why coronavirus mortality of young adults may be underestimated. · 2020-03-16T14:44:01.968Z · LW · GW

DP Cruise didn't have any fatalities under age 70, so not sure where you're getting the under-29 number. Also since the population is older the case fatality was over-estimated. This study https://cmmid.github.io/topics/covid19/severity/diamond_cruise_cfr_estimates.html?fbclid=IwAR2jCOZcBGHYBWC_dqSzwvX7T7-DOpwm8L84qqW8k6QtKa05Inv35Pk3Ezs estimates adjusted CFR form DP cruise ship data (assuming treatment!) to be .5%, largely in agreement with other numbers I'd heard. Though the sample size is ridiculously small, so the error bounds are terrible.

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Coronavirus: Justified Practical Advice Thread · 2020-03-13T05:00:13.600Z · LW · GW

Advice: drink a mouthful of water every 15 minutes. This is speculative (facebook post from a friend of a friend). The rationale is that if you have virus particles in your mouth, rinsing them into your stomach (where the stomach acid kills them) will prevent them from getting into your respiratory system. [edit: retracted, seems to be downstream from a fake news article. Drinking water is still good, but looks like this pathway is not realistic]

Comment by Dmitry Vaintrob (dmitry-vaintrob) on Coronavirus: Justified Practical Advice Thread · 2020-03-13T02:14:10.907Z · LW · GW

Advice: now may be a good time to learn to meditate. Deaths from coronavirus are due mostly to breathing problems from pneumonia, which is the main explanation for why older people are more likely to die. There is evidence that meditation is good for pneumonia specifically http://www.annfammed.org/content/10/4/337.full and lowers oxygen consumption generally https://journals.sagepub.com/doi/full/10.1177/2156587213492770. I didn't read the studies carefully to see how trustworthy they are, but this conforms well with my understanding and limited experience of meditation. Meditation is also known to be good for mitigating stress, which will obviously be beneficial in the coming months.