Posts
Comments
There are, but what does having a length below 10^90 have to do with the solomonoff prior? There's no upper bound on the length of programs.
Yes, you are missing something.
Any DEADCODE that can be added to a 1kb program can also be added to a 2kb program. The net effect is a wash, and you will end up with a ratio over priors
Thirder here (with acknowledgement that the real answer is to taboo 'probability' and figure out why we actually care)
The subjective indistinguishability of the two Tails wakeups is not a counterargument - it's part of the basic premise of the problem. If the two wakeups were distinguishable, being a halfer would be the right answer (for the first wakeup).
Your simplified example/analogies really depend on that fact of distinguishability. Since you didn't specify whether or not you have it in your examples, it would change the payoff structure.
I'll also note you are being a little loose with your notion of 'payoff'. You are calculating the payoff for the entire experiment, whereas I define the 'payoff' as being the odds being offered at each wakeup. (since there's no rule saying that Beauty has to bet the same each time!)
To be concise, here's my overall rationale:
Upon each (indistinguishable) wakeup, you are given the following offer:
- If you bet H and win, you get dollars.
- If you bet T and win, you get 1+ dollars.
If you believe T yields a higher EV, then you have a credence
You get a positive EV for all N up to 2, so . Thus you should be a thirder.
Here's a clarifying example where this interpretation becomes more useful than yours:
The experimenter flips a second coin. If the second coin is Heads (H2), then N= 1.50 on Monday and 2.50 on Tuesday. If the second coin is Tails, then the order is reversed.
I'll maximize my EV if I bet T when , and H when . Both of these fall cleanly out of 'thirder' logic.
What's the 'halfer' story here? Your earlier logic doesn't allow for separate bets on each awakening.
Thanks for sharing that study. It looks like your team is already well-versed in this subject!
You wouldn't want something that's too hard to extract, but I think restricting yourself to a single encoder layer is too conservative - LLMs don't have to be able to fully extract the information from a layer in a single step.
I'd be curious to see how much closer a two-layer encoder would get to the ITO results.
:Here's my longer reply.
I'm extremely excited by the work in SAEs and their potential for interpretability, however I think there is a subtle misalignment in the SAE architecture and loss function, and the actual desired objective function.
The SAE loss function is:
, where is the -Norm.
or
I would argue that, however, what you are actually trying to solve is the sparse coding problem:
where, importantly, the inner optimization is solved separately (including at runtime).
Since is an overcomplete basis, finding that minimizes the inner loop (also known as basis pursuit denoising[1] ) is a notoriously challenging problem, one which a single-layer encoder is underpowered to compute. The SAE's encoder thus introduces a significant error , which means that you are actual loss function is:
The magnitude of the errors would have to be determined empirically, but I suspect that it is enough to be a significant source of error..
There are a few things you could do reduce the error:
- Ensuring that obeys the restricted isometry property[2] (i.e. a cap on the cosine similarity of decoder weights), or barring that, adding a term to your loss function that at least minimizes the cosine similarities.
- Adding extra layers to your encoder, so it's better at solving for .
- Empirical studies to see how large the feature error is / how much reconstruction error it is adding.
This is great work. My recommendation: add a term in your loss function that penalizes features with high cosine similarity.
I think there is a strong theoretical underpinning for the results you are seeing.
I might try to reach out directly - some of my own academic work is directly relevant here.
This is one of those cases where it might be useful to list out all the pros and cons of taking the 8 courses in question, and then thinking hard about which benefits could be achieved by other means.
Key benefits of taking a course (vs. Independent study) beyond the signaling effect might include:
- precommitting to learning a certain body of knowledge
- curation of that body of knowledge by an experienced third party
- additional learning and insight from partnerships / teamwork / office hours
But these depend on the courses and your personality. The precommitment might be unnecessary due to your personal work habits, the curation might be misaligned with what you are interested in learning, and the other students or TAs may not have useful insights that you can't figure out in your own.
Hope that helps.
Instead of demanding orthogonal representations, just have them obey the restricted isometry property.
Basically, instead of requiring , we just require .
This would allow a polynomial number of sparse shards while still allowing full recovery.
I think the success or failure of this model really depends on the nature and number of the factions. If interfactional competition gets too zero-sum (this might help us, but it helps them more, so we'll oppose it) then this just turns into stasis.
During ordinary times, vetocracy might be tolerable, but it will slowly degrade state capacity. During a crisis it can be fatal.
Even in America, we only see this factional veto in play in a subset of scenarios - legislation under divided government. Plenty of action at the executive level or in state governments don't have to worry about this.
You switch positions throughout the essay, sometimes in the same sentence!
"Completely remove efficacy testing requirements" (Motte) "... making the FDA a non-binding consumer protection and labeling agency" (Bailey)
"Restrict the FDA's mandatory authority to labeling" logically implies they can't regulate drug safety, and can't order recalls of dangerous products. Bailey! "... and make their efficacy testing completely non-binding" back to Motte again.
"Pharmaceutical manufactures can go through the FDA testing process and get the official “approved’ label if insurers, doctors, or patients demand it, but its not necessary to sell their treatment." Again implies the FDA has no safety regulatory powers.
"Scott’s proposal is reasonable and would be an improvement over the status quo, but it’s not better than the more hardline proposal to strip the FDA of its regulatory powers." Bailey again!
This is a Motte and Bailey argument.
The Motte is 'remove the FDAs ability to regulate drugs for efficacy'
The Bailey is 'remove the FDAs ability to regulate drugs at all'
The FDA doesn't just regulate drugs for efficacy, it regulates them for safety too. This undercuts your arguments about off-label prescriptions, which were still approved for use by the FDA as safe.
Relatedly, I'll note you did not address Scott's point on factory safety.
If you actually want to make the hardline position convincing, you need to clearly state and defend that the FDA should not regulate drugs for safety.
The differentiation between CDT as a decision theory and FDT as a policy theory is very helpful at dispelling confusion. Well done.
However, why do you consider EDT a policy theory? It's just picking actions with the highest conditional utility. It does not model a 'policy' in the optimization equation.
Also, the ladder analogy here is unintuitive.
This doesn't make sense to me. Why am I not allowed to update on still being in the game?
I noticed that in your problem setup you deliberately removed n=6 from being in the prior distribution. That feels like cheating to me - it seems like a perfectly valid hypothesis.
After seeing the first chamber come up empty, that should definitively update me away from n=6. Why can't I update away from n=5 ?
Counterpoint, robotaxis already exist: https://www.nytimes.com/2023/08/10/technology/driverless-cars-san-francisco.html
You should probably update your priors.
Nope.
According to the CDC pulse survey you linked (https://www.cdc.gov/nchs/covid19/pulse/long-covid.htm) the metrics for long covid are trending down. This includes: currently experiencing, any limitations, and significant limitations categories.
How is this in the wrong place?
Nice. This also matches my earlier observation that the epestemic failure is of not anticipating one's change in value. If you do anticipate it, you won't agree to this money pump.
I agree that the type of rationalization you've described is often practically rational. And it's at most a minor crime against epestemic rationality. If anything, the epestemic crime here is not anticipating that your preferences will change after you've made a choice.
However, I don't think this case is what people have in mind when they critique rationalization.
The more central case is when we rationalize decisions that affect other people; for example, Alice might make a decision that maximizes her preferences and disregards Bob's, but after the fact she'll invent reasons that make her decision appear less callous: "I thought Bob would want me to do it!"
While this behavior might be practically rational from Alice's selfish perspective, she's being epestemically unvirtuous by lying to Bob, degrading his ability to predict her future behavior.
Maybe you can use specific terminology to differentiate your case from the more central one, maybe "preference rationalization"?
I can use a laptop to hammer in a nail, but it's probably not the fastest or most reliable way to do so.
I don't see how this is more of a risk for a shutdown-seeking goal, than it is for any other utility function that depends on human behavior.
If anything, the right move here is for humans to commit to immediately complying with plausible threats from the shutdown-seeking AI (by shutting it down). Sure, this destroys the immediate utility of the AI, but on the other hand it drives a very beneficial higher level dynamic, pushing towards better and better alignment over time.
That assumption literally changes the nature of the problem, because the offer to bet, is information that you are using to update your posterior probability.
You can repair that problem by always offering the bet and ignoring one of the bets on tails. But of course that feels like cheating - I think most people would agree that if the odds makers are consistently ignoring bets on one side, then the odds no longer reflect the underlying probability.
Maybe there's another formulation that gives 1:1 odds, but I can't think of it.
To the second point, because humans are already general intelligences.
But more seriously, I think the monolithic AI approach will ultimately be uncompetitive with modular AI for real life applications. Modular AI dramatically reduces the search space. And I would contend that prediction over complex real life systems over long-term timescales will always be data-starved. Therefore being able to reduce your search space will be a critical competitive advantage, and worth the hit from having suboptimal interfaces.
Why is this relevant for alignment? Because you can train and evaluate the AI modules independently, individually they are much less intelligent and less likely to be deceptive, you can monitor their communications, etc.
I take issue with the initial supposition:
- How could the AI gain practical understanding of long-term planning if it's only trained on short time scales?
- Writing code, how servers work, and how users behave seen like very different types of knowledge, operating with very different feedback mechanisms and learning rules. Why would you use a single, monolithic 'AI' to do all three?
My weak prediction is that adding low levels of noise would change the polysemantic activations, but not the monosemantic ones.
Adding L1 to the loss allows the network to converge on solutions that are more monosemantic than otherwise, at the cost of some estimation error. Basically, the network is less likely to lean on polysemantic neurons to make up small errors. I think your best bet is to apply the L1 loss on the hidden layer and the output later activations.
I've been thinking along very similar lines, and would probably generalize even further:
Hypothesis: All DNNs thus far developed are basically limited to system-1 like reasoning.
Great stuff!
Do you have results with noisy inputs?
The negative bias lines up well with previous sparse coding implementations: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=JHuo2D0AAAAJ&citation_for_view=JHuo2D0AAAAJ:u-x6o8ySG0sC
Note that in that research, the negative bias has a couple of meanings/implications:
- It should correspond to the noise level in your input channel.
- Higher negative biases directly contribute to the sparsity/monosemanticty of the network.
Along those lines, you might be able to further improve monosemanticity by using the lasso loss function.
Yes, but that was decades ago, when Yeltsin was president! The 'union state' has been moribund since the early aughts.
I have some technical background in neuromorphic AI.
There are certainly things that the current deep learning paradigm is bad at which are critical to animal intelligence: e.g. power efficiency, highly recurrent networks, and complex internal dynamics.
It's unclear to me whether any of these are necessary for AGI. Something, something executive function and global workspace theory?
I once would have said that feedback circuits used in the sensory cortex for predictive coding were a vital component, but apparently transformers can do similar tasks using purely feedforward methods.
My guess is that the scale and technology lead of DL is sufficient that it will hit AGI first, even if a more neuro way might be orders of magnitude more computationally efficient.
Where neuro AI is most useful in the near future is for embodied sensing and control, especially with limited compute or power. However, those constraints would seem to drastically curtail the potential for AGI.
If the world's governments decided tomorrow that RL was top-secret military technology (similar to nuclear weapons tech, for example), how much time would that buy us, if any? (Feel free to pick a different gateway technology for AGI, RL just seems like the most salient descriptor).
In my model, Chevron and the US military are probably open to AI governance, because: 1 - they are institutions traditionally enmeshed in larger cooperative/rule-of-law systems, AND 2 - their leadership is unlikely to believe they can do AI 'better' than the larger AI community.
My worry is instead about criminal organizations and 'anti-social' states (e.g. North korea) because of #1, and big tech because of #2.
Because of location, EA can (and should) make decent connective with US big tech. I think the bigger challenge will be tech companies in other countries , especially China.
I published an article on induction https://www.lesswrong.com/posts/7x4eGxXL5DMwRwzDQ/commensurable-scientific-paradigms-or-computable-induction of decent length/complexity that send to have gotten no visibility at all, which I found very discouraging for my desire to ever do so again. I could only find it by checking my user profile!
I'm downvoting this, not because it's wrong or because of weak epistemics, but because politics is the mind killer, and this article is deliberately structured to make that worse.
I believe politically sensitive topics like this can be addressed on less wrong, but the inflammatory headline and first sentence here are just clickbait.
Articles are hard! I was lucky enough to be raised bilingual, so I'm somewhat adept at navigating between different article schemes). I won't claim these are hard and fast rules in English, but:
1 - 'Curiosity' is an abstract noun (e.g. liberty, anger, parsimony). These generally don't have articles, unless you need some reason to distinguish between subcategories (e.g. 'the liberty of the yard' vs. 'the liberty of the French')
2 - 'Context' can refer to either a specific context (e.g. 'see in the proper context'), in which case the articles are included, or the broad category (e.g. 'context is everything'). 'see in the context' is not ungrammatical, but its usually awkward, because without an adjective its unclear which context you are talking about. (And if you were referring to one that was previously established, you would use 'that context' or 'this context'). However, in the particular case of the button, 'see in the context' would be acceptable, because the identity of 'the context' is clear! I doubt a native English speaker would say that, though, because its not idiomatic.
3 - 'hide the previous comment' is actually correct here! However, in human-machine interfaces, articles, prepositions, and pronouns are often omitted to save space/mental effort.
I'm confused.
In the counterfactual where lesswrong had the epistemic and moderation standards you desire, what would have been the result of the three posts in question, say three days after they were first posted? Can you explain why, using the standards you elucidated here?
(If you've answered this elsewhere, I apologize).
Full disclosure: I read all three of those posts, and downvoted the third post (and only that one), influenced in part by some of the comments to that post.
"However there’s definitely an additional problem, which is that the fees are going to the city."
Money which the city could presumably use to purchase scarce and vital longshoreman labor.
The city is getting a windfall because it owns a scarce resource. Would you consider this a problem if the port were privately owned?
What Ryan is calling punishment is just an ECON 101 cost increase.
I'm actually ok with the social pressures inherent in the activity. It's a subtle reminder of the real influence of this community. The fact that this community would enforce a certain norm makes me more likely to be a conscientious objector in contexts with the opposite norm. (This is true of historical C.O.s, who often come from religious communities).
I'd highly recommend 'The Bomber Mafia' by Malcolm Gladwell on this subject, which details the internal debates of the US Army Air Corps generals during WWII.
One of the key questions was whether to use the bombers to target strategic industries, or just for general attrition (i.e. firebombing of civilians). Obviously the first one would have been preferable from a humanitarian perspective (and likely would have ended the European War sooner), but it was very difficult to execute in practice.
I think the Bob example is very informative! I think there's an intuitive and logical reason why we think Bob and Edward are worse off. Their happiness is contingent on the masquerade continuing, which has a probability less than one in any plausible setup.
(The only exception to this would be if we're analyzing their lives after they are dead)
Yes, I was completely turned off from 'debate' as a formal endeavor as a high schooler, despite my love for informal debate.
One of the main problems is that debate contests are usually formulated as zero sum, whereas the typical informal debate I engage in is not.
Do you know of any formats for nonzero sum debate competitions where the competitors argue points they actually believe in? e.g. both debaters get more points if they identify a double-crux, and you win by having more points in the tournament as a whole, not by beating your opponent.
I believe that determinism and free will are both good models of reality, albeit at different conceptual level.
Human brains are high dimensional chaotic systems. I believe that if you put a very smart human in a task that demands creativity and insight, it will be extremely difficult to predict what they'll do, even if you precisely knew their connectome and data inputs. Maybe that's not the same thing as a philosophical "free will", but I don't see how it would result in a different end experience.
This chapter would make a great movie.
Russia's' has an extra quote.
Alice's explanation of the Bayesian model sounds like technobabble. Unless that was the intent, it could use a bit more elaboration.
Depends on the environment. My assumption is that the venue is sufficiently crowded that the tamperer would never be alone with the drink, and the main protection is their risk of being spotted.
A tamper proof solution would likely be far more costly to implement.
Lids and straws. Presumably this would make slipping a drug in way more obvious.
"Miriam placed poker her hand against" should be "Miriam placed her hand" or "poked her hand"
I think I agree. I hadn't realized the UK vaccination rates were so high. In that case I'll lean towards the pockets of unvaccinated reaching herd immunity + shorter incubation period hypothesis.
I agree that this seems to explain it, but it raises a new question: how did the antibody rate get so high? Is it possible that part of Delta's contagiousness is that it has a lot more carriers who don't get sick?
Good point! I'll edit my fermi analysis to reflect that.
Even in a scenario where all unvaccinated people were infected with covid, I would expect none of the Georgetown undergraduates to die from covid or get covid longer than 12 weeks.
Here's my fermi analysis:
- in your 20s, covid CFR is .0001, compared to .01 for population as a whole.
- covid longer than 12 weeks is .03 for covid population as a whole.
- assume really long covid scales similarly to death and hospitalization
- mRNA reduces these both by .9.
That gives us .03 x .01 x .1, for a case really long covid rate of .00003. .00003 x 6532 = .2 really long covid .00001 x 6532 = .07 deaths
And given that you are primarily interacting with other unvaccinated, young individuals, you are less likely to be infected than the average vaccinated person. So the real number is probably less than .1 person getting covid beyond 12 weeks.
Let me know if you see errors in my reasoning.
He recommends that for communities, which presumably include significant numbers of unvaccinated folks. Which, if targeted to N95 or better masks, and actually enforced, could have substantial effect!
But having members of the least infectious subpopulation voluntarily mask is pretty much useless.
As to your second point, there is strong evidence that is not the case: https://pubmed.ncbi.nlm.nih.gov/34250518/ Vaccinated individuals who get infected have substantially lower viral loads, and thus are substantially less contagious.
You reach the opposite conclusion from Tomas Pueyo (who seems to be your primary reference):
"If you’re vaccinated, you’re mostly safe, especially with mRNA vaccines. Keep your guard up for now, avoid events that might become super-spreaders, but you don’t need to worry much more than that."
Checking your math, I think your biggest error is equating long covid (at least one symptom still present after 28 days) with lifelong CFS. The vast majority seem to clear up in the next 8 weeks: https://www.nature.com/articles/s41591-021-01292-y
I believe the 64% reduction in symptomatic infections is an outlier (compare with the UK data, e.g.), and if you've had an mRNA vaccine the number is much higher.
Finally, not accounting for age in your long covid statistics is a mistake. Young people are making up a large percentage of the infected because they are disproportionally unvaccinated. Those young and vaccinated are quite well protected from severe infection. And while some long covid comes from mild cases, it's highly correlated with severe cases.