LessWrong 2.0 Reader
View: New · Old · Topnext page (older posts) →
next page (older posts) →
Two things lead me to think human content online will soon become way more valuable.
The implication: make tons of digital stuff. Write/Draw/Voice-record/etc
leogao on leogao's Shortformwhen i was new to research, i wouldn't feel motivated to run any experiment that wouldn't make it into the paper. surely it's much more efficient to only run the experiments that people want to see in the paper, right?
now that i'm more experienced, i mostly think of experiments as something i do to convince myself that a claim is correct. once i get to that point, actually getting the final figures for the paper is the easy part. the hard part is finding something unobvious but true. with this mental frame, it feels very reasonable to run 20 experiments for every experiment that makes it into the paper.
minusgix on evhub's Shortform2b/2c. I think I would say that we should want a tyranny of the present to the extent that is in our values upon reflection. If, for example, Rome still existed and took over the world, their CEV should depend on their ethics and population. I think it would still be a very good utopia, but it may also have things we dislike.
Other considerations, like nearby Everett branches... well they don't exist in this branch? I would endorse game theoretical cooperation with them, but I'm skeptical of any more automatic cooperation than what we already have. That is, this sort of fairness is a part of our values, and CEV (if not adversarially hacked) should represent those already?
I don't think this would end up in a tyranny anything like the usual form of the word if we're actually implementing CEV. We have values for people being able to change and adjust over time, and so those are in the CEV.
There may very well be limits to how far we want humanity to change in general, but that's perfectly allowed to be in our values. Like, as a specific example, some have said that they think global status games will be vastly important in the far future and thus a zero-sum resource. I find it decently likely that an AGI implementing CEV would discourage such, because humans wouldn't endorse it on reflection, even if it is a plausible default outcome.
Like, essentially my view is: Optimize our-branches' humanity's values as hard as possible, this contains desires for other people's values to be satisfied, and thus they're represented. Other forms of fairness to things we aren't completely a fans of can be bargained for (locally, or acausally between branches/whatever).
So that's my argument against the tyranny and Everett branches part. I'm less skeptical of considering whether to include the recently dead, but I also don't have a great theory of how to weight them. Those about to be born wouldn't have a notable effect on CEV, I'd believe.
The option you suggest in #3 is nice, though I think it runs some risks of being dominated or notably influenced by "humans in other very odd branches", and so we're outweighed by them despite them not locally existing. I think it is less that you want a human predicate, and more of a "human who has values compatible with this local branch". This is part of why I advocate just bargaining between branches: if the humans in an AGI-made New Rome want us to instantiate their constructed friendly/restricted AGI Gods locally to proselytize, they can trade for it rather than that faction being automatically divvied out a star by our AGI's CEV.
"Human who has values compatible with this local branch" feels weak as a definition, arbitrary, but I'm not sure we can do better than that. I imagine we'd even have weightings, because we likely legitimately value baby's in special ways that don't entail maxing out reward centers or boosting them to megaminds soon after birth, we have preferences about that. Then of course there's minds that are sortof humanish, which is why you'd have a weighting.
(This is kinda rambly, but I do think a lot of this can be avoided with just plain CEV because I think most people on reflection would end up with "reevaluate whether the deal was fair with reflection and then adjust the deal and reference class based on that".)
martin-randall on The Case Against AI Control ResearchI like Wentworth's toy model, but I want it to have more numbers, so I made some up. This leads me to the opposite conclusion to Wentworth.
I think (2-20%) is pretty sensible for successful intentional scheming of early AGI.
Assume the Phase One Risk is 10%.
Superintelligence is extremely dangerous by (strong) default. It will kill us or at least permanently disempower us, with high probability, unless we solve some technical alignment problems before building it.
Assume the Phase Two Risk is 99%. Also:
The justification for these numbers is that each billion dollars buys us a "dignity point" aka +1 log-odds of survival. This assumes that both research fields are similarly neglected and tractable.
Therefore:
We should therefore spend billions of dollars on both AI control and AI alignment, they are both very cost-efficient. This conclusion is robust to many different assumptions, provided that overall P(Doom) < 100%. So this model is not really a "case against AI control research".
nathan-helm-burger on Mikhail Samin's ShortformIt has been pretty clearly announced to the world by various tech leaders that they are explicitly spending billions of dollars to produce "new minds vastly smarter than any person, which pose double-digit risk of killing everyone on Earth". This pronouncement has not yet incited riots. I feel like discussing whether Anthropic should be on the riot-target-list is a conversation that should happen after the OpenAI/Microsoft, DeepMind/Google, and Chinese datacenters have been burnt to the ground.
Once those datacenters have been reduced to rubble, and the chip fabs also, then you can ask things like, "Now, with the pressure to race gone, will Anthropic proceed in a sufficiently safe way? Should we allow them to continue to exist?" I think that, at this point, one might very well decide that the company should continue to exist with some minimal amount of compute, while the majority of the compute is destroyed. I'm not sure it makes sense to have this conversation while OpenAI and DeepMind remain operational.
raemon on Raemon's ShortformSomeone just noted that the Review Voting widget might imply that the "Jan 5" end time is meant to be inclusive of a full 24 hours from now, which wasn't the intent, but given that people may have been expecting that, and that the consequences for Not Extending aren't particularly bad, I'm going to give people another 24 hours.
Meanwhile, I said it elsewhere but will say again here: if that there were any posts you were blocked from commenting on that you want to write a review of... I forgot to fix that in the code and it'll take a little while to fix, but meanwhile you can write a shortform comment and DM me (in which I will manually set it to be a review of that post), and/or write a top level post and tag it with 2023 Longform Reviews [? · GW] (which will cause it to appear in the Frontpage Review widget)
nathan-helm-burger on Chris_Leong's ShortformPeople have said that to get a good prompt it's better to have a discussion with a model like o3-mini, o1, or Claude first, and clarify various details about what you are imagining, then give the whole conversation as a prompt to OA Deep Research.
knight-lee on Mikhail Samin's ShortformEven if building intelligence requires solving many many problems, preventing that intelligence from killing you may just require solving a single very hard problem. We may go from having no idea to having a very good idea.
I don't know. My view is that we can't be sure of these things.
jimrandomh on Nick Land: OrthogonalityDownvotes don't (necessarily) mean you broke the rules, per se, just that people think the post is low quality. I skimmed this, and seemed like... a mix of edgy dark politics with poetic obscurantism?
cstinesublime on Learn to Develop Your AdvantageThe second half of this post was rather disappointing. You certainly changed my mind on the seemingly orderly progression of learning from simple to harder with your example about chess. This reminds me of an explanation Ruby on Rails creator David Heinemeier Hansson made about intentionally putting himself into a class of motorracing above his (then) abilities[1].
However there was little detail or actionable advice about how to develop advantages. Such as where to identify situations that are good for learning, least of all from perceived losses or weaknesses. For example:
...where we genuinely have all the necessary resources (including internal ones). At the very least, it’s useful to develop the skill of finishing tasks quickly and decisively when nothing is actually preventing us from doing so.
I would be hard-pressed to list any situations where I do have the necessary resources, internal or external, to finish the task but just not the inclination to do so promptly. Clean my bedroom maybe? Certainly if I gave you a list of things found on my bughunt [LW · GW], none of the high-value bugs would fit this criteria.
I also find the "Maximizing the Effective Use of Resources" section feels very much like "How to draw an owl: draw a circle, now draw the rest of the owl". I am aware that often the first idea we have isn't the best.
Except for me... it often is the best. I know because I have a tendency to commit quota filling. What I mean is, the first idea isn't great, but it's the best I have. All the subsequent ideas, even when I use such creativity techniques like "saying no- nos" or removing all internal censors and not allowing myself to feel any embarrassment or shame for posing alternatives - none of them are demonstrably better than the first. In fact they are devolve into an assemblages of words, like a word salad, that seem to exist only for the purpose of ticking the box of "didn't come up with just one idea and use that, thought of other ideas."
Similarly role-playing often doesn't work for me because if I ask myself something like
“What resources and strategies would Harry James Potter-Evans-Verres / Professor Quirrell use in this situation?” If the answer is obvious, why not apply it?
There is never an obvious answer which is applicable to me. For example, I might well ask myself when on a music video set "How would Stanley Kubrick shoot this?" - and then remember that while he had 6 days at his disposable to a single lateral dolly track with an 18mm lens, and do 50 takes if he wanted. I have 6 hours to shoot the rest of the entire video, only portrait length lenses (55mm and 77mm) and don't have enough track to lay run a long-enough track to shoot it like Kubrick.
I suspect though that this needs to go further upstream - okay, how would Stanley Kubrick get resources to have the luxury of that shot? How would he get the backing of a major studio? Or perhaps more appropriately how would a contemporary music video director like Dave Myers or Hannah Lux Davis get their commissions?
But if I knew that, I'd be doing it. I don't know how they do it. That would involve drawing the rest of the owl.
With this in mind, how can I like Heinemeier Hansson or your hypothetical chess student push myself into higher classes and learn strategies to win?
And if his 2013 LeMans results are anything to go by: it worked, his car came 8th overall, and 1st in his class. Overall he beat many ex-Formula One drivers. Including race winner Giancarlo Fisichella (21st), podium placer and future WEC champion Kamui Kobayashi (20th), Karun Chandok and Brendan Hartley (12th) and even Indy 500 winner Alessandro Rossi (23rd)