Posts
Comments
I'd give this a +9 if I could*. I've been using this technique for 7 years. I think it's clearly paid off in "clear, legible lessons about how to think." But the most interesting question is "did the subtler benefits pay off, in 7 years of practice?"
Let's start with the legible
This was essentially the first step on the path towards Feedbackloop-first Rationality. The basic idea here is "Watch your thoughts as they do their thinking. Notice where your thoughts could be better, and notice where they are particularly good. Do more of that."
When I've ran this exercise for groups of 20 people, typically 1/4 of them report a noticeable effect size of "oh, that showed me an obvious way to improve my thinking." (I've done this 3x. I've also run it ~3 times for smaller groups and where most people didn't didn't seem to get it, which led me to eventually write Scaffolding for "Noticing Metacognition", which people seemed to have an easier time with)
I've picked up a lot of explicit cognitive tricks, via this feedbackloop. Some examples:
- "oh, I'm having trouble thinking because the problem is too complex, but that problem goes away when I get better working memory aids"
- "oh, I just spent 30 minutes planning out an elaborate series of tests. But, then the very first test failed in the dumbest way possible. If there are cheap tests, just do those first."
But, the essay promises more:
A small tweak to how your brain processes information in general is worth more than a big upgrade to your conscious repository of cognitive tricks.
[...] More creativity and good ideas just "popping into your head". There's no magic to it! Once you understand how the process works, it can be optimized for any purpose you choose.
Most people already have a thinking style built on top of excessive conscious cognitive effort. This often involves relying on side-effects of verbal and conscious thoughts, while mistakenly assigning the full credit for results to those effortful thoughts.
When you already have some conscious/verbal thoughts, it is tempting to imagine they are the only result of your thinking, and then try to pick up from there. But this is limiting, because the most power is in whatever generated that output.
It's not overwhelming enough to be obvious to others at this point (I did ask a few people "hey, uh, do I seem smarter to you in the past couple years?" and they said "a bit maybe, but, like not obviously? But I don't know that I would have really noticed"). But, I am subjectively fairly sure I've seen real progress here.
Here, at least, is my self-story, make of it what you will.
14 years ago, thinking strategically was generally hard for me (5 minutes of trying to think about a chess board or complex problem would give me a headache). I also didn't respond to crises very well in the moment. For my first several years in the rationalist community, I felt like I got dumber, because I learned the habit of "go ask the smarter people around me whenever couldn't figure something out."
8 years ago, I began "thinking for real", for various reasons. One piece of that was doing the Tuning Your Cognitive Strategies exercise for the first time, and then sporadically practicing at the skill "notice my thoughts as they're happening, and notice when particularly good thoughts are happening."
6 years ago, a smart colleague I respected did tell me "hey, you seem kinda smarter than you used to." (They brought this up in response to some comments of mine that made it a more reasonable thing to say)
More recently, I've noticed at the workshops I've run, that although there are people around who are, in many senses, smarter and more knowledgeable than me, they found certain types of metacognitive thoughts more effortful and unnatural than they seemed to me. It was pretty common for me to spend 5 minutes directing my attention at a problem, and having approaches just sort of naturally occur to me, where for some participants they'd have to struggle for 30-60 minutes to get to it.
The way this plays out feels very similar to how it's described in SquirrelInHell's essay here.
But, also, I think the style of thinking here is pretty normal for Lightcone core staff, and people in our nearby network. So this may have more to do with "just generally making a habit of figuring out how to deal with obstacles" that comes up naturally in our work. I think most of us have gotten better at that over the past few years, and most of us don't explicitly do this exercises.
(Jacob Lagerros did explicitly invent and train at the Babble challenge and apply it to problemsolving, which is a different exact mechanism but feels at least adjacent to this exercise, and which I also attribute to improving my own generativity. Maybe that's a better exercise than this one, though it's at least a point towards "deliberately practice generativity." During the pandemic, I tried out a "Babble and Tune" variant that combined the two exercises, which didn't obviously work at the time but I think is essentially what I actually do most of the time)
Most recently, in November, I spent... basically two whole weeks thinking strategically ~all the time, and I did eventually get a headache that lasted for days, but only after 1.5 weeks instead of 5 minutes.
When I asked John Wentworth recently if I seemed smarter to him, he said "not obviously, but I'm not sure I'd notice." I said "fair, but, though I (somewhat defensively) wanna flag – a few years ago when you first met/read my stuff, most of what I was writing was basically summarizing/distilling the work of other people, and nowadays most of what you hear me say is more like "original work.")
So, idk, that's my story. Take the self-report with a grain of salt.
The Cautionary Tale
It's annoying that whenever I bring up this technique, I either need to disclaim "uh, the person who invented this later killed themselve," or, not disclaim it but then have someone else bring it up.
I do think there's an important cautionary tale there, but it's a bit subtler. Copying my warning from Subskills of "Listening to Wisdom":
I believing Tuning Your Cognitive Strategies is not dangerous in a way that was causal in that suicide[4], except that it's a kind of a gateway drug into weird metacognitive practices and then you might find yourself doing weirder shit that either explicitly hurts you or subtly warps you in a way you don't notice or appreciate.
I think the way SquirrelInHell died was essentially (or, at least, analogous to) absorbing some Tacit Soulful Ideas, which collapsed a psychologically load-bearing belief in a fatal way.[5]
I do think there are people for whom Tuning Your Cognitive Algorithms is overwhelming, and people for whom it disrupts a coping mechanism that depends on not noticing things. If anything feels off while you try it, definitely stop. I think my post Scaffolding for "Noticing Metacognition" presents it in a way that probably helps the people who get overwhelmed but not the people who had a coping mechanism depending on not-noticing-things.
I also think neither of these would result in suicide in the way that happened to SquirrelInHell.
* it's a bit annoying I can't give this my own +9, since I crossposted it, even though I didn't write it.
Whenever this comes up, I note: I think this is only a problem for a certain kind of nerd/geek who wants particularly intense stakes.
Sitcoms, soap operas have plenty of interesting stories that are mostly about low-stakes interpersonal drama.
(I guess this cached annoyance of mine is more about people complaining about utopian fiction rather than science fiction. But I think the same principles apply)
I haven't really explicitly checked this. I only use caffeine and (questionably counting) wellbutrin. I'll keep an eye out, especially if there's particular evidence about something to look out for.
I have observed people on modafinil who seem to get more tunnel visioned and have a harder time reorienting but I haven't used it myself.
I'm curious to hear more about how this went.
I'm curious how this seems to have gone for you 14 years later.
I'm not really sure what goal you were trying to achieve by branching off into so many different topics in a single post instead of creating separate post
I think in my ideal world this was a series of blogposts that I actually expected people to read all of. Part of the reason it's all one post is that I didn't expect people to reliably get all of them.
Partly, I think each individual piece is necessary. Also, kind of the point of pieces like this are to be sort of guided meditations on a topic that let you sit with it long enough, and approach it from enough different angles, that a foreign way of thinking has time to seep into your brain and get digested.
I expected people would mostly not believe me without the concrete practical examples, but the concrete examples are (necessarily) meandering because that's what the process was actually like (you should expect the process of transmitting soulful knowledge to feel some-kind-of-meandering, at least a fair amount of the time).
I wanted to make sure people got the warnings at the same time that they got the "how to" manual – if I separated the warnings into a separate post, people might only read the more memetically successful "how to" posts.
I do suspect I could write a much shorter version that gets across the basic idea, but I don't expect the basic idea to actually be very useful because each of the 20 skills is pretty deep, and conveying what it's like to use them all at once is just necessarily complicated.
I will say I think there are a few different things people mean by burnout, but, they are each individually pretty real. Three examples that come to mind easily:
"Overworked" burnout.
If I've been working 60 hour weeks for months on end, eventually I'm just like "I can't do this anymore." My brain gets foggy. I feel exhausted. My body/mind start to rebel at the prospect of doing of more of that type of work.
In my experience, this lasts 1-3 weeks (if I am able to notice and stop and switch to a more relaxed mode). When I do major projects, I have a decent sense of when Overworked Burnout is coming, and I time the projects such that I work up until my limit, then take a couple weeks to recover.
"Overworked + Trapped" burnout.
As above, except for some reason I don't have the ability to stop – people are depending on me, or future me is depending on me, and if I were to take a break a whole bunch of projects or relationships would come crashing down and destroy a lot of stuff I care about.
Something about this has a horrible coercive feeling that is qualitatively different being tired/overworked. Some kind of "sick to my stomach", want to curl up and hide but you can't curl up and hide. This can happen because your boss is making excessive demands on you (or firing you), or simply because I volunteered myself into the position. Each of those feels differently bad. The former because you maybe really can't escape without losing resources that you need. The latter because if I've put myself in this situation, than something about my self-image and how others will relate to me will have to change if I were to escape.
"Things are deeply fucked burnout."
This feels similar to the Overworked+Trapped but it's some other kind of trapped other than just "needing to put in a lot of hours." Like, maybe there's conflict at work, or in a close relationship, and there are parts of it you can't talk about with anyone, and the people you can easily talk about it with have some perspective that feels wrong to you and it's hard to hold onto your own sense of sanity.
In some (many?) cases the right move here is to walk away, but that might be hard either because you need money/resources from the group, or you've invested so much of your identity into it that letting go requires reorganizing how you conceptualize yourself and your goals and your social scene.
This can cause a number of things other than burnout, i.e. various trauma responses. But I think a "burnout" flavored version of it can come when you have to live in this state for months or years. I haven't had this quite happen to me, but the people who've had "conflict based burnout" or "no longer really believe in their job/mission/relationship" flavor burnout can leave people struggling to do much-of-anything on purpose for months.
+9. Fatebook has been a game changer for me, in terms of how practical it is to weave predictions into my decisionmaking. I donated $1000 to Sage to support it.
It's not listed here, but one of the most crucial things is the Fatebook Chrome Extension, which makes it possible to frictionless integrate it into my normal orienting process (which I do in google docs. You can also do it in the web version of Roam).
I've started work on "Enriched Fatebook" poweruser view that shows your calibration at a more granular level. I have several ideas for how to build additional poweruser UI for it but not sure if I'll get around to it. https://raemon.github.io/fatebook-enriched
One weakness is that the Slack Integration produces pretty bulky predictions, which makes it feel awkward to make a ton of predictions in a channel (and usually, when we're having a discussion where it'd be appropriate to make a prediction, it's useful to make like 3 predictions that tackle the question from different angles). Trimming off a few lines from the Slack UI would be good, i.e. see here:
I don't know what the limitations of the Slack integration are but you should be able to shave at least one line off that.
...
I currently thinking a thing that Quick Forecasting is missing is "qualia-based predictions." I.e. before I know "what probability do I assign here?" I often know things like "I don't believe in this, in my gut" or "I believe in this but in a loopy way where I'm the one driving the actions and I'm inhabiting a confident mode which is hard to be objective about." Right now Fatebook has tags for Questions, but not tags for "predictions."
Longterm, I think the Philosophically/Practically Correct Typing for an individual prediction should let you either put a number (if you have one), or a "prediction tag" which is some kind of metadata other than the raw probability. (But, admittedly I don't expect anyone other than me to use that for the near future so it's not obviously a priority)
“Right now”, which includes figuring out what different ways things can be the most important thing right now.
Did this work?
i.e. the question "what sort of community institutions are good to build?" is a timeless question. Why should we artificially limit our ability to reflect on that sort of thing during the Review, given that we set the Review up in an openended way that allows us to do that on the margin?
Fwiw I disagree, I think the Review is deliberately openended.
Yes there's a specific goal of find the top 50 posts, and to identify important timeless intellectual contributions. But, part of the whole point of the review (as I originally envisioned it) is also to help reflect in a more general sense on "what happened on LessWrong and what can we learn from it?".
I think rather than trying to say "no, don't reflect on particular things that don't fit the most central use case of the Review", it seems actively good to me to take advantage of the openended nature of it to think about less central things. We can learn timeless lessons from posts that weren't, themselves, particularly timeless.
My Current Metacognitive Engine
Someday I might work this into a nicer top-level post, but for now, here's the summary of the cognitive habits I try to maintain (and reasonably succeed at maintaining). Some of these are simple TAPs, some of them are more like mindsets.
- Twice a day, asking “what is the most important thing I could be working on and why aren’t I on track to deal with it?”
- you probably want a more specific question (“important thing” is too vague). Three example specific questions (but, don’t be a slave to any specific operationalization)
- what is the most important uncertainty I could be reducing, and how can I reduce it fastest?
- what’s the most important resource bottleneck I can gain, or contribute to the ecosystem, and would gain me that resource the fastest?
- what’s the most important goal I’m backchaining from?
- you probably want a more specific question (“important thing” is too vague). Three example specific questions (but, don’t be a slave to any specific operationalization)
- Have a mechanism to iterate on your habits that you use every day, and frequently update in response to new information
- for me, this is daily prompts and weekly prompts, which are:
- optimized for being the efficient metacognition I obviously want to do each day
- include one skill that I want to level up in, that I can do in the morning as part of the meta-orienting (such as operationalizing predictions, or “think it faster”, or whatever specific thing I want to learn to attend to or execute better right now)
- for me, this is daily prompts and weekly prompts, which are:
- The five requirements each fortnight:
- be backchaining
- from the most important goals
- be forward chaining
- through tractable things that compound
- ship something
- to users every fortnight
- be wholesome
- (that is, do not minmax in a way that will predictably fail later)
- spend 10% on meta (more if you’re Ray in particular but not during working hours. During working hours on workdays, meta should pay for itself within a week)
- be backchaining
- Correlates:
- have a clear, written model of what you’re backchaining from
- have a clear, written model of how you’re compounding
- The general problem solving approach:
- breadth first
- identify cruxes
- connect inner-sim to cruxes / predictions
- follow your heart
- see how your predictions went
- Random ass skills
- napping
- managing working memory, innovating and applying on working memory tools
- grieving
- Generalizing
Skill I’m working on that hasn’t paid off yet but I think you should try anyway:
- At least once a day or so, when you notice a mistake or surprise, spent a couple minutes asking “how could I have thought that faster” (and periodically do deeper dives)
- each day/week, figure out what you’re confused or predictably going to tackle in a dumb way, and think in advance about how to be smart about it the first time
It sounds like there's actually like 3-5 different object level places where we're talking about slightly different things. I also updated on the practical aspect from Ryan's comment. So, idk here's a bunch of distinct points.
1.
Ryan Greenblatt's comment updated me that the energy requirements here are minimal enough that "eating the sun" isn't really going to come up as a consideration for astronomical waste. (Eating the Earth or most of the solar system seems like it still might be. But, I agree we shouldn't Eat the Earth)
2.
I'd interpreted most past comments for nearterm (i.e. measured in decades) crazy shit to be about building Dyson spheres, not Star Lifting. (i.e. I expected the '20 years from now in some big ol' computer' in the solstice song to be about dyson spheres and voluntary uploads). I think many people will still freak out about Dyson Sphering the sun (not sure if you would). I would personally argue "it's just pretty damn important to Dyson Sphere the sun even if it makes people uncomfortable (while designing it such that Earth still gets enough light)."
3.
I agree in 1000 years it won't much matter whether you Starlift, for astronomical waste reasons. But I do expect in 1000 years, even assuming a maximally consent-oriented / conservative-with-regards-to-bio-human-values, and all around "good" outcome, most people will have shifted to running on computronium and experienced much more than 1000 years of subjective time and their intuitions about what's good will just be real different. There may be small groups of people who continue living in bio-world but most of them will still probably be pretty alien by our lights.
I think I do personally hope they preserve the Earth as sanctuary and/or historical relic. But I think there's a lot of compromises like "starlift a lot of material out of the sun, but move the Earth closer to the sun to compensate" (I haven't looked into the physics here, the details are obviously cruxy).
When I imagine any kind of actual realistic future that isn't maximally conservative (i.e. the bio humans are < .1% of the solar system's population and just don't have that much bargaining power), it seems even more likely that they'll at least work on compromise solutions that preserve a reasonable Earth experience but eat a bunch of the sun, if there turn out to be serious tradeoffs there. (Again I don't actually know enough physics here and I'm recently humbled by remembering the Eternity in Six Hours paper, maybe there's literally no tradeoffs here, but, I'd still doubt it)
4.
It sounds like it's not particularly cruxy anymore, but, I think the "0.00000004% of the Earth's current population" analogy is just quite different. 80 trillion suns is involves more value than has ever been had before, 3 lives is (relatively) insignificant compared to many political compromises we've made, even going back thousands of years. Maybe whatever descendants get to reap that value are so alien that they just don't count as valuable by today's lights, and it's reasonable to have some extreme time discounting here, but, if any values-you-care-about survived it would be huge.
I agree both morally and practically with "it's way more important to make sure we have good global coordination systems that don't predictably either descend into a horrible totalitarianism, or trigger a race for power that causes horrible wars or other bad things, than to get those 80 trillion suns." But, like, the 80 trillion suns are still a big deal.
5.
I'll note it's also not a boolean whether we "bulldoze the earth" or "bulldoze the rest of the solar system" for rushing to build a dyson sphere. You can start the process with a bunch of mining in some remote mountain regions or whatever without eating the whole earth. (But I think it might be bad to do this because "don't harvest Earth" is just a nice simple Schelling rule and once you start haggling over the details I do get a lot more worried)
6.
I recall reading it's actually maybe cheaper to use asteroids than Mercury to make a dyson sphere because you don't need to expensively lift things out of the gravity well. It is appealing to me if there are no tradeoffs involved with deconstructing any of the charismatic astronomical objects until we've had more time to think/orient/grow-as-a-people.
7.
Part of my outlook here is that I spend the last 14 years being personally uninterested in and scared by the sorts of rapid/crazy/exponential change you're wary of. In the past few years, I've adjusted to be more personally into it. I don't think I would have wanted to rush that grieving/orienting process for Past Me even though it cost me a lot of important time and resources (I'm referring here more to more like stuff in The God of Humanity, and the God of the Robot Utilitarians)
But I do wish I had somehow sped along the non-soulfully-traumatic parts of the process (i.e. some of the updates were more simple/straightforward and if someone had said the right words to me, I think I'd have gotten a strictly better outcome by my original lights).
I expect most humans, given opportunity to experiment on their own terms, will gradually have some kind of perspective shift here (maybe on a longer timescale than Me Among the Rationalists, but, like, <500 years). I don't want people to feel rushed about it, but I think there will be some societal structures that will lend themselves to dallying more and accumulating serious tradeoffs, or less.
In addition to being hauntingly beautiful, this story helped me adjust to the idea of the trans/posthuman future.
14 years ago, I very much did not identify with the Transhuman Vision. It was too alien, too much, and I didn't feel ready for it. I also didn't actively oppose it. I knew that slowly, as I hung out around rationalists, I would probably slowly come to identify more with humanity's longterm future.
I have indeed come to identify more with the longterm future and all of it's weirdness. It was mostly not because of this story, but I did particularly resonate with the framing here – in large part because it met me where I am, instead of jumping into Future Shock. It presents the increasing alienness in gentle increments, from multiple perspectives, and from the perspective of someone currently living in a more Ancestral Human perspective.
This story doesn't tell you what sort of choices are good to make, but it makes it feel easier to wrap my brain around how I (or others) might eventually make such choices.
I think it might have been important to the process for this post to exist (so you can tell Elizabeth did the work), but for there still to be a shorter post that just gets the point across.
I have my own actual best guesses for what happens in reasonably good futures, which I can get into. (I'll flag for now I think "preserve Earth itself for as long as possible" is a reasonable Schelling point that is compatible with many "go otherwise quite fast" plans)
I doubt that totally dismantling the Sun after centuries would significantly accelerate the time we reach the cosmic event horizon.
Why do you doubt this? (to be clear, depends on exact details. But, my original query was about a 2 year delay. Proxima Centauri is 4 lightyears away. What is your story for how only taking 0.1% of the sun's energy while we spin up doesn't slow us down by at least 2 years?
I have more to say but maybe should wait on your answer to that.
Mostly, I think your last comment still had it's own missing mood of horror, and/or seemed to be assuming away any tradeoffs.
(I am with you on "many rationalists seem gung ho about this in a way I find scary")
This site is a cool innovation but missing pieces required to be really useful. I’m giving it +1. I might give it +4 to subsidize ‘actually build shit’.
I think this site is on-the-path to something important but a) the UI isn't quite there and b) there's this additional problem where, well, most news doesn't matter (in that it doesn't affect my decisions).
During Ukraine nuclear scares, I looked at BaseRateTimes sometimes to try and orient, but I think it was less helpful than other compilations of prediction markets that Lightcone made specifically to help orient to nuclear threats.
I do think combining multiple prediction platforms into a single page is a pretty useful innovation and still seems likely to be a key piece of the final usable product I hope exists some day.
Things that feel missing:
- good UI for tailoring the page to suit my interests and decisionmaking
- ...more context, or something, for each market?
I assume one problem here is that matching up markets from different platforms with similar enough resolution criteria is hard. (Seems like something AI may be able to automate now?)
Richard said "I don't think my priors on that are very different from yours but the thing that would have made this post valuable for me is some object-level reason to upgrade my confidence in that." He didn't say it'd be a longterm project, I think he just meant he didn't change his beliefs about it due to thist post.
So, I'm with you on "hey guys, uh, this is pretty horrifying, right? Uh, what's with the missing mood about that?".
The issue is that not-eating-the-sun is also horrifying. i.e. see also All Possible Views About Humanity's Future Are Wild. To not eat the sun is to throw away orders of magnitude more resources than anyone has ever thrown away before. Is it percentage-wise "a small fraction of the cosmos?". Sure. But, (quickly checks Claude, which wrote up a fermi code snippet before answering, I can share the work if you want to doublecheck yourself), a two year delay would be... 0.00000004% of the unvierse lost beyond the lightcone horizon, which doesn't sound like much except that's 200 galaxies lost.
When you compare "the Amish get a Sun Replica that doesn't change their experience", the question is "Is it worth throwing away 80 trillion stars for the Amish to have the real thing." It does not seem obviously worth it.
IMO there isn't an option that isn't at least a bit horrifying in some sense that one could have a missing mood about. And while I still feel unsettled about it, I think if I have to grieve something, makes more sense to grieve in the direction of "don't throw away 80 trillion stars worth of resources."
I think you're also maybe just not appreciating how much would change in 10,000 years? Like, there is no single culture that has survived 10,000 years. (Maybe one of those small tribes in the amazon? I'd still bet on there having been a lot of cultural drift there but not confidently). The Amish are only a few hundred years old. I can imagine doing a lot of moral reflection and coming to the conclusion the sun shouldn't be eaten until all human cultures have decided it's the right thing to do, but I do really doubt that process takes 10,000 years.
What do you think Richard Ngo claimed about this?
This seemed like a nice explainer post, though it's somewhat confusing who the post is for – if I imagine being someone who didn't really understand any arguments about superintelligence, I think I might bounce off the opening paragraph or title because I'm like "why would I care about eating the sun."
There is something nice and straightforward about the current phrasing but suspect there's an opening paragraph that would do a better job explaining why you might care about this.
(But I'd be curious to hear from people who weren't really sold on any singularity stuff who read it and can describe how it was for them)
FYI I think by the time I wrote Optimistic Assumptions, Longterm Planning, and "Cope", I think I had updated on the things you criticize about it here (but, I had started writing it awhile ago from a different frame and there is something disjointed about it)
But, like, I did mean both halfs of this seriously:
I think you should be scared about this, if you're the sort of theoretic researcher, who's trying to cut at the hardest parts of the alignment problem (whose feedback loops are weak or nonexistent)
I think you should be scared about this, if you're the sort of Prosaic ML researcher who does have a bunch of tempting feedback loops for current generation ML, but a) it's really not clear whether or how those apply to aligning superintelligent agents, b) many of those feedback loops also basically translate into enhancing AI capabilities and moving us toward a more dangerous world.
...
Re:
For the last few weeks, I’ve been working on trying to find plans for AI safety. They should cover the whole problem, including the major hurdles after intent alignment.
I strongly disagree with this being a good thing to do! We're not going to have a good, end-to-end plan about how to save the world from AGI.
I think in some sense I agree with you – the actual real plans won't be end-to-end. And I think I agree with you about some kind of neuroticism that unhelpfully bleeds through a lot of rationalist work. (Maybe in particular: actual real solutions to things tend to be a lot messier than the beautiful code/math/coordination-frameworks an autistic idealist dreams up)
But, there's still something like "plans are worthless, but planning is essential." I think you should aim for the standard of "you have a clear story for how your plan fits into something that solves the hard parts of the problem." (or, we need way more people doing that sort of thing, since most people aren't really doing it at all)
Some ways that I think about End to End Planning (and, metastrategy more generally)
Because there are multiple failure modes, I treat myself as having multiple constraints I have to satisfy:
- My plans should backchain from solving the key problems I think we ultimately need to solve
- My plans should forward chain through tractable near goals with at least okay-ish feedback loops. (If the okay-ish feedback loops don't exist yet, try to be inventing them. Although don't follow that off a cliff either – I was intigued by Wentworth's recent note that overly focusing on feedback loops led him to predictably waste some time)
- Ship something to external people, fairly regularly
- Be Wholesome (that is to say, when I look at the whole of what I'm doing it feels healthy, not like I've accidentally min-maxed my way into some brittle extreme corner of optimization space)
And for end to end planning, have a plan for...
- up through the end of my current OODA loop
- (maybe up through a second OODA loop if I have a strong guess for how the first OODA loop goes)
- as concrete a plan as I can, assuming no major updates from the first OODA loop, up through the end of the agenda.
- as concrete a visualization of the followup steps after my plan ends, for how it goes on to positively impact the world.
End to End plans don't mean you don't need to find better feedbackloops or pivot. You should plan that into the plan (And also expect to be surprised about it anyway). But, I think if you don't concretely visualize how it fits together you're like to go down some predictably wasteful paths.
Not sure I quite parsed, but things that makes me think of:
- first, if you're bottlenecked on health (physical or mental), it may be that finding medication that helps is more important than your mindset.
- try success spiralling – start doing small things, build up both a habit/muscle of doing things, and momentum in doing things, escalate to bigger things
- if getting started is hard, maybe find a friend or pay a colleague to just sit with you and constantly be like "are you doing stuff?" and spray you with a water bottle if you look like you're overthinking stuff, until you build up a success spiral / muscle of doing things.
- try doing doing doing just fucking do it man and when you're brain is like "idk that seems like a whole lotta doing what if we're doing the wrong thing?" be like "it's okay Thinky Brain this is an experiment we will learn from later so we evetually can calibrate on Optimal Think-to-Do Ratio"
when you attempt to switch from thinking to doing, what happens instead?
Sort of inspired by Erik Jenner's post:
If you've been vaguely following me and thought "man, I wish Ray hurried up and finished making the update that <X>", what are some values of X?
Great writeup.
- Alice's team develops a major product without first checking to see if it's something people actually want -- after a year and a half of development, the product works great, but it turns out there isn't much of any demand. (I would consider this an observation failure -- failure to observe critical information leads to lots of wasted time.)
FYI I'd classify this more as a decision failure. They really would have had to take different actions in order to get this data, so this was more at the point when they were like "do I start building this product, or do I find some random representative users and see what they think of the idea?."
Also, since decision can flow pretty directly from orientation, you may find these two similar enough that you want to group them as one; I'm undecided on whether to make that change to this technique "more formally" and probably need to test it with more participants to see!
I actually normally combine/conflate Observe and Orient.
I think the actual takeaway here is: any two adjacent steps can kind of blend into each other.
You might be in a microloop where you're observing and orienting (and then maybe looking for more observations and then orienting on the new ones).
Then, when you're eventually like "okay I have enough observations", you may be in a loop where you're evaluating decisions, and then looking at your confused model and trying to wrangle the information into a form that's useful for decisionmaking, then look at your decision options again, be dissatisfied with your current ability to make-sense-of-things, and do more orienting.
Then eventually you're in a state where you know how to think about the situation, and you pretty much know what the options are, but as you start thinking about "Acting", your brain starts to see the consequences of each decision in near mode, which changes your guesses about which actions are best.
Then, as you start acting in earnest, each action comes with some immediate observations.
But, you can't really move from "Observe" to "Decide" without having gone through at least a little bit of an orient step on how to classify your observations.
I did a session yesterday with @moonlight, which went pretty well. I ended up consolidating some notes that seemed good to share with new assistants, and then he wrote the introduction he'd personally have preferred.
I generally work out of google docs that serve as shared-external-working-memory, with multiple tabs.
Moonlight's Intro
[Written by the first thinking assistant working with Ray, writing here what I’d have liked to read first]
Important things:
- Ray is currently sick, so put some effort into speaking more softly and slowly.
- There is no interview or anything similar, you’ll begin assisting him straight away.
- By default, just watch him work (coding/planning/writing/operations), and occasionally give signs you’re still attentive, without interrupting.
- Write moment to moment observations which feel useful to you, as well as general thoughts, down in the Assistant Notes tab. This helps you feel more proactively involved and makes you focused on noticing patterns and ways in which you could be more useful as an assistant.
- The Journal tab is for his plans and thoughts about what to generally do. Read it as an overview.
- This Context tab is for generally useful information about what you should do and about relevant strategies and knowledge Ray has in mind. Reading this helps you get a more comprehensive view on what his ideal workflow looks like, and what your ideal contributions look like.
For the structure of this document:
- Collapse sections when reading. It helps traverse the document.
- This part has my thoughts for onboarding. Under it, you can find Ray’s onboarding section. Read these two first.
- The “Ray Facts” section has important information about logistics and operations. Currently it has only his work location. [edited out in this comment]
- In “Ray’s Metacognitive Engine” and below, you can find the strategies and knowledge I’ve mentioned above. You can read these after, they’re not mandatory at the very start.
Ray’s First Draft Intro Materials
Strategic Overview
Goal: End the acute risk period, and ensure a flourishing human future.
I’ve recently finished a bunch of grieving necessary to say “all right I’m ready to just level up into an Elon-Musk-but-with-empathy-and-cyborg-tools type”, as well as the minimum necessary pieces of a cognitive engine that (I think) is capable of doing so).
I want to be growing in capacity at an exponential rate, both in terms of my personal resources, and the resources available to the x-risk ecosystem that are accomplishing things I think need accomplishing.
This means having a number of resources that are compounding, that are synergistic, which include:
- Money (either mine, or ability to spend Lightcone’s)
- Skills
- Meta personal skills, like ability to learn, and understand things, or be strategic
- Meta interpersonal skills, such as the ability to outsource labor or make use of assistants,
- Object level skills like programming, UI design, Event running
- Ability to work with employees who can take on tasks I want done
- Capital
- Relationships with people I work well with
- Tools I can re-use
Things I actually do most days:
- Coding on LessWrong
- Coding on random other projects
- Planning my Cybercognition Agenda, which includes workshops, cybernetic tools, and upskilling people around me.
- UI design, trying to figure out important complex things I want people to interact with in a way that feels simple to them.
- Thinking strategically about what needs to be done next
Instructions for Thinking Assistants
Things I would like you to do:
- By default, be quiet and attentive and just help me focus by being a real human who’s staring at me
- Develop skills for tackling sort of arbitrary ops or research or coding tasks, such that I can outsource small things to you.
- Advice
- This is tricky because I have a good enough model of myself that a lot of advice isn’t that helpful. It’s still useful to have my blindspots pointed out. But, if I interrupt you (either with words or with a hand gesture) that probably means I want to move on to a different thread. (Ideally, you feel comfortable bringing up ideas, with no hard feelings if it doesn’t work out)
I would like to end up with a series of if-then habits you can help me execute. I will mostly write these myself, but as you get to know me well enough to say useful things, you can make suggested-edits
From “Hire or become a Thinking Assistant”
- By default, be quietly but visibly attentive.
- Every now and then (~5-10 minutes, or when I look actively distracted), briefly check in (where if I'm in-the-zone, this might just be a brief "Are you focused on what you mean to be?" from them, and a nod or "yeah" from me).
- When I need to think something through, they rubber duck (i.e. listen as I talk out loud about it, and ask clarifying questions)
- Build a model of my thought process (partly by me explaining it to them, partly by observing, partly by asking questions)
- Ideally, notice when my thought process seems confused/disoriented/inefficient.
- Ideally, have a large repertoire of cognitive tools they can suggest if I seem to be missing them.
- Intelligent enough that they can pretty easily understand the gist of what I'm working on.
- Ability to pick things up from context so I don't need to explain things in too much detail.
- Ideally, when my bottlenecks are emotional, also be at least fairly emotionally attuned (i.e. project a vibe that helps me worth through it, or at least doesn't add extra friction or emotional labor demands from me), and ideally, basically be a competent therapist.
- In general, own the metacognition. i.e. be taking responsibility for keeping track of things, both on a minute-to-minute timescale, and the day-to-day or week-to-week timescale.
- Ability to get out of the way / quickly drop things if it doesn't turn out to be what I need, without it being a big deal.
There are also important outside-the-container skillsets, such as:
- Be responsive in communication, so that it's easy to schedule with them. If it's too much of a pain to schedule, it kinda defeats the point.
- Potentially: proactively check in remotely during periods where I'm not actively hiring them. i.e. be a professional accountability buddy, maybe paid some base rate to briefly check in each day, with the ability to upsell into "okay today is a day that requires bigger metacognitive guns than Raemon has at the moment")
Even the minimum bar (i.e. "attentive body double") here is a surprisingly skilled position. It requires gentleness/unobtrusiveness, attentiveness, a good vibe.
The skill ceiling, meanwhile, seems quite high. The most skilled versions of this are the sort of therapist or executive coach who would charge hundreds of dollars an hour. The sort of person who is really good at this tends to quickly find their ambitions outgrowing the role (same with good executive assistants, unfortunately).
Pitfalls
Common problems I've run into:
- Having trouble scheduling with people. If you want to specialize in this role, it's often important for people to contact you on a short timeline (i.e. I might notice I'm in a brainfoggy state and want someone to assist me like right now, or tomorrow), so, having a communication channel you check regularly so people can ping you about a job.
- Asking questions in a way that is annoying instead of helpful. Since the point is to be giving me more time, if I have to spend too much time explaining the situation to someone, it undoes the value of it. This requires either them being good at picking things up quickly without much explanation, or good at reading nonverbal cues that the current thread isn't worth it and we should move on.
- Spending too much time on unhelpful advice. Sometimes an assistant will have ideas that don't work out, and maybe push them more than appropriate. There's a delicate balance here because sometimes I am being avoidant or something and need advice outside of my usual wheelhouse, but generally if advice isn't feeling helpful, I think the assistant should back off and observe more and try to have a few other hypotheses about what to suggest if they feel that the assistee is missing something.
- Navigating weird dynamics around "having someone entirely optimized to help another person." Having this run smoothly, in a net helpful way, means having to actually be prioritizing my needs/goals in a way that would normally be pretty rude. If I constantly feel like there's social awkwardness / wariness about whether I'm making them feel bad, the whole thing is probably net negative. I think doing a good job of navigating this requires some nuance/emotional-skill on both parties, in terms of striking a vibe where it feels like you are productively collaborating.
- (I think this likely works best when the person is really actively interested in the job "be a thinking assistant", as opposed to something they're doing because they haven't gotten traction on their real goals).
Ray’s Metacognitive Engine
- Twice a day, asking “what is the most important thing I could be working on and why aren’t I on track to deal with it?”
- you probably want a more specific question (“important thing” is too vague). Three example specific questions (but, don’t be a slave to any specific operationalization)
- what is the most important uncertainty I could be reducing, and how can I reduce it fastest?
- what’s the most important resource bottleneck I can gain, or contribute to the ecosystem, and would gain me that resource the fastest?
- what’s the most important goal I’m backchaining from?
- you probably want a more specific question (“important thing” is too vague). Three example specific questions (but, don’t be a slave to any specific operationalization)
- Have a mechanism to iterate on your habits that you use every day, and frequently update in response to new information
- for me, this is daily prompts and weekly prompts, which are:
- optimized for being the efficient metacognition I obviously want to do each day
- include one skill that I want to level up in, that I can do in the morning as part of the meta-orienting (such as operationalizing predictions, or “think it faster”, or whatever specific thing I want to learn to attend to or execute better right now)
- for me, this is daily prompts and weekly prompts, which are:
- The five requirements each fortnight:
- be backchaining
- from the most important goals
- be forward chaining
- through tractable things that compound
- ship something
- to users every fortnight
- be wholesome
- (that is, do not minmax in a way that will predictably fail later)
- spend 10% on meta (more if you’re Ray in particular but not during working hours. During working hours on workdays, meta should pay for itself within a week)
- be backchaining
- Correlates:
- have a clear, written model of what you’re backchaining from
- have a clear, written model of how you’re compounding
- The general problem solving approach:
- breadth first
- identify cruxes
- connect inner-sim to cruxes / predictions
- follow your heart
- see how your predictions went
- Random ass skills
- napping
- managing working memory, innovating and applying on working memory tools
- grieving
- Generalizing
Skill I’m working on that hasn’t paid off yet but I believe in:
- At least once a day or so, when you notice a mistake or surprise, spent a couple minutes asking “how could I have thought that faster” (and periodically do deeper dives)
- each day/week, figure out what you’re confused or predictably going to tackle in a dumb way, and think in advance about how to be smart about it the first time
This is the sort of thing I find appealing to believe, but I feel at least somewhat skeptical of. I notice a strong emotional pull to want this to be true (as well as an interesting counterbalancing emotional pull for it to not be true).
I don't think I've seen output from the people aspiring in this direction without being visibly quite smart to make me think "okay yeah it seems like it's on track in some sense."
I'd be interested in hearing more explicit cruxes from you about it.
I do think it's plausible than the "smart enough, creative enough, strong epistemics, independent, willing to spend years without legible output, exceptionally driven, and so on" are sufficient (if you're at least moderately-but-not-exceptionally-smart). Those are rare enough qualities that it doesn't necessarily feel like I'm getting a free lunch, if they turn out to be sufficient for groundbreaking pre-paradigmatic research. I agree the x-risk pipeline hasn't tried very hard to filter for and/or generate people with these qualities.
(well, okay, "smart enough" is doing a lot of work there, I assume from context you mean "pretty smart but not like genius smart")
But, I've only really seen you note positive examples, and this seems like the sort of thing that'd have a lot of survivorship bias. There can be tons of people obsessed, but not necessarily on the right things, and if you're not naturally the right cluster of obsessed + smart-in-the-right-way, I don't know whether trying to cultivate the obsession on purpose will really work.
I do nonetheless overall probably prefer people who have all your listed qualities, and who also either can:
a) self-fund to pursue the research without having to make it legible to others
b) somehow figure out a way to make it legible along the way
I probably prefer those people to tackle "the hard parts of alignment" over many other things they could be doing, but not overwhelmingly obviously (and I think it should come with a background awareness that they are making a gamble, and if they aren't the sort of person who must make that gamble due to their personality makeup, they should be prepared for the (mainline) outcome that it just doesn't work out)
I'd sort of naively guess doing it with a stranger (esp. one not even in your circles) would be easier on the "feeling private/anxious about your productivity" – does that feel like it wouldn't work?
Okay a few people have DMd me, and I'm feeling some kind of vague friction that feels currently on track to be a dealbreaker so let's think that through here.
Problems:
- I can't tell offhand who's good at this, and while I think this is something someone with little experience could turn out to be good at, they often won't be, and it's kind of costly to spend a slot on them, especially if I really need someone competent at it.
- I often need someone "right now", and need a way to contact a bunch of people quickly, such that most of them will get the message and one of them will reply quickly, in a way that isn't too annoying for them but works.
I have a vision of a whole-ass website dedicated to facilitating this but right now want a quick hacky solution.
A group DM would work, but that feels like it'll produce weird competitive dynamics with who replies first but maybe isn't as good as the person who replies second.
DMing a bunch of people individually I guess is fine but but then I need to go find them.
A requirement for everyone participating as an assistant is that they have a way of being contacted that they'll respond to quickly.
I've added lyrics to this post for now (if you expand each section)
Clarification (I'll add this to the OP):
The ideal that I'm looking for are things that will take a smart researcher (like 95th percentile alignment researcher, i.e. there are somewhere between 10-30 people who might count) at least 30 minutes to solve the problem, and most alignment researchers maybe would have a 50% change of figuring it out in 1-3 hours.
The ideal is that people have to:
a) go through a period of planning, and replanning
b) spend at least some time feeling like the problem is totally opaque and they don't have traction.
c) have to reach for tools that they don't normally reach for.
It may be that we just don't have evals at this level yet, and I might take what I can get, but, it's what I'm aiming for.
I'm not trying to make an IQ test – my sense from the literature is that you basically can't raise IQ through training. So many people have tried. This is very weird to me – subjectively it is just really obvious to me that I'm flexibly smarter in many ways than I was in 2011 when I started the rationality project, and this is due to me having a lot of habits I didn't used to have. The hypotheses I currently have are:
- You just have to be really motivated to do transfer learning, and a genuinely inspiring / good teacher, and it's just really hard to replicate this sort of training scientifically
- IQ is mostly measuring "fast intelligence", because that's what cost-effective to measure in large enough quantities to get a robust sample. i.e. it measures whether you can solve questions in like a few minutes which mostly depends on you being able to intuitively get it. It doesn't measure your ability to figure out how to figure something out that requires longterm planning, which would allow a lot of planning skills to actually come into play.
Both seem probably at least somewhat true, but the latter one feels like a clearer story for why there would be potential (at least theoretically) in the space I'm exploring – IQ test take a few hours to take. It would be extremely expensive to do the theoretical statistically valid version of the thing I'm aiming at.
My explicit goal here is to train researchers who are capable of doing the kind of work necessary in worlds where Yudkowsky is right about the depth/breadth of alignment difficulty.
(my guess is you took more like 15-25 minutes per question? Hard to tell from my notes, you may have finished early but I don't recall it being crazy early)
(This seems like more time than Buck was taking – the goal was to not get any wrong so it wasn't like people were trying to crank through them in 7 minutes)
The problems I gave were (as listed in the csv for the diamond problems)
- #1 (Physics) (1 person got right, 3 got wrong, 1 didn't answer)
- #2 (Organic Chemistry), (John got right, I think 3 people didn't finish)
- #4 (Electromagnetism), (John and one other got right, 2 got wrong)
- #8 (Genetics) (3 got right including John)
- #10 (Astrophysics) (5 people got right)
I at least attempted to be filtering the problems I gave you for GPQA diamond, although I am not very confident that I succeeded.
(Update: yes, the problems John did were GPQA diamond. I gave 5 problems to a group of 8 people, and gave them two hours to complete however many they thought they could complete without getting any wrong)
I like all these questions. "Maybe you should X" is least likely to be helpful but still fine so long as "nah" wraps up the thread quickly and we move on. The first three are usually helpful (at least filtered for assistants who are asking them fairly thoughtfully)
I imagined "FocusMate + TaskRabbit" specifically to address this issue.
Three types of workers I'm imagining here:
- People who are reasonable skilled types, but who are youngish and haven't landed a job yet.
- People who actively like doing this sort of work and are good at it
- People who have trouble getting/keeping a fulltime job for various reasons (which would land them in the "unreliable" sector), but... it's FocusMate/TaskRabbit, they don't need to be reliable all the time, there just needs to be one of them online who responds to you within a few hours, who is at least reasonably competent when they're sitting down and paying attention.
And then there are reviews (which I somehow UI design to elicit honest reactions, rather than just slapping a 0-5 stars rating which everyone feels obligated to rate "5" all the time unless something was actively wrong"), and they have profiles about what they think they're good at and what others thought they were good at.
(where an expectation is, if you don't have active endorsementss, if you haven't yet been rated you will probably charge a low rate)
Meanwhile if you're actively good and actively reliable, people can "favorite" you and work out deals where you commit to some schedule.
(Quick note to people DMing me, I'm doing holidays right now and will followup in a week or so. I won't necessarily have slots/need for everyone expressing interest)
Can you say more details about how this works (in terms of practical steps) and how it went?
I actually meant to say "x-risk focused individuals" there (not particularly researchers), and yes was coming from the impact side of things. (i.e. if you care about x-risk, one of the options available to you is to becoming a thinking assistant).
I’d like to hire cognitive assistants and tutors more often. This could (potentially) be you, or people you know. Please let me know if you’re interested or have recommendations.
By “cognitive assistant” I mean a range of things, but the core thing is “sit next to me, and notice when I seem like I’m not doing the optimal thing, and check in with me.” I’m interested in advanced versions who have particular skills (like coding, or Applied Quantitivity, or good writing, or research taste) who can also be tutoring me as we go.
I’d like a large rolodex of such people, both for me, and other people I know who could use help. Let me know if you’re interested.
I was originally thinking "people who live in Berkeley" but upon reflection this could maybe be a remote role.
Yep, endorsed. One thing I would add: the "semi-official" dresscode I've been promoting explicitly includes black (for space/darkness), silver (for stars), gold (for the sun), and blue (for the earth).
(Which is pretty much what you have here, I think the blue works best when it is sort of a minority-character distributed across people, such that it's a bit special when you notice it)
The complaints I remember about this post seem mostly to be objecting to how some phrases were distilled into the opening short "guideline" section. When I go reread the details it mostly seems fine. I have suggestions on how to tweak it.
(I vaguely expect this post to get downvotes that are some kind of proxy for vague social conflict with Duncan, and I hope people will actually read what's written here and vote on the object level. I also encourage more people to write up versions of The Basics of Rationalist Discourse as they seem them)
The things I'd want to change are:
1. Make some minor adjustments to the "Hold yourself to the absolute highest standard when directly modeling or assessing others' internal states, values, and thought processes." (Mostly, I think the word "absolute" is just overstating it. "Hold yourself to a higher standard" seems fine to me. How much higher-a-standard depends on context)
2. Somehow resolve an actual confusion I have with the "...and behave as if your interlocutors are also aiming for convergence on truth" clause. I think this is doing important, useful work, but a) it depends on the situation, b) it feels like it's not quite stating the right thing.
Digging into #2...
Okay, so when I reread the detailed section, I think I basically don't object to anything. I think the distillation sentence in the opening paragraphs conveys a thing that a) oversimplifies, and b) some people have a particularly triggered reaction to.
The good things this is aiming for that I'm tracking:
- Conversations where everyone trusts that each other are converging on truth are way less frictiony than ones where everyone is mistrustful and on edge about it.
- Often, even when the folk you're talking to aren't aiming for convergence on truth, proactively acting as if they are helps make it more true. Conversational vibes are contagious.
- People are prone to see others' mistakes as more intense than their own mistakes, and if most humans aren't specifically trying to compensate for this bias, there's a tendency to spiral into a low-trust conversation unnecessarily (and then have the wasted motion/aggression of a low-trust conversation instead of a medium-or-high one).
I think maybe the thing I want to replace this with is more like "aim for about 1-2 levels more trusting-that-everyone-is-aiming-for-truth than currently feel warranted, to account for your own biases, and to lead by example in having the conversation focus on truth." But I'm not sure if this is quite right either.
...
This post came a few months before we created our New User Reject Template system. It should have at least occurred to me to use some of the items here as some of the advice we have easily-on-hand to give to new users (either as part of a rejection notice, or just "hey, welcome to LW but it seems like you're missing some of the culture here."
If this post was voted in the Top 50, and a couple points were resolved, I'd feel good making a making a fork with minor context-setting adjustments and then linking to it as a moderation resource), since I'd feel like The People had a chance to weigh in on it.
The context-setting I'm imagining is not "these are the official norms of LessWrong", but, if I think a user is making a conversation worse for reasons covered in this post, be more ready to link to this post. Since this post came out, we've developed better Moderator UI for sending users comments on their comments, and it hadn't occurred to me until now to use this post as reference for some of our Stock Replies.
(Note: I currently plan to make it so, during the Review, anyone write Reviews on a post even if normally blocked on commenting. Ideally I'd make it so they can also comment on Review comments. I haven't shipped this feature yet but hopefully will soon)
Previously, I think I had mostly read this through the lens of "what worked for Elizabeth?" rather than actually focusing on which of this might be useful to me. I think that's a tradeoff on the "write to your past self" vs "attempt to generalize" spectrum – generalizing in a useful way is more work.
When I reread it just now, I found the "Ways to Identify Fake Ambition" the most useful section (both for the specific advice of "these emotional reactions might correspond to those motivations", and the meta-level advice of "check for your emotional reactions and see what they seem to be telling you."
I'd kinda like to see a post that is just that section, with a bit of fleshing out to help people figure out when/why they should check for fake ambition (and how to relate to it). I think literally a copy-paste version would be pretty good, and I think there's a more (well, um) ambitious version that does more interviewing with various people and seeing how the advice lands for them.
I might incorporate this section more directly into my metastrategy workshops.
Well to be honest in the future there is probably mostly an AI tool that just beams wisdom directly into your brain or something.
I wrote about 1/3 of this myself fyi. (It was important to me to get it to a point where it was not just a weaksauce version of itself but where I felt like I at least might basically endorse it and find it poignant as a way of looking at things)
One way I parse this is "the skill of being present (may be) about untangling emotional blocks that prevent you from being present, more than some active action you take."
It's not like entangling emotional blocks isn't tricky!
I don't have a strong belief that this experience won't generalize, but, I want to flag the jump between "this worked for me" and an implied "this'll work for everyone/most-people." (I expect most people would benefit from hearing this suggestion, just generally have a yellow-flag about some of the phrasings you have here)