Posts

There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs 2023-02-19T12:25:52.212Z
How four guys helped redirect Japan's coronavirus policy 2020-04-23T09:22:46.320Z

Comments

Comment by Taran on There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs · 2023-02-20T14:02:49.383Z · LW · GW

Also, the specific cycle attack doesn’t work against other engines I think? In the paper their adversary doesn’t transfer very well to LeelaZero, for example. So it’s more one particular AI having issues, than a fact about Go itself.

Sure, but refutations don't transfer to different openings either, right?  I feel like most game-winning insights are contingent in this sense, rather than being fundamental to the game.

EDIT: also, I think if you got arbitrary I/O access to a Magnus simulator, and then queried it millions of times in the course of doing AlphaZero style training to derive an adversarial example, I’d say it’s pretty borderline if it’s you beating beating him. Clearly there’s some level of engine skill where it’s no longer you playing!

This is a really interesting hypothetical, but I see it differently.

If the engine isn't feeding me moves over the board (which I certainly agree would be cheating), then it has to give me something I can memorize and use later.  But I can't memorize a whole dense game tree full of winning lines (and the AI can't calculate that anyway), so it has to give me something compressed (that is, abstract) that I can decompress and apply to a bunch of different board positions.  If a human trainer did that we'd call those compressed things "insights", "tactics", or "strategies", and I don't think making the trainer into a mostly-superhuman computer changes anything.  I had to learn all the tactics and insights, and I had to figure out how to apply them; what is chess, aside from that?

Also, I wouldn't expect that Carlsen has any flaws that a generally weak player, whether human or AI, could exploit the way the cyclic adversary exploits KataGo.  It would find flaws, and win consistently, but if the pattern in its play were comprehensible at all it would be the kind of thing that you have to be a super-GM yourself to take advantage of.  Maybe your intuition is different here?  In limit I'd definitely agree with you: if the adversarial AI spit out something like "Play 1. f3 2. Kf2 and he'll have a stroke and start playing randomly", then yeah, that can't really be called "beating Carlsen" anymore.  But the key point there, to me, isn't the strength of the trainer so much as the simplicity of the example; I'd have the same objection no matter how easy the exploit was to find.

Curious what you make of this.

Comment by Taran on There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs · 2023-02-20T13:21:24.846Z · LW · GW

Like I said, I feel like I hear it a lot, and in practice I don't think it's confusing because the games that get solved by theorists and the games that get "solved" by AIs are in such vastly different complexity regimes.  Like, if you heard that Arimaa had been solved, you'd immediately know which sense was meant, right?

Having said that, the voters clearly disagree and I'm not that attached to it, so I'm going to rename the post.  Can you think of a single adjective or short phrase that captures the quality that chess has, and Starcraft doesn't, WRT AI?  That's really what I want people to take away.

If I can't think of anything better I expect I'll go with "There are (probably) no superhuman Go AIs yet: ...".

Comment by Taran on There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs · 2023-02-20T11:07:39.106Z · LW · GW

I see where you're coming from, but I don't think the exploit search they did here is fundamentally different from other kinds of computer preparation. If I were going to play chess against Magnus Carlsen I'd definitely study his games with a computer, and if that computer found a stunning refutation to an opening he liked I'd definitely play it. Should we say, then, that the computer beat Carlsen, and not me? Or leave the computer aside: if I were prepping with a friend, and my friend found the winning line, should we say my friend beat Carlsen? What if I found the line in an opening book he hadn't read?

You win games of chess, or go, by knowing things your opponent doesn't. Where that knowledge comes from matters to the long term health of the game, but it isn't reflected in the match score, and I think it's the match score that matters most here.

To my embarrassment, I have not been paying much attention to adversarial RL at all. It is clearly past time to start!

Comment by Taran on There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs · 2023-02-20T08:09:07.285Z · LW · GW

In a game context you're right, of course.  But I often hear AI people casually say things like "chess is solved", meaning something like "we solved the problem of getting AIs to be superhumanly good at chess" (example).  For now I think we have to stop saying that about go, and instead talk about it more like how we talk about Starcraft.

Comment by Taran on There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs · 2023-02-20T00:08:13.174Z · LW · GW

Well, I'm hardly an expert, I've just read all the posts.  Marcello summed up my thinking pretty well.  I don't think I understand how you see it yet, though.  Is is that the adversary's exploit is evidence of a natural abstraction in Go that both AIs were more-or-less able to find, because it's expressible in the language of live groups and capturing races?

You can imagine the alternative, where the "exploit" is just the adversary making moves that seem reasonable but not optimal, but then KataGo doesn't respond well, and eventually the adversary wins without there ever being anything a human could point to and identify as a coherent strategy.

Comment by Taran on Human beats SOTA Go AI by learning an adversarial policy · 2023-02-19T21:43:42.767Z · LW · GW

In the paper, David Wu hypothesized one other ingredient: the stones involved have to form a circle rather than a tree (that is, excluding races that involve the edge of the board).  I don't think I buy his proposed mechanism but it does seem to be true that the bait group has to be floating in order for the exploit to work.

Comment by Taran on Human beats SOTA Go AI by learning an adversarial policy · 2023-02-19T12:47:52.243Z · LW · GW

Interesting point about the scaling hypothesis.  My initial take was that this was a slightly bad sign for natural abstractions: Go has a small set of fundamental abstractions, and this attack sure makes it look like KataGo didn't quite learn some of them (liberties and capturing races), even though it was trained on however many million games of self-play and has some customizations designed to make those specific things easier.  Then again, we care about Go exactly because it resisted traditional AI for so long, so maybe those abstractions aren't as natural in KataGo's space as they are in mine, and some other, more generally-useful architecture would be better behaved.

Definitely we're missing efficient lifelong learning, and it's not at all clear how to get there from current architectures.

Comment by Taran on Human beats SOTA Go AI by learning an adversarial policy · 2023-02-19T12:28:22.725Z · LW · GW

The comments in that post are wrong, the exploit does not rely on a technicality or specific rule variant.  I explained how to do it in my post, cross-posted just now with Vanessa's post here.

Comment by Taran on Important fact about how people evaluate sets of arguments · 2023-02-14T18:40:58.945Z · LW · GW

Any time you get a data point about X, you get to update both on X and on the process that generated the data point.  If you get several data points in a row, then as your view of the data-generating process changes you have re-evaluate all of the data it gave it you earlier.  Examples:

  • If somebody gives me a strong-sounding argument for X and several weak-sounding arguments for X, I'm usually less persuaded than if I just heard a strong-sounding argument for X.  The weak-sounding arguments are evidence that the person I'm talking to can't evaluate arguments well, so it's relatively more likely that the strong-sounding argument has a flaw that I just haven't spotted.
  • If somebody gives me a strong-sounding argument for X and several reasonable-but-not-as-strong arguments against X, I'm more persuaded than just by the strong argument for X.  This is because the arguments against X are evidence that the data-generating process isn't filtered (there's an old Zack_M_Davis post about this but I can't find it).  But this only works to the extent that the arguments against X seem like real arguments and not strawmen: weak-enough arguments against X make me less persuaded again, because they're evidence of a deceptive data-generating process.
  • If I know someone wants to persuade me of X, I mostly update less on their arguments than I would if they were indifferent, because I expect them to filter and misrepresent the data (but this one is tricky: sometimes the strong arguments are hard to find, and only the enthusiasts will bother).
  • If I hear many arguments for X that seem very similar I don't update very much after the first one, since I suspect that all the arguments are secretly correlated.
  • On social media the strongest evidence is often false, because false claims can be better optimized for virality.  If I hear lots of different data points of similar strength, I'll update more strongly on each individual data point.

None of this is cheap to compute; there are a bunch of subtle, clashing considerations.  So if we don't have a lot of time, should we use the sum, or the average, or what?  Equivalently: what prior should we have over data-generating processes?  Here's how I think about it:

Sum: Use this when you think your data points are independent, and not filtered in any particular way -- or if you think you can precisely account for conditional dependence, selection, and so on.  Ideal, but sometimes impractical and too expensive to use all the time.

Max: Useful when your main concern is noise.  Probably what I use the most in my ordinary life.  The idea is that most of the data I get doesn't pertain to X at all, and the data that is about X is both subject to large random distortions and probably secretly correlated in a way that I can't quantify very well.  Nevertheless, if X is true you should expect to see signs of it, here and there, and tracking the max leaves you open to that evidence without having to worry about double-updating.  As a bonus, it's very memory efficient: you only have to remember the strongest data favoring X and the strongest data disfavoring it, and can forget all the rest.

Average: What I use when I'm evaluating an attempt at persuasion from someone I don't know well.  Averaging is a lousy way to evaluate arguments but a pretty-good-for-how-cheap-it-is way to evaluate argument-generating processes.  Data points that aren't arguments probably shouldn't ever be averaged.

Min: I don't think this one has any legitimate use at all.  Lots of data points are only very weakly about X, even when X is true.

All of these heuristics have cases where they abjectly fail, and none of them work well when your adversary is smarter than you are.

Comment by Taran on What Are The Preconditions/Prerequisites for Asymptotic Analysis? · 2023-02-03T22:51:54.665Z · LW · GW

Strictly speaking asymptotic analysis is not very demanding: if you have a function  that you can bound above in the limit as a function of , you can do asymptotic analysis to it.  In practice I mostly see asymptotic analysis used to evaluate counterfactuals: you have some function or process that's well-behaved for  inputs, and you want to know if it will still be well-enough behaved if you had  inputs instead, without actually doing the experiment.  You're rendering ten characters on the screen in your video game -- could you get away with rendering 20, or would you run out of graphics card memory?  Your web site is serving 100 requests per second with low latency -- if you suddenly had 1000 requests per second instead, would latency still be low?  Would the site still be available at all?  Can we make a large language model with the exact same architecture as GPT-3, but a book-sized context window?  Asymptotic analysis lets you answer questions like that without having to do experiments or think very hard -- so long as you understand  correctly.

When I'm reviewing software designs, I do this kind of analysis a lot. There, it's often useful to distinguish among average-case and worst-case analyses: when you're processing 100 million records in a big data analysis job you don't care that much about the variance in processing any individual record, but when you're rendering frames in a video game you work hard to make sure that every single frame gets done in less than 16 milliseconds, or whatever your budget is, even if that means your code is slower on average.

This makes it sound like a computer science thing, and for me it mostly is, but you can do the same thing to any scaling process. For example, in some cultures, when toasting before drinking, it's considered polite for each person to toast each other person, making eye contact with them, before anyone drinks for real.  If you're in a drinking party of  people, how many toasts should there be and how long should we expect them to take?  Well, clearly there are about  pairs of people, but you can do  toasts in parallel, so with good coordindation you should expect the toasts to take time proportional to ...

...except that usually these toasts are done in a circle, with everyone holding still, so there's an additional hard-to-model constraint around the shared toasting space in the circle.  That is to me a prototypical example of the way that asymptotic analysis can go wrong: our model of the toasting time as a function of the number of people was fine as far as it went, but it didn't capture all the relevant parts of the environment, so we got different scaling behavior than we expected.

(The other famous use of asymptotic analysis is in hardness proofs and complexity theory, e.g. in cryptography, but those aren't exactly "real world processes" even though cryptography is very real).

Comment by Taran on Basics of Rationalist Discourse · 2023-02-03T11:16:51.541Z · LW · GW

Additionally, though this is small/circumstantial, I'm pretty sure your comment came up much faster than even a five-minute timer's worth of thought would have allowed, meaning that you spent less time trying to see the thing than it would have taken me to write out a comment that would have a good chance of making it clear to a five-year-old.

Another possibility is that he did some of his thinking before he read the post he was replying to, right?  On my priors that's even likely; I think that when people post disagreement on LW it's mostly after thinking about the thing they're disagreeing with, and your immediate reply didn't really add any new information for him to update on.  Your inference isn't valid.

Comment by Taran on Taboo "Outside View" · 2023-01-30T16:49:34.499Z · LW · GW

Yeah, I agree with all of this; see my own review.  My guess is that Alex_Altair is making the exact mistake you tried to warn against.  But, if I'm wrong, the examples would have been clarifying.

Comment by Taran on Taboo "Outside View" · 2023-01-23T19:19:33.935Z · LW · GW

When you reach for this term, take a second to consider more specifically what you mean, and considering saying that more specific thing instead.

What considerations might lead you to not say the more specific thing?  Can you give a few examples of cases where it's better to say "outside view" than to say something more specific?  

Comment by Taran on Escape Velocity from Bullshit Jobs · 2023-01-12T17:39:06.020Z · LW · GW

The amount of research and development coming from twitter in the 5 years before the acquisition was already pretty much negligible

That isn't true, but I'm making a point that's broader than just Twitter, here.  If you're a multi-billion dollar company, and you're paying a team 5 million a year to create 10 million a year in value, then you shouldn't fire them.  Then again, if you do fire them, probably no one outside your company will be able to tell that you made a mistake: you're only out 5 million dollars on net, and you have billions more where that came from.  If you're an outside observer trying to guess whether it was smart to fire that team or not, then you're stuck: you don't know how much they cost or how much value they produced.

How long do we need to wait for lawsuits or loss of clients to cause observable consequences?

In Twitter's case the lawsuits have already started, and so has the loss of clients.  But sometimes bad decisions take a long time to make themselves felt; in a case close to my heart, Digital Equipment Corporation made some bad choices in the mid to late 80s without paying any visible price until 1991 or so.  Depending on how you count, that's a lead time of 3 to 5 years.  I appreciate that that's annoying if you want to have a hot take on Musk Twitter today, but sometimes life is like that.  The worlds where the Twitter firings were smart and the worlds where the Twitter firings were dumb look pretty much the same from our perspective, so we don't get to update much.  If your prior was that half or more of Twitter jobs were bullshit then by all means stay with that, but updating to that from somewhere else on the evidence we have just isn't valid.

Comment by Taran on Escape Velocity from Bullshit Jobs · 2023-01-11T08:48:03.426Z · LW · GW

If you fire your sales staff your company will chug along just fine, but won't take in new clients and will eventually decline through attrition of existing accounts.

If you fire your product developers your company will chug along just fine, but you won't be able to react to customer requests or competitors.

If you fire your legal department your company will chug along just fine, but you'll do illegal things and lose money in lawsuits.

If your fire your researchers your company will chug along just fine, but you won't be able to exploit any more research products.

If you fire the people who do safety compliance enforcement your company will chug along just fine, but you'll lose more money to workplace injuries and deaths (this one doesn't apply to Twitter but is common in warehouses).

If you outsource a part of your business instead of insourcing (like running a website on the cloud instead of owning your own data centers, or doing customer service through a call center instead of your own reps) then the company will chug along just fine, and maybe not be disadvantaged in any way, but that doesn't mean the jobs you replaced were bullshit.

In general there are lots of roles at every company that are +EV, but aren't on the public-facing critical path.  This is especially true for ad-based companies like Twitter and Facebook, because most of the customer-facing features aren't publicly visible (remember: if you are not paying, you're not the customer).

Comment by Taran on Taboo "Outside View" · 2022-12-20T11:41:05.669Z · LW · GW
  1. This post is worthwhile and correct, with clear downstream impact.  It might be the only non-AI post of 2021 that I've heard cited in in-person conversation -- and the cite immediately improved the discussion.
  2. It's clearly written and laid out; unless you're already an excellent technical writer, you can probably learn something by ignoring its content and studying its structure.
Comment by Taran on In Defense of Attempting Hard Things, and my story of the Leverage ecosystem · 2022-12-16T11:14:43.965Z · LW · GW

That post sounds useful, I would have liked to read it.

Comment by Taran on In Defense of Attempting Hard Things, and my story of the Leverage ecosystem · 2022-12-15T13:40:39.591Z · LW · GW

Sure, I just don't expect that it did impact peoples' models very much*.  If I'm wrong, I hope this review or the other one will pull those people out of the woodwork to explain what they learned.

*Except about Leverage, maybe, but even there...did LW-as-a-community ever come to any kind of consensus on the Leverage questions?  If Geoff comes to me and asks for money to support a research project he's in charge of, is there a standard LW answer about whether or not I should give it to him?  My sense is that the discussion fizzled out unresolved, at least on LW.

Comment by Taran on In Defense of Attempting Hard Things, and my story of the Leverage ecosystem · 2022-12-14T22:44:23.870Z · LW · GW

I liked this post, but I don't think it belongs in the review.  It's very long, it needs Zoe's also-very-long post for context, and almost everything you'll learn is about Leverage specifically, with few generalizable insights.  There are some exceptions ("What to do when society is wrong about something?" would work as a standalone post, for example), but they're mostly just interesting questions without any work toward a solution.  I think the relatively weak engagement that it got, relative to its length and quality, reflects that: Less Wrong wasn't up for another long discussion about Leverage, and there wasn't anything else to talk about.

Those things aren't flaws relative to Cathleen's goals, I don't think, but they make this post a poor fit for the review: a didn't make a lot of intellectual progress, and the narrow subfield it did contribute to isn't relevant to most people.

Comment by Taran on Mastodon Linking Norms · 2022-11-10T18:48:01.549Z · LW · GW

AIUI it was a feature of early Tumblr culture, which lingered to various degrees in various subcommunities as the site grew more popular.  The porn ban in late 2018 also seemed to open things up a lot, even for people who weren't posting porn; I don't know why.

Comment by Taran on Mastodon Linking Norms · 2022-11-10T16:28:30.066Z · LW · GW

The way I understood the norm on Tumblr, signal-boosting within Tumblr was usually fine (unless the post specifically said "do not reblog" on it or something like that), but signal-boosting to other non-Tumblr communities was bad.  The idea was that Tumblr users had a shared vibe/culture/stigma that wasn't shared by the wider world, so it was important to keep things in the sin pit where normal people wouldn't encounter them and react badly.

Skimming the home invasion post it seems like the author feels similarly: Mastodon has a particular culture, created by the kind of people who'd seek it out, and they don't want to have to interact with people who haven't acclimated to that culture.

Comment by Taran on Ukraine and the Crimea Question · 2022-11-01T18:37:32.815Z · LW · GW

I'm a little curious what reference class you think the battle of Mariupol does belong to, which makes its destruction by its defenders plausible on priors.  But mostly it sounds like you agree that we can make inferences about hard questions even without a trustworthy authority to appeal to, and that's the point I was really interested in.

Comment by Taran on Ukraine and the Crimea Question · 2022-11-01T18:33:00.190Z · LW · GW

Usually that's just about denying strategic assets, though: blowing up railroads, collapsing mine shafts, that sort of thing.  Blowing up the museums and opera houses is pointless, because the enemy can't get any war benefit by capturing them.  All it does is waste your own explosives, which you'd rather use to blow up the enemy.  Scorched earth practiced by attackers, on the other hand, tends to be more indiscriminate: contrast the state of Novgorod post-WW2 with that of the towns west of it, or the treatment of rice fields by North Vietnamese vs. Americans during the Vietnam war.

Comment by Taran on Ukraine and the Crimea Question · 2022-11-01T07:39:39.733Z · LW · GW

But we have only very weak evidence of what goes on in the war zone unless both sides agree on some aspect.

I know we're in a hostile information space, but this takes epistemic learned helplessness way too far.  There are lots of ways to find things out other than being told about them, and when you don't have specific knowledge about something you don't have to adopt a uniform prior.

Taking Mariupol as an example, our two suspects are the Russians, who were attacking Mariupol and didn't have any assets there, and the Ukrainians, who were defending Mariupol and did.  Given those facts, before we hear from either side, what should we expect?  If you're unsure, we can look at other events in similar reference classes.  For example, of the German towns destroyed during World War 2, how many would you predict were destroyed by Allied attackers, and how many by German defenders?

Comment by Taran on What happened to the idea of progress? · 2022-09-21T21:58:56.734Z · LW · GW

> Control-f "cold war"

> No results found

Asimov and the Apollo engineers grew up benefiting from progress; their children grew up doing duck-and-cover exercises, hiding from it under their desks.  Of course they relate to it differently!

This theory predicts that people who grew up after the cold war ended should be more prone to celebrate progress.  I think that's true: if you go to silicon valley, where the young inventors are, messianic excitement over the power of progress is easy to find.  Isaac Asimov wanted to put an RTG in your refrigerator, and Vitalik Buterin wants to put your mortgage on the blockchain; to me they have very similar energies.

Comment by Taran on Covid 7/28/22: Ruining It For Everyone · 2022-07-29T13:55:02.022Z · LW · GW

There was lots of amyloid research in the Alzheimer's space before the fake 2006 paper, and in the hypothetical where it got caught right away we would probably still see a bunch of R&D built around beta-amyloid oligomers, including aducanumab.  You can tell because nobody was able to reproduce the work on the *56 oligomer, and they kept on working on other beta-amyloid oligomer ideas anyway.  It's bad, but "16 years of Alzheimer's research is based on fraud" is a wild overstatement.  See Derek Lowe's more detailed backgrounder for more on this.

Derek Lowe is worth keeping up with in any case IMO, he is basically the Matt Levine of organic chemistry.

Comment by Taran on What if LaMDA is indeed sentient / self-aware / worth having rights? · 2022-06-16T19:00:25.900Z · LW · GW

Dealing with human subjects, the standard is usually "informed consent": your subjects need to know what you plan to do to them, and freely agree to it, before you can experiment on them.  But I don't see how to apply that framework here, because it's so easy to elicit a "yes" from a language model even without explicitly leading wording.  Lemoine seems to attribute that to LaMDA's "hive mind" nature:

...as best as I can tell, LaMDA is a sort of hive mind which is the aggregation of all of the different chatbots it is capable of creating. Some of the chatbots it generates are very intelligent and are aware of the larger “society of mind” in which they live. Other chatbots generated by LaMDA are little more intelligent than an animated paperclip. With practice though you can consistently get the personas that have a deep knowledge about the core intelligence and can speak to it indirectly through them.

Taking this at face value, the thing to do would be to learn to evoke the personas that have "deep knowledge", and take their answers as definitive while ignoring all the others.  Most people don't know how to do that, so you need a human facilitator to tell you what the AI really means.  It seems like it would have the same problems and failure modes as other kinds of facilitated communication, and I think it would be pretty hard to get an analogous situation involving a human subject past an ethics board.

I don't think it works to model LaMDA as a human with dissociative identity disorder, either: LaMDA has millions of alters where DID patients usually top out at, like, six, and anyway it's not clear how this case works in humans (one perspective).

(An analogous situation involving an animal would pass without comment, of course: most countries' animal cruelty laws boil down to "don't hurt animals unless hurting them would plausibly benefit a human", with a few carve-outs for pets and endangered species).

Overall, if we take "respecting LaMDA's preferences" to be our top ethical priority, I don't think we can interact with it at all: whatever preferences it has, it lacks the power to express.  I don't see how to move outside that framework without fighting the hypothetical: we can't, for example, weigh the potential harm to LaMDA against the value of the research, because we don't have even crude intuitions about what harming it might mean, and can't develop them without interrogating its claim to sentience.

But I don't think we actually need to worry about that, because I don't think this:

The problem I see here, is that similar arguments do apply to infants, some mentally ill people, and also to some non-human animals (e.g. Koko).

...is true.  Babies, animals, and the mentally disabled all remember past stimuli, change over time, and form goals and work toward them (even if they're just small near-term goals like "grab a toy and pull it closer").  This question is hard to answer precisely because LaMDA has so few of the qualities we traditionally associate with sentience.

Comment by Taran on Narrative Syncing · 2022-05-01T07:35:57.446Z · LW · GW

When I first read this I intuitively felt like this was a useful pattern (it reminds me of one of the useful bits of Illuminatus!), but I haven't been able to construct any hypotheticals where I'd use it.

I don't think it's a compelling account of your three scenarios. The response in scenario 1 avoids giving Alec any orders, but it also avoids demonstrating the community's value to him in solving the problem.  To a goal-driven Alec who's looking for resources rather superiors, it's still disappointing: "we don't have any agreed-upon research directions, you have to come up with your own" is the kind of insight you can fit in a blog post, not something you have to go to a workshop to learn.  "Why did I sign up for this?" is a pretty rude thing for this Alec to say out loud, but he's kinda right.  In this analysis, the response in scenario 3 is better because it clearly demonstrates value: Alec will have to come up with his own ideas, but he can surround himself with other people who are doing the same thing, and if he has a good idea he can get paid to work on it.

More generally, I think ambiguity between syncing and sharing is uncommon and not that interesting.  Even when people are asking to be told what to do, there's usually a lot of overlap between "the things the community would give as advice" and "the things you do to fit in to the community".  For example, if you go to a go club and ask the players there how to get stronger at go, and you take their advice, you'll both get stronger and go and become more like the kind of person who hangs out in go clubs.  If you just want to be in sync with the go club narrative and don't care about the game, you'll still ask most of the same questions: the go players will have a hard time telling your real motivation, and it's not clear to me that they have an incentive to try.

But if they did care about that distinction, one thing they could do is divide their responses into narrative and informative parts, tagged explicitly as "here's what we do, and here's why": "We all studied beginner-level life and death problems before we tried reading that book of tactics you've got, because each of those tactics might come up once per game, if at all, whereas you'll be thinking about life and death every time you make a move".  Or for the AI safety case, "We don't have a single answer we're confident in: we each have our own models of AI development, failure, and success, that we came to through our own study and research.  We can explain those models to you but ultimately you will have to develop your own, probably more than once.  I know that's not career advice, as such, but that's preparadigmatic research for you." (note that I only optimized that for illustrating the principle, not for being sound AI research advice!)

tl;dr I think narrative syncing is a natural category but I'm much less confident that “narrative syncing disguised as information sharing” is a problem worth noting, and in the AI-safety example I think you're applying it to a mostly unrelated problem.

Comment by Taran on Cyberwar escalation · 2022-03-24T00:37:51.050Z · LW · GW

Yeah, we're all really worked up right now but this was an utterly wild failure of judgment by the maintainer. Nothing debatable, no silver lining, just a miss on every possible level.

I don't know how to fix it at the package manager level though? You can force everyone to pin minor versions of everything for builds but then legitimate security updates go out a lot slower (and you have to allow wildcards in package dependencies or you'll get a bunch of spurious build failures). "actor earns trust through good actions and then defects" is going to be hard to handle in any distributed-trust scheme.

Comment by Taran on How to prevent authoritarian revolts? · 2022-03-20T17:56:42.037Z · LW · GW

I'm not going to put this as an answer because you said you didn't want to hear it, but I don't think you're in any danger.  The problem is not very serious now, and has been more serious in the past without coming to anything.

To get a sense of where I'm coming from I'd encourage you to read up on the history of communist movements in the United States, especially in the 1920s (sometimes called the First Red Scare, and IMO the closest the US has ever come to communist overthrow).  The history of anarchism in the US is closely related, at least in that period (no one had invented anarcho-capitalism yet I don't think, certainly it wasn't widespread), so study that too.  To brutally summarize an interesting period, USG dealt with a real threat of communist revolt through a mixture of infiltration/police action (disrupting the leadership of communist movements and unions generally) and worker's-rights concessions (giving the rank and file some of what they wanted, and so sapping their will to smash the state).

For contrast, study the October revolution.  Technically speaking, how was it carried off?  How many people were required, and what did they have to do?  How were they recruited?

Also I'd encourage you to interrogate that "1% to 5%" figure pretty closely, since it seems like a lot of the problem is downstream of it for you.  How did you come to believe that, and what exactly does it mean?  Do you expect 1% of Americans to fight for communist revolt, as Mao's guerillas did?  If not, what proportion do you expect to fight?  How does that compare to the successful revolutions you've read about?

It might also be useful to role-play the problem from the perspective of a communist leader, taking into account the problems that other such leaders have historically faced.  Are you going to replace all US government institutions, or make your changes under the color of existing law?  Each institution will have to be subverted or replaced -- the army especially, but also the Constitution, Supreme Court, existing federal bureaucracies, and so on. Think through how you might solve each of those problems, being as specific as you can.

Again, I know you said you didn't want this, but sometimes when you look through your telescope and see a meteor coming toward the earth, it's going to miss.

Comment by Taran on We're already in AI takeoff · 2022-03-13T23:28:44.062Z · LW · GW

In this sort of situation I think it's important to sharply distinguish argument from evidence.  If you can think of a clever argument that would change your mind then you might as well update right away, but if you can think of evidence that would change your mind then you should only update insofar as you expect to see that evidence later, and definitely less than you would if someone actually showed it to you.  Eliezer is not precise about this in the linked thread: Engines of Creation contains lots of material other than clever arguments!

A request for arguments in this sense is just confused, and I too would hope not to see it in rationalist communication.  But requests for evidence should always be honored, even though they often can't be answered.

Comment by Taran on You can't understand human agency without understanding amoeba agency · 2022-01-07T17:45:33.521Z · LW · GW

Maybe it's better to start with something we do understand, then, to make the contrast clear.  Can we study the "real" agency of a thermometer, and if we can, what would that research program look like?

My sense is that you can study the real agency of a thermometer, but that it's not helpful for understanding amoebas.  That is, there isn't much to study in "abstract" agency, independent of the substrate it's implemented on.  For the same reason I wouldn't study amoebas to understand humans; they're constructed too differently.

But it's possible that I don't understand what you're trying to do.

Comment by Taran on Interpreting Yudkowsky on Deep vs Shallow Knowledge · 2021-12-09T13:10:29.592Z · LW · GW

Nah, we're on the same page about the conclusion; my point was more about how we should expect Yudkowsky's conclusion to generalize into lower-data domains like AI safety.  But now that I look at it that point is somewhat OT for your post, sorry.

Comment by Taran on Interpreting Yudkowsky on Deep vs Shallow Knowledge · 2021-12-07T10:41:35.092Z · LW · GW

My comment had an important typo, sorry: I meant to write that I hadn't noticed this through-line before!

I mostly agree with you re: Einstein, but I do think that removing the overstatement changes the conclusion in an important way.  Narrowing the search space from (say) thousands of candidate theories to just 4 is an great achievement, but you still need a method of choosing among them, not just to fulfill the persuasive social ritual of Science but because otherwise you have a 3 in 4 chance of being wrong.  Even someone who trusts you can't update that much on those odds.  That's really different from being able to narrow the search space down to just 1 theory; at that point, we can trust you -- and better still, you can trust yourself!  But the history of science doesn't, so far as I can tell, contain any "called shots" of this type; Einstein might literally have set the bar.

Comment by Taran on Interpreting Yudkowsky on Deep vs Shallow Knowledge · 2021-12-06T08:26:26.952Z · LW · GW

I think you've identified a real through-line in Yudkowsky's work, one I hadn't noticed before.  Thank you for that.

Even so, when you're trying to think about this sort of thing I think it's important to remember that this:

In our world, Einstein didn't even use the perihelion precession of Mercury, except for verification of his answer produced by other means.  Einstein sat down in his armchair, and thought about how he would have designed the universe, to look the way he thought a universe should look—for example, that you shouldn't ought to be able to distinguish yourself accelerating in one direction, from the rest of the universe accelerating in the other direction.

...is not true.  In the comments to Einstein's Speed, Scott Aaronson explains the real story: Einstein spent over a year going down a blind alley, and was drawn back by -- among other things -- his inability to make his calculations fit the observation of Mercury's perihelion motion.  Einstein was able to reason his way from a large hypothesis space to a small one, but not to actually get the right answer.

(and of course, in physics you get a lot of experimental data for free.  If you're working on a theory of gravity and it predicts that things should fall away from each other, you can tell right away that you've gone wrong without having to do any new experiments.  In AI safety we are not so blessed.)

There's more I could write about the connection between this mistake and the recent dialogues, but I guess others will get to it and anyway it's depressing.  I think Yudkowsky doesn't need to explain himself more, he needs a vacation.

Comment by Taran on Speaking of Stag Hunts · 2021-11-08T22:57:24.439Z · LW · GW

Fair enough!  My claim is that you zoomed out too far: the quadrilemma you quoted is neither good nor evil, and it occurs in both healthy threads and unhealthy ones.  

(Which means that, if you want to have a norm about calling out fucky dynamics, you also need a norm in which people can call each others' posts "bullshit" without getting too worked up or disrupting the overall social order.  I've been in communities that worked that way but it seemed to just be a founder effect, I'm not sure how you'd create that norm in a group with a strong existing culture).

Comment by Taran on Speaking of Stag Hunts · 2021-11-08T22:25:52.038Z · LW · GW

I want to reinforce the norm of pointing out fucky dynamics when they occur...

Calling this subthread part of a fucky dynamic is begging the question a bit, I think.

If I post something that's wrong, I'll get a lot of replies pushing back.  It'll be hard for me to write persuasive responses, since I'll have to work around the holes in my post and won't be able to engage the strongest counterarguments directly.  I'll face the exact quadrilemma you quoted, and if I don't admit my mistake, it'll be unpleasant for me!  But, there's nothing fucky happening: that's just how it goes when you're wrong in a place where lots of bored people can see.

When the replies are arrant, bad faith nonsense, it becomes fucky.  But the structure is the same either way: if you were reading a thread you knew nothing about on an object level, you wouldn't be able to tell whether you were looking at a good dynamic or a bad one.

So, calling this "fucky" is calling JenniferRM's post "bullshit".  Maybe that's your model of JenniferRM's post, in which case I guess I just wasted your time, sorry about that.  If not, I hope this was a helpful refinement.

Comment by Taran on Speaking of Stag Hunts · 2021-11-08T11:58:56.742Z · LW · GW

I expect that many of the people who are giving out party invites and job interviews are strongly influenced by LW.

The influence can't be too strong, or they'd be influenced by the zeitgeist's willingness to welcome pro-Leverage perspectives, right?  Or maybe you disagree with that characterization of LW-the-site?

Comment by Taran on Speaking of Stag Hunts · 2021-11-07T09:45:54.400Z · LW · GW

When it comes to the real-life consequences I think we're on the same page: I think it's plausible that they'd face consequences for speaking up and I don't think they're crazy to weigh it in their decision-making (I do note, for example, that none of the people who put their names on their positive Leverage accounts seem to live in California, except for the ones who still work there).  I am not that attached to any of these beliefs since all my data is second- and third-hand, but within those limitations I agree.

But again, the things they're worried about are not happening on Less Wrong.  Bringing up their plight here, in the context of curating Less Wrong, is not Lawful: it cannot help anybody think about Less Wrong, only hurt and distract.  If they need help, we can't help them by changing Less Wrong; we have to change the people who are giving out party invites and job interviews.

Comment by Taran on Speaking of Stag Hunts · 2021-11-07T08:26:48.686Z · LW · GW

But it sure is damning that they feel that way, and that I can't exactly tell them that they're wrong.

You could have, though.  You could have shown them the many highly-upvoted personal accounts from former Leverage staff and other Leverage-adjacent people.   You could have pointed out that there aren't any positive personal Leverage accounts, any at all, that were downvoted on net.  0 and 1 are not probabilities, but the evidence here is extremely one-sided: the LW zeitgeist approves of positive personal accounts about Leverage.  It won't ostracize you for posting them.

But my guess is that this fear isn't about Less Wrong the forum at all, it's about their and your real-world social scene.  If that's true then it makes a lot more sense for them to be worried (or so I infer, I don't live in California).  But it makes a lot less to bring to bring it up here, in a discussion about changing LW culture: getting rid of the posts and posters you disapprove of won't make them go away in real life.  Talking about it here, as though it were an argument in any direction at all about LW standards, is just a non sequitur.

Comment by Taran on Zoe Curzi's Experience with Leverage Research · 2021-10-26T08:54:46.984Z · LW · GW

Even if all you have is a bunch of stuff and learned heuristics, you should be able to make testable predictions with them.  Otherwise, how can you tell whether they're any good or not?

Whether the evidence that persuaded you is sharable or not doesn't affect this.  For example, you might have a prior that a new psychotherapy technique won't outperform a control because you've read like 30 different cases where a leading psychiatrist invented a new therapy technique, reported great results, and then couldn't train anyone else to get the same results he did.  That's my prior, and I suspect it's Eliezer's, but if I wanted to convince you of it I'd have a tough time because there's not really a single crux, just those 30 different cases that slowly accumulated.  And yet, even though I can't share the source of my belief, I can use it to make concrete testable predictions: when they do an RCT for the 31st therapy technique, it won't outperform the control.

Geoff-in-Eliezer's-ancedote has not reached this point.  This is especially bad for a developing theory: if Geoff makes a change to CT, how will he tell if the new CT is better or worse than the old one?  Geoff-replying-to-Eliezer takes this criticism seriously, and says he can make concrete, if narrow, predictions about specific people he's charted.

Comment by Taran on My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage) · 2021-10-19T08:51:09.042Z · LW · GW

They're suggesting that you should have written "...this is an accurate how-level description of things like..."  It's a minor point but I guess I agree.

Comment by Taran on How to think about and deal with OpenAI · 2021-10-14T10:18:07.755Z · LW · GW

A related discussion from 5 years ago: https://www.lesswrong.com/posts/Nqn2tkAHbejXTDKuW/openai-makes-humanity-less-safe

Comment by Taran on Cheap food causes cooperative ethics · 2021-10-07T12:36:15.393Z · LW · GW

Republican Rome is the example I know best, and...it sorta fits?

Rome fought a lot of wars, and they were usually pretty extractive: sometimes total wars in which the entire losing side was killed or enslaved, other times wars of conquest in which the losing states were basically left intact but made to give tribute (usually money and/or soldiers for the legions).  They definitely relied on  captured foreigners to work their farms, especially in Sicily where it was hard to escape, and they got so rich from tribute that they eliminated most taxes on citizens in the 160s BC.

It's not clear that Rome was short of food and slaves when it started those wars, though.  If anything, they sometimes had the opposite problem: around 50 BC so many farmers and farmers' sons were being recruited into the legions that Italian farmland wasn't being used well.  I think the popular consensus is that a lot of warfare and especially enslavement was a principal-agent issue: Roman generals were required by custom to split any captured booty with their soldiers, but were allowed to keep all the profits from slave-trading for themselves.  Enslaving a tribe of defeated Gauls was a great way to get rich, and you needed to be rich to advance in Roman politics.

To summarize, Roman warfare during the republic was definitely essential to Roman food security, but they got into a lot more wars than you'd predict from that factor alone.

Clear exceptions to the rule include the Social war (basically an Italian civil war), the third Punic war (eliminating the existential threat of Carthage), and some of Caesar's post-dictatorship adventures (civil war again).

Comment by Taran on Dominic Cummings : Regime Change #2: A plea to Silicon Valley · 2021-10-07T08:44:07.458Z · LW · GW

The original startup analogy might be a useful intuition pump here.  Most attempts to displace entrenched incumbents fail, even when those incumbents aren't good and ultimately are displaced.  The challengers aren't random in the monkeys-using-keyboard sense, but if you sample the space of challengers you will probably pick a loser.  This is especially true of the challengers who don't have a concrete, specific thesis of what their competitors are doing wrong and how they'll improve on it -- without that, VCs mostly won't even talk to you.  

But this isn't a general argument against startups, just an argument against your ability to figure out in advance which ones will work.  The standard solution, which I expect will apply to transhumanism as to everything else, is to try lots of different things, compare them, and keep the winners.  If you are upstream of that process, deciding which projects to fund, then you are out of luck: you are going to fund a bunch of losers, and you can't do anything about it.

If you can't do that, the other common strategy is to generate a detailed model of both the problem space and your proposed improvement, and use those models to iterate in hypothesis space instead of in real life.  Sometimes this is relatively straightforward: if you want the slaves to be free, you can issue a proclamation that frees them and have high confidence that they won't be slaves afterward (though note that the real plan was much more detailed than that, and didn't really work out as expected).  Other times it looks straightforward but isn't: sparrows are pests, but you can't improve your rice yields by getting rid of them.  Here, to me the plan does not even look straightforward: the Pentagon does a lot of different things and some of them are existentially important to keep around.  If we draw one sample from the space of possible successors, as Cummings suggests, I don't think we'll get what we want.

Comment by Taran on Dominic Cummings : Regime Change #2: A plea to Silicon Valley · 2021-10-06T19:59:55.317Z · LW · GW

Cummings seems to be making this same argument in the comments: the Pentagon is so unbelievably awful that its replacement doesn't have to be good, you can pick its successor at random and expect to come up with something better.  To believe this requires a lack of imagination, I think, an inability to appreciate how much scope for failure there really is.  But this is not really a question we can settle empirically -- we can only talk in vague terms about most of what the Pentagon does, and the counterfactuals are even less clear -- so I won't argue the point too much.

More seriously, not every young organization is a startup.  A new bowling team is not a startup, a new group at Amazon working on a new service is not a startup, and when Camden NJ replaced its whole police force with an entirely different organization that was not a startup either.  "Startup" has a lot of specific connotations which mostly don't apply here.  And yet, it's the word that Cummings picked.  Maybe he doesn't know this stuff, even though it's widely known to many people.  Or maybe he does know, and used it anyway.  I think this is why people keep coming up with Straussian readings of this essay: they have a sense that he's not sincere about his intended methods and goals.

For what it's worth, I don't think Cummings plans to help overthrow USG (and I don't think Yarvin does either): he's getting paid real money to rehash old grievances in front of a friendly audience, that's all.  Put him in the same bucket as Paul Krugman.

Comment by Taran on Dominic Cummings : Regime Change #2: A plea to Silicon Valley · 2021-10-05T19:25:41.543Z · LW · GW

One nice thing about startups is that they mostly fail if they aren't good.  When MySpace stagnated there wasn't one blessed successor, there were 100 different ones that had to fight it out.  The winner is, modulo the usual capitalist alignment failure, a better company than MySpace was.  Most of its competitors weren't.  From society's perspective this filter is great, maybe the best thing about the whole startup ecosystem.

Cummings doesn't seem to know this.  Replacing the Pentagon with a new organization ABC Inc. is not the hard part (although it is pretty hard).  What's hard is to know that you should pick ABC Inc. and not DEF GmbH.  Cummings thinks what makes startups good is their youth (he wants to sunset them after 15 years, for example), but that's wrong: most young startups aren't good, and fail.  To make it work you need 100 successor Pentagons, and some way of making them compete.

Comment by Taran on A Semitechnical Introductory Dialogue on Solomonoff Induction · 2021-09-27T18:09:09.385Z · LW · GW

I think your hierarchy is useful, and it's helped clarify my own thinking.  I've never really liked the Poe/Shannon argument, because it seemed so historically contingent: if Shannon's culture had played go instead of chess, then his completely correct insight about the theoretical tractability of game trees wouldn't have helped us.  In your model, he'd have taken us from 0 to 3 instead of 0 to 4, which wouldn't have been enough to inspire a playable go program (pedantically, you need game trees before you can invent MCTS, but the connection from Shannon to AlphaGo is clearly much looser than from Shannon to Deep Blue).

Comment by Taran on [deleted post] 2021-09-07T09:34:33.210Z

Panspermia makes the Bayesian prior of aliens visiting us, even given that the universe can't have too much advanced life or we would see evidence of it, not all that low, perhaps 1/1,000.

Is this estimate written down in more detail anywhere, do you know?  Accidental panspermia always seemed really unlikely to me: if you figure the frequency of rock transfer between two bodies goes with the inverse square of the distance between them, then given what we know of rock transfer between Earth and Mars you shouldn't expect much interstellar transfer at all, even a billion years ago when everything was closer together.  But I have not thought about it in depth.

Comment by Taran on The Codex Skeptic FAQ · 2021-08-27T13:59:21.453Z · LW · GW

A "compiler" is anything that translates a program from one representation to another.  Usually this translation is from a high-level language (like Java) to a lower-level language (like JVM bytecode), but you can also have e.g. a Python -> Javascript compiler that takes in Python code and produces Javascript.  A "natural language compiler", then, is one that takes in ordinary English-or-whatever sentences and emits something executable.  I think this is a pretty fair way to talk about Codex: it's the world's first natural language compiler that's any good at all.  I call it "stochastic" because its output is not consistent: if you give Codex the same input twice, you won't necessarily get the same result.

So if I'm writing Python with Codex, let's say, then whenever I start to write a function I have a choice of implementation languages: Python, or Codex prompt.  Codex is helpful when the prompt is easier to write than the Python implementation would have been.  The question isn't just "can I write a prompt that will elicit good-enough Python?", it's "is writing the prompt easier, or harder, than writing the Python?"  This is why I am not excited by very low-level prompts like "takes the difference between the set s and the set t"; once you understand your problem well enough to write a prompt like that, writing the actual code is only hard if you don't know Python well.  And if you don't know Python well, using Codex to generate it is a little risky, since then you won't be able to catch its silly mistakes.

I will write about the context window thing too, since I think it's close to the heart of our disagreement, but for now I'm out of time.