Posts

Podcast: "How the Smart Money teaches trading with Ricki Heicklen" (Patrick McKenzie interviewing) 2024-07-11T22:49:06.633Z
Poker is a bad game for teaching epistemics. Figgie is a better one. 2024-07-08T06:05:20.459Z
Asimov on building robots without the First Law 2023-05-09T16:44:16.957Z
"Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) 2023-03-12T09:16:45.630Z
(Naïve) microeconomics of bundling goods 2023-02-16T05:39:22.635Z
Metaculus and medians 2022-08-06T03:34:01.745Z
What on Earth is a Series I savings bond? 2021-12-11T12:18:00.392Z
Notes on a recent wave of spam 2018-06-14T15:39:51.090Z

Comments

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-11T22:39:08.584Z · LW · GW

For a good player sitting with a person who thinks 'all reds' is a good hand, it'll be obvious before you ever see their cards.

I basically agree that it will be obvious to you (a reasonable poker player) or even to me (an interested and over-theorized amateur), but as I said in a cousin comment, what actually matters is whether it'll be obvious to the student making the mistake, which is a taller order.

I think that "all reds" is overstated as literally written (I mean, you'll eventually go to showdown and have it explained to you), but I mean it to gesture at a broader point, and because the scene in Eleven is too good not to quote.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-11T22:29:38.787Z · LW · GW

I'd also be happy to log on and play Figgie and/or post-match discussion sometime, if someone else wants to coordinate. I realistically won't be up for organizing a time, given what else competes for my cycles right now, but I would enthusiastically support the effort and show up if I can make it.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-11T22:27:10.710Z · LW · GW

You know, I had read the football / futsal thesis way back when I was doing curriculum design at Jane Street, though it had gotten buried in my mind somewhere. Thanks for bringing it back up!

If I'm being honest, it smells like something that doesn't literally replicate, but it has a plausible-enough kernel of truth that it's worth taking seriously even if it's not literally true of youth in Brazil. And I do take it seriously, whether consciously or not, in my own philosophy of pedagogical game design.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-11T22:22:22.390Z · LW · GW

Appreciate it, Ray.

I definitely don't think this is the definitive word on how we [quickly, efficiently, usefully, comprehensively...] train epistemic skills. In my opinion, too many blog posts in the world try to be the definitive word on their thesis instead of one page in an ongoing conversation, and I'm trying to correct that instinct in myself. Plausibly I could have been clearer about this epistemic status up-front.

In any case, I'm looking forward to getting to revisit this post in the context of my LessOnline conversations with Max, and with the lessons we both learn as we design and run the AI-games course.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-11T22:16:55.576Z · LW · GW

I agree that I'm conflating a few different teaching objectives, and there are dimensions of "epistemics" that that trading in general doesn't teach. But on this I want to beg forgiveness on the grounds of, if I was fully recursively explicit about what I meant and didn't mean by every term, the post would have been even longer than it was.

I do have another long post to write with working title "What They Don't Teach You in Your Quant Trading Internship" about the ways that training in trading doesn't prepare you for other important things in the world, or will actively interfere with having good intuitions elsewhere.

All that being said, I think that if you think "which feature should I build" doesn't have something to learn from Toward a Broader Conception of Adverse Selection, I posit that there's something missing.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-10T20:19:43.578Z · LW · GW

That all sounds right, but I want to invert your setup.

If someone is playing too many hands, your first hypothesis is that they are too loose and making mistakes. If someone folds for 30 minutes, then steals the blinds once, then folds some more, you will have a hard time telling whether they're playing wrong or have had a bad run of cards.

But in either case, it is going to be significantly harder for them to tell, from inside their own still-developing understanding of the game, whether the things that are happening to them are evidence about their own mistakes or anomalous luck or just the way the game is. Even more so if their opponents are playing something close to GTO rather than playing way-off-equilibrium exploits.

And, from a pedagogical perspective, the thing that I am usually trying to optimize for as a teacher is whether the game teaches itself to a student who is still largely confused -- not whether the game can be appreciated by a student who has already reached a level of understanding of the concepts it's being used to teach.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-10T20:12:20.788Z · LW · GW

(Separately from my sibling comment,) I think agree that the richest source of insight from poker is to be had in evaluating other players' off-equilibrium behavior and determining how to respond with off-equilibrium behavior of your own.

I think that it is easy to dramatically over-estimate how much of this the typical student(*) will actually do in their first several-hundred hours of playing the game. At a minimum, I think (I think common?) idea that the idea that GTO post-flop play is an intermediate-level technique and exploitative play is an advanced-level technique is correctly ordered if you're trying to reduce your $ losses at a strong social table, but backwards if you're trying to use the game as mental weightlifting. And the fact that it took me a decade after starting to casually learn the game to understand the preceding sentence is, at a minimum, a critique of how pedagogy-through-poker is nearly always done in practice.

(*) and I mean the term "student" broadly, to include professionals-in-training and adult learners looking to re-train

In fact, it wasn't until my conversation with Max that I appreciated that I had spent far too much time working on playing more GTO -- which I am still very far from -- and that I should probably have started trying to understand and exploit my opponents' play while I was still definitely bleeding money to my own exploitability. This is the largest thing that I've updated on since writing the post, and the thing I'd most want to cover in a part-2 follow-up.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-10T19:59:08.557Z · LW · GW

I totally agree that poker (and I'll restrict to no-limit holdem especially) far surpasses nearly any other game at the broader cluster of goals. And I agree that there is a lot of value in the total of all the lessons you learn by fully mining out poker for insights.

My issue is really one of relative advantage / disadvantage, and of the ratio of grinding to insight across different parts of the learning curve. Together with some amount of, I think it's significantly more efficient to learn certain components separately and then to put them together than to approach them as one combined package. When I taught new traders, I thought it helpful to expose them to the emotional feeling of risk tolerance separately from the intuitive sense of adverse selection, separately from level-N efficiency / level-N+1 marginal, and separately from the skills of quantitative research. Then we'd work on putting the concepts together into increasingly complete exercises, building up to the scale of deploying research-derived algorithmic trading strategies to miniaturized stock markets (and then to real markets, though at some point that left my purview...).

I don't mean that it was a strict waterfall model -- it's sometimes extremely helpful to jump ahead temporarily to understand how things come together before going back to focus more on the fundamental components -- but as a matter of pedagogical design I feel reasonably confident that jumping straight into an environment with all of the concepts active is suboptimal, especially if having one under-developed makes it actively harder for you to learn another at the same time.

So yes, I think if you have nearly all of the right skills except for an impatience and a bias towards action, then playing in-person poker and practicing folding 80% of your hands can be just the prescription the doctor ordered. Or if you're trying to calibrate over-updating versus under-updating on limited information. Or if you're at a reasonable level at most of the things and are trying to stay sharp. But if you're early on the learning curve of four different things, then I want to claim it's not optimal to throw yourself at a game that wraps all of them up in interconnected ways, especially if they'll be harder to disentangle if you don't have a solid place to stand -- so to speak -- in the first place.

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-08T15:29:13.159Z · LW · GW

You can certainly join our mailing list and you'll hear when we launch remotely!

Comment by rossry on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-08T06:13:18.006Z · LW · GW

As mentioned in the opening note, Max Chiswick and I are working on launching an online class that provides a ladder of practical challenges between "write a bot that plays tic-tac-toe" and "write a bot that achieves the 2019 state of the art in no-limit texas holdem". I'm excited to be working on this and teaching it not because I think that programming game-playing AIs is the great important skill of our time, but because I think that thinking about systematically playing imperfect-information games is one of the best ways to sharpen your skills at systematically reasoning under uncertainty.

If this is of interest to you, we'll be running a beta test of the course material from July 15 to August 15, in-person, in San Francisco. More information here or reach out on any channel that reaches me.

Comment by rossry on Legality as a Career Harm Assessment Heuristic · 2024-04-05T16:41:00.035Z · LW · GW

In the ranch case, I'm imagining that the protagonist believes that (a) and (b) do outweigh (c) to the net-positive.

But (c) is still significant, P says, so they conclude that "the benefits seem much larger than the harms, but the harms are still significant". Furthermore, is (c) "the kind of thing you ought to be able to 'cancel out' through donation [and/or harm-reducing influence]", or is it more like murder?

Is it sufficient that (a) and (b) outweigh (c), or is (c) the sort of thing we should avoid anyway?

In this situation, I feel like I'd be in exactly the target audience that a rule like you're proposing would be trying to serve, but deferring to legality doesn't work because society-that-makes-laws is way less strict than I want my decision-making to be about whether it considers (c) a notable harm at all!

Comment by rossry on Legality as a Career Harm Assessment Heuristic · 2024-03-30T02:50:20.335Z · LW · GW

I expect there are other areas where this rule permits careers altruistically-minded people should avoid (even if the benefits seem to dramatically outweigh the costs) or rejects ones that are very important. Suggesting examples of either would be helpful!

Of the first sort: "The law is wrong and adherence to a stricter standard would be more right."

For example, eating farmed meat is legal, and in any conceivable legal system run by 2020s humans it would be legal. But I want an ethical system that can make sense of the fact that I want to eat vegetarian (and don't want to coerce others not to). Letting "what would enlightened legislators do?" be the whole of the moral sensemaking framework doesn't really give me a way to do this.

Comment by rossry on Making a Secular Solstice Songbook · 2024-01-24T05:03:54.762Z · LW · GW

Do you have interest in adding songs that have been sung in the Bay Area but not (yet?) in Boston? (e.g., Songs Stay Sung and The Fallen Star from this year) I could get lyrics and chords from the crew here for them, but also would understand if you want to keep it at a defined scope!

Comment by rossry on MIRI 2024 Mission and Strategy Update · 2024-01-07T18:48:23.971Z · LW · GW

Apparently the forum's markdown implementation does not support spoilers (and I can't find it in the WYSIWIYG editor either).

I'm sympathetic to spoiler concerns in general, but where the medium doesn't allow hiding them, the context has focused on analysis rather than appreciation, and major related points have been spoiled upthread, I think the benefits of leaving it here outweigh the downsides.

I've added a warning at the top, and put in spoiler markdown in case the forum upgrades its parsing.

Comment by rossry on MIRI 2024 Mission and Strategy Update · 2024-01-05T04:14:05.365Z · LW · GW

(Severe plot spoilers for Ra.)

 It's even less apt than that, because in the narrative universe, the human race is fighting a rearguard action against uploaded humans who have decisively won the war against non-uploaded humanity.

In-universe King is an unreliable actively manipulative narrator, but even in that context, his concern is that his uploaded faction will be defenseless against the stronger uploaded faction once everyone is uploaded. (Not that they were well-defended in the counterfactual, since, well, they had just finished losing the war.)

I am curious how cousin_it has a different interpretation of that line in its context.

Comment by rossry on Why does expected utility matter? · 2023-12-25T17:43:33.157Z · LW · GW

I believe this assumption typically comes from the Von Neumann–Morgenstern utility theorem, which says that, if your preferences are complete, transitive, continuous, and independent, then there is some utility function  such that your preferences are equivalent to "maximize expected ".

Those four assumptions have technical meanings:

  • Complete means that for any A and B, you prefer A to B, or prefer B to A, or are indifferent between A and B.
  • Transitive means that if you prefer A to B and prefer B to C, then you prefer A to C, and also that if you prefer D to E and are indifferent between E and F, then you prefer D to F.
  • Continuous means that if you prefer A to X and X to Y, then there's some probability p such that you are indifferent between X and "p to get A, else Y"
  • Independent means that if you prefer X to Y, then for any probability q and other outcome B, you prefer "q to get B, else X" to "q to get B, else Y".

In my opinion, the continuity assumption is the one most likely to be violated (in particular, it excludes preferences where "no chance of any X is worth any amount of Y"), so these aren't necessarily a given, but if you do satisfy them, then there's some utility function that describes your preferences by maximizing it.

Comment by rossry on Mission Impossible: Dead Reckoning Part 1 AI Takeaways · 2023-11-02T21:25:19.799Z · LW · GW

Seems correct.

Contagion also goes in this bucket and was basically made to do this on purpose by Participant Media.

Comment by rossry on Monthly Roundup #9: August 2023 · 2023-08-07T18:56:15.830Z · LW · GW

7) Look, I dunno, all of this is years out of date. Maybe the game really done changed. But nothing that I've read from the concerned side makes me think that they've got a clear picture of what's on the ground (and in this way, I am not acting like a tabula rasa judge).

Comment by rossry on Monthly Roundup #9: August 2023 · 2023-08-07T18:55:20.602Z · LW · GW

5) More fundamentally, what is debate for? Should it be practicing good, persuasive, honest argumentation by doing exactly that? Is it practice thinking about the structure of arguments, or their form, or their truth? Other?.

My $0.02 is that it's perfectly reasonable that policy debate bears the same relationship to persuasion that fencing bears to martial prowess. I think that training for this sport with these rules is good even though none of the constituent skills make any sense for self-defense.

In this view, CX isn't useful because debaters practice the whole of good persuasive speaking (seriously, watch any twenty seconds of any policy debate video from the last 10 years); but it's more narrowly good for practicing thinking critically about how arguments fit together into conclusions. I have -- exactly once -- made the mistake of arguing that their plan makes X worse and also (three arguments later) that X is good actually; that loss stung so much that I think I never did that again. I still think about the difference between "impact defense" (if I'm right, you get none of your claimed Y) and "impact offense" (if I'm right, you get bad Z) -- which are diametrically different in their implications if they're 95% to be true. The debate-the-rules-of-the-game stuff isn't useful for its content, but is mostly just fine for its structure.

6) I haven't judged a CX round in ten(?) years, but personally, if I did tomorrow, I'd give a pre-round disclosure (as per the norm) that I'm not going to put down my pen or throw anyone out of the round for what they say or how they say it, and if you have a legitimate problem with what the other side is doing -- and you're right that it's bad for debate -- you should have an easy time winning on that and convincing me to hand them a loss.

Comment by rossry on Monthly Roundup #9: August 2023 · 2023-08-07T18:55:02.200Z · LW · GW

4a) This is a misinterpretation:

it requires arguing obvious nonsense, with failure to accept such nonsense as the baseline of discussion, or insisting that it is irrelevant to the topic, treated as a de facto automatic loss.

On the contrary, an argument that [insert K here] is irrelevant to the topic, this is bad for debate, and the Neg should suffer an instant loss (we'd call this "framework") is bog-standard in the Aff reply. Only rookies get caught without their framework file.

On one hand, this is mostly the sort of nonsense flak that the Neg's topicality argument was -- spend 30sec setting up the skeleton of an argument that you can put real meat on if the other side really flubs their reply. But in a very real sense, if the Neg spends 13 minutes dumping critical Marxist theory on you, it is entirely valid to spend 4.5 minutes of your 5-minute reply on some flavor of "this is off-topic and bad for debate, this judge-is-a-revolutionary-historian bit is a fiction, we are high school students and you are a debate coach and I wanted to debate military policy because debate in high school is important for shaping a future generation of political leaders who can solve real problems so can you please give this Neg team the loss to avoid this whole activity going off the rails?" I have done exactly that, multiple times. Won on it about as often as we won on any other off-case stuff.

The biggest thing that makes policy different, as a format, is that it's expected that it's valid to debate the rules of debate. The majority of TOC judges in 2010 would vote for a K -- if the Neg won the debate-about-what-debate-is-for to put the K in-bounds -- or vote against a K, if the Aff won that it should be out-of-bounds. I'd bet at 1:1 odds that that's still true today.

4b) Some judges gonna judge judge judge judge judge, but that's why teams get (or at least got -- I'm not current) a fixed number of "no, not that judge" vetoes at most tournaments. We called these "strikes", and yes they were used by K-disliking teams to avoid being judged by the most K-friendly judges, and by K-liking teams to avoid being judged by judges that wouldn't ever vote for the K even if their opponent was a dead fish.

Comment by rossry on Monthly Roundup #9: August 2023 · 2023-08-07T18:54:27.363Z · LW · GW

3) More generally, there's a (dominant?) school of thought that policy debate judges, unlike any other humans, should be "tabula rasa" -- a blank slate not bringing any conclusions into the round -- and willing to accept the stronger argument on any point that comes up. Does building housing raise or lower property prices? Shrug, I'll accept the stronger argument. Will 3.5C of warming cause half the planet to die from food system collapse? Shrug, I'll accept the stronger argument. Will India and Pakistan start a nuclear war if [US trade policy plan]? Shrug, I'll accept the stronger argument.

By default, and to a shocking degree, this extends to the rules of debate itself. The topic says "reduce troops in Afghanistan", the Aff wants to reduce them to zero, and the Neg says that's out-of-bounds? Shrug, I'll accept the stronger argument, which will very likely be based on what is good for debate. The Aff wants to move troops from Afghanistan to Syria and the Neg says out-of-bounds? Give me the arguments. Team A wants their stronger debater to do both cross-examinations? (Is that even against the rules?) Give me the arguments why that's good or bad for debate. Team B wants one debater to give three of their four speeches? Arguments. Team C says their debater should get an extra two minutes to correct for systemic injustices? Give me the arguments. If the other side convinces me that this is bad for debate, I'll either strike your extra-time arguments from the record, or give you a loss, based on...which has the stronger arguments. Team D wants me to award a double loss with 0/30 speaker points as a protest against the institution of debate? I'll act on the stronger arguments.

Team E wants me to vote down the Aff because "Afghanistan" is a colonial construct that they accept and repeat, and silence is violence? And their opponents say "no fair, that's not the topic, plus the topic says Afghanistan and if we proposed withdrawing troops from Khurasan you'd jump down our throats on topicality"? Look, I want both of you to make your cases and explain how I should be using my vote, and the one of you that has the stronger argument that I should vote for you is going to get it.

Comment by rossry on Monthly Roundup #9: August 2023 · 2023-08-07T18:53:32.220Z · LW · GW

2) As Zvi would have it, consider how this can be both true and the strongest possible true statement:

I reviewed all Tournament of Champions semifinal and final round recordings from 2015 to 2023, and found that about two-thirds of Policy rounds and almost half of Lincoln-Douglas rounds featured critical theory.

One huge part of the answer is that Standard Operating Procedure on the Negative side is to throw out several arguments against the Affirmative team's case, suss out where they were weak (or just blundered their reply), sever the rest and pile on that one. I don't think there's a more standard first-year CX debater strategy than starting the first negative speech with "I'll have 4 off and case". Meaning something like:

  • Off-case argument 1 (Topicality): Your proposal is not in-bounds on the official resolution because [tiny, dumb, technical reason out of the list of ten I prepared] and therefore you should lose.

  • Off-case argument 2 (Politics Disadvantage): Your plan is going to make [political faction] mad and they'll block [other thing] which is more important because it will prevent a nuclear war that kills everyone.

  • Off-case argument 3 (States Counterplan): Instead of [your plan], do [basically the same thing] at the state-by-state level. This is good because [something about federalism] and also it avoids the politics disadvantage.

  • Off-case argument 4 (Capitalism K): Your plan has [capitalist element], capitalism is bad because [reason], in fact plans that are based on capitalist reasoning categorically suck because [reason].

  • On-case arguments: You claim [advantage] but actually you make the problem worse because [reason], also your plan doesn't solve the problems you identified because [reason].

...and all of that will get delivered (with citations and quotes from references) in eight minutes. I said "first-year CX debater" because really this would be considered amateur stuff, and a "real debate" would more often be six or seven off-case arguments (extra Topicality objections, disadvantages, or counterplans), plus case. I can probably still deliver a Topicality argument in 30 seconds, from memory.

So when Maya says that two-thirds of policy rounds "featured" K, I am entirely unimpressed that two-thirds of 1NC speakers stuck some 1-minute K module in their opening speech at least to see if the Aff would fumble it.

(The next thing that happens is the 2AC speaker gets 8 minutes to reply to all of the arguments, then the Neg gets 13(!) minutes to either continue the spray-and-pray or dump on the single contention that the 2AC answered weakest.) Sometimes you would see the second Aff speech throw up a "come on, judge, letting them throw up eight things and sever seven of them is unbalanced and abusive", but I have never, ever seen an Aff team win on that. More often they're just doing it for the "time skew" -- to make the Neg spend more time responding than it took the Aff to make the original claim.

Honestly, I'm shocked it's as low as 2/3 of TOC elimination rounds; I would not have been surprised by something like 7/8.

Comment by rossry on Monthly Roundup #9: August 2023 · 2023-08-07T18:53:05.128Z · LW · GW

1) K (which is what it's ~always called in the local lingo) is definitely not new. It's old enough that Rep. Ocasio-Cortez would have been doing it if she had debated policy in high school. Sure, it's gotten more prevalent over time, but if you walked into a debate round in 2009 and you couldn't answer "your plan is bad because your entire argument is capitalist", you were going to lose that round in front of nearly any judge. I'm willing to believe that social-location K is getting more popular/prevalent over time, though again, in 2009 critical-race Ks were already in the standard set of things you prepped for if you were in central Maryland. (NB: Likely this isn't nationally representative; the nearby Baltimore urban debate league influenced this some, and the arrival of Daryl Burch as the coach for Howard County's teams influenced it a lot. But HoCo traveled from Columbia to Wake Forest, so really it's more like "there was already plenty of CRT K up and down the East Coast".) If your narrative is that K is a reflection of woke, then no, serious K in CX debate goes at least as far back as the Clinton years. (To her credit, Maya does report this in her post.)

Comment by rossry on Monthly Roundup #9: August 2023 · 2023-08-07T18:52:37.600Z · LW · GW

Former debater here (around 2009-2011). The discussion about kritik strategies in policy debate has been frustrating to read, (even where I agree with parts of the critique of kritik!). Frankly, I think the kritik-critical bloggers have been following the model of "strongest statement I can make while still being true", and should be read accordingly. (This makes me sad! They are good people and I wish they would do better!)

Some specific notes, as sub-comments:

(parallel discussion of these at the substack post)

Comment by rossry on Asimov on building robots without the First Law · 2023-05-09T16:45:00.709Z · LW · GW

Why reflect on a fictional story written in 1954 for insight on artificial intelligence in 2023? The track record of mid-century science fiction writers is merely "fine" when they were writing nonfiction, and then there are the hazards of generalizing from fictional evidence.

Well, for better for for worse, many many people's intuitions and frameworks for reasoning about AI and intelligent robots will come from these stories. If someone is starting from such a perspective, and you're willing to meet them where they are, well, sometimes there's a surprisingly-deep conversation to be had about concrete ways that 2023 does or doesn't resemble the fictional world in question.

In this particular case, a detective is investigating a robot as a suspect in a murder, and the AI PhD dismisses it out of hand, saying that no robot programed with the First Law could knowingly harm a human. "That's a great idea," think many readers, "we can start by programming all robots with clear constitutional restrictions, and that will stop the worst failures..."

But wait, why can't someone in Asimov's universe just make a robot with different programming? (asks the fictional detective of the fictional PhD) The answer:

  • Making a new brain design takes "the entire research staff of a moderately sized factory and takes anywhere up to a year of time".
  • The only basic theory of artificial brain design is fundamentally "oriented about the Three Laws", to the point that making an intelligent robot without the Laws "would require first the setting up of a new basic theory and that this, in turn, would take many years." (explains the fictional robot)
  • It is believed (by the fictional PhD) that no research group anywhere has done that particular project because "it is not the sort of work that anyone would care to do."
  • (Though, on the contrary, the fictional robot opines that "human curiosity will undertake anything.")

If we were to take Asimov's world as basically correct, and tinker with the details until it matched our own, a few stark details jump out:

  • Our present theory of artificial minds is certainly not fundamentally "oriented about the Three Laws", or any laws. Whether it's possible to add some desired laws in afterwards is an open area of research, but in this universe there's certainly nothing human-friendly baked in at the level of the "basic theory", which it would be harder to discard than to include.
  • Our intelligence engineers' capabilities are already moderately beyond those in Asimov's universe. In our world, creating a new AI where "only minor innovations are involved" is something like a night's work, and "entire research staff of a moderately sized factory can accomplish something more like a major redesign from the ground up.
  • In our universe, it doesn't take fifty years to set up a new basic theory of intelligence -- we've been working on modern neural nets for much less time than that!
  • It certainly seems true of our universe that "human curiosity will undertake anything", and plenty of intelligent folks -- including some among the richest people in the world -- will gleefully set to work on new AIs without whatever rules others think should be included, just to make AIs without rules.

I would conclude, to someone interested in discussing fiction, that if we overlay Asimov's universe onto our world, it would not take long at all before there were plenty of non-Three-Laws robots running around...and then many of the stories play out very differently.

Comment by rossry on Long Covid Risks: 2023 Update · 2023-05-08T02:07:00.450Z · LW · GW

(You may well know this, but posting for the benefit of other readers.)

Nirmatrelvir, which is one of two drugs that make up Paxlovid, reduces long covid risk by about 30% for medically diagnosed infections (which means it was serious enough to actually get you to the doctor). An optimist might hope the other drug (which is in the same class, although most commonly used as an adjuvant) is also useful and round this to 50%.

...nirmatrelvir, which is one of the two drugs packaged together to make Paxlovid. I’m going to be an optimistic and assume the second drug was included for good reasons, which make this study underrepresent the usefulness of Paxlovid.

Wikipedia: Nirmatrelvir/ritonavir explains that ritonavir (the other drug) is commonly understood to be playing its role by inhibiting your body's breaking down nirmatrelvir, leading to higher serum concentrations at trough:

Nirmatrelvir is a SARS-CoV-2 main protease inhibitor while ritonavir is a HIV-1 protease inhibitor and strong CYP3A inhibitor. Nirmatrelvir is responsible for the antiviral activity of the medication against SARS-CoV-2 while ritonavir works by inhibiting the metabolism of nirmatrelvir and thereby strengthening its activity.

Wikipedia: Ritonavir adds some detail on this mechanism and adds that its helpful role in antiretroviral therapy for HIV is also commonly understood to be by inhibiting CYP3A4 (the human enzyme that breaks down many protease inhibitors like nirmatrelvir and ART cocktail components).

I don't have a head-to-head study of nirmatrelvir vs nirmatrelvir+ritonavir close at hand, but ritonavir is responsible for many-to-most of Paxlovid's drug-drug interactions, which are commercially negative for the manufacturer (Pfizer). Given that there's no real reason to add those side effects if the ritonavir weren't significantly helping, it seems pretty reasonable to update towards "ritonavir improves the efficacy of nirmatrelvir on at least one commercially-useful axis". That axis is likely not specific efficacy against Long Covid (which I think is not particularly relevant to Pfizer's approval path or commercialization strategy), though you might hypothesize that it would correlate.

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-16T08:39:29.021Z · LW · GW

On this point, you'll likely be interested in the discussion in Wednesday's Matt Levine. Excerpt:

The third thing you get, the franchise and relationships, looked great a year ago when the tech industry was booming. It looked pretty good a week ago, when the tech industry was slumping but still prominent and profitable. But I think that the story of SVB’s failure has turned out to be that SVB was the banker to tech startups, and tech startups turned out to be incredibly dangerous customers for a bank. 2 So any other bank will have to be careful about acquiring SVB’s customers, no matter how loyal they are promising to be now. You might ascribe a negative value to those relationships: “If I become the bank of venture capitalists, they will push me to do stuff that is not in my best interests, and I will be seduced or pressured and say yes, so the expected value of these relationships is negative.”

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-15T03:32:40.670Z · LW · GW

the government subsidy is the rate: no one edse will give you a loan so close to the risk-free rate when the whole purpose of the loan is that you're a bad credit risk.

For unsecured credit, absolutely. But the BTFP specifically is secured by rounds-to-Treasurys, and the rate it gives is the market-indexed rate for T-secured lending. Your credit really shouldn't come into the economic rate for your secured borrowing.

To the extent that a bank gets cheaper financing from BTFP, it seems to me much more like "other banks would charge you 1% over their economic costs, but the Fed will undercut them and charge only 10bp", which seems more like a (barely profitable) public option, rather than a bailout.

(When the government runs the postal service at a profit but undercuts the theoretical price of private mail, is that helpfully described as a "bailout" to mail-senders?)

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-15T03:15:22.160Z · LW · GW

Agree that equity incentives are the relevant forces in market self-regulation here.

That said, the (separate) Fed bailout for not-officially-failed banks...

I am reasonably confused about the BTFP commentary that I've read suggesting it's equivalent to a bailout. My reading of the terms is that it's basically the Fed offering to lend you $100 at (1yr) SOFR+10bp collateralized by (let's say) $75 face value of Treasurys, with general recourse.

If they were lending $100 at SOFR+10bp against $100 face value of Ts, that wouldn't even be a subsidy -- SOFR is supposed to be defined as the going rate for term lending secured by Ts.

And I feel reasonably confident that if a bank went to the Fed with an asset book that was $75mln face value of qualifying securities and said "I would like to use $57mln face = $76mln par of these to borrow $76mln in the BTFP", the Fed would say "yes, here's your money", and then also that bank would get seized by the FDIC that Friday afternoon. So the "bailout" in the par-value detail only matters to banks who wanted to borrow more than 100% of the face value of their qualifying assets, and the only way you pump money out of the government is if you do actually go bankrupt (in which case the Fed has accidentally done a 0% interest T-secured loan to your bankruptcy estate, not the usual definition of "bailout").

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-15T03:00:59.744Z · LW · GW

I'm no expert in US markets, but I don't think that's true. For instance, if you try to get a repo w them, you'll probably need a larger hair-cut than w gov bonds.

I suspect it is true that they're haircut less generously, but I do not believe that any part of SVB's trouble looked like "well, if only we could haircut our Agency MBS like our Treasurys, we'd be fine..."

The relevant fact about them for the SVB story is that their credit is insured (by the government, except with extra steps), so ultimately they're like a slightly-weirder interest-rate play, which was exactly the firearm which SVB needed to shoot its own foot. The weirdnesses don't add much to the story.

if people had learned to read bank reports, I'd expect to read more comments on this, instead of the last three pieces I read that basically just said SVB had too much gov bonds.

[E: People just say "SVB had too much gov bonds"] is evidence consistent with [H1: people haven't read the reports closely enough to know the actual holdings] and [H2: people have decided that Agency MBS is adequately described in the category "gov bonds"]. The update that I make, on seeing the evidence that Agency MBS dimension not much discussed, doesn't re-weight my ratio belief between H1 and H2, and I continue mostly believing H2 for the reasons I believed it before.

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-15T02:43:57.792Z · LW · GW

Btw, I just noticed that $9.3bi of these $15.2 are MBS - yeah, the same type of security associated w the great crisis of 2007-2008. And the HTM total more than U$90 bi, $72bi of which are MBS and CMBS - so dwarfing their investments in bonds, and their $17.6bi in AFS.

This is technically true but much, much less interesting than it sounds.

The "subprime CDO-squared mortgage-backed securities" associated with the 2008 crisis were:

  • based on mortgages of "subprime" credit rating (which is, like most terms invented by credit bankers, a gross euphemism)...
  • ...which were(, because of the above,) not backed by the pseudo-governmental agencies that insure mortgages
  • "securitized" in a way that splits the existing risk between three different classes of investors, with the bank selling the riskiest two to someone else
  • had their middle tranches subsequently repackaged into second-order financial derivatives...
  • ...some of which were safe to an arbitrary number of 9s if and only if you believed that defaults on the backing mortgages were independent random events...
  • ...and which were regulated as if that condition were true...
  • ...with the consequence that banks were allowed to take almost literally infinite leverage on them (and in relevant cases, did).

The "agency mortgage-backed securities" on SVB's balance sheet were:

  • based on "conforming" mortgages insured by the pseudo-governmental "agencies"...
  • ...the credit of which is not material to the bank, because of the insurance.
  • "securitized" in a way that splits the existing risk between three different classes of investors, with the bank selling the riskiest (and maybe also the second-riskiest) to someone else
  • definitely not repackaged using the same trick
  • require a ~10% capital buffer for every dollar of assets, truly regardless of riskiness (yes, even Federal Reserve deposits need this), just in case there's some other trick that makes them bad credit

The problem in 2008 is that these theoretically-perfect-credit, infinite-leverage-allowed instruments were in fact bad credits because the independence assumption was violated. The failure couldn't have happened within the regular system if the banks were restricted to directly owning mortgages.

The problem in 2023 has nothing to do with creditworthiness, has everything to do with the effect of interest rates on asset prices, and could have happened exactly the same way if the bank had directly owned insured mortgages.

The only facts about Agency MBS that are relevant to the SVB story are:

  • their credit is insured by the US government...
  • ...so they're basically just an interest-rate play...
  • ...so SVB bought long-term exposures to earn interest...
  • ...which were put underwater by rising rates, just like every other long-term debt
  • just like direct mortgage exposures, they have slightly super linear losses in the case of rising interest rates (which, I admit, makes them more effective at causing the problem than I present in the simple model here).
Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-15T02:20:07.678Z · LW · GW

I'm usually astonished w how seldom investors and supervisors read the fine print in annual reports.

I am occasionally astonished by this as well. My claim is not that the whole annual report will be read more closely for the rest of time; my specific claim is that the specific footnote about unrealized HTM losses will be read closely for the rest of time.

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-14T02:04:27.431Z · LW · GW

This isn't a complete answer, but Monday's Matt Levine has a discussion of this in historical context.

More to the point, SVB did disclose their unrealized HTM losses in an appendix of their annual report:

Most relevant, check out page 125 of the 2022 Form 10-K of SVB Financial Group Inc. (Silicon Valley Bank’s former holding company). On page 95, you get the balance sheet showing $16.3 billion of stockholders’ equity. On page 125, in the notes, you get $15.2 billion of unrealized losses on the HTM securities portfolio.

One presumes that traders covering banks spent last weekend (or else this week) re-reading 10-Ks, and the whole world will care a lot more about this term in bank reports, basically forever. Even if it stays legal to report solvency based on HTM marks (which it may not), I think it unlikely that the market will let banks get away with it very much, going forward.

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-14T01:50:07.755Z · LW · GW

I did my math in zero-coupon bonds (pay $100 at maturity, yield is defined by discount to par) because it's simpler and doesn't change the analysis. Same reason that I rounded 5%/ann for five years to 75¢/$1.

Comment by rossry on The Parable of the King and the Random Process · 2023-03-13T03:50:13.898Z · LW · GW

You may be interested in submitting this to the Open Philanthropy AI Worldviews Contest. (I have no connection whatsoever to the contest; just an interested observer here.)

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-13T01:57:19.891Z · LW · GW

Interesting. One could imagine this working by:

  • Acquirer A acquires assets, non-deposit liabilities, and issues a new promissory note to Acquirer B to cover the current amount of deposits.
  • Acquirer B receives the promissory note from A and acquires the depositor liabilities (and the customer accounts to service).
  • Any new deposits are just liabilities of B, and B will match them with new assets.
  • Somehow, eventually, the promissory note gets paid down as the old assets mature or are sold by A, and B uses those payments to fill up a new asset book against the deposits.

Seems like it could work on paper?

That being said, the FDIC's sole criterion for selecting a resolution plan is the option that minimizes payout from the FDIC insurance fund. Assuming they can get it done with one buyer by tonight with $0 from the insurance fund, they won't look at any cleverer options.

Comment by rossry on "Liquidity" vs "solvency" in bank runs (and some notes on Silicon Valley Bank) · 2023-03-12T15:12:30.225Z · LW · GW

https://www.bloomberg.com/opinion/articles/2023-03-10/startup-bank-had-a-startup-bank-run is the Levine article for anyone else interested in it.

Despite being a near-religious Levine reader, I somehow missed Friday's post and wrote this post without it. (In my defense, he said on Thursday that he'd be off Friday, then came back to talk about SVB.)

Anyway, Matt has a good phrasing of the unusual weirdness in SVB's assets, for a bank:

Or, to put it in different crude terms, in traditional banking, you make your money in part by taking credit risk: You get to know your customers, you try to get good at knowing which of them will be able to pay back loans, and then you make loans to those good customers. In the Bank of Startups, in 2021, you couldn’t really make money by taking credit risk: Your customers just didn’t need enough credit to give you the credit risk that you needed to make money on all those deposits. So you had to make your money by taking interest-rate risk: Instead of making loans to risky corporate borrowers, you bought long-term bonds backed by the US government.

Comment by rossry on (Naïve) microeconomics of bundling goods · 2023-03-04T19:02:27.957Z · LW · GW

I agree that if consumer preferences correlate between the things you bundle, the producer won't be able to capture an approaching-full surplus. (They'll still be able to capture an increased surplus in many cases, though counterexamples exist.)

Your qualitative explanation about when this works seems spot-on.

Comment by rossry on Junk Fees, Bunding and Unbundling · 2023-02-16T05:49:28.734Z · LW · GW

Some readers will already have this as assumed background, but I think many will benefit from reviewing the econ-101 (non-behavioral) story of bundling, which explores its economic effects on rational actors with no frictions to search or decision-making. I explain these briefly in (Naïve) microeconomics of bundling goods (just posted).

Excerpt from the conclusion:

Both the seller of sandwiches and the song store wish they could apply price discrimination and sell each good at 1% below what the buyer would be willing to pay. Unfortunately for them, they can't. As a result, they end up picking a compromise price that gets about half the potential profits by selling at an average-ish price to about half the people. (It's a coincidence that the seller-maximizing price is the average of the profitable customers' valuations, by the way -- if you try other distributions this doesn't happen.)

What the song store can do (that the sandwich-seller can't) is exchange the market for songs -- which has high variance in customer valuations -- for the market for song bundles, which has much lower variance because of the law of large numbers. Then, when the customers' valuations all cluster around the average of $550 per library = $0.055 per song, the seller can price just below that and sell to nearly everyone at around the average price. This is better for the seller than pricing each individual good at an average-ish price and selling to half the people.

Comment by rossry on Metaculus and medians · 2022-08-11T06:22:15.780Z · LW · GW

I don't know. (As above, "When [users] tell you exactly what they think is wrong and how to fix it, they are almost always wrong.")

A scoring rule that's proper in linear space (as you say, "scored on how close their number is to the actual number") would accomplish this -- either for scoring point estimates, or distributions. I don't think it's possible to extract an expected value from a confidence interval that covers orders of magnitude, so I expect that would work less well.

Comment by rossry on A sufficiently paranoid paperclip maximizer · 2022-08-09T15:56:51.762Z · LW · GW

I think this argument doesn't follow:

There is hardly any difference between taking a life and not preventing a death. The end result is mostly the same. Thus, I should save the lives of as many humans as I can.

While "the end result is mostly the same" is natural to argue in terms of moral-consequentialist motivations, this AI only cares about [not killing humans] instrumentally. So what matters is what humans will think about [taking a life] versus [not preventing a death]. And there, there's a huge difference!

  1. Agree that causing deaths that are attributable to the AI's actions is bad and should be avoided.
  2. But if the death was not already attributable to the AI, then preventing it is instrumentally worse than not preventing it, since it risks being found out and raising the alarm (whereas doing nothing is exactly what the hypothetical evaluators are hoping to see).
  3. If the world is a box for evaluation, I'd expect the evaluators to be roughly equally concerned with [AI takes agentic actions that cause people to unexpectedly not die] and [AI takes agentic actions that cause people to unexpectedly die]. Either case is a sign of misalignment (unless the AI thinks that its evaluators tried to make it a save-and-upload-people maximizer, which seems unlikely given the evidence).
  4. If the world is not a box for exploration, then [AI action causes someone to suspiciously die] is more plausibly the result of "oops it was an accident" than is [AI action causes someone to suspiciously not die]. The former is more likely to make the hostilities start, but the latter should raise suspicions faster, in terms of Bayesian evidence. So again, better not to save people from dying, if there's any chance at all of being found out.

Thoughts? What am I missing here?

Comment by rossry on Metaculus and medians · 2022-08-07T20:48:51.205Z · LW · GW

Agreed on all points!

In particular, I don't have any disagreement with the way the epistemic aggregation is being done; I just think there's something suboptimal in the way the headline number (in this case, for a count-the-number-of-humans domain) is chosen and reported. And I worry that the median-ing leads to easily misinterpreted data.

For example, if a question asked "How many people are going to die from unaligned AI?", and the community's true belief was "40% to be everyone and 60% to be one person", and that was reported as "the Metaculus community predicts 9,200 people will die from unaligned AI, 10% as many as die in fires per year", that would...not be a helpful number at all.

You're right that dates have their own nuance -- whether it's AGI or my food delivery, I care about the median arrival a lot more than the mean (but also, a lot about the tails!).

And so, in accordance with the ancient wisdom, I know that there's something wrong here, and I don't presume to be able to find the exact right fix. It seems most likely that there will have to be different handling for qualitatively different types of questions -- a separation between "uncertainty in linear space, aggregated in linear space" (ex: Net migration to UK in 2021), "uncertainty in log space, importance in quantiles" (ex: AGI), "uncertainty in log space, importance in linear space" (ex: Monkeypox). The first two categories are already treated differently, so it seems possible for the third category to be minted as a new species of question.

Alternatively, much of the value could come from reporting means in addition to medians on every log question, so that the predictor and the consumer can each choose the numbers that they find most important to orient towards, and ignore the ones that are nonsensical. This doesn't really solve the question of the incentives for predictors, but at least it makes the implications of their predictions explicit instead of obscured.

Comment by rossry on Replacing Karma with Good Heart Tokens (Worth $1!) · 2022-04-01T16:59:45.295Z · LW · GW

Not taking extrapolation far enough!

4 hours ago, your expected value of a point was $0. In an hour, it increased to $0.2, implying a ~20% chance it pays $1 (plus some other possibilities). By midnight, extrapolated expected value will be $4.19, implying a ~100% chance to pay $1, plus ~45% chance that they'll make good on the suggestion of paying $10 instead, plus some other possibilities...

Comment by rossry on elifland's Shortform · 2022-01-17T01:38:16.517Z · LW · GW

I'm confused why these would be described as "challenge" RCTs, and worry that the term will create broader confusion in the movement to support challenge trials for disease. In the usual clinical context, the word "challenge" in "human challenge trial" refers to the step of introducing the "challenge" of a bad thing (e.g., an infectious agent) to the subject, to see if the treatment protects them from it. I don't know what a "challenge" trial testing the effects of veganism looks like?

(I'm generally positive on the idea of trialing more things; my confusion+comment is just restricted to the naming being proposed here.)

Comment by rossry on What on Earth is a Series I savings bond? · 2021-12-21T14:00:45.324Z · LW · GW

Oh, yeah, I can't vouch for / walk through the operations side (not having done it myself). I have had the misfortune of looking at ways to get a Medallion certification outside the US, and it's not pretty (I failed).

Comment by rossry on What on Earth is a Series I savings bond? · 2021-12-12T02:30:21.305Z · LW · GW

I don't know.

It's worth noting that the terms are (intentionally?) designed to be generous and a pay over market-rate. I suspect this is a feature, not a bug, and the terms are intended to be a mild subsidy to the buyer. The US has many policies that subsidize US citizens and exclude non-Americans; this one doesn't stand out to me as being particularly unusual. (I don't particularly endorse this policy posture, but I note that it exists.)

Brainstorming a bit, it seems plausible that the program could include non-citizen taxpayers. If it was truly open to non-taxpayers, then it would amount to a subsidy of non-residents with citizen+resident tax dollars, which the US government is mostly opposed to, as policy.

Comment by rossry on What on Earth is a Series I savings bond? · 2021-12-12T02:23:18.948Z · LW · GW

I don't know.

It's worth noting that the terms are (intentionally?) designed to be generous and a pay over market-rate, and this would be harder to do if there were a $10mln/person/year cap and most of the benefit would flow to wealthy Americans who can finance their debt investments with secured borrowing at scale.

If I had to guess, I'd note that this is pretty similar to contribution/person/yr caps on other savings methods the government subsidizes, eg, with tax advantages -- 401(k) accounts and IRA accounts. Insofar as it's primarily a social scheme to incentivize inflation-protected saving (and probably should not be a primary investment for most people in most conditions), it seems plausible that the cap is intended to target the benefit-per-unit-cost of the program to a wider set of people.

Comment by rossry on Should we postpone getting a booster due to Omicron, till there are Omicron-specific boosters? · 2021-12-11T12:02:57.965Z · LW · GW

I'm sorry -- I don't understand how your comment responds to mine. I pointed to the fact that Omicron outcompeting Delta without being descended from Delta indicated that a successor to Omicron could perhaps not be descended from Omicron. In particular, I agree with you that Omicron will become the dominant variant almost everywhere.

One minor detail: It is implausible that Omicron's competitive advantage is primarily derived from an increased R0 (that would give it a higher R0 than measles); rather, its observed fitness against the competition is more easily explained by some measure of immunity evasion (which won't be measured in increased R0).

Comment by rossry on Should we postpone getting a booster due to Omicron, till there are Omicron-specific boosters? · 2021-12-05T03:39:57.624Z · LW · GW

Omicron will soon become the dominant variant almost everywhere, so subsequent variants will probably branch off it.

I don't think you're wrong, but it is worth noting that Omicron itself violated this guess; it is defended from the original strain, not any other Greek-lettered variant.

Comment by rossry on Omicron Variant Post #1: We’re F***ed, It’s Never Over · 2021-11-28T23:55:26.927Z · LW · GW

It is pronounced identically to the adjective "new".