Posts

Which things were you surprised to learn are not metaphors? 2024-11-21T18:56:18.025Z
Seven lessons I didn't learn from election day 2024-11-14T18:39:07.053Z
Research update: Towards a Law of Iterated Expectations for Heuristic Estimators 2024-10-07T19:29:29.033Z
Implications of China's recession on AGI development? 2024-09-28T01:12:36.443Z
My thesis (Algorithmic Bayesian Epistemology) explained in more depth 2024-05-09T19:43:16.543Z
My hour of memoryless lucidity 2024-05-04T01:40:56.717Z
Eric Neyman's Shortform 2024-04-25T05:58:02.862Z
My PhD thesis: Algorithmic Bayesian Epistemology 2024-03-16T22:56:59.283Z
How much do you believe your results? 2023-05-06T20:31:31.277Z
[Crosspost] ACX 2022 Prediction Contest Results 2023-01-24T06:56:33.101Z
Solving for the optimal work-life balance with geometric rationality 2022-11-28T17:02:53.777Z
Three questions about mesa-optimizers 2022-04-12T02:58:00.497Z
Can group identity be a force for good? 2021-07-04T17:16:32.761Z
Social behavior curves, equilibria, and radicalism 2021-06-05T01:39:22.063Z
An exploration of exploitation bias 2021-04-03T23:03:22.773Z
Pseudorandomness contest: prizes, results, and analysis 2021-01-15T06:24:15.317Z
Grading my 2020 predictions 2021-01-07T00:33:38.566Z
Overall numbers won't show the English strain coming 2021-01-01T23:00:34.905Z
Predictions for 2021 2020-12-31T21:12:47.184Z
Great minds might not think alike 2020-12-26T19:51:05.978Z
Pseudorandomness contest, Round 2 2020-12-20T08:35:09.266Z
Pseudorandomness contest, Round 1 2020-12-13T03:42:10.654Z
An elegant proof of Laplace’s rule of succession 2020-12-07T22:43:33.593Z

Comments

Comment by Eric Neyman (UnexpectedValues) on o3 · 2024-12-20T20:55:48.982Z · LW · GW

Yeah, I agree that that could work. I (weakly) conjecture that they would get better results by doing something more like the thing I described, though.

Comment by Eric Neyman (UnexpectedValues) on o3 · 2024-12-20T20:46:20.544Z · LW · GW

My random guess is:

  • The dark blue bar corresponds to the testing conditions under which the previous SOTA was 2%.
  • The light blue bar doesn't cheat (e.g. doesn't let the model run many times and then see if it gets it right on any one of those times) but spends more compute than one would realistically spend (e.g. more than how much you could pay a mathematician to solve the problem), perhaps by running the model 100 to 1000 times and then having the model look at all the runs and try to figure out which run had the most compelling-seeming reasoning.
Comment by Eric Neyman (UnexpectedValues) on leogao's Shortform · 2024-12-18T17:36:18.816Z · LW · GW

What's your guess about the percentage of NeurIPS attendees from anglophone countries who could tell you what AGI stands for?

Comment by Eric Neyman (UnexpectedValues) on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-15T21:56:52.893Z · LW · GW

I just donated $5k (through Manifund). Lighthaven has provided a lot of value to me personally, and more generally it seems like a quite good use of money in terms of getting people together to discuss the most important ideas.

More generally, I was pretty disappointed when Good Ventures decided not to fund what I consider to be some of the most effective spaces, such as AI moral patienthood and anything associated with the rationalist community. This has created a funding gap that I'm pretty excited about filling. (See also: Eli's comment.)

Comment by Eric Neyman (UnexpectedValues) on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-09T19:48:34.081Z · LW · GW

Consider pinning this post. I think you should!

Comment by Eric Neyman (UnexpectedValues) on Which things were you surprised to learn are not metaphors? · 2024-11-24T08:29:36.663Z · LW · GW

It took until I was today years old to realize that reading a book and watching a movie are visually similar experiences for some people!

Comment by Eric Neyman (UnexpectedValues) on Which things were you surprised to learn are not metaphors? · 2024-11-23T01:04:39.033Z · LW · GW

Let's test this! I made a Twitter poll.

Comment by Eric Neyman (UnexpectedValues) on Which things were you surprised to learn are not metaphors? · 2024-11-22T22:08:22.598Z · LW · GW

Oh, that's a good point. Here's a freehand map of the US I drew last year (just the borders, not the outline). I feel like I must have been using my mind's eye to draw it.

Comment by Eric Neyman (UnexpectedValues) on Which things were you surprised to learn are not metaphors? · 2024-11-22T19:33:50.923Z · LW · GW

I think very few people have a very high-fidelity mind's eye. I think the reason that I can't draw a bicycle is that my mind's eye isn't powerful/detailed enough to be able to correctly picture a bicycle. But there's definitely a sense in which I can "picture" a bicycle, and the picture is engaging something sort of like my ability to see things, rather than just being an abstract representation of a bicycle.

(But like, it's not quite literally a picture, in that I'm not, like, hallucinating a bicycle. Like it's not literally in my field of vision.)

Comment by Eric Neyman (UnexpectedValues) on Which things were you surprised to learn are not metaphors? · 2024-11-22T02:51:55.688Z · LW · GW

Huh! For me, physical and emotional pain are two super different clusters of qualia.

Comment by Eric Neyman (UnexpectedValues) on Which things were you surprised to learn are not metaphors? · 2024-11-22T00:09:47.871Z · LW · GW

My understanding of Sarah's comment was that the feeling is literally pain. At least for me, the cringe feeling doesn't literally hurt.

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-15T21:14:56.074Z · LW · GW

I don't really know, sorry. My memory is that 2023 already pretty bad for incumbent parties (e.g. the right-wing ruling party in Poland lost power), but I'm not sure.

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-15T18:58:54.473Z · LW · GW

Fair enough, I guess? For context, I wrote this for my own blog and then decided I might as well cross-post to LW. In doing so, I actually softened the language of that section a little bit. But maybe I should've softened it more, I'm not sure.

[Edit: in response to your comment, I've further softened the language.]

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-15T17:11:41.271Z · LW · GW

Yeah, if you were to use the neighbor method, the correct way to do so would involve post-processing, like you said. My guess, though, is that you would get essentially no value from it even if you did that, and that the information you get from normal polls would prrtty much screen off any information you'd get from the neighbor method.

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-15T17:06:28.768Z · LW · GW

I think this just comes down to me having a narrower definition of a city.

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-15T07:27:12.072Z · LW · GW

If you ask people who their neighbors are voting for, they will make their best guess about who their neighbors are voting for. Occasionally their best guess will be to assume that their neighbors will vote the same way that they're voting, but usually not. Trump voters in blue areas will mostly answer "Harris" to this question, and Harris voters in red areas will mostly answer "Trump".

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-15T06:37:41.208Z · LW · GW

Ah, I think I see. Would it be fair to rephrase your question as: if we "re-rolled the dice" a week before the election, how likely was Trump to win?

My answer is probably between 90% and 95%. Basically the way Trump loses is to lose some of his supporters or have way more late deciders decide on Harris. That probably happens if Trump says something egregiously stupid or offensive (on the level of the Access Hollywood tape), or if some really bad news story about him comes out, but not otherwise.

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-14T21:14:01.251Z · LW · GW

It's a little hard to know what you mean by that. Do you mean something like: given the information known at the time, but allowing myself the hindsight of noticing facts about that information that I may have missed, what should I have thought the probability was?

If so, I think my answer isn't too different from what I believed before the election (essentially 50/50). Though I welcome takes to the contrary.

Comment by Eric Neyman (UnexpectedValues) on Seven lessons I didn't learn from election day · 2024-11-14T21:02:56.991Z · LW · GW

I'm not sure (see footnote 7), but I think it's quite likely, basically because:

  • It's a simpler explanation than the one you give (so the bar for evidence should probably be lower).
  • We know from polling data that Hispanic voters -- who are disproportionately foreign-born -- shifted a lot toward Trump.
  • The biggest shifts happened in places like Queens, NY, which has many immigrants but (I think?) not very much anti-immigrant sentiment.

That said, I'm not that confident and I wouldn't be shocked if your explanation is correct. Here are some thoughts on how you could try to differentiate between them:

  • You could look on the precinct-level rather than the county-level. Some precincts will be very high-% foreign-born (above 50%). If those precincts shifted more than surrounding precincts, that would be evidence in favor of my hypothesis. If they shifted less, that would be evidence in favor of yours.
  • If someone did a poll with the questions "How did you vote in 2020", "How did you vote in 2024", and "Were you born in the U.S.", that could more directly answer the question.
Comment by Eric Neyman (UnexpectedValues) on Should CA, TX, OK, and LA merge into a giant swing state, just for elections? · 2024-11-07T00:20:04.401Z · LW · GW

An interesting thing about this proposal is that it would make every state besides CA, TX, OK, and LA pretty much irrelevant for the outcome of the presidential election. E.g. in this election, whichever candidate won CATXOKLA would have enough electoral votes to win the election, even if the other candidate won every swing state.

 

...which of course would be unfair to the non-CATXOKLA states, but like, not any more unfair than the current system?

Comment by Eric Neyman (UnexpectedValues) on Research update: Towards a Law of Iterated Expectations for Heuristic Estimators · 2024-10-07T23:06:49.158Z · LW · GW

Yeah, that's right -- see this section for the full statements.

Comment by Eric Neyman (UnexpectedValues) on Implications of China's recession on AGI development? · 2024-09-28T07:01:05.162Z · LW · GW

Since no one is giving answers, I'll give my super uninformed take. If anyone replies with a disagreement, you should presume that they are right.

During a recession, countries want to spend their money on economic stimulus programs that create jobs and get their citizens to spend more. China seems to be doing this.

Is spending on AI development good for these goals? I'm tempted to say no. One exception is building power plants, which China would maybe need to eventually do in order to build sufficiently large models.

At the same time, China seems to have a pretty big debt problem. Its debt-to-GDP ratio was 288% in 2023 (I think this number accounts not only for national debt but also for local government debt and maybe personal debt, which I think China has a lot of compared to other countries like the United States). This might in practice constrain how much it can spend.

So China is in a position of wanting to spend, but not spend too much, and AI probably isn't a great place for it to spend in order to accomplish its immediate goals.

In other words, I think the recession makes AGI development a lower priority for the Chinese government. It seems quite plausible to me that the recession might delay the creation of a large government project for building AGI by a few years.

 

(Again, I don't know stuff about this. Maybe someone will reply saying "Actually, China has already created a giant government project for building AGI" with a link.)

Comment by Eric Neyman (UnexpectedValues) on Should Sports Betting Be Banned? · 2024-09-21T15:01:53.395Z · LW · GW

Thanks! This makes me curious: is sports betting anomalous (among forms of consumption) in terms of how much it substitutes for financial investing?

Comment by Eric Neyman (UnexpectedValues) on Proveably Safe Self Driving Cars [Modulo Assumptions] · 2024-09-15T16:16:54.000Z · LW · GW

I think the "Provably Safe ML" section is my main crux. For example, you write:

One potential solution is to externally gate the AI system with provable code. In this case, the driving might be handled by an unsafe AI system, but its behavior would have “safety in the loop” by having simpler and provably safe code restrict what the driving system can output, to respect the rules noted above. This does not guarantee that the AI is a safe driver - it just keeps such systems in a provably safe box.

I currently believe that if you try to do this, you will either have to restrict the outputs so much that the car wouldn't be able to drive well, or else fail to prove that the actions allowed by the gate are safe. Perhaps you can elaborate on why this approach seems like it could work?

(I feel similarly about other proposals in that section.)

Comment by Eric Neyman (UnexpectedValues) on Akash's Shortform · 2024-09-03T17:44:35.374Z · LW · GW

For what it's worth, I don't have any particular reason to think that that's the reason for her opposition.

Comment by Eric Neyman (UnexpectedValues) on Akash's Shortform · 2024-09-03T06:00:11.187Z · LW · GW

But it seems like SB1047 hasn't been very controversial among CA politicians.

I think this isn't true. Concretely, I bet that if you looked at the distribution of Democratic No votes among bills that reached Newsom's desk, this one would be among the highest (7 No votes and a bunch of not-voting, which I think is just a polite way to vote No; source). I haven't checked and could be wrong!

 

My take is basically the same as Neel's, though my all-things-considered guess is that he's 60% or so to veto. My position on Manifold is in large part an emotional hedge. (Otherwise I would be placing much smaller bets in the same direction.)

Comment by Eric Neyman (UnexpectedValues) on Akash's Shortform · 2024-09-03T05:58:38.118Z · LW · GW

I believe that Pelosi had never once spoken out against a state bill authored by a California Democrat before this.

Comment by Eric Neyman (UnexpectedValues) on Arjun Panickssery's Shortform · 2024-07-30T17:16:46.107Z · LW · GW

Probably no longer willing to make the bet, sorry. While my inside view is that Harris is more likely to win than Nate Silver's 72%, I defer to his model enough that my "all things considered" view now puts her win probability around 75%.

Comment by Eric Neyman (UnexpectedValues) on "AI achieves silver-medal standard solving International Mathematical Olympiad problems" · 2024-07-26T16:24:44.306Z · LW · GW

[Edit: this comment is probably retracted, although I'm still confused; see discussion below.]

I'd like clarification from Paul and Eliezer on how the bet would resolve, if it were about whether an AI could get IMO silver by 2024.

Besides not fitting in the time constraints (which I think is kind of a cop-out because the process seems pretty parallelizable), I think the main reason that such a bet would resolve no is that problems 1, 2, and 6 had the form "find the right answer and prove it right", whereas the DeepMind AI was given the right answer and merely had to prove it right. Often, finding the right answer is a decent part of the challenge of solving an Olympiad problem. Quoting more extensively from Manifold commenter Balasar:

The "translations" to Lean do some pretty substantial work on behalf of the model. For example, in the theorem for problem 6, the Lean translation that the model is asked to prove includes an answer that was not given in the original IMO problem.

theorem imo_2024_p6 (IsAquaesulian : (ℚ → ℚ) → Prop) (IsAquaesulian_def : ∀ f, IsAquaesulian f ↔ ∀ x y, f (x + f y) = f x + y ∨ f (f x + y) = x + f y) : IsLeast {(c : ℤ) | ∀ f, IsAquaesulian f → {(f r + f (-r)) | (r : ℚ)}.Finite ∧ {(f r + f (-r)) | (r : ℚ)}.ncard ≤ c} 2

The model is supposed to prove that "there exists an integer c such that for any aquaesulian function f there are at most c different rational numbers of the form f(r)+f(−r) for some rational number r, and find the smallest possible value of c".

The original IMO problem does not include that the smallest possible value of c is 2, but the theorem that AlphaProof was given to solve has the number 2 right there in the theorem statement. Part of the problem is to figure out what 2 is.

Link: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/imo-2024-solutions/P6/index.html

Comment by Eric Neyman (UnexpectedValues) on Arjun Panickssery's Shortform · 2024-07-25T20:02:46.079Z · LW · GW

I'm now happy to make this bet about Trump vs. Harris, if you're interested.

Comment by Eric Neyman (UnexpectedValues) on Arjun Panickssery's Shortform · 2024-07-22T07:04:15.929Z · LW · GW

Looks like this bet is voided. My take is roughly that:

  • To the extent that our disagreement was rooted in a difference in how much to weight polls vs. priors, I continue to feel good about my side of the bet.
  • I wouldn't have made this bet after the debate. I'm not sure to what extent I should have known that Biden would perform terribly. I was blindsided by how poorly he did, but maybe shouldn't have been.
  • I definitely wouldn't have made this bet after the assassination attempt, which I think increased Trump's chances. But that event didn't update me on how good my side of the bet was when I made it.
  • I think there's like a 75-80% chance that Kamala Harris wins Virginia.
Comment by Eric Neyman (UnexpectedValues) on Eric Neyman's Shortform · 2024-06-24T18:05:46.292Z · LW · GW

I frequently find myself in the following situation:

Friend: I'm confused about X
Me: Well, I'm not confused about X, but I bet it's because you have more information than me, and if I knew what you knew then I would be confused.

(E.g. my friend who know more chemistry than me might say "I'm confused about how soap works", and while I have an explanation for why soap works, their confusion is at a deeper level, where if I gave them my explanation of how soap works, it wouldn't actually clarify their confusion.)

This is different from the "usual" state of affairs, where you're not confused but you know more than the other person.

I would love to have a succinct word or phrase for this kind of being not-confused!

Comment by Eric Neyman (UnexpectedValues) on Arjun Panickssery's Shortform · 2024-06-13T19:59:49.057Z · LW · GW

Yup, sounds good! I've set myself a reminder for November 9th.

Comment by Eric Neyman (UnexpectedValues) on Arjun Panickssery's Shortform · 2024-06-13T01:53:43.882Z · LW · GW

I'd have to think more about 4:1 odds, but definitely happy to make this bet at 3:1 odds. How about my $300 to your $100?

(Edit: my proposal is to consider the bet voided if Biden or Trump dies or isn't the nominee.)

Comment by Eric Neyman (UnexpectedValues) on Arjun Panickssery's Shortform · 2024-06-12T19:11:38.612Z · LW · GW

I think the FiveThirtyEight model is pretty bad this year. This makes sense to me, because it's a pretty different model: Nate Silver owns the former FiveThirtyEight model IP (and will be publishing it on his Substack later this month), so FiveThirtyEight needed to create a new model from scratch. They hired G. Elliott Morris, whose 2020 forecasts were pretty crazy in my opinion.

Here are some concrete things about FiveThirtyEight's model that don't make sense to me:

  • There's only a 30% chance that Pennsylvania, Michigan, or Wisconsin will be the tipping point state. I think that's way too low; I would put this probability around 65%. In general, their probability distribution over which state will be the tipping point state is way too spread out.
  • They expect Biden to win by 2.5 points; currently he's down by 1 point. I buy that there will be some amount of movement toward Biden in expectation because of the economic fundamentals, but 3.5 seems too much as an average-case.
  • I think their Voter Power Index (VPI) doesn't make sense. VPI is a measure of how likely a voter in a given state is to flip the entire election. Their VPIs are way to similar. To pick a particularly egregious example, they think that a vote in Delaware is 1/7th as valuable as a vote in Pennsylvania. This is obvious nonsense: a vote in Delaware is less than 1% as valuable as a vote in Pennsylvania. In 2020, Biden won Delaware by 19%. If Biden wins 50% of the vote in Delaware, he will have lost the election in an almost unprecedented landslide.
    • I claim that the following is a pretty good approximation to VPI: (probability that the state is the tipping state) * (number of electoral votes) / (number of voters). If you use their tipping-point state probabilities, you'll find that Pennsylvania's VPI should be roughly 4.3 times larger than New Hampshire's. Instead, FiveThirtyEight has New Hampshire's VPI being (slightly) higher than Pennsylvania's. I retract this: the approximation should instead be (tipping point state probability) / (number of voters). Their VPI numbers now seem pretty consistent with their tipping point probabilities to me, although I still think their tipping point probabilities are wrong.

The Economist also has a model, which gives Trump a 2/3 chance of winning. I think that model is pretty bad too. For example, I think Biden is much more than 70% likely to win Virginia and New Hampshire. I haven't dug into the details of the model to get a better sense of what I think they're doing wrong.

Comment by Eric Neyman (UnexpectedValues) on Eric Neyman's Shortform · 2024-06-05T20:20:31.762Z · LW · GW

One example of (2) is disapproving of publishing AI alignment research that may advance AI capabilities. That's because you're criticizing the research not on the basis of "this is wrong" but on the basis of "it was bad to say this, even if it's right".

Comment by Eric Neyman (UnexpectedValues) on Eric Neyman's Shortform · 2024-06-05T20:16:26.205Z · LW · GW

People like to talk about decoupling vs. contextualizing norms. To summarize, decoupling norms encourage for arguments to be assessed in isolation of surrounding context, while contextualizing norms consider the context around an argument to be really important.

I think it's worth distinguishing between two kinds of contextualizing:

(1) If someone says X, updating on the fact that they are the sort of person who would say X. (E.g. if most people who say X in fact believe Y, contextualizing norms are fine with assuming that your interlocutor believes Y unless they say otherwise.)

(2) In a discussion where someone says X, considering "is it good for the world to be saying X" to be an importantly relevant question.

I think these are pretty different and it would be nice to have separate terms for them.

Comment by Eric Neyman (UnexpectedValues) on Ilya Sutskever and Jan Leike resign from OpenAI [updated] · 2024-05-17T17:07:38.243Z · LW · GW

My Manifold market on Collin Burns, lead author of the weak-to-strong generalization paper

Comment by Eric Neyman (UnexpectedValues) on My thesis (Algorithmic Bayesian Epistemology) explained in more depth · 2024-05-10T16:51:25.773Z · LW · GW

Indeed! This is Theorem 9.4.2.

Comment by Eric Neyman (UnexpectedValues) on My hour of memoryless lucidity · 2024-05-09T00:32:57.333Z · LW · GW

Update: the strangely-textured fluid turned out to be a dentigerous cyst, which was the best possible outcome. I won't need a second surgery :)

Comment by Eric Neyman (UnexpectedValues) on My hour of memoryless lucidity · 2024-05-09T00:31:17.302Z · LW · GW

I just asked -- it was a combination of midazolam (as you had hypothesized), propofol, fentanyl (!), and ketamine.

Comment by Eric Neyman (UnexpectedValues) on Some Experiments I'd Like Someone To Try With An Amnestic · 2024-05-05T20:10:34.726Z · LW · GW

Yeah, that's my best guess. I have other memories from that period (which was late into the hour), so I think it was the drug wearing off, rather than learning effects.

Comment by Eric Neyman (UnexpectedValues) on Eric Neyman's Shortform · 2024-04-25T16:49:49.528Z · LW · GW

I'm curious what disagree votes mean here. Are people disagreeing with my first sentence? Or that the particular questions I asked are useful to consider? Or, like, the vibes of the post?

(Edit: I wrote this when the agree-disagree score was -15 or so.)

Comment by Eric Neyman (UnexpectedValues) on Eric Neyman's Shortform · 2024-04-25T05:58:03.525Z · LW · GW

I think that people who work on AI alignment (including me) have generally not put enough thought into the question of whether a world where we build an aligned AI is better by their values than a world where we build an unaligned AI. I'd be interested in hearing people's answers to this question. Or, if you want more specific questions:

  • By your values, do you think a misaligned AI creates a world that "rounds to zero", or still has substantial positive value?
  • A common story for why aligned AI goes well goes something like: "If we (i.e. humanity) align AI, we can and will use it to figure out what we should use it for, and then we will use it in that way." To what extent is aligned AI going well contingent on something like this happening, and how likely do you think it is to happen? Why?
  • To what extent is your belief that aligned AI would go well contingent on some sort of assumption like: my idealized values are the same as the idealized values of the people or coalition who will control the aligned AI?
  • Do you care about AI welfare? Does your answer depend on whether the AI is aligned? If we built an aligned AI, how likely is it that we will create a world that treats AI welfare as important consideration? What if we build a misaligned AI?
  • Do you think that, to a first approximation, most of the possible value of the future happens in worlds that are optimized for something that resembles your current or idealized values? How bad is it to mostly sacrifice each of these? (What if the future world's values are similar to yours, but is only kinda effectual at pursuing them? What if the world is optimized for something that's only slightly correlated with your values?) How likely are these various options under an aligned AI future vs. an unaligned AI future?
Comment by Eric Neyman (UnexpectedValues) on My PhD thesis: Algorithmic Bayesian Epistemology · 2024-04-11T17:59:09.275Z · LW · GW

Yeah, there's definitely value in experts being allowed to submit multiple times, allowing them to update on other experts' submissions. This is basically the frame taken in Chapter 8, where Alice and Bob update their estimate based on the other's estimate at each step. This is generally the way prediction markets work, and I think it's an understudied perspective (perhaps because it's more difficult to reason about than if you assume that each expert's estimate is static, i.e. does not depend on other experts' estimates).

Comment by Eric Neyman (UnexpectedValues) on My PhD thesis: Algorithmic Bayesian Epistemology · 2024-04-11T17:47:42.122Z · LW · GW

Thanks! I think the reason I didn't give those expressions is that they're not very enlightening. See here for l = 2 on (0, 1/2] and here for l = 4 on [1/2, 1).

Comment by Eric Neyman (UnexpectedValues) on My PhD thesis: Algorithmic Bayesian Epistemology · 2024-03-28T21:02:39.481Z · LW · GW

Thanks! Here are some brief responses:

From the high level summary here it sounds like you're offloading the task of aggregation to the forecasters themselves. It's odd to me that you're describing this as arbitrage.

Here's what I say about this anticipated objection in the thesis:

For many reasons, the expert may wish to make arbitrage impossible. First, the principal may wish to know whether the experts are in agreement: if they are not, for instance, the principal may want to elicit opinions from more experts. If the experts collude to report an aggregate value (as in our example), the principal does not find out whether they originally agreed. Second, even if the principal only seeks to act based on some aggregate of the experts' opinions, their method of aggregation may be different from the one that experts use to collude. For instance, the principal may have a private opinion on the trustworthiness of each expert and wishes to average the experts' opinions with corresponding weights. Collusion among the experts denies the principal this opportunity. Third, a principal may wish to track the accuracy of each individual expert (to figure out which experts to trust more in the future, for instance), and collusion makes this impossible. Fourth, the space of collusion strategies that constitute arbitrage is large. In our example above, any report in [0.546, 0.637] would guarantee a profit; and this does not even mention strategies in which experts report different probabilities. As such, the principal may not even be able to recover basic information about the experts' beliefs from their reports.

 

For example, when I worked with IARPA on geopolitical forecasting, our forecasters would get financial rewards depending on what percentile they were in relative to other forecasters.

This would indeed be arbitrage-free, but likely not proper: it wouldn't necessarily incentivize each expert to report their true belief; instead, an expert's optimal report is going to be some sort of function of the expert's belief about the joint probability distribution over the experts' beliefs. (I'm not sure how much this matters in practice -- I defer to you on that.)

It's surprising to me that you could disincentivize forecasters from reporting the aggregate as their individual forecast.

In Chapter 4, we are thinking of experts as having immutable beliefs, rather than beliefs that change upon hearing other experts' beliefs. Is this a silly model? If you want, you can think of these beliefs as each expert's belief after talking to the other experts a bunch. In theory(?) the experts' beliefs should converge (though I'm not actually clear what happens if the experts are computationally bounded); but in practice, experts often don't converge (see e.g. the FRI adversarial collaboration on AI risk).

It seems to me that under sufficiently pessimistic conditions, there would be no good way to aggregate those two forecasts.

Yup -- in my summary I described "robust aggregation" as "finding an aggregation strategy that works as well as possible in the worst case over a broad class of possible information structures." In fact, you can't do anything interesting in the worse case over all information structures. The assumption I make in the chapter in order to get interesting results is, roughly, that experts' information is substitutable rather than complementary (on average over the information structure). The sort of scenario you describe in your example is the type of example where Alice and Bob's information might be complementary.

Comment by Eric Neyman (UnexpectedValues) on My PhD thesis: Algorithmic Bayesian Epistemology · 2024-03-26T06:49:56.773Z · LW · GW

Great questions!

  1. I didn't work directly on prediction markets. The one place that my thesis touches on prediction markets (outside of general background) is in Chapter 5, page 106, where I give an interpretation of QA pooling in terms of a particular kind of prediction market called a cost function market. This is a type of prediction market where participants trade with a centralized market maker, rather than having an order book. QA pooling might have implications in terms of the right way to structure these markets if you want to allow multiple experts to place trades at the same time, without having the market update in between. (Maybe this is useful in blockchain contexts if market prices can only update every time a new block is created? I'm just spitballing; I don't really understand how blockchains work.)
  2. I think that for most contexts, this question doesn't quite make sense, because there's only one question being forecast. The one exception is where I talk about learning weights for experts over the course of multiple questions (in Chapter 5 and especially 6). Since I talk about competing with the best weighted combination of experts in hindsight, the problem doesn't immediately make sense if some experts don't answer some questions. However, if you specify a "default thing to do" if some expert doesn't participate (e.g. take all the other experts' weights and renormalize them to add to 1), then you can get the question to make sense again. I didn't explore this, but my guess is that there are some nice generalizations in this direction.
  3. I don't! This is Question 4.5.2, on page 94 :) Unfortunately, I would conjecture (70%) that no such contract function exists.
Comment by Eric Neyman (UnexpectedValues) on Lying is Cowardice, not Strategy · 2023-10-25T09:17:36.649Z · LW · GW

(Note: I work with Paul at ARC theory. These views are my own and Paul did not ask me to write this comment.)

I think the following norm of civil discourse is super important: do not accuse someone of acting in bad faith, unless you have really strong evidence. An accusation of bad faith makes it basically impossible to proceed with discussion and seek truth together, because if you're treating someone's words as a calculated move in furtherance of their personal agenda, then you can't take those words at face value.

I believe that this post violates this norm pretty egregiously. It begins by saying that hiding your beliefs "is lying". I'm pretty confident that the sort of belif-hiding being discussed in the post is not something most people would label "lying" (see Ryan's comment), and it definitely isn't a central example of lying. (And so in effect it labels a particular behavior "lying" in an attempt to associate it with behaviors generally considered worse.)

The post then confidently asserts that Paul Christiano hides his beliefs in order to promote RSPs. This post presents very little evidence presented that this is what's going on, and Paul's account seems consistent with the facts (and I believe him).

So in effect, it accuses Paul and others of lying, cowardice, and bad faith on what I consider to be very little evidence.

Edited to add: What should the authors have done instead? I think they should have engaged in a public dialogue with one or more of the people they call out / believe to be acting dishonestly. The first line of the dialogue should maybe have been: "I believe you have been hiding your beliefs, for [reasons]. I think this is really bad, for [reasons]. I'd like to hear your perspective."

Comment by Eric Neyman (UnexpectedValues) on Lying is Cowardice, not Strategy · 2023-10-25T08:39:20.612Z · LW · GW