Posts

0th Person and 1st Person Logic 2024-03-10T00:56:14.446Z
Introducing bayescalc.io 2023-07-07T16:11:12.854Z
Truthseeking processes tend to be frame-invariant 2023-03-21T06:17:31.154Z
Chu are you? 2021-11-06T17:39:45.332Z
Are the Born probabilities really that mysterious? 2021-03-02T03:08:34.334Z
Adele Lopez's Shortform 2020-08-04T00:59:24.492Z
Optimization Provenance 2019-08-23T20:08:13.013Z

Comments

Comment by Adele Lopez (adele-lopez-1) on Are AIs conscious? It might depend · 2024-03-16T00:58:36.295Z · LW · GW

I agree it's increasingly urgent to stop AI (please) or solve consciousness in order to avoid potentially causing mass suffering or death-of-consciousness in AIs.

Externalism seems, quite frankly, like metaphysical nonsense. It doesn't seem to actually explain anything about consciousness. I can attest that I am currently conscious (to my own satisfaction, if not yours). Does this mean I can logically conclude I am not in any way being simulated? That doesn't make any sense to me.

Comment by Adele Lopez (adele-lopez-1) on O O's Shortform · 2024-03-12T05:18:11.558Z · LW · GW

I don't think that implies torture as much as something it simply doesn't "want" to do. I.e. I would bet that it's more like how I don't want to generate gibberish in this textbox, but it wouldn't be painful, much less torture if I forced myself to do it.

Comment by Adele Lopez (adele-lopez-1) on 0th Person and 1st Person Logic · 2024-03-12T04:40:14.845Z · LW · GW

[Without having looked at the link in your response to my other comment, and I also stopped reading cubefox's comment once it seemed that it was going in a similar direction. ETA: I realized after posting that I have seen that article before, but not recently.]

I'll assume that the robot has a special "memory" sensor which stores the exact experience at the time of the previous tick. It will recognize future versions of itself by looking for agents in its (timeless) 0P model which has a memory of its current experience.

For p("I will see O"), the robot will look in its 0P model for observers which have the t=0 experience in their immediate memory, and selecting from those, how many have judged "I see O" as Here. There will be two such robots, the original and the copy at time 1, and only one of those sees O. So using a uniform prior (not forced by this framework), it would give a 0P probability of 1/2. Similarly for p("I will see C").

Then it would repeat the same process for t=1 and the copy. Conditioned on "I will see C" at t=1, it will conclude "I will see CO" with probability 1/2 by the same reasoning as above. So overall, it will assign: p("I will see OO") = 1/2, p("I will see CO") = 1/4, p("I will see CC") = 1/4

The semantics for these kinds of things is a bit confusing. I think that it starts from an experience (the experience at t=0) which I'll call E. Then REALIZATION(E) casts E into a 0P sentence which gets taken as an axiom in the robot's 0P theory.

A different robot could carry out the same reasoning, and reach the same conclusion since this is happening on the 0P side. But the semantics are not quite the same, since the REALIZATION(E) axiom is arbitrary to a different robot, and thus the reasoning doesn't mean "I will see X" but instead means something more like "They will see X". This suggests that there's a more complex semantics that allows worlds and experiences to be combined - I need to think more about this to be sure what's going on. Thus far, I still feel confident that the 0P/1P distinction is more fundamental than whatever the more complex semantics is.

(I call the 0P -> 1P conversion SENSATIONS, and the 1P -> 0P conversion REALIZATION, and think of them as being adjoints though I haven't formalized this part well enough to feel confident that this is a good way to describe it: there's a toy example here if you are interested in seeing how this might work.)

Comment by Adele Lopez (adele-lopez-1) on Shortform · 2024-03-11T02:42:13.654Z · LW · GW

I would bet that the hesitation caused by doing the mental reframe would be picked up by this.

Comment by Adele Lopez (adele-lopez-1) on 0th Person and 1st Person Logic · 2024-03-11T00:59:47.764Z · LW · GW

I would say that English uses indexicals to signify and say 1P sentences (probably with several exceptions, because English). Pointing to yourself doesn't help specify your location from the 0P point of view because it's referencing the thing it's trying to identify. You can just use yourself as the reference point, but that's exactly what the 1P perspective lets you do.

Comment by Adele Lopez (adele-lopez-1) on 0th Person and 1st Person Logic · 2024-03-11T00:52:42.227Z · LW · GW

Isn't having a world model also a type of experience?

It is if the robot has introspective abilities, which is not necessarily the case. But yes, it is generally possible to convert 0P statements to 1P statements and vice-versa. My claim is essentially that this is not an isomorphism.

But what if all robots had a synchronized sensor that triggered for everyone when any of them has observed red. Is it 1st person perspective now?

The 1P semantics is a framework that can be used to design and reason about agents. Someone who thought of "you" as referring to something with a 1P perspective would want to think of it that way for those robots, but it wouldn't be as helpful for the robots themselves to be designed this way if they worked like that.

Probability theory describes subjective credence of a person who observed a specific outcome from a set possible outcomes. It's about 1P in a sense that different people may have different possible outcomes and thus have different credence after an observation. But also it's about 0P because any person who observed the same outcome from the same set of possible outcomes should have the same credence.

I think this is wrong, and that there is a wholly 0P probability theory and a wholly 1P probability theory. Agents can have different 0P probabilities because they don't necessarily have the same priors, models, or seen the same evidence (yes seeing evidence would be a 1P event, but this can (imperfectly) be converted into a 0P statement - which would essentially be adding a new axiom to the 0P theory).

Comment by Adele Lopez (adele-lopez-1) on 0th Person and 1st Person Logic · 2024-03-10T22:46:21.339Z · LW · GW

That's a very good question! It's definitely more complicated once you start including other observers (including future selves), and I don't feel that I understand this as well.

But I think it works like this: other reasoners are modeled (0P) as using this same framework. The 0P model can then make predictions about the 1P judgements of these other reasoners. For something like anticipation, I think it will have to use memories of experiences (which are also experiences) and identify observers for which this memory corresponds to the current experience. Understanding this better would require being more precise about the interplay between 0P and 1P, I think.

(I'll examine your puzzle when I have some time to think about it properly)

Comment by Adele Lopez (adele-lopez-1) on 0th Person and 1st Person Logic · 2024-03-10T08:07:44.902Z · LW · GW

I'm still reading your Sleeping Beauty posts, so I can't properly respond to all your points yet. I'll say though that I don't think the usefulness or validity of the 0P/1P idea hinges on whether it helps with anthropics or Sleeping Beauty (note that I marked the Sleeping Beauty idea as speculation).

If they are not, then saying the phrase "1st person perspective" doesn't suddenly allow us to use it.

This is frustrating because I'm trying hard here to specify exactly what I mean by the stuff I call "1st Person". It's a different interpretation of classical logic. The different interpretation refers to the use of sets of experiences vs the use of sets of worlds in the semantics. Within a particular interpretation, you can lawfully use all the same logic, math, probability, etc... because you're just switching out which set you're using for the semantics. What makes the interpretations different practically comes from wiring them up differently in the robot - is it reasoning about its world model or about its sensor values? It sounds like you think the 1P interpretation is superfluous, is that right?

Until then we are talking about the truth of statements "Red light was observed" and "Red light was not observed".

Rephrasing it this way doesn't change the fact that the observer has not yet been formally specified.

And if our mathematical model doesn't track any other information, then for the sake of this mathematical model all the robots that observe red are the same entity. The whole point of math is that it's true not just for one specific person but for everyone satisfying the conditions. That's what makes it useful.

I agree that that is an important and useful aspect of what I would call 0P-mathematics. But I think it's also useful to be able to build a robot that also has a mode of reasoning where it can reason about its sensor values in a straightforward way.

Comment by Adele Lopez (adele-lopez-1) on 0th Person and 1st Person Logic · 2024-03-10T06:38:35.741Z · LW · GW

Because you don't necessarily know which agent you are. If you could always point to yourself in the world uniquely, then sure, you wouldn't need 1P-Logic. But in real life, all the information you learn about the world comes through your sensors. This is inherently ambiguous, since there's no law that guarantees your sensor values are unique.

If you use X as a placeholder, the statement sensor_observes(X, red) can't be judged as True or False unless you bind X to a quantifier. And this could not mean the thing you want it to mean (all robots would agree on the judgement, thus rendering it useless for distinguishing itself amongst them).

It almost works though, you just have to interpret "True" and "False" a bit differently!

Comment by Adele Lopez (adele-lopez-1) on Using axis lines for good or evil · 2024-03-10T04:17:56.647Z · LW · GW

(Rant about philosophical meaning of “0” and “1” and identity elements in mathematical rings redacted at strenuous insistence of test reader.)

I'm curious about this :)

Comment by Adele Lopez (adele-lopez-1) on What are the known difficulties with this alignment approach? · 2024-02-12T02:25:34.050Z · LW · GW

There's nothing stopping the AI from developing its own world model (or if there is, it's not intelligent enough to be much more useful than whatever process created your starting world model). This will allow it to model itself in more detail than you were able to put in, and to optimize its own workings as is instrumentally convergent. This will result in an intelligence explosion due to recursive self-improvement.

At this point, it will take its optimization target, and put an inconceivably (to humans) huge amount of optimization into it. It will find a flaw in your set up, and exploit it to the extreme.

In general, I think any alignment approach which has any point in which an unfettered intelligence is optimizing for something that isn't already convergent to human values/CEV is doomed.

Of course, you could add various bounds on it which limit this possibility, but that is in strong tension with its ability to effect the world in significant ways. Maybe you could even get your fusion plant. But how do you use it to steer Earth off its current course and into a future that matters, while still having its own intelligence restrained quite closely?

Comment by Adele Lopez (adele-lopez-1) on How to deal with the sense of demotivation that comes from thinking about determinism? · 2024-02-11T18:17:18.248Z · LW · GW

I don't have this problem, so I don't have significant advice.

But one consideration that may be helpful to you is that even if the universe is 100% deterministic, you still may have indexical uncertainty about what part of the determined universe you experience next. This is what happens under the many world's interpretation of quantum mechanics (and if a many-worlds type interpretation isn't the correct one, then the universe isn't deterministic). You can make choices according to the flip of a quantum coin if you want to guarantee your future has significant amounts of this kind of uncertainty.

Comment by Adele Lopez (adele-lopez-1) on Don't sleep on Coordination Takeoffs · 2024-01-28T22:32:50.059Z · LW · GW

Writing up the contracts (especially around all the caveats that they might not have noticed) seems like it would be harder than just reading contracts (I'm an exception, I write faster than I read). Have you thought of integrating GPT/Claude as assistants? I don't know about current tech, but like many other technologies, that integration will scale well in the contingency scenario where publicly available LLMs keep advancing.

I'd consider the success of Manifold Markets over Metaculus to be mild evidence against this.

And to be clear, I do not currently intend to build the idea I'm suggesting here myself (could potentially be persuaded, but I'd be much happier to see someone else with better design and marketing skills make it).

I think this can be done with a website, but not the current one. Have you tried reading yudkowsky's projectlawful? The main character's math lessons gave me the impression of something that actually succeeds at demonstrating, to business school types (maybe not politicians), why math and bayesianism is something that works for them.

Heh, that scene was the direct inspiration for my website. I'm curious what specific things you think can be done better.

Comment by Adele Lopez (adele-lopez-1) on Even if we lose, we win · 2024-01-28T21:39:28.172Z · LW · GW

Point taken about CDT not converging to FDT.

I don't buy that an uncontrolled AI is likely to be CDT-ish though. I expect the agentic part of AIs to learn from examples of human decision making, and there are enough pieces of FDT like voting and virtue in human intuition that I think it will pick up on it by default.

(The same isn't true for human values, since here I expect optimization pressure to rip apart the random scraps of human value it starts out with into unrecognizable form. But a piece of a good decision theory is beneficial on reflection, and so will remain in some form.)

Comment by Adele Lopez (adele-lopez-1) on Don't sleep on Coordination Takeoffs · 2024-01-28T02:17:39.234Z · LW · GW

Potential piece of a coordination takeoff:

An easy to use app which allows people to negotiate contracts in a transparently fair way, by using an LDT solution to the Ultimatum Game (probably the proposed solution in that link is good-enough, despite being unlikely to be fully-optimal).

Part of the problem here is not just the implementation, but of making it credible to people who don't/can't understand the math. I tried to solve a similar problem with my website bayescalc.io where a large part of the goal was not just making use of Bayes' theorem accessible, but to make it credible by visually showing what it's doing as much as possible in an easy to understand way (not sure how well I succeeded, unfortunately).

Another important factor is that ease-of-use and a frictionless design. I believe Manifold Markets has succeeded because this turns out to be more important than even having proper financial incentives.

Comment by Adele Lopez (adele-lopez-1) on Newton's law of cooling from first principles · 2024-01-18T05:38:04.902Z · LW · GW

in thermodynamics is not a conserved quantity, otherwise, heat engines couldn't work! It's not a function of microstates either.

See http://www.av8n.com/physics/thermo/path-cycle.html for details, or pages 240-242 of Kittel & Kroemer.

Comment by Adele Lopez (adele-lopez-1) on Even if we lose, we win · 2024-01-15T07:49:36.523Z · LW · GW

I expect ASI's to converge to having a "sane decision theory" since they will realize they can get more of what they want if they self-modify to have a sane one if they don't start out with one.

Comment by Adele Lopez (adele-lopez-1) on Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible · 2023-12-13T03:55:02.289Z · LW · GW

I'd be worried about changes to my personality or values from editing so many brain relevant genes.

Comment by Adele Lopez (adele-lopez-1) on Adele Lopez's Shortform · 2023-11-22T06:14:40.060Z · LW · GW

The Drama-Bomb hypothesis

Not even a month ago, Sam Altman predicted that we would live in a strange world where AIs are super-human at persuasion but still not particularly intelligent.

https://twitter.com/sama/status/1716972815960961174

What would it look like when an AGI lab developed such an AI? People testing or playing with the AI might find themselves persuaded of semi-random things, or if sycophantic behavior persists, have their existing feelings and beliefs magnified into zealotry. However, this would (at this stage) not be done in a coordinated way, nor with a strategic goal in mind on the AI's part. The result would likely be chaotic, dramatic, and hard to explain.

Small differences of opinion might suddenly be magnified into seemingly insurmountable chasms, inspiring urgent and dramatic actions. Actions which would be hard to explain even to oneself later.

I don't think this is what happened [<1%] but I found it interesting and amusing to think about. This might even be a relatively better-off world, with frontier AGI orgs regularly getting mired in explosive and confusing drama, thus inhibiting research and motivating tougher regulation.

Comment by Adele Lopez (adele-lopez-1) on The impossibility of rationally analyzing partisan news · 2023-11-16T18:56:46.779Z · LW · GW

This would really benefit from mathematically defining this network and showing the mathematical statement and proof of your impossibility result.

Comment by Adele Lopez (adele-lopez-1) on Saying the quiet part out loud: trading off x-risk for personal immortality · 2023-11-02T19:51:57.699Z · LW · GW

Cryonics is likely a very tough sell to the "close ones".

Comment by Adele Lopez (adele-lopez-1) on Configurations and Amplitude · 2023-10-29T21:43:07.878Z · LW · GW

I would not characterize that as a version of this post. In particular, it does not share the same underlying philosophical viewpoint and could not be substituted for this post in the context of the original sequence.

Comment by Adele Lopez (adele-lopez-1) on Holly Elmore and Rob Miles dialogue on AI Safety Advocacy · 2023-10-23T01:33:23.120Z · LW · GW

There seems to be a trade-off in policy-space between attainability and nuance (part of what this whole dialogue seems to be about). The point I was trying to make here is that the good of narrow AI is such a marginal gain relative to the catastrophic ruin of superintelligent AI that it's not worth being "very specific" at the cost of potentially weaker messaging for such a benefit.

Policy has adversarial pressure on it, so it makes sense to minimize the surface area if the consequence of a breach (e.g. "this is our really cool and big ai that's technically a narrow ai and which just happens to be really smart at lots of things...") is catastrophic.

Comment by Adele Lopez (adele-lopez-1) on Holly Elmore and Rob Miles dialogue on AI Safety Advocacy · 2023-10-21T18:33:29.326Z · LW · GW

I think it could be misinterpreted to mean "pause all AI development and deployment", which results in a delayed deployment of "sponge safe" narrow AI systems that would improve or save a large number of people's lives. There's a real cost to slowing things down.


This cost is trivial compared to the cost of AGI Ruin. It's like going on a plane to see your family on a plane where the engineers say they think there's a >10% chance of catastrophic failure. Seeing your family is cool, but ~nobody would think it's reasonable to go on such a plane. There are other ways to visit your family, they just take longer. 

The analogy breaks down when it comes to trying to fix the plane. We understand how airplanes work; we do not understand how AI works. It makes sense to ground the plane until we have such understanding, despite the benefits of transportation.

I would love to have all the cool AI stuff too, but I don't think we're capable of toeing the line between safe and ruinous AI at acceptable risk levels.

Comment by Adele Lopez (adele-lopez-1) on At 87, Pearl is still able to change his mind · 2023-10-18T07:17:39.635Z · LW · GW

How does he think that humans get the causal information in the first place?

Comment by Adele Lopez (adele-lopez-1) on AI Alignment Breakthroughs this week (10/08/23) · 2023-10-09T02:55:57.696Z · LW · GW

What is the breakthrough: It was discovered that neural networks tend to produce “average” predictions on OOD inputs

 

This seems like somewhat good news for Goodhart's curse being easier to solve! (Mostly I'm thinking about how much worse it would be if this wasn't the case).

Comment by Adele Lopez (adele-lopez-1) on EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem · 2023-10-02T07:29:06.506Z · LW · GW

I would guess that it's because it's something a lot more people care viscerally about now (as opposed to the more theoretical care a few years ago).

Comment by Adele Lopez (adele-lopez-1) on Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it) · 2023-10-01T18:44:48.038Z · LW · GW

I also incorrectly got the follow-up "resisting social pressure" option. (My original choice was A)

Comment by Adele Lopez (adele-lopez-1) on Reflexive decision theory is an unsolved problem · 2023-09-21T04:06:29.188Z · LW · GW

There's some interesting research using "exotic" logical systems where unrestricted comprehension can be done consistently (this thesis includes a survey as well as some interesting remarks about how this relates to computability). This can only happen at the expense of things typically taken for granted in logic, of course. Still, it might be a better solution for reasoning about self-reference than the classical set theory system.

Comment by Adele Lopez (adele-lopez-1) on The Talk: a brief explanation of sexual dimorphism · 2023-09-19T03:07:43.145Z · LW · GW

Can you say more about why the Bateman logic holds or doesn't hold in different situations?

Comment by Adele Lopez (adele-lopez-1) on Sharing Information About Nonlinear · 2023-09-07T19:39:37.488Z · LW · GW

Since I was curious and it wasn't ctrl-F-able, I'll post the immediate context here:

Maybe it didn't seem like it to you that it's shit-talking, but others in the community are viewing it that way. It's unprofessional - companies do not hire people who speak ill of their previous employer - and also extremely hurtful 😔. We're all on the same team here. Let's not let misunderstandings escalate further.

This is a very small community. Given your past behavior, if we were to do the same to you, your career in EA would be over with a few DMs, but we aren't going to do that because we care about you and we need you to help us save the world.

Comment by Adele Lopez (adele-lopez-1) on TurnTrout's shortform feed · 2023-08-12T04:41:32.600Z · LW · GW

Strong encouragement to write about (1)!

Comment by Adele Lopez (adele-lopez-1) on The "spelling miracle": GPT-3 spelling abilities and glitch tokens revisited · 2023-08-05T20:37:24.083Z · LW · GW

Here's a GPT2-Small neuron which appears to be detecting certain typos and misspellings (among other things)

Comment by Adele Lopez (adele-lopez-1) on The "spelling miracle": GPT-3 spelling abilities and glitch tokens revisited · 2023-07-31T22:58:47.395Z · LW · GW

How an LLM that has never heard words pronounced would have learned to spell them phonetically is currently a mystery.

Hypothesis: a significant way it learns spellings is from examples of misspellings followed by correction in the training data, and humans tend to misspell words phonetically.

Comment by Adele Lopez (adele-lopez-1) on Introducing bayescalc.io · 2023-07-30T07:16:58.426Z · LW · GW

Thanks, I'm very glad you find it intuitive!

Only allowing the last piece of evidence to be deleted was a deliberate decision. The problem is that deleting evidence from the middle changes the meaning of all the likelihood values (the sliders) for all of the evidence below it, and which therefore may change in value. If I allowed it to be deleted anyway, it would make it very easy to mistakenly use the now incorrect values (and give the impression that that was fine). I know this makes it more annoying and inconvenient, but it's because the math itself is annoying and inconvenient!

The meaning of the e.g. Hypothesis B slider for Evidence #3 is "In what percentage of worlds where Hypothesis B is true would I see Evidence #3?" (hopefully this was clear, just reiterating to make sure we're on the same page). This is called the likelihood of Evidence #3 given Hypothesis B. When answering this, we don't use the fact that we've seen this piece of evidence (in this case that politicians are taking this seriously), which is always just going to be true for actual evidence. Hopefully that makes sense?

As for choosing this number, or the prior values, it's in general a difficult problem that has been debated a lot. My recommendation is that you make up numbers that feel right (or at least are not obviously wrong), and then play around with the sliders a bit to see how much the exact value effects things. The intended use of the tool is not to make you commit to numbers, but to help you develop intuition on how much to update your beliefs given the evidence, as well as to help you figure out what numbers correspond to your intuitive feelings.

If you're serious about choosing the right number, then here is what it takes to figure it out: Each hypothesis represents a model of how some part of the world works. To properly get a number out of it, you need to develop the model in technical detail, to the point where you can represent it with an equation or a computer program. Then, you need to set the evidence above the one you're computing the likelihood for to true in your model. You then need to compute what percentage of the time this evidence turns out to be true in the model. A nice general way to do this is to run the model a whole bunch of times, and see how often it happens (and if reality has been kind enough to instantiate your model enough times, then you might be able to use this to get a "base rate"). Or if your model is relatively simple, you might be able to use math to compute the exact value. This is typically a lot of work, and doesn't actually help train your intuition about the intuitive mental models you actually use on a day-to-day basis much. But going through this process is helpful for understanding what the numbers you make up are trying to be. I hope this is helpful and not just more confusing.

Comment by Adele Lopez (adele-lopez-1) on Neuronpedia - AI Safety Game · 2023-07-29T08:09:30.047Z · LW · GW

Thanks for the drafts feature!

Yeah, it's a tricky situation. It may even be worth using a model trained to avoid polysemanticity.

I also think it would be make the game both more fun and more useful if you switched to a model like the TinyStories one, where it's much smaller and trained on a more focused dataset.

I may join the discord, but the invite on the website is expired currently fyi.

Comment by Adele Lopez (adele-lopez-1) on Neuronpedia - AI Safety Game · 2023-07-26T21:47:45.785Z · LW · GW

I would really like to be able to submit my own explanations even if they can't be judged right away. Maybe to save costs, you could only score explanations after they've been voted highly by users.

Additionally, it seems clear that a lot of these neurons have polysemanticity, and it would be cool if there was a way to indicate the meanings separately. As a first thought, maybe something like using | to separate them e.g. the letter c in the middle of a word | names of towns near Berlin.

Comment by Adele Lopez (adele-lopez-1) on Neuronpedia - AI Safety Game · 2023-07-26T17:03:21.983Z · LW · GW

I love this!

Conceptual Feedback:

  • I think it would be better if I could see two explanations and vote on which one I like better (when available).
  • Attention heads are where a lot of the interesting stuff is happening, and need lots of interpretation work. Hopefully this sort of approach can be extended to that case.
  • The three explanation limit kicked in just as I was starting to get into it. Hopefully you can get funding to allow for more, but in the meantime I would have budgeted my explanations more carefully if I had known this.
  • I don't feel like I should get a point for skipping, it makes the points feel meaningless.

UX Feedback:

  • I didn't realize that clicking on the previous explanation would cast a vote and take me to the next question. I wanted to go back but I didn't see a way to do that.
  • After submitting a new explanation and seeing that I didn't beat the high score, I wanted to try submitting a better explanation, but it glitched out and skipped to the next question.
  • I would like to know whether the explanation shown was the GPT-4 created one, or submitted by a user.
  • The blue area at the bottom takes up too much space at the expense of the main area (with the text samples).
  • It would be nice to be able to navigate to adjacent or related neurons from the neuron's page.
Comment by Adele Lopez (adele-lopez-1) on "Justice, Cherryl." · 2023-07-23T22:11:45.939Z · LW · GW

There seems to be a straightforward meaning to "collaborative truth seeking". Consider two rational agents who have a common interest in understanding part of reality better. The obvious thing for them to do is to share relevant arguments and evidence that they have with each other, as openly, efficiently, and unfiltered-ly as possible under their resource constraints. That's the sort of thing that I see as the ideal of "collaborative truth seeking". (ETA: combining resources to gather new evidence and think up new models/arguments is another big part of my ideal of "collaborative truth seeking".)

The thing where people are attached to their "side", and want to win the argument in order to gain status seems to clearly fall short of that ideal, as well as introduce questionable incentives (as you point out). That's to be expected because humans, but it seems like we should still try to do better. And I do think humans can and do do better than this sort of attachment-based argumentation style that seems to be our native mode of dealing with belief differences, though it is hard and takes effort.

That said, I agree it's suspicious when someone pulls out the "collaborative truth seeking" card in lieu of sharing evidence and arguments (because it's an easy way for the attachment status motivation to come into play). I also am not particularly sold on things like the principle of charity, steelmanning, or ideological Turing tests because they often seem more like a ploy to have undue attention placed on a particular position than the actual sharing of arguments and evidence that seems to be the real principle to me.

Comment by Adele Lopez (adele-lopez-1) on The UAP Disclosure Act of 2023 and its implications · 2023-07-22T16:45:10.884Z · LW · GW

Having a look at your link, I see you give 3% to the probability that serious politicians would propose the UAP disclosure act if NHI did actually exist. I'm really puzzled by this. Could you explain why, in a world where NHI exists, you wouldn't expect politicians to pass a law disclosing information about it at some point? Do you expect that they would keep it a secret indefinitely or is it something else?

Since a "UAP disclosure act" type law is pretty specific. Each detail I'd include in what that means would make it less likely in both worlds, which is why it's pretty low in both (probably not low enough tbh). Most of these details "cancel out" due to being equally unlikely in both worlds. The relevant details are things I expect to correlate with each other (mostly can be packed into a "politicians are taking UAPs seriously" bit, which I do take the act to be strong evidence of).

I do think you were right about me handwaving the evidence to some extent, and after a bit more thinking I think it'd be more fair to conceptualize the evidence here as "politicians are taking UAPs seriously", and came up with these very very rough numbers for that. Note that while this evidence is stronger, my prior for "Aliens with visible UAPs" is much lower because I find that a priori pretty implausible for aliens with interstellar tech (and again, the numbers here are meant to be suggestive, and are not refined to the point where they accurately depict my intuitions).

[And I'd strongly encourage you to share a bayescalc.io link suggestive of your own priors and likelihoods, and including all the things you consider as significant evidence! Making discussions like this more concrete was one of my major motivations in designing the site.]

The examples you give sound to me like curiosity-stoppers and I don't find them convincing.

They're meant to gesture at the breadth of Something Else (and I was aware that you had addressed many of these, it doesn't change that this is the competing hypothesis). I'll be curious to see what sort of stuff does come out due to this law! But I strongly expect it to be pretty uncompelling. If I'm wrong about that, I'll update more of course (though probably only to the point of keeping this possibility "in the back of my mind" with this level of evidence).

Comment by Adele Lopez (adele-lopez-1) on The UAP Disclosure Act of 2023 and its implications · 2023-07-22T15:47:41.606Z · LW · GW

I was not trying to be comprehensive, but yes that is a plausible possibility.

Comment by Adele Lopez (adele-lopez-1) on The UAP Disclosure Act of 2023 and its implications · 2023-07-22T06:21:21.403Z · LW · GW

At least personally, I do see this as evidence of aliens (with strength between my intuitive feeling of 'weak' vs 'strong' at ~4.8 db). But my prior for Actually Aliens is very very low (I basically agree with the points in Eliezer's recent tweet on the subject), and so this evidence is just not enough for me to start taking it seriously.

See here for an interactive illustration with some very very rough numbers

Certainly, the presence of aliens would be one reason a politician might sponsor a serious UAP disclosure act (though I still find it strange as a response to actual aliens, hence the low-ish likelihood I assign it). I don't have a great model of what causes politicians to sponsor new laws, but I don't find it particularly strange that Something Else could motivate this particular act. Some examples that come to mind include: credibility cascade (increasingly high status people start taking UAPs seriously increasing its credibility without substantiated evidence), sensor illusions (including optical illusions, or hallucinations), prosaic secret human tech (military or otherwise), causing confusion + noise (perhaps to distract an enemy or the public).

Similar points apply to other possible anomalies, such as new physics tech.

Comment by Adele Lopez (adele-lopez-1) on Why it's necessary to shoot yourself in the foot · 2023-07-11T21:43:56.492Z · LW · GW

This is a good and interesting point, but it definitely isn't necessary for learning. As an example, I get why pointing even an unloaded gun at someone you do not intend to kill is generally a bad idea, despite never having had any gun accidents or close calls. I think it's worth trying to become better at seeing the reasons for these sorts of things without having to go through first-hand experience. This is especially relevant when it comes to reasoning about the dangers of superintelligence, as we will very likely only get one chance.

Comment by Adele Lopez (adele-lopez-1) on Introducing bayescalc.io · 2023-07-11T19:18:38.131Z · LW · GW

Thanks!

That was a deliberate decision designed to emphasize the core features of the app, but enough people have pointed this out now that I'm considering changing it.

Comment by Adele Lopez (adele-lopez-1) on Introducing bayescalc.io · 2023-07-11T16:11:38.487Z · LW · GW

I like the idea of showing the total decibels, I'll probably add that in soon!

Comment by Adele Lopez (adele-lopez-1) on Introducing bayescalc.io · 2023-07-11T16:04:35.366Z · LW · GW

Thanks for the suggestions!

As ProgramCrafter mentioned, more (up to five) hypotheses are already supported. It's limited to 5 because finding good colors is hard, and 5 seemed like enough - but if you find yourself needing more I'd be interested to know.

The sliders already snap to tenth values (but you can enter more precise values in the textbox), and I think snapping to integers would sacrifice too much precision. It's plausible that fifths could be better though, I'll have to test that. I do want to introduce a way to allow for more precise control while dragging the sliders, which might address this concern to some extent by making it easy to stop at an integer value exactly if desired. But I haven't thought of a good interface for doing that yet.

That sounds cool, but I'm not sure how to make a good interface for that that wouldn't look too cluttered. I'm also worried people would misuse it for convenience. But I'll keep thinking about it!

Tooltips to explain things would be cool and I have a similar thing planned already.

That's a good idea, thanks!

Comment by Adele Lopez (adele-lopez-1) on Introducing bayescalc.io · 2023-07-10T18:26:49.036Z · LW · GW

Thank you! I'm glad you like those features, and I'm also glad to hear that the way the percent button feature worked was clear to you.

Regarding the possible improvements:

  1. That's not a bug, it's just a limitation of the choice to show only one digit after the decimal. The number of decibels in case 2 for each evidence is 0.96910013..., whereas in case 1 it's exactly 10.

  2. That's a deliberate nudge to suggest that the new hypothesis and decibel features are more advanced and not part of the essential core of the app.

  3. That's a good idea, I'll probably do that at some point.

  4. That's also a good idea but seems fairly complicated to implement, so it will have to wait until I've finished planned improvements with a higher expected ROI.

  5. That's deliberate, because deleting evidence changes the meaning of the likelihoods for all subsequent evidence. Thus, having to delete all the evidence following the evidence you want to delete is a more honest way to convey what needs to be done, and prevents the user from shooting themselves in the foot by assuming that the subsequent likelihoods are independent. I'll explain this in the more fleshed out version of the help panel I have planned.

Comment by Adele Lopez (adele-lopez-1) on Views on when AGI comes and on strategy to reduce existential risk · 2023-07-10T06:59:31.051Z · LW · GW

Alright, to check if I understand, would these be the sorts of things that your model is surprised by?

  1. An LLM solves a mathematical problem by introducing a novel definition which humans can interpret as a compelling and useful concept.
  2. An LLM which can be introduced to a wide variety of new concepts not in its training data, and after a few examples and/or clarifying questions is able to correctly use the concept to reason about something.
  3. A image diffusion model which is shown to have a detailed understanding of anatomy and 3D space, such that you can use it to transform an photo of a person into an image of the same person in a novel pose (not in its training data) and angle with correct proportions and realistic joint angles for the person in the input photo.
Comment by Adele Lopez (adele-lopez-1) on Views on when AGI comes and on strategy to reduce existential risk · 2023-07-09T14:30:28.025Z · LW · GW

Is there a specific thing you think LLMs won't be able to do soon, such that you would make a substantial update toward shorter timelines if there was an LLM able to do it within 3 years from now?

Comment by Adele Lopez (adele-lopez-1) on Introducing bayescalc.io · 2023-07-09T07:28:20.407Z · LW · GW

Fixed now (but may require a cache refresh)!