Is a random box of gas predictable after 20 seconds? 2024-01-24T23:00:53.184Z
Will quantum randomness affect the 2028 election? 2024-01-24T22:54:30.800Z
Vote in the LessWrong review! (LW 2022 Review voting phase) 2024-01-17T07:22:17.921Z
AI Impacts 2023 Expert Survey on Progress in AI 2024-01-05T19:42:17.226Z
Originality vs. Correctness 2023-12-06T18:51:49.531Z
The LessWrong 2022 Review 2023-12-05T04:00:00.000Z
Open Thread – Winter 2023/2024 2023-12-04T22:59:49.957Z
Complex systems research as a field (and its relevance to AI Alignment) 2023-12-01T22:10:25.801Z
How useful is mechanistic interpretability? 2023-12-01T02:54:53.488Z
My techno-optimism [By Vitalik Buterin] 2023-11-27T23:53:35.859Z
"Epistemic range of motion" and LessWrong moderation 2023-11-27T21:58:40.834Z
Debate helps supervise human experts [Paper] 2023-11-17T05:25:17.030Z
How much to update on recent AI governance moves? 2023-11-16T23:46:01.601Z
AI Timelines 2023-11-10T05:28:24.841Z
How to (hopefully ethically) make money off of AGI 2023-11-06T23:35:16.476Z
Integrity in AI Governance and Advocacy 2023-11-03T19:52:33.180Z
What's up with "Responsible Scaling Policies"? 2023-10-29T04:17:07.839Z
Trying to understand John Wentworth's research agenda 2023-10-20T00:05:40.929Z
Trying to deconfuse some core AI x-risk problems 2023-10-17T18:36:56.189Z
How should TurnTrout handle his DeepMind equity situation? 2023-10-16T18:25:38.895Z
The Lighthaven Campus is open for bookings 2023-09-30T01:08:12.664Z
Navigating an ecosystem that might or might not be bad for the world 2023-09-15T23:58:00.389Z
Long-Term Future Fund Ask Us Anything (September 2023) 2023-08-31T00:28:13.953Z
Open Thread - August 2023 2023-08-09T03:52:55.729Z
Long-Term Future Fund: April 2023 grant recommendations 2023-08-02T07:54:49.083Z
Final Lightspeed Grants coworking/office hours before the application deadline 2023-07-05T06:03:37.649Z
Correctly Calibrated Trust 2023-06-24T19:48:05.702Z
My tentative best guess on how EAs and Rationalists sometimes turn crazy 2023-06-21T04:11:28.518Z
Lightcone Infrastructure/LessWrong is looking for funding 2023-06-14T04:45:53.425Z
Launching Lightspeed Grants (Apply by July 6th) 2023-06-07T02:53:29.227Z
Yoshua Bengio argues for tool-AI and to ban "executive-AI" 2023-05-09T00:13:08.719Z
Open & Welcome Thread – April 2023 2023-04-10T06:36:03.545Z
Shutting Down the Lightcone Offices 2023-03-14T22:47:51.539Z
Review AI Alignment posts to help figure out how to make a proper AI Alignment review 2023-01-10T00:19:23.503Z
Kurzgesagt – The Last Human (Youtube) 2022-06-29T03:28:44.213Z
Replacing Karma with Good Heart Tokens (Worth $1!) 2022-04-01T09:31:34.332Z
Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22] 2021-11-03T18:22:58.879Z
The LessWrong Team is now Lightcone Infrastructure, come work with us! 2021-10-01T01:20:33.411Z
Welcome & FAQ! 2021-08-24T20:14:21.161Z
Berkeley, CA – ACX Meetups Everywhere 2021 2021-08-23T08:50:51.898Z
The Death of Behavioral Economics 2021-08-22T22:39:12.697Z
Open and Welcome Thread – August 2021 2021-08-15T05:59:05.270Z
Open and Welcome Thread – July 2021 2021-07-03T19:53:07.048Z
Open and Welcome Thread – June 2021 2021-06-06T02:20:22.421Z
Attributions, Karma and better discoverability for wiki/tag features 2021-06-02T23:47:03.604Z
Open and Welcome Thread - May 2021 2021-05-03T07:58:03.130Z
2019 Review: Voting Results! 2021-02-01T03:10:19.284Z
Last day of voting for the 2019 review! 2021-01-26T00:46:35.426Z
The Great Karma Reckoning 2021-01-15T05:19:32.447Z
COVID-19: home stretch and fourth wave Q&A 2021-01-06T22:44:29.382Z


Comment by habryka (habryka4) on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-23T00:28:56.015Z · LW · GW

Oh, yeah, I totally think what happened here is "we had more rules/guidelines about COVID, which increased the complexity of the rules we had to follow, which caused us to be more inconsistent in applying those rules". I didn't mean to imply that we actually flawlessly followed the rules.

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-22T08:00:56.814Z · LW · GW

Not sure who you are referring to, but we made some tweaks to various parts of the system of the last few months, so decent chance it wouldn't happen again.

I currently am reasonably happy when I review who gets rate limited when, though it's definitely not easy to see the full effects of it. I think a time decay would make it a lot worse.

Comment by habryka (habryka4) on A Case for the Least Forgiving Take On Alignment · 2024-02-22T07:21:03.255Z · LW · GW

(Please don't leave both top-level reacts and inline reacts of the same type on comments, that produces somewhat clearly confusing summary statistics. We might make it literally impossible, but until then, pick one and stick to it)

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-22T04:56:55.298Z · LW · GW

Actually ok now that I am thinking, why don't downvoters have to select the text and provide the negative feedback in order to issue a downvote?

Forcing people to write a whole sentence or multiple paragraphs to signal that they think some content is bad would of course have enormous chilling effects on people's ability to express their preferences over content on the site, and reduce the signal we have on content-quality a lot.

Downvoters never reply. I suspect because they are obviously afraid I will retaliate their downvotes with my own...

I would be quite surprised if it's about vote-retaliation. I think it's usually because then people ask follow-up questions and there is usually an asymmetric burden of proof in public communication where interlocutors demand very high levels of precision and shareable evidence, when the actual underlying cognitive process was "my gut says this is bad, and I don't want to see more of this". 

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-22T02:21:27.509Z · LW · GW

That's a nice to have, and I do think it reduces the correlation across time and so is a case for having the rate-limit decay with just time, but mostly the point of the rate-limit is to increase the average comment quality on the site without banning a bunch of people (which comes with much more chilling effects where their perspectives are not at all represented on the site, and while still allowing them to complain about the moderation and make the costs to them known)

Comment by habryka (habryka4) on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-22T01:34:52.528Z · LW · GW

For COVID in-particular we added a specific threshold that is "yes, this is news based, but important enough that we will frontpage the most important posts in this category anyways". I think we announced it somewhere, let me look it up... 

Here is the comment where we announced we would no longer frontpage Zvi's COVID updates: 

Here is where Ruby writes about "Long COVID" posts being frontpage: 

I feel like I remember a comment or post where we stated publicly we would start frontpaging some COVID stuff, but I can't find it quickly. 

In any case, in the domain of COVID the frontpage/personal stuff is particularly confusing.

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-22T01:17:53.875Z · LW · GW

No, it's if at least 7 people downvote you in the past 20 comments (on comments that end up net-negative), and the net of all the votes (ignoring your self-votes) on your last 20 comments is below -5 (just using approval-karma, not agreement-karma).

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-21T22:29:35.720Z · LW · GW

Yeah, it's not crazy, but I currently am against it. I think if a user only comments occasionally, but always comments in a way that gets downvoted, then I think it's good for them to maintain a low rate-limit. I don't see how calendar time passing gives me evidence that someone's comments will be better and that I now want more of them on the site again.

Comment by habryka (habryka4) on A Case for the Least Forgiving Take On Alignment · 2024-02-21T21:24:45.327Z · LW · GW

Hmm, I feel sad about this kind of critique. Like, this comment invokes some very implicit standard for posts, without making it at all explicit. Of course neither this post nor the posts they link to are literally "not based on anything". My guess is you are invoking an implicit standard for work to be "empirical" in order to be "anything", but that also doesn't really make sense since there are a lot of empirical arguments in this article and in the linked articles.

I think highlighting any specific assumption, or even some set of assumptions that you think is fragile would be helpful. Or being at all concrete about what you would consider work that is "anything". But I think as it stands I find it hard to get much out of comments like this.

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-21T19:35:45.099Z · LW · GW

Oh, I am an idiot, you are right. I got mislead by the variable name. 

Then yeah, this seems pretty good to me (and seems like it should prevent basically all instances of one or two people having a grudge against someone causing them to be rate-limited).

Comment by habryka (habryka4) on Open Thread – Winter 2023/2024 · 2024-02-21T19:29:32.079Z · LW · GW

Welcome! Hope you have a good time. Asking good questions is quite valuable, and I think a somewhat undersupplied good on the site, so am glad to have you around!

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-21T18:52:19.531Z · LW · GW

(The algorithm aggregates karma over the last 20 comments or posts a user has written. Roko has written 20 comments since publishing that post, so it's no longer in the averaging window.)

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-21T18:50:55.299Z · LW · GW

Dialogues don't run into any rate limits, so that is definitely always an option (and IMO a better way to have long conversations than comment threads).

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-21T18:49:43.854Z · LW · GW

It's net karma of your last 20 comments or posts. So in order for one person to rate limit you, you would have needed to write 20 comments in a row that got basically no votes from anyone but you, at which point, I probably endorse rate-limiting you (though the zero vote case is a bit tricky, and indeed where I think a lot of the false-positives and false-negatives of the system come from).

I do think the system tends to fire the most false-positives when people are engaging in really in-depth comment trees and so write a lot of comments that get no engagement, which then makes things more sensitive to marginal downvotes. I do think "number of downvoters in the last month" or maybe "number of downvoters on your last 20 comments or posts" would help a bunch with that.

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-21T18:33:17.815Z · LW · GW

(People upvoted Roko's comments after making this post, so presumably he is no longer being rate-limited. I think there were more negative comments a few hours ago)

Comment by habryka (habryka4) on Less Wrong automated systems are inadvertently Censoring me · 2024-02-21T18:32:04.144Z · LW · GW

In order for a rate limit to trigger the user needs to be downvoted by at least 4 different users for users below 2000 karma, and 7 different users for users above 2000 karma (relevant line of code is here). 

This failsafe I think prevents most occasional commenters and posters from being affected by one or two people downvoting them.

I do think it fails to trigger for Roko here, since I think we only check for "total downvoter count", which helps with new users, but of course over the hundreds of comments that Roko has acquired over the years he has acquired more than 7 downvoters. I think replacing that failsafe with "downvoters in the last month" is a marginal improvement, and I might make a PR with that.

Comment by habryka (habryka4) on Updatelessness doesn't solve most problems · 2024-02-20T06:57:09.805Z · LW · GW

Promoted to curated: I think it's pretty likely a huge fraction of the value of the future will be determined by the question this post is trying to answer, which is how much game theory produces natural solutions to coordination problems, or more generally how much better we should expect systems to get at coordination as they get smarter.

I don't think I agree with everything in the post, and a few of the characterizations of updatelessness seem a bit off to me (which Eliezer points to a bit in his comment), but I still overall found reading this post quite interesting and valuable for helping me think about for which of the problems of coordination we have a more mechanistic understanding of how being smarter and better at game theory might help, and which ones we don't have good mechanisms for, which IMO is a quite important question.

Comment by habryka (habryka4) on And All the Shoggoths Merely Players · 2024-02-20T04:22:38.824Z · LW · GW

I don't understand the point. 

"Endpoints are easier to predict than intermediate trajectories" seems like a locally valid and relevant point to bring up. Then there is a valid argument here that there are lots of reasons people want to build powerful AGI, and that the argument about the structure of the cognition here is intended to apply to an endpoint where those goals are achieved, which is a valid response (if not a knockdown argument) to the argument of the interlocutor that is reasoning from local observations and trends.

Maybe you were actually commenting on some earlier section, but I don't see any word games in the section you quoted.

Comment by habryka (habryka4) on Open Thread – Winter 2023/2024 · 2024-02-20T01:36:53.534Z · LW · GW

Welcome! I hope you have a good time here, and if you run into any problems, feel free to ping the admin team on the Intercom chat in the bottom right corner.

Comment by habryka (habryka4) on CFAR Takeaways: Andrew Critch · 2024-02-19T23:05:47.928Z · LW · GW

I think the key issue here is that CFAR workshops were optimized around being 4 days long. I think teaching someone numeracy in 4 days is very hard, and the kind of things you end up being able to convey look different (and still pretty valuable, but I do think end up missing a large fraction of the art of the art of rationality).

Comment by habryka (habryka4) on johnswentworth's Shortform · 2024-02-16T19:51:24.689Z · LW · GW

Hmm, I don't buy it. These two scenes seem very much not like the kind of thing a video game engine could produce: 

Look at this frame! I think there is something very slightly off about that face, but the cat hitting the person's face and the person's reaction seem very realistic to me and IMO qualifies as "complex motion and photorealism in the same video".

Comment by habryka (habryka4) on Open Thread – Winter 2023/2024 · 2024-02-16T03:09:23.587Z · LW · GW

Thank you! I also am very excited about it, though sadly adoption hasn't been amazing. Would love to see more people organically produce dialogues!

Comment by habryka (habryka4) on Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy · 2024-02-14T19:23:54.519Z · LW · GW

It also isn't my favorite version of this post that could exist, but it seems like a reasonable point to make, and my guess is a lot of people are expressing their agreement with the title by upvoting.

Comment by habryka (habryka4) on Open Thread – Winter 2023/2024 · 2024-02-14T00:56:37.227Z · LW · GW

I think currently the bot is more noticeable than where it will when we have cleared out the 2023/2024 backlog. Usually the bot just makes a comment on a post when it reaches 100 karma, but since we are just starting it, it's leaving a lot of comments at the same time whenever older posts get voted on that don't yet have a market.

The key UI component I care about is actually not the comment (which was just the most natural place to put this information), but the way the post shows up in post-lists: 

The karma number gets a slightly different (golden-ish) color, and then you can see the likelihood that it ends up at the top of the review on hover as well as at the top of the post. 

The central goal is to both allows us to pull forward a bunch of the benefits of the review, and to create a more natural integration of the review into the everyday experience of the site.

Comment by habryka (habryka4) on How to (hopefully ethically) make money off of AGI · 2024-02-11T07:45:09.638Z · LW · GW

Do you have a rough estimate of how much it went up in the last 3 months?

Comment by habryka (habryka4) on Welcome to the SSC Dublin Meetup · 2024-02-06T02:45:32.315Z · LW · GW

I marked the group as inactive. 

Comment by habryka (habryka4) on TurnTrout's shortform feed · 2024-02-03T17:14:44.352Z · LW · GW

Yeah, not being able to say "negative reward"/"punishment" when you use "reinforcement" seems very costly. I've run into that problem a bunch.

And yeah, that makes sense. I get the "reward implies more model based-thinking" part. I kind of like that distinction, so am tentatively in-favor of using "reward" for more model-based stuff, and "reinforcement" for more policy-gradient based stuff, if other considerations don't outweigh that.

Comment by habryka (habryka4) on TurnTrout's shortform feed · 2024-02-03T09:04:41.538Z · LW · GW

I don't understand why "reinforcement" is better than "reward"? They both invoke the same image to me. 

If you reward someone for a task, they might or might not end up reliably wanting to do the task. Same if you "reinforce" them to do that task. "Reinforce" is more abstract, which seems generally worse for communication, so I would mildly encourage people to use "reward function", but mostly expect other context cues to determine which one is better and don't have a strong general take.

Comment by habryka (habryka4) on Vote in the LessWrong review! (LW 2022 Review voting phase) · 2024-01-31T15:25:26.829Z · LW · GW

Oops, I think that's fair. I adjusted the period of the vote by 24 hours. Agree that the current deadline is confusing.

Comment by habryka (habryka4) on Amritesh Kumar's Shortform · 2024-01-30T18:41:33.969Z · LW · GW

Welcome! Hope you have a good time, and don't hesitate to complain on Intercom (the tiny chat bubble in the bottom right corner) if there is anything you don't like.

Comment by habryka (habryka4) on Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI · 2024-01-29T18:59:10.539Z · LW · GW

Promoted to curated: I like this post as a relatively self-contained explanation for why AI Alignment is hard. It's not perfect, in that I do think it makes a bunch of inferences implicitly and without calling sufficient attention to them, but I still think overall this seems to me like one of the best things to link to when someone asks about why AI Alignment is an open problem.

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-29T17:33:44.108Z · LW · GW

Quantum physics. I don't see why it would be indistinguishable from 50%.

Agree that there will be some decoherence. My guess is decoherence would mostly leave particle position at this scale intact, and if it becomes a huge factor, I would want the question to be settled on the basis being able to predict which side has higher irreducible uncertainty (i.e. which side had higher amplitude, if I am using that concept correctly).

Comment by habryka (habryka4) on This might be the last AI Safety Camp · 2024-01-28T01:36:32.191Z · LW · GW

I do think that helps, but I don't think it helps that much. People don't pursue super naive CDT-ish decision theories. 

In-practice this shakes out in a feeling of being indebted to whoever pays you and a pretty strong hesitation to do something that would upset them, even if they weren't going to pay you more anyways. Also, few games are actually really only single-iteration. You will likely continue interacting in one way or another, and Arb will interact with other clients, making this have more of an iterated nature. 

Comment by habryka (habryka4) on This might be the last AI Safety Camp · 2024-01-28T00:22:24.467Z · LW · GW

My guess is it matters a lot, even if people aspire towards independence. I would update if someone has a long track record of clearly neutral-seeming reports for financial compensation, but I think in the absence of such a track record, my prior would be that people are very rarely capable of making strong negative public statements about people who are paying them.

Comment by habryka (habryka4) on This might be the last AI Safety Camp · 2024-01-26T19:02:08.539Z · LW · GW

On a more meta point, I have honestly not been all that impressed with the average competency of the AIS funding ecosystem. I don't think it not funding a project is particularly strong evidence that the project is a bad idea. 

I made a different call on AISC, but also think this is right. There aren't a lot of players in the funding ecosystem, especially post-FTX there isn't a lot of non-OpenPhil money around, and I generally only weakly update on people succeeding to get funding or failing to get funding.

Comment by habryka (habryka4) on Will quantum randomness affect the 2028 election? · 2024-01-26T18:54:17.320Z · LW · GW

This is a relatively straightforward question in the context of quantum mechanics. There is a fact of the matter of how much amplitude the world states get where one person wins an election vs. the other one. This question is about how much such decoherence there will be.

In this conception of uncertainty there is no answer to the matter of which of the two outcomes really happens. Both events get some magical reality fluid, as Eliezer would call it.

Comment by habryka (habryka4) on RAND report finds no effect of current LLMs on viability of bioterrorism attacks · 2024-01-25T19:32:16.312Z · LW · GW

The original title of this post is "RAND doesn't believe current LLMs are helpful for bioweapons development". I don't think it makes sense to ascribe beliefs this specific to an entity as messy and big as RAND. I changed title to something that tries to be informative without making as strong of a presumption (for link posts to posts by off-site authors I take more ownership over how a post is titled, I wouldn't change the title if the author of the report had created it)

Comment by habryka (habryka4) on The case for ensuring that powerful AIs are controlled · 2024-01-25T19:03:52.773Z · LW · GW

Promoted to curated: I disagree with a bunch of the approach outlined in this post, but I nevertheless found this framing quite helpful for thinking about various AI X-risk related outcomes and plans. I also really appreciate the way this post is written, being overall both approachable while maintaining relatively much precision in talking about these issues. 

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-25T17:59:38.941Z · LW · GW

The bet would then be over the integral of all the random initializations (and random perturbations). I.e. does a random initializations in-expectation leave enough information intact for 20 seconds if you change it a tiny bit.

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-25T06:51:22.168Z · LW · GW

Nah, I don't think that's super relevant here. All the degrees of freedom of the gas are coupled to each other, so the biggest source of chaos can scramble everything just fine.

Hmm, I don't super buy this. For example, this model predicts no standing wave would survive for multiple seconds, but this is trivial to disprove by experiment. So clearly there are degrees of freedom that remain coupled. No waves of substantial magnitude are present in the initialization here, but your argument clearly implies a decay rate for any kind of wave that is too substantial.

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-25T05:22:27.683Z · LW · GW

Yeah, standing waves where what me and Thomas also most talked about when we had a long conversation about this. Seems like there would be a bunch, and they wouldn't obviously decay that fast. 

Comment by habryka (habryka4) on Will quantum randomness affect the 2028 election? · 2024-01-25T02:51:50.764Z · LW · GW

I don't think most people die for quantum-randomness reasons. I expect very little probability of someone dying is related to quantum randomness (though my guess is someone might disagree, but then we are just kind of back to the OP question about how much quantum randomness influences macro-level events).

Comment by habryka (habryka4) on This might be the last AI Safety Camp · 2024-01-25T02:34:44.507Z · LW · GW

I thought some about the AI Safety camp for the LTFF. I mostly evaluated the research leads they listed and the resulting teams directly, for the upcoming program (which was I think the virtual one in 2023). 

I felt unexcited about almost all the research directions and research leads, and the camp seemed like it was aspiring to be more focused on the research lead structure than past camps, which increased the weight I was assigning to my evaluation of those research directions. I considered for a while to fund just the small fraction of research lead teams I was excited about, but it was only a quite small fraction, and so recommended against funding it.

It did seem to me that the quality of research leads was very marketly worse by my lights than past years, so I didn't feel comfortable just doing an outside-view on the impact of past camps (as the ARB report seems to do). I feel pretty good about the past LTFF grants to the past camps but my expectations for post-2021 camps were substantially worse than earlier camps, looking at the inputs and plans, so my expectation of the value of it substantially changed.

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-25T02:26:45.422Z · LW · GW

Doing it for one particle seems like it would be harder than doing it for all particles, since even if you are highly uncertain about each individual particle, in-aggregate that could still produce a quite high confidence about which side has more particles. So my guess is it matters a lot whether it's almost uniform or not.

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-25T02:21:21.557Z · LW · GW

Do you know how to interpret "maximum divergence" in this context? Also, IIRC aren't there higher-order exponents that might decay slower? (I just read about this this morning, so I am quite unfamiliar with the literature here)

Comment by habryka (habryka4) on Will quantum randomness affect the 2028 election? · 2024-01-25T02:17:53.715Z · LW · GW

I do think even if you change the outcome of all people using quantum random number generators, this is quite unlikely to flip the outcome of an election. It's just not that many people, and election margins are quite large. There are butterfly effects here, but I think the prior on the people who use quantum random number generators explaining a lot of the variance of election outcomes seems quite unlikely to me, even if you can correlate their actions somehow.

Comment by habryka (habryka4) on The Hidden Complexity of Wishes · 2024-01-25T02:15:51.080Z · LW · GW

Oh, I was definitely not thinking of a hole in a gas pipe. I was expecting something much much subtler than that (more like very highly localized temperature-increases which then chain-react). You are dealing with omniscient levels of consequence-control here.

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-25T00:13:28.658Z · LW · GW

The goal is not to predict the ratio, but to just predict which side will have more atoms (no matter how small the margin). It seems very likely to me that any such calculation would be extremely prohibitively expensive and would approximately require logical omniscience. 

To clarify this, we are assuming that without random perturbation, you would get 100% accuracy in predicting which side of the system has more atoms at t=20s. The question is how much of that 100% accuracy you can recover with a very very small unknown perturbation.

Comment by habryka (habryka4) on Is a random box of gas predictable after 20 seconds? · 2024-01-25T00:02:18.746Z · LW · GW

The variance in density will by-default be very low, so the effect size of such structure really doesn't have to be very high. Also, if you can identify multiple such structures which are uncorrelated, you can quickly bootstrap to relatively high confidence. 

I don't think "strong correlation" is required. I think you just need a few independent pieces of evidence. Determining such independence is usually really hard to establish, but we are dealing with logical omniscience here.

For example, any set of remotely coherent waves that form in the box with non-negligible magnitude would probably be enough to make a confident prediction. I do think that specific thing is kind of unlikely in a totally randomly initialized box of gas, but I am not confident, and there are many other wave-like patterns that you would find.

Comment by habryka (habryka4) on The Hidden Complexity of Wishes · 2024-01-24T02:31:47.879Z · LW · GW

Does this undermine the parable? Kinda, I think. If you built a machine that samples from some bizarre inhuman distribution, and then you get bizarre outcomes, then the problem is not really about your wish any more, the problem is that you built a weirdly-sampling machine. (And then we can debate about the extent to which NNs are weirdly-sampling machines, I guess.)

This is roughly how I would interpret the post. Physics itself is a bizarre inhuman distribution, and in-general many probability distributions from which you might want to sample from will be bizarre and inhuman. 

Agree that it's then arguable to what degree the optimization pressure of a mature AGI arising from NNs would also be bizarre. My guess is quite bizarre, since a lot of the constraints it will face will be constraints of physics.