LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

review-bot on My Objections to "We’re All Gonna Die with Eliezer Yudkowsky"

The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?

thornoar on Open Thread Spring 2024

Thank you for your kind words! Unfortunately, Asymptote doesn't really have much of a community development platform, but I'll be trying to make smoothmanifold part of the official project in some way or another. Right now the development is so fast that the README is actually out of date... gotta fix that. So far, though, my talents seem less to help others and more to serve as a pleasurable pastime :)

I'm also glad that another person discovered Asymptote and liked it --- it's a language that I cannot stop to admire for the graphical functionality, ease of image creation (pdf's, jpeg's, svg's, etc., all with the same interface), and at the same time amazing programming potential (you can redefine any builtin function, for example, and Asymptote will carry on with your definition)

hastings-greer on How would you navigate a severe financial emergency with no help or resources?

No one we have worked with has had a license. I think you need one to take care of multiple people's kids at your house, but not to take care of one family's kids at their house.

ryan_greenblatt on AI Control: Improving Safety Despite Intentional Subversion

I agree that if the model constructing the backdoor is allowed to coordinate on secrets with the model that inserted the backdoor, you can do this sort of trickery (by coordinating on a secret for this sort of scheme).

In practice, we didn't model this sort of case as it didn't seem like a very important dynamic (for the reason you describe around suspicion).

If we don't allow any coordination on a shared secret and the model constructing the backdoor and H are identical (as in our paper), then these should operate identically (up to non-determinism).

And we simplify to just fixing the behavior to be exactly identical. (Effectively assuming the exact same prompts and the exact same non-determinism.)

Thanks for the objection.

shubhorup-biswas on AI Control: Improving Safety Despite Intentional Subversion

>We assume that 𝐻 will be able to construct a successful exploit if and only if the attack policy is able to (because both are implemented by GPT-4).

I challenge this assumption.

Consider the code

'''
bool is_odd(int x) {
if (sha256(x) == 354) {
return true;
}
return x % 2 == 1;
}

'''

where sha256() computes the SHA-256 hash of the input. It is obviously possible for an attacker to supply numbers x and y such that sha256(x) = y, write y into the code and supply x when creating a successful exploit.

Thus an equally 'capable' adversary need not be able construct a backdoor.

The general idea is that it should be possible to confidently claim that some code is backdoored without being able to construct an exploit. According to [Rice's Theorem](https://en.wikipedia.org/wiki/Rice%27s_theorem) it is impossible to definitively classify all programs into 'backdoored or not', so the only hope is for a process to bin programs into 'definitely safe', 'definitely backdoored' and 'suspicious'.

Hope the 'Rate Suspicion' step will catch this, and other backdoored code where H cannot construct an exploit, as suspicious looking.

anthonyc on GDP per capita in 2050

I think there's a good chance the degree to which the world of 2050 looks different to the average person might have very little to do with GDP.

On the one hand, a large chunk of the GDP growth I expect will come from changes in how we produce, distribute, and use energy and chemicals and water and food and steel and concrete etc. But for most people what will mostly feel the same is that their home is warm in winter and cool in summer, and they can get from place to place reasonably easily, and they have machines that do their basic household chores.

On the other hand, something like self-driving cars, or augmented or virtual reality, or 3D printed organs, could be hugely tranaformative for society without necessarily impacting GDP growth much at all.

benito on The Best Tacit Knowledge Videos on Every Subject

Done.

habryka4 on Open Thread Spring 2024

Only the authors (and admins) can do it.

If you paste some images here that seem good to you, I can edit them unilaterally, and will message the authors to tell them I did that.

parker-conley on The Best Tacit Knowledge Videos on Every Subject

Any chance you could pin the 'Updates Thread' too?

dzoldzaya on Which skincare products are evidence-based?

Of course, but there reaches a level of sun exposure at which the marginal increased harm becomes negligible compared to other things that damage your skin (see this meta-analysis - photo-aging is just one component among many), and below that level you're probably actually getting suboptimal levels of UV exposure for skin health (see this article for benefits of UV - from Norway, aptly).

I'd love to see someone try to measure and compare the specific trade-offs, but I strongly suspect that people at northern latitudes should just trust common sense - only wear sunscreen in summer months, and when you're actually exposed to the sun for extended periods.

LessWrong 2.0 Reader

Archive

Recent comments