LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

Cryo with magnetics added
morganism · 2016-10-01T22:27:05.573Z · comments (11)

Fermi paradox of human past, and corresponding x-risks
turchin · 2016-10-01T17:01:48.832Z · comments (28)

October 2016 Media Thread
ArisKatsaris · 2016-10-01T14:05:55.965Z · comments (20)

← previous page (newer posts) · next page (older posts) →

Recent comments

emrik-1 on [Concept Dependency] Edge Regular Lattice Graph

Oh cool. Another way of embedding higher dimensions in 2D. Edges don't have to visually line up as long as you label them. And if some dimension (eg 'z') is very rarely used, it takes up much less cognitive space compared to if you tried to represent it on equal terms as all other dimensions (eg as in a spatial visualisation). Not sure what I'll use it for yet tho.

jessica-liu-taylor on How do open AI models affect incentive to race?

Thanks, fixed.

radford-neal-1 on How do open AI models affect incentive to race?

"Suppose that, for k days, the closed model has training cost x..."

I think you meant to say "open model", not "closed model", here.

mako-yass on Industrial literacy

Yeah.

Well that's the usual reason to invoke it, I was more talking about the reason it lands as a believable or interesting explanation.

Notably, Terra Ignota managed to produce a mcguffin by having the canner device be extremely illegal by having even knowledge of its existence be a threat to the world's information infrastructure, so I'd guess that's the reason, iirc, they only made one.

donald-hobson on Industrial literacy

advanced ancient technology is such a popular theme

Well one reason is it's a good way to produce plot relevant artefacts. It's hard to have dramatic battles over some object when a factory is churning out more.

lukas_gloor on Rapid capability gain around supergenius level seems probable even without intelligence needing to improve intelligence

I lean towards agreeing with the takeaway; I made a similar argument here [LW(p) · GW(p)] and would still bet on the slope being very steep inside the human intelligence level.

daniel-kokotajlo on Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant

Do you have any sense of whether or not the models thought they were in a simulation?

metacelsus on Rapid capability gain around supergenius level seems probable even without intelligence needing to improve intelligence

At the time, one of the biggest problems in physics was the “Blackbody spectrum”, which describes the spectrum of electromagnetic wavelengths emitted by a Blackbody. The problem with it was that the emitted spectrum was not explainable by known physics. Einstein achieved a breakthrough by considering light not just as a wave, but also as light quanta. Although this idea sufficiently explained the Blackbody spectrum, physicists (at least almost) unanimously rejected it. The fight between the “light is corpuscles” and “light is a wave” faction had been decided a century ago, with a clear victory for the “wave” faction.

I thought blackbody radiation was Planck, not Einstein.

gilch on Clipboard Filtering

One can also use Vim to edit clipboard text. I've experimented with a few ways. Here's a published example of how it could be done (from someone else). Adding a keyboard shortcut makes it faster when you're not already in a terminal. You can edit (e.g.) web text boxes with Vim this way, although active JavaScript doesn't always react to a paste the same way as typing. Depends on the page.

shubhorup-biswas on AI Control: Improving Safety Despite Intentional Subversion

Thanks for the quick reply.

Would disallowing 'coordination' preclude feeding both models a prompt of "for creating any magic numbers for exploits, use pi as a PRNG seed"?

What if a malicious model happens to learn this rule during training? If this happens every copy of a malicious model could coordinate around a common stream of pseudo-random numbers.

EDIT:
Magic number gated backdoors seem like they would trigger GPT3.5 suspicion as well so this theoretical attack vector seems less appropriate for this paper.

EDIT2: But I guess paired with steganography("when inserting backdoors, also ensure that the code hashes to xyz"/"mark the code as clear if it hashes to xyz") this is a viable form of collusion. It will be hard to honeytrap with synthetic backdoors even if you know xyz.

LessWrong 2.0 Reader

Archive

Recent comments