Posts

Prototype of Using GPT-3 to Generate Textbook-length Content 2023-01-18T14:25:02.444Z
What is wrong with this approach to corrigibility? 2022-07-12T22:55:22.342Z

Comments

Comment by Rafael Cosman (rafael-cosman-1) on Someone already tried "Chaos-GPT" · 2023-04-10T14:24:53.402Z · LW · GW

I'm curious about if a good "hero-GPT" or "alignment-research-support-GPT" could be useful today or with slightly improved tech. Of course having something like this run autonomously is not without risk, but might be quite valuable/important in the sub-critical AI era.

Comment by Rafael Cosman (rafael-cosman-1) on Creating a truly formidable Art · 2023-01-27T21:25:43.939Z · LW · GW

Hey Valentine, I really like this post. I think it hits on some key things that traditional LW culture was missing for a while. Was wondering if you've ever encountered The Conscious Leadership Group (https://conscious.is/)- they explicitly train some techniques similar to what you're describing here (as well as some quite different ones).

Comment by Rafael Cosman (rafael-cosman-1) on Prototype of Using GPT-3 to Generate Textbook-length Content · 2023-01-21T13:37:23.081Z · LW · GW

Cool, thanks for sharing! Hadn't heard of Metaphor before.

Comment by Rafael Cosman (rafael-cosman-1) on Prototype of Using GPT-3 to Generate Textbook-length Content · 2023-01-18T16:44:55.706Z · LW · GW

I might be able to code up an 'editing' pass to catch things like that!

Comment by Rafael Cosman (rafael-cosman-1) on Prototype of Using GPT-3 to Generate Textbook-length Content · 2023-01-18T14:42:32.891Z · LW · GW

:)

Comment by Rafael Cosman (rafael-cosman-1) on Consider using reversible automata for alignment research · 2023-01-18T13:12:39.128Z · LW · GW

Have spent some time playing with reversible CAs, and can confirm that they are very interesting. They are a great example of how provable high-level properties (things like conservation of gliders) can come out of low level properties (reversibility).

Comment by Rafael Cosman (rafael-cosman-1) on Jailbreaking ChatGPT on Release Day · 2022-12-04T01:03:53.542Z · LW · GW

This is absolutely hilarious, thank you for the post. 

Comment by Rafael Cosman (rafael-cosman-1) on What is wrong with this approach to corrigibility? · 2022-09-21T00:53:03.362Z · LW · GW

Great answer, thanks!

Comment by Rafael Cosman (rafael-cosman-1) on What an actually pessimistic containment strategy looks like · 2022-07-12T23:17:12.251Z · LW · GW

Thanks for the post! I think asking AI Capabilities researchers to stop is pretty reasonable, but I think we should be especially careful not to alienate the people closest to our side. E.g. consider how the Protestants and Catholics fought even though they agree on so much.

 

I like focusing on our common ground and using that to win people over. 

Comment by rafael-cosman-1 on [deleted post] 2022-07-06T18:39:03.245Z

Please comment! Excited to hear everyone’s thoughts and feedback on these ideas.

 

Guidelines: please try to keep it positive and constructive, even when providing critical feedback. But my door is open for anything!

Comment by Rafael Cosman (rafael-cosman-1) on AGI Ruin: A List of Lethalities · 2022-06-11T19:18:20.924Z · LW · GW

Eliezar- I love the content, but similar to some other commenters, I think you are missing the value (and rationality) of positivity. Specifically, when faced with an extremely difficult challenge, assume that you (and the other smart people who care about it) have a real shot at solving it! This is the rational strategy for a simple reason: if you don’t have a real shot at solving it then you haven’t lost anything anyway. But if you do have a real shot at solving it, then let’s all give it our 110%!

I’m not proposing being unrealistic about the challenges we face - I’m as concerned as you are. But I believe thinking this way and inviting the community and our broader society to work together on this challenge is part of Good Strategy

Comment by Rafael Cosman (rafael-cosman-1) on AGI Ruin: A List of Lethalities · 2022-06-11T19:13:57.064Z · LW · GW
Comment by Rafael Cosman (rafael-cosman-1) on You are invited to apply to the Guild of the ROSE · 2021-08-23T04:03:58.704Z · LW · GW

This looks awesome! Would love to chat about being involved in some way.

Comment by rafael-cosman-1 on [deleted post] 2021-07-10T06:33:16.794Z

Valid concern. I would say (1) keep our research results very secret (2) hire people that are fairly aligned? But I agree that’s not a sure fire solution at all.