Posts

On epistemic autonomy 2024-08-31T18:50:43.377Z
Two LessWrong speed friending experiments 2024-06-15T10:52:26.081Z
Announcing the Double Crux Bot 2024-01-09T18:54:15.361Z

Comments

Comment by sanyer (santeri-koivula) on IAPS: Mapping Technical Safety Research at AI Companies · 2024-12-02T11:55:16.747Z · LW · GW

Really interesting work! I have two questions:

1. In the model organisms of misalignment -section it is stated that AI companies might be nervous about researching model organisms because it could increase the likelihood of new regulation, since it would provide more evidence on concerning properties in AI system. Doesn't this depend on what kind of model organisms the company expects to be able to develop? If it's difficult to find model organisms, we would have evidence that alignment is easier and thus there would be less need for regulation.  

2. Why didn't you listed AI control work as one of the areas that may be slow to progress without efforts from outside labs? According to your incentives analysis it doesn't seem like AI companies have many incentives to pursue this kind of work, and there were zero papers on AI control.

Comment by sanyer (santeri-koivula) on Skills from a year of Purposeful Rationality Practice · 2024-09-29T14:37:43.720Z · LW · GW

I've also found "spreadsheet literacy" a recurring skill

What exactly do you use spreadsheets for? Any examples?

Comment by sanyer (santeri-koivula) on Announcing the Double Crux Bot · 2024-07-12T08:04:03.440Z · LW · GW

Unfortunately the bot works only in Discord and Slack.

Comment by sanyer (santeri-koivula) on Announcing the Double Crux Bot · 2024-04-04T08:43:52.332Z · LW · GW

Here's another about biking:

Comment by sanyer (santeri-koivula) on Announcing the Double Crux Bot · 2024-04-04T08:43:16.552Z · LW · GW

Sure! Here's a simple conversation about tea:

Comment by sanyer (santeri-koivula) on CFAR Takeaways: Andrew Critch · 2024-02-15T22:04:15.022Z · LW · GW

Filtering for "people who can afford to pay for a workshop" works pretty well.

This is surprising to me. It seems to assume income is just based on general competence, which doesn't seem true to me. There are a lot of people who seem to have these traits who would find it really difficult to pay for this, and vice versa

Comment by sanyer (santeri-koivula) on Announcing the Double Crux Bot · 2024-01-13T09:35:05.652Z · LW · GW

I can see why you think it would be contradictory. The idea in the example was that both of you want better working environment in your workplace, but you have different opinions on how to get there. Whereas the disclaimers were about situations where this is not the case. For example, a situation where the other person doesn't care about a safe working environment. Does that make it clearer?

We are probably going to change the example if it's unclear though

Comment by sanyer (santeri-koivula) on Job listing: Communications Generalist / Project Manager · 2023-11-06T21:08:02.591Z · LW · GW

When is the deadline for applying?

Comment by sanyer (santeri-koivula) on Picture Frames, Window Frames and Frameworks · 2023-10-24T10:44:43.625Z · LW · GW

Some of the links in this post don't work for me. They seem to be links to localhost.

Comment by sanyer (santeri-koivula) on Open Thread – Autumn 2023 · 2023-10-05T07:23:38.542Z · LW · GW

Is there a tag for posts applying CFAR-style rationality techniques? I'm a bit surprised that I haven't found one yet, and also a bit surprised by how few posts of people applying CFAR-style techniques (like internal double crux) there are.

Comment by sanyer (santeri-koivula) on Coherence Therapy with LLMs - quick demo · 2023-09-28T20:42:36.095Z · LW · GW

It doesn't seem sufficient anymore to have a VPN in order to get an access to Claude. You also need a UK/US -based phone number. If anyone knows how to get around this, please let me know!