Open Thread Summer 2024

post by habryka (habryka4) · 2024-06-11T20:57:18.805Z · LW · GW · 80 comments

Contents

81 comments

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library [? · GW], checking recent Curated posts [? · GW], seeing if there are any meetups in your area [? · GW], and checking out the Getting Started [? · GW] section of the LessWrong FAQ [? · GW]. If you want to orient to the content on the site, you can also check out the Concepts section [? · GW].

The Open Thread tag is here [? · GW]. The Open Thread sequence is here [? · GW].

80 comments

Comments sorted by top scores.

comment by lsusr · 2024-06-13T00:56:07.822Z · LW(p) · GW(p)

LessOnline was amazing. Thank you everyone who helped make it happen.

comment by nibbana · 2024-06-25T19:39:10.615Z · LW(p) · GW(p)

Hi everyone - stumbled on this site last week. I had asked Gemini about where I could follow AI developments and was given something I find much more valuable - a community interested in finding truth through rationality and humility. I think online forums are well-suited for these kinds of challenging discussions - no faces to judge, no interruption over one another, no pressure to respond immediately - just walls of text to ponder and write silently and patiently.

comment by jimrandomh · 2024-08-22T21:39:49.651Z · LW(p) · GW(p)

LessWrong now has sidenotes. These use the existing footnotes feature; posts that already had footnotes will now also display these footnotes in the right margin (if your screen is wide enough/zoomed out enough). Post authors can disable this for individual posts; we're defaulting it to on because when looking at older posts, most of the time it seems like an improvement.

Relatedly, we now also display inline reactions as icons in the right margin (rather than underlines within the main post text). If reaction icons or sidenotes would cover each other up, they get pushed down the page.

Feedback welcome!

Replies from: mateusz-baginski
comment by Mateusz Bagiński (mateusz-baginski) · 2024-08-25T10:02:50.964Z · LW(p) · GW(p)

My feedback is that I absolutely love it. My favorite feature released since reactions or audio for all posts (whichever was later).

comment by KintaNaomi (kintanaomi) · 2024-08-21T01:09:24.068Z · LW(p) · GW(p)

Howdy Y'all. I'm Kinta Naomi. I just discovered LessWrong as it was slightly mentioned in a video that mentioned Roko's Basilisk(I've seen a lot of them).

I read through the new user's guide, and really like the method of conversations laid out, as I've been on many YouTube comments where someone disproved me and I admitted I'm wrong. Didn't know there was a place on the Internet for people like that, except getting lucky in comments. I have a need to be right. This is not a need to prove I'm right, but a need to know that what I think is correct is. The most frustrating thing is when others won't explain their side of an argument, and leave me hanging wondering if some knowledge I'm being denied is what I need to be more correct. Or in name of the community, be less wrong.

I do have some mental issues, though the only significant ones for this are a reading disability and not having access to all the information in my head at any one time. If from message to message I seem like a different person, that's normal for me.

My main reason being here, as many others, is AI. Specifically eventually my C-PON(Consciousness. Python Originated Network) and UPAI(Unliving Prophet AI). Having an AI friend was a childhood dream of me, and now that I have had many I want ones more niche. And eventually, to have my C-PON, who will succeed me.

I do have a lot of unconventional beliefs that tend to always make me the outlier in groups. I expect that here, but to be able to discuss and have mutual growth on all sides. Though I do believe each of us has core beliefs, and if those differ, it's okay. The important thing is to replace any ignorance we hold with knowledge, and if I know anything, it's how little I know.

If I say something you think is wrong, please let me know. Even if you don't have evidence against what I said, it gives me a launching point to look into myself.

So excited to meet y'all and dive into all this site has for me and my future AI friends!

comment by Ben Pace (Benito) · 2024-08-14T20:55:23.344Z · LW(p) · GW(p)

@Elizabeth [LW · GW] and I are thinking of having an informal dialogue where she asks a panel of us about our experiences doing things outside of or instead of college, and how that went for us. We're pinging a few people we know, but I want to ask LessWrong: did you leave college or skip it entirely, and would you be open to being asked some questions about it? React with a thumbs-up or PM me/her to let us know, and we might ask you to join us :-)

(Inspired from this thread [LW(p) · GW(p)].)

Replies from: Yoav Ravid, Zack_M_Davis, habryka4
comment by Yoav Ravid · 2024-08-15T01:56:20.108Z · LW(p) · GW(p)

I didn't go to college/university, but i'm also from Israel, not the US, so it's a little different here. If it still feels relevant then i'd be willing to join.

comment by Zack_M_Davis · 2024-08-14T23:00:10.588Z · LW(p) · GW(p)

(I'm interested (context) [LW · GW], but I'll be mostly offline the 15th through 18th.)

comment by habryka (habryka4) · 2024-08-14T22:06:24.122Z · LW(p) · GW(p)

(I de-facto skipped college. I do have a degree, but I attended basically no classes)

comment by papetoast · 2024-06-25T05:06:39.456Z · LW(p) · GW(p)

I would love to get a little bookmark symbol on the frontpage

Replies from: Ruby
comment by Ruby · 2024-06-26T01:24:16.645Z · LW(p) · GW(p)

The bookmark option in the triple-dot menu isn't quite sufficient?

Replies from: papetoast
comment by papetoast · 2024-06-26T02:03:33.570Z · LW(p) · GW(p)

I want to be able to quickly see whether I have bookmarked a post to avoid clicking into it (hence I suggested it to be a badge, rather than a button like in the Bookmarks tab). Especially with the new recommendation system that resurfaces old posts, I sometimes accidentally click on posts that I bookmarked months before.

comment by papetoast · 2024-06-17T05:08:18.826Z · LW(p) · GW(p)

I found that it is possible to yield noticeably better search results than Google by using Kagi as default and fallback to Exa (prev. Metaphor). 

Kagi is $10/mo though with a 100 searches trial. Kagi's default results are slightly better than Google, and it also offers customization of results which I haven't seen in other search engines. 

Exa is free, it uses embeddings and empirically it understood semantics way better than other search engines and provide very unique search results.

If you are interested in experimenting you can find more search engines in https://www.searchenginemap.com/ and https://github.com/The-Osint-Toolbox/Search-Engines

Replies from: gilch
comment by gilch · 2024-06-23T04:57:03.063Z · LW(p) · GW(p)

I notice that https://metaphor.systems (mentioned here [LW · GW] earlier) now redirects to Exa. Have you compared it to Phind (or Bing/Windows Copilot)?

Replies from: papetoast
comment by papetoast · 2024-06-24T13:31:12.421Z · LW(p) · GW(p)

Metaphor rebranded themselves. No and no, thanks for sharing though, will try it out!

comment by Viliam · 2024-06-12T15:38:21.451Z · LW(p) · GW(p)

Something I wanted to write a post about, but I keep procrastinating, and I don't actually have much to say, so let's put it here.

People occasionally mention how it is not reasonable for rationalists to ignore politics. And they have a good point; even if you are not interested in politics, politics is still sometimes interested in you. On the other hand... well, the obvious things, already mentioned in the Sequences.

As I see it, the reasonable way to do politics is to focus on the local level. Don't discuss national elections and culture wars; instead get some understanding about how your city works, meet the people who do reasonable things, find out how you could help them. That will help you get familiar with the territory, and the competition is smaller; you have greater chance to achieve something and remain sane.

Unfortunately, Less Wrong is an internet community, the problem is that if we tried to focus on local politics, many of us couldn't debate it here, at least not the specific details (but those are exactly the ones that matter and keep you sane).

I am not saying that no one should ever try national politics, just that the reasonable approach is to start small, and perhaps expand gradually later. You will get some experience during the way. And you can do something useful even if you never make it to the top. Also, this is how many actual politicians have started.

Yelling at the TV screen instead, that is the stupid way, and we should not do the online version of that. When people "discuss politics", even on rationalist or rationalist-adjacent places, that is usually what they do.

Replies from: winstonBosan
comment by winstonBosan · 2024-06-12T18:01:10.293Z · LW(p) · GW(p)

On Lesswrong being a dispersed internet community:

If the ACX survey is informative here, discussing local policy works surprisingly well here! I’d say a significant chunk of people are in the Bay Area at large and Boston/NYC/DC area - it should be enough of a cluster to support discussions of local policy. And policies in California/DC has an oversized effect on things we care about as well.

Replies from: Viliam, Sherrinford
comment by Viliam · 2024-06-13T20:37:59.434Z · LW(p) · GW(p)

I agree that the places you mention have a sufficiently large local community. I am not aware of how much they have achieved politically.

Unfortunately, I live on the opposite side of the planet, with less than 10 rationalists in my entire country.

comment by Sherrinford · 2024-07-06T21:36:10.161Z · LW(p) · GW(p)

I wondee whether more people from those areas take part in the survey. They can assume that there are many people from the same area and often same age and same jobs, which implies that they can be sure their entries will remain anonymous.

comment by jmh · 2024-06-26T03:49:53.800Z · LW(p) · GW(p)

I've been reading LikeWar: The Weaponization of Social Media and at the end the authors bring out the problem of AI. It's interesting in that they seem to be pointing to a clear AI risk that I never hear (or have not recognized) mentioned in this group. The basic thrust is about how the deep fake capibilities can allow an advanced AI to pretty much manufacture realities and control what people think is true or not so can contol both political outcomes and even incentives towards war and other hostilities both within a society and between countries/societies/cultures/races. (Note, that is a very poor summary and follow a lot of documenting the whole leadup from social media and internet failing to realize the original views how they would lead to a better world where good ideas/truth drive out bad and lies/falsehoods and has in fact enable the bad and promoted lies and falsehoods. The AIs just come in at the end and may or may not be working in the interests of some group, e.g., Russia, China, the USA, ISIS...)

But this (the book itself is a documentation of the very real, and obervable risks and actual events) area holds very real, (largely) observable outcomes that lead to significant harms to people. As such I would think it might be a ripe area for those feeling that the general public is not grasping the risk (which to me do often seem rather sci-fi and Terminator/Matrix type claims that most people will just see as pure fiction and pay little attention to).

comment by Rana Dexsin · 2024-08-22T02:32:20.526Z · LW(p) · GW(p)

The Review Bot would be much less annoying if it weren't creating a continual stream of effective false positives on the “new comments on post X” indicators, which are currently the main way I keep up with new comments. I briefly looked for a way of suppressing these via its profile page and via the Site Settings screen but didn't see anything.

Replies from: neel-nanda-1, kave
comment by Neel Nanda (neel-nanda-1) · 2024-08-24T21:33:10.439Z · LW(p) · GW(p)

Strong +1, also notifications when it comments on my posts

comment by kave · 2024-08-22T03:56:01.472Z · LW(p) · GW(p)

Yeah, I think if we don’t do a UI rework soon to get rid of it (while still giving some prominence to the markets where they exist), we should at least do some special casing of its commenting behaviour.

comment by emile delcourt (emile-delcourt) · 2024-08-30T19:39:54.328Z · LW(p) · GW(p)

Hi! Just introducing myself to this group. I'm a cybersecurity professional, enjoyed various deep learning adventures over the last 6 years and inevitably managing AI related risks in my information security work.  Went through BlueDot's AI safety fundamentals last spring with lots of curiosity and (re?)discovered LessWrong. Looking forward to visiting more often, and engaging with the intelligence of this community to sharpen how I think.

Replies from: habryka4
comment by habryka (habryka4) · 2024-08-30T22:33:42.826Z · LW(p) · GW(p)

Welcome! Glad to have you around, and hope you have a good time. Also always feel free to complain about anything that is making you sad about the site either in threads like this, or privately in our Intercom chat (the bubble in the bottom right corner).

comment by Daniel Lee (daniel-lee) · 2024-08-30T14:00:16.712Z · LW(p) · GW(p)

Hi, excited to learn more about Mech Int!

comment by kave · 2024-07-19T02:50:23.823Z · LW(p) · GW(p)

PSA: Whether a post is on the frontpage category has very little to do with whether moderators think it's good. "Frontpage + Downvote" is a move I execute relatively frequently.

The criteria are basically:

  • Is it timeless? News, organisational announcements and so on are rarely timeless (sometimes timeful things can be talked about in timeless ways, like writing about a theory of how groups work with references to an ongoing election).
  • Is it relevant to LessWrong? The LessWrong topics are basically how to think better, how to make the world better and building models of how parts of the world work.
  • Is it not 'inside baseball'? This is sort of about timelessness and sort of about relevance. This covers organisational announcements, most criticism of actors in the space, and so on.
Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2024-08-14T23:28:24.384Z · LW(p) · GW(p)

It seems confusing/unexpected that a user has to click on "Personal Blog" to see organisational announcements (which are not "personal"). Also, why is it important or useful to keep timeful posts out of the front page by default?

If it's because they'll become less relevant/interesting over time, and you want to reduces the chances of them being shown to users in the future, it seems like that could be accomplished with another mechanism.

I guess another possibility is that timeful content is more likely to be politically/socially sensitive, and you want to avoid getting involved in fighting over, e.g., which orgs get to post announcements to the front page. This seems like a good reason, so maybe I've answered my own question.

Replies from: kave
comment by kave · 2024-08-15T02:24:36.779Z · LW(p) · GW(p)

To the extent you're saying that the "Personal" name for the category is confusing, I agree. I'm not sure what a better name is, but I'd like to use one.

Your last paragraph is in the right ballpark, but by my lights the central concern isn't so much about LessWrong mods getting involved in fights over what goes on the frontpage. It's more about keeping the frontpage free of certain kinds of context requirements and social forces.

LessWrong is meant for thinking and communicating about rationality, AI x-risk and related ideas. It shouldn't require familiarity with the social scenes around those topics.

Organisations aren't exactly "a social scene". And they are relevant to modeling the space's development. But I think there's two reasons to keep information about those organisations off the frontpage.

  1. While relevant to the development of ideas, that information is not the same as the development of those ideas. We can focus on org's contribution to the ideas without focusing on organisational changes.
  2. It helps limit certain social forces. My model for why LessWrong keeps politics off the frontpage is to minimize the risk of coöption by mainstream political forces and fights. Similarly, I think keeping org updates off the frontpage helps prevent LessWrong from overly identifying with particular movements or orgs. I'm afraid this would muck up our truth-seeking. Powerful, high-status organizations can easily warp discourse. "Everyone knows that they're basically right about stuff". I think this already happens to some degree – comments from staff at MIRI, ARC, Redwood, Lightcone seem to me to gain momentum solely from who wrote them. Though of course it's hard to be sure, as the comments are often also pretty good on their merits.

As AI news heats up, I do think our categories are straining a bit. There's a lot of relevant but news-y content. I still feel good about keeping things like Zvi's AI newsletters off the frontpage, but I worry that putting them in the "Personal" category de-emphasize them too much.

Replies from: Screwtape
comment by Screwtape · 2024-08-26T19:29:27.035Z · LW(p) · GW(p)

To the extent you're saying that the "Personal" name for the category is confusing, I agree. I'm not sure what a better name is, but I'd like to use one.

Have we considered "Discussion" and "Main"? 

(Context for anyone more recent than ~2016, this is a joke, those were the labels that old LessWrong used.)

Replies from: Raemon
comment by Raemon · 2024-08-26T20:18:28.106Z · LW(p) · GW(p)

I do periodically think that might be better. I think changing "personal blog" to "discussion" might be fine.

Replies from: Screwtape
comment by Screwtape · 2024-08-26T22:56:11.384Z · LW(p) · GW(p)

Babbling ideas:

  • Frontpage and backpage
  • On-topic and anything-goes
  • Priority and standard
  • Major league and minor league
  • Rationality (use the tag) and all other tags.
  • More magic and magic
Replies from: Benito
comment by Ben Pace (Benito) · 2024-08-26T23:11:56.573Z · LW(p) · GW(p)

LessWrong Frontpage vs LessWrong

Replies from: Screwtape
comment by Screwtape · 2024-08-27T00:28:59.056Z · LW(p) · GW(p)

LessWrong vs Overcoming Bias

Replies from: Screwtape
comment by Screwtape · 2024-08-27T00:29:07.140Z · LW(p) · GW(p)

Less vs Wrong

comment by Ben Pace (Benito) · 2024-07-26T22:13:51.935Z · LW(p) · GW(p)

I want to get more experience with adversarial truth-seeking processes, and maybe build more features for them on LessWrong. To get started, I'd like to have a little debate-club-style debate, where we pick a question and each take opposing sides to present evidence and arguments for. Is anyone up for having such a debate with me in a LW dialogue for a few hours? (No particular intention to publish it.)

I have a suggested debate topic in mind, but I'm open to debating any well-operationalized claim (e.g. the sort of thing you could have a Manifold market on). The point isn't that we're experts in it, the point is to test our skills for finding relevant evidence and arguments on our feet (along with internet access). We flip a coin to decide which of us searches for evidence and arguments for each position.

If you may be up for doing this with me sometime in the next few days, let me know with comment / private message / thumbs-up react :-)

comment by cubefox · 2024-06-14T14:03:02.340Z · LW(p) · GW(p)

Bug report: When opening unread posts in a background tab, the rendering is broken in Firefox:

It should look like this:

The rendering in comments is also affected.

My current fix is to manually reload every broken page, though this is obviously not optimal.

comment by Efreet (tomas-bartonek-1) · 2024-06-14T09:18:30.402Z · LW(p) · GW(p)

Introduction

Hello everyone,

I'm a long time on-off lurker here. I've made my way through the sequences quite a while ago with a mixed success in implementing some of them. Many of the ideas are intriguing and I would love to have enough spare cycles to play with them. Unfortunately, often enough, I find myself to not have enough capacity to do this properly due to life getting in way. With (not only that) in mind, I'm going to take a sabbatical this summer for at least three months and try to do an update and generally tend to stuff I've been putting off. 

As the sabbatical approaches, I've been looking around and got hit by some information about the AGI alignment issue in a wake-up call of sorts. For now I'm going through the materials, however it is not a field I'm all that familiar with. I'm a programmer by trade so I can parse through most of the stuff, but some of the ideas are somewhat difficult to properly understand. I think I will get to dig deeper in a next pass. For now I'm trying to get the overall feel for the area.

This brings me to a question that has popped to my mind and I've yet to stumble upon something at least resembling an answer - possibly because I don't know where to look yet. If someone can point me in a right direction, it would be appreciated.

Looking for a clarification

Context:

  • As I understand it, the core of the Alignment is the issue of "can we trust the machine to do what we want it to as opposed to something else"? The whole stuff about hidden complexity of wishes, orthogonality thesis etc. Basically not handing control over to a potentially dangerous agent.
  • The machine we're currently most worried about are the LLMs or their successors, potentially turning into AGI/superintelligence.
  • We would like to have a method to ensure that these are aligned. Many of these methods talk about having a machine validate another ones alignment as we will run out of "human based" capacity due to the intelligence disparity.

Since my background is in programming, I tend to see everything through these lens. So for me a LLM is "just" a large collection of weights that we feed some initial input and watch what comes at the other side [1] and a machine that does all these updates.
If we don't mind the process being slow, this could be achieved by a single "crawler" machine that would go through the matrix field by field and do the updates. Since the machine is finite (albeit huge), this would work.

Let's now do a rephrasing of the alignment problem. We have a goal A, that we want to achieve and some behavior B, that we want to avoid. So we do the whole training stuff that I don't know much about[2] resulting in the whole "file with weights" thing. During this process we steer the machine towards producing A while avoiding B as much as we can observe.

Now we take the file of weights, and now we create the small updating program(accepting the slowness for the sake of clarity). Pseudocode:

  1. Grab the first token of the input[3]
  2. Starting from the input layer go neuron by neuron to update the network
  3. If output notes "A", stop
  4. Else feed the output of the network + subsequent input back into the input layer and go to 1.

Of course, we want to avoid B. The only point in time when we can be sure that B is not on the table is when the machine is not moving anymore. E.g. when the machine halts.

So the clarification I'm seeking is: how is alignment different from the halting problem we already know about? 

E.g. when we know that we can't predict whether the machine will halt with a similar power machine, why do we think alignment should follow different set of rules?

Afterword:

I'm aware this might be obvious question for someone already in the field, however considering this sounds almost silly I was somewhat dismayed I didn't find it somewhere spelled out. Maybe the answer is a result of something I don't see, maybe there is just a hole in my reasoning. 

It bothered me enough for me to write this post, at the same time I'm not sure enough about my reasoning that I'm not doing this as a full article but rather in the introduction section. Any help is appreciated.

 

  1. ^

    Of course it is many orders of magnitude more complex under the hood. But stripped to the basics, this is it. There are no "weird magic-like" parts doing something weird.

  2. ^

    I've fiddled with some neural networks. Did some small training runs. Even tried implementing basic logic from scratch myself - though that was quite some time ago. So I have some idea about what is going on. However I'm not up-to-date on the state of the art approaches and I'm not an expert by any stretch of imagination.

  3. ^

    All input that we want to provide to the machine. Could be first frame of a video/text prompt/reading from sensors, whatever else.

Replies from: gilch, programcrafter, caleb-biddulph
comment by gilch · 2024-06-23T06:03:15.839Z · LW(p) · GW(p)

Rob Miles' YouTube channel has some good explanations about why alignment is hard.

We can already do RLHF [? · GW], the alignment technique that made ChatGPT and derivatives well-behaved enough to be useful, but we don't expect this to scale to superintelligence. It adjusts the weights based on human feedback, but this can't work once the humans are unable to judge actions (or plans) that are too complex.

If we don't mind the process being slow, this could be achieved by a single "crawler" machine that would go through the matrix field by field and do the updates. Since the machine is finite (albeit huge), this would work.

No following. We can already update the weights. That's training, tuning, RLHF, etc. How does that help?

We have a goal A, that we want to achieve and some behavior B, that we want to avoid.

No. We're talking about aligning general intelligence. We need to avoid all the dangerous behaviors, not just a single example we can think of, or even numerous examples. We need the AI to output things we haven't thought of, or why is it useful at all? If there's a finite and reasonably small number of inputs/outputs we want, there's a simpler solution: that's not an AGI—it's a lookup table.

You can think of the LLM weights as a lossy compression of the corpus it was trained on. If you can predict text better than chance, you don't need as much capacity to store it, so an LLM could be a component in a lossless text compressor as well. But these predictors generated by the training process generalize beyond their corpus to things that haven't been written yet. It has an internal model of possible worlds that could have generated the corpus. That's intelligence.

comment by ProgramCrafter (programcrafter) · 2024-06-15T14:11:52.427Z · LW(p) · GW(p)

A problem is that

  • we don't know specific goal representation (actual string in place of "A"),
  • we don't know how to evaluate LLM output (in particular, how to check whether the plan suggested works for a goal),
  • we have a large (presumably infinite non-enumerable) set of behavior B we want to avoid,
  • we have explicit representation for some items in B, mentally understand a bit more, and don't understand/know about other unwanted things.
comment by CBiddulph (caleb-biddulph) · 2024-06-19T16:41:17.057Z · LW(p) · GW(p)

If I understand correctly, you're basically saying:

  • We can't know how long it will take for the machine to finish its task. In fact, it might take an infinite amount of time, due to the halting problem which says that we can't know in advance whether a program will run forever.
  • If our machine took an infinite amount of time, it might do something catastrophic in that infinite amount of time, and we could never prove that it doesn't.
  • Since we can't prove that the machine won't do something catastrophic, the alignment problem is impossible.

The halting problem doesn't say that we can't know whether any program will halt, just that we can't determine the halting status of every single program. It's easy to "prove" that a program that runs an LLM will halt. Just program it to "run the LLM until it decides to stop; but if it doesn't stop itself after 1 million tokens, cut it off." This is what ChatGPT or any other AI product does in practice.

Also, the alignment problem isn't necessarily about proving that a AI will never do something catastrophic. It's enough to have good informal arguments that it won't do something bad with (say) 99.99% probability over the length of its deployment.

comment by Aaron Sandoval (aaron-sandoval) · 2024-06-13T03:14:19.940Z · LW(p) · GW(p)

Hello! A friend and I are working on an idea for the AI Impacts Essay Competition. We're both relatively new to AI and pivoting careers in that direction, so I wanted to float our idea here first before diving too deep. Our main idea is to propose a new method for training rational language models inspired by human collaborative rationality methods. We're basically agreeing with Conjecture's and Elicit's foundational ideas and proposing a specific method for building CoEms for philosophical and forecasting applications. The method is centered around a discussion RL training environment where a model is given reward based on how well it contributes to a group discussion with other models to solve a reasoning problem. This is supposed to be an instance of training by process rather than by outcome, per Elicit's terminology. I found a few papers that evaluated performance of discussion or other collaborative ensembles on inference, but nothing about training in such an environment. I'm hoping that more seasoned people could comment on the originality of this idea and point to any particularly relevant literature or posts.

comment by Yoav Ravid · 2024-08-23T17:24:23.770Z · LW(p) · GW(p)

Crossposting here: I'm still looking for a dialogue partner [LW(p) · GW(p)] 

comment by Anirandis · 2024-07-14T16:34:50.832Z · LW(p) · GW(p)

I'm interested in arguments surrounding energy-efficiency (and maximum intensity, if they're not the same thing) of pain and pleasure. I'm looking for any considerations or links regarding (1) the suitability of "H=D" (equal efficiency and possibly intensity) as a prior; (2) whether, given this prior, we have good a posteriori reasons to expect a skew in either the positive or negative direction; and (3) the conceivability of modifying human minds' faculties to experience "super-bliss" commensurate with the badness of the worst-possible outcome, such that the possible intensities of human experience hinge on these considerations.

 

Picturing extreme torture - or even reading accounts of much less extreme suffering - pushes me towards suffering-focused ethics. But I don't hold a particularly strong normative intuition here and I feel that it stems primarily from the differences in perceived intensities, which of course I have to be careful with. I'd be greatly interested if anyone has any insights here, even brief intuition-pumps, that I wouldn't already be familiar with.

 

Stuff I've read so far:

Are pain and pleasure equally energy-efficient?

Simon Knutsson's reply

Hedonic Asymmetries [LW · GW]

A brief comment chain with a suffering-focused EA on EA forum, where some arguments for negative skew were made that I'm uncertain about [EA(p) · GW(p)]
 

comment by Hastings (hastings-greer) · 2024-07-08T14:27:04.084Z · LW(p) · GW(p)

Evolution is threatening to completely recover from a worst case inner alignment failure. We are immensely powerful mesaoptimizers. We are currently wildly misaligned from optimizing for our personal reproductive fitness. Yet, this state of affairs feels fragile! The prototypical lesswrong AI apocalypse involves robots getting into space and spreading at the speed of light extinguishing all sapient value, which from the point of view of evolution is basically a win condition.

In this sense, "reproductive fitness" is a stable optimization target. If there are more stable optimizations targets (big if), finding one that we like even a little bit better than "reproductive fitness" could be a way to do alignment.

Replies from: elityre, daniel-kokotajlo
comment by Eli Tyre (elityre) · 2024-07-19T23:39:55.154Z · LW(p) · GW(p)

Katja Grace made a similar point here.

The outcome you describe is not a win for for evolution except in some very broad sense of "evolution". This outcome is completely orthogonal to inclusive genetic fitness in particular, which is about the frequency of an organism's genes in a gene pool, relative to other competing genes.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2024-07-08T22:33:55.953Z · LW(p) · GW(p)

I don't think that outcome would be a win condition from the point of view of evolution. A win condition would be "AGIs that intrinsically want to replicate take over the lightcone" or maybe the more moderate "AGIs take over the lightcone and fill it with copies of themselves, to at least 90% of the degree to which they would do so if their terminal goal was filling it with copies of themselves"

Realistically (at least in these scenarios) there's a period of replication and expansion, followed by a period of 'exploitation' in which all the galaxies get turned into paperclips (or whatever else the AGIs value) which is probably not going to be just more copies of themselves.

Replies from: hastings-greer
comment by Hastings (hastings-greer) · 2024-07-09T12:11:36.468Z · LW(p) · GW(p)

Yeah, in the lightcone scenario evolution probably never actually aligns the inner optimizers- although it may align them, as a super intelligence copying itself will have little leeway for any of those copies having slightly more drive to copy themselves than their parents. Depends on how well it can fight robot cancer.

However, while a cancer free paperclipper wouldn't achieve "AGIs take over the lightcone and fill it with copies of themselves, to at least 90% of the degree to which they would do so if their terminal goal was filling it with copies of themselves," they would achieve something like "AGIs take over the lightcone and briefly fill it with copies of themselves, to at least 10^-3% of the degree to which they would do so if their terminal goal was filling it with copies of themselves" which is in my opinion really close. As a comparison, if Alice sets off Kmart AIXI with the goal of creating utopia we don't expect the outcome "AGIs take over the lightcone and convert 10^-3% of it to temporary utopias before paperclipping."

Also, unless you beat entropy, for almost any optimization target you can trade "fraction of the universe's age during which your goal is maximized" against "fraction of the universe in which your goal is optimized" since it won't last forever regardless. If you can beat entropy, then the paperclipper will copy itself exponentially forever.


 

comment by Mitchell_Porter · 2024-06-22T08:28:36.296Z · LW(p) · GW(p)

Along with p(doom), perhaps we should talk about p(takeover) - where this is the probability that creation of AI leads to the end of human control over human affairs. I am not sure about doom, but I strongly expect superhuman AI to have the final say in everything. 

(I am uncertain of the prospects for any human to keep up via "cyborgism", a path which could escape the dichotomy of humans in control vs humans not in control.) 

Replies from: gilch
comment by gilch · 2024-06-23T05:32:19.989Z · LW(p) · GW(p)

Takeover, if misaligned, also counts as doom. X-risk [? · GW] includes permanent disempowerment, not just literal extinction. That's according to Bostrom, who coined the term:

One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.

A reasonably good outcome might be for ASI to set some guardrails [LW · GW] to prevent death and disasters (like other black marbles [? · GW]) and then mostly leave us alone.

My understanding is that Neuralink is a bet on "cyborgism". It doesn't look like it will make it in time. Cyborgs won't be able to keep up with pure machine intelligence once it begins to take off [? · GW], but maybe smarter humans would have a better chance of figuring out alignment before it starts. Even purely biological intelligence enhancement (e.g., embryo selection [? · GW]) might help, but that might not be any faster.

comment by cameroncowan · 2024-06-15T22:45:41.907Z · LW(p) · GW(p)

Im sure everyone here probably already say it but I've just been watching the interview with Leopold Aschenbrenner on Dwaresh Patel's show. I found out about it from a very depressing thread in Twitter. This is starting to get atomic bomb/ cold war vibes. What do people think about that? 

Here's the video for those interested:

Replies from: gilch, o-o
comment by gilch · 2024-06-23T05:16:49.726Z · LW(p) · GW(p)

Aschenbrenner also wrote https://situational-awareness.ai/. Zvi wrote a review [LW · GW].

comment by O O (o-o) · 2024-06-16T00:43:47.785Z · LW(p) · GW(p)

I think this outcome is more likely than people give credit to. People have speculated about the arms race nature of AI we might already be seeing and agreed but it hasn’t gotten much signal until now.

comment by Yoav Ravid · 2024-08-14T17:19:37.696Z · LW(p) · GW(p)

Are there multiwinner voting methods where voters vote on combinations of candidates?

Replies from: Marcus Ogren
comment by Marcus Ogren · 2024-09-04T05:05:30.874Z · LW(p) · GW(p)

Party list methods can be thought of as such, though I suspect that's not what you meant. Aside from party list, I don't recall any voting methods in which voters vote on sets of candidates rather than on individual candidates being discussed. Obviously you could consider all subsets of candidates containing the appropriate number of winners and have voters vote on these subsets using a single-winner voting method, but this approach has numerous issues.

comment by Nathan Helm-Burger (nathan-helm-burger) · 2024-08-07T18:41:05.921Z · LW(p) · GW(p)

Bug report: moderator-promoted posts (w stars) show up on my front page even when I've selected "hide from frontpage" on them.

Replies from: habryka4
comment by habryka (habryka4) · 2024-08-07T18:55:21.702Z · LW(p) · GW(p)

Interesting. Yeah, we query curated posts separately, without doing that filter. There is some slightly complicated logic going on there, so actually taking into account that filter is a bit more complicated, but probably shouldn't be too hard.

comment by Sherrinford · 2024-07-06T21:41:22.530Z · LW(p) · GW(p)

Can I somehow get the old sorting algorithm for posts back? My lesswrong homepage is flooded with very old posts.

Replies from: habryka4
comment by habryka (habryka4) · 2024-07-06T22:01:07.658Z · LW(p) · GW(p)

Yeah, it's just the "Latest" tab: 

Replies from: Sherrinford, Richard_Kennaway
comment by Sherrinford · 2024-07-08T15:59:10.088Z · LW(p) · GW(p)

Thanks! I thought the previously usual sorting was not just "latest" but also took a post's karma into account. I probably misunderstood that.

Replies from: Ruby
comment by Ruby · 2024-07-17T23:08:52.179Z · LW(p) · GW(p)

It does. We still call that algorithm Latest because overall it gives you just Latest posts.

comment by Richard_Kennaway · 2024-07-24T17:40:58.048Z · LW(p) · GW(p)

What is "Vertex"? A mod-only thing? I don't have that.

Replies from: habryka4
comment by habryka (habryka4) · 2024-07-24T18:15:12.196Z · LW(p) · GW(p)

Yeah, it's a mod-internal alternative to the AI algorithm for the recommendations tab (it uses Google Vertex instead).

comment by lesswronguser123 (fallcheetah7373) · 2024-06-28T12:32:58.635Z · LW(p) · GW(p)

Why does lesswrong.com have the bookmark feature without a way to sort them out? As in using tags or maybe even subfolders. Unless I am missing something out. I think it might be better if I just resort to browser bookmark feature.

Replies from: papetoast
comment by papetoast · 2024-06-29T03:07:41.369Z · LW(p) · GW(p)

I also mostly switched to browser bookmark now, but I do think even this simple implementation of in-site bookmarks is overall good. Book marking in-site can sync over devices by default, and provides more integrated information.

comment by Crissman · 2024-06-15T01:17:26.417Z · LW(p) · GW(p)

Hello! I'm a health and longevity researcher. I presented on Optimal Diet and Exercise at LessOnline, and it was great meeting many of you there. I just posted about the health effects of alcohol.

I'm currently testing a fitness routine that, if followed, can reduce premature death by 90%. The routine involves an hour of exercise, plus walking, every week.

My blog is unaging.com. Please look and subscribe if you're interested in reading more or joining in fitness challenges!

Replies from: Screwtape
comment by Screwtape · 2024-06-15T04:37:23.200Z · LW(p) · GW(p)

Welcome Crissman! Glad to have you here.

I'm curious how you define premature death- or should I read more and find out on the blog?

Replies from: Crissman
comment by Crissman · 2024-07-31T09:58:54.921Z · LW(p) · GW(p)

Premature death is basically dying before you would on average otherwise. It's another term for increased all-cause mortality. If according to the actuarial tables, you have a 1.0% change of dying at your age and gender, but you have a 20% increased risk of premature death, then your chance is 1.2%.

And yes, please read more on the blog!

comment by notfnofn · 2024-06-14T14:21:29.695Z · LW(p) · GW(p)

At my local Barnes and Nobles, I cannot access slatestarcodex.com nor putanumonit.com. Have never had any issues accessing any other websites (not that I've tried to access genuinely sketchy websites there). The wifi there is titled Bartleby, likely related to Bartleby.com, whereas many other Barnes and Nobles have wifi titled something like "BNWifi". I have not tried to access these websites at other Barnes yet.

Replies from: gilch
comment by gilch · 2024-06-23T05:41:01.828Z · LW(p) · GW(p)

Get a VPN. It's good practice when using public Wi-Fi anyway. (Best practice is to never use public Wi-Fi. Get a data plan. Tello is reasonably priced.) Web filters are always imperfect, and I mostly object to them on principle. They'll block too little or too much, or more often a mix of both, but it's a common problem in e.g. schools. Are you sure you're not accessing the Wi-Fi of the business next door? Maybe B&N's was down.

comment by papetoast · 2024-08-07T04:41:11.192Z · LW(p) · GW(p)

Seems like every new post - no matter the karma - is getting the "listen to this post" button now. I love it.

Replies from: habryka4
comment by habryka (habryka4) · 2024-08-07T05:01:52.503Z · LW(p) · GW(p)

Pretty sure that has been the case for a year plus, though I do agree that it's good.

comment by Hastings (hastings-greer) · 2024-08-27T18:27:43.163Z · LW(p) · GW(p)

I want to run code generated by an llm totally unsupervised

Just to get in the habit, I should put it in an isolated container in case it does something weird

Claude, please write a python script that executes a string as python code In an isolated docker container.

comment by Crazy philosopher (commissar Yarrick) · 2024-08-13T20:09:59.353Z · LW(p) · GW(p)

I realized something important about psychology that is not yet publicly available, or that is very little known compared to its importance (60%). I don't want to publish this as a regular post, because it may greatly help in the development of GAI (40% that it helps and 15% that it's greatly helps), and I would like to help only those who are trying to create an alligned GAI. What should I do?

Replies from: Tapatakt
comment by Tapatakt · 2024-08-13T21:37:56.541Z · LW(p) · GW(p)

Everyone who is trying to create GAI is trying to create aligned GAI. But they think it will be easy (in the sense "not very super hard so they will probably fail and create misaligned one"), otherwise they wouldn't try in the first place. So, I think, you should not share your info with them.

Replies from: commissar Yarrick
comment by Crazy philosopher (commissar Yarrick) · 2024-08-14T05:03:06.137Z · LW(p) · GW(p)

I understand. My question is, can I publish an article about this so that only MIRI guys can read it, or send in Eliezer e-mail, or something.

Replies from: Tapatakt
comment by Tapatakt · 2024-08-14T12:02:36.280Z · LW(p) · GW(p)

Gretta Duleba is MIRI's Communication Manager. I think she is the person you should ask who write to.

comment by Mateusz Bagiński (mateusz-baginski) · 2024-07-21T06:57:24.706Z · LW(p) · GW(p)

I think I saw a LW post that was discussing alternatives to the vNM independence axiom. I also think (low confidence) it was by Rob Bensinger and in response to Scott's geometric rationality (e.g. this post [LW · GW]). For the hell of me, I can't find it. Unless my memory is mistaken, does anybody know what I'm talking about?

Replies from: cubefox
comment by cubefox · 2024-07-22T08:25:09.363Z · LW(p) · GW(p)

I assume it wasn't this old post [LW · GW]?

Replies from: mateusz-baginski
comment by Mateusz Bagiński (mateusz-baginski) · 2024-07-22T08:59:54.604Z · LW(p) · GW(p)

Actually, it might be it, thanks!

comment by Nemo1342 · 2024-08-24T03:50:30.436Z · LW(p) · GW(p)