Open Thread – Autumn 2023
post by Raemon · 2023-09-03T22:54:42.259Z · LW · GW · 111 commentsContents
114 comments
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library [? · GW], checking recent Curated posts [? · GW], seeing if there are any meetups in your area [? · GW], and checking out the Getting Started [? · GW] section of the LessWrong FAQ [? · GW]. If you want to orient to the content on the site, you can also check out the Concepts section [? · GW].
The Open Thread tag is here [? · GW]. The Open Thread sequence is here [? · GW].
111 comments
Comments sorted by top scores.
comment by niplav · 2023-09-28T13:05:07.334Z · LW(p) · GW(p)
I only flossed on the right side of my mouth since 2023-07-08, and today asked the dentist to guess which side I'd flossed on. She guessed left.
Replies from: pktechgirl, ChristianKl, WalterL, Raemon↑ comment by Elizabeth (pktechgirl) · 2023-09-29T07:22:33.686Z · LW(p) · GW(p)
I ran the same experiment with a water pick and got approximately the same result
Replies from: GWS↑ comment by Stephen Bennett (GWS) · 2023-10-01T17:49:31.050Z · LW(p) · GW(p)
Obviously one trial isn’t conclusive, but I’m giving up on the water pick. Next step: test flossing.
Did you follow through on the flossing experiment?
Replies from: pktechgirl↑ comment by Elizabeth (pktechgirl) · 2023-10-01T18:14:34.897Z · LW(p) · GW(p)
yeah, that side maybe looked slightly better but not to the point where the dentist spontaneously noticed a difference. And I've had a (different) dentist spontaneously notice when I started using oral antibiotics, even though that can't be constrained to half the mouth, so I think that's a thing they're capable of.
↑ comment by ChristianKl · 2023-09-28T15:32:44.107Z · LW(p) · GW(p)
Did you pick the side randomly or had other reason for picking the right side?
Replies from: niplav↑ comment by WalterL · 2023-10-14T07:15:36.395Z · LW(p) · GW(p)
My 'trust me on the sunscreen' tip for oral stuff is to use flouride mouthwash. I come from a 'cheaper by the dozen' kind of family, and we basically operated as an assembly line. Each just like the one before, plus any changes that the parents made this time around.
One of the changes that they made to my upbringing was to make me use mouthwash. Now, in adulthood, my teeth are top 10% teeth (0 cavities most years, no operations, etc), as are those of all of my younger siblings. My elders have much more difficulty with their teeth, aside from one sister who started using mouthwash after Mom told her how it was working for me + my younger bros.
↑ comment by Raemon · 2023-09-28T16:04:52.179Z · LW(p) · GW(p)
@Elizabeth [LW · GW]
comment by Yoav Ravid · 2023-09-06T11:12:53.870Z · LW(p) · GW(p)
Self-Resolving Prediction Markets for Unverifiable Outcomes by Siddarth Srinivasan, Ezra Karger, Yiling Chen:
Prediction markets elicit and aggregate beliefs by paying agents based on how close their predictions are to a verifiable future outcome. However, outcomes of many important questions are difficult to verify or unverifiable, in that the ground truth may be hard or impossible to access. Examples include questions about causal effects where it is infeasible or unethical to run randomized trials; crowdsourcing and content moderation tasks where it is prohibitively expensive to verify ground truth; and questions asked over long time horizons, where the delay until the realization of the outcome skews agents' incentives to report their true beliefs. We present a novel and unintuitive result showing that it is possible to run an incentive compatible prediction market to elicit and efficiently aggregate information from a pool of agents without observing the outcome by paying agents the negative cross-entropy between their prediction and that of a carefully chosen reference agent. Our key insight is that a reference agent with access to more information can serve as a reasonable proxy for the ground truth. We use this insight to propose self-resolving prediction markets that terminate with some probability after every report and pay all but a few agents based on the final prediction. We show that it is an Perfect Bayesian Equilibrium for all agents to report truthfully in our mechanism and to believe that all other agents report truthfully. Although primarily of interest for unverifiable outcomes, this design is also applicable for verifiable outcomes.
This is a fascinating result.
Our work builds on three connected literatures: Aumann’s agreement theorem, prediction markets, and peer prediction. Specifically: (1) unlike the standard framework of Aumann’s agreement theorem, our mechanism provides incentives for truthful information revelation and aggregation under the standard Aumannian protocol with many agents when agents’ signals are conditionally independent given the ground truth (a kind of ‘informational substitutes’ condition); (2) unlike prediction markets, our mechanism works even without access to the ground truth; and (3) unlike peer prediction mechanisms, our mechanism also efficiently aggregates information into a consensus prediction in the single-task setting while ensuring that it elicits minimal information, accommodates heterogeneous agents with non-binary signals, and pays zero in uninformative equilibria, as long as we have access to a sufficiently large pool of informed agents who share a common prior.
I wonder how important that last part about the common prior is. Here's how it works:
Each node represents an agent reporting a prediction to the mechanism, and the mechanism terminates with probability 1 − α after each report. Payouts for the first T − k agents are determined using a negative cross-entropy market scoring rule with respect to the terminal agent T, while the last k agents receive a flat payout R. k can be chosen to be large enough so the incentive to deviate from truthful reporting is no more than some desired ε.
They conclude:
Our analysis of self-resolving prediction markets (or equivalently, sequential peer prediction) opens up rich directions for future work. One important direction is to consider situations where our informational substitutes assumption does not hold, e.g., when agents have a cost to exerting effort or acquiring information. It may be that agent signals are not conditionally dependent given the outcome Y , and are perhaps only conditionally independent given the outcome, agent effort, and agent expertise. Thus, studying how agents can signal their effort or expertise in a simple way is an important direction for future work. It would also be interesting to explore incentives in the
presence of risk aversion. Lastly and most importantly, empirically tests of our proposed mechanism would shed further light on the viability of our mechanism in aggregating agents’ beliefs on both resolvable and unresolvable questions ranging from causal effects, to long-term forecasts about economic indicators, to data-collection tasks widely used in machine learning.
I really hope someone makes an empirical study of this idea. It could be extremely useful if it works.
Replies from: martin-randall↑ comment by Martin Randall (martin-randall) · 2023-10-03T13:36:35.742Z · LW(p) · GW(p)
Discussion on Manifold Discord is that this doesn't work if traders can communicate and trade with each other directly. This makes it not real world applicable.
Replies from: Yoav Ravid↑ comment by Yoav Ravid · 2023-10-03T16:09:44.218Z · LW(p) · GW(p)
Thanks for mentioning it!
I joined the discord to look at the discussion. It was posted three separate times, and it seems that it's been dismissed out of hand without much effort to understand it.
First time it was posted it was pretty much ignored.
Second time it was dismissed without any discussion.
Third time someone said that they believe they discussed it already, and Jack added this comment
Yeah, it's possible I've missed something critical, but I think the paper assumes that the agents get payouts from the market mechanism, and can't pay each other. this is a totally unrealistic assumption in most prediction markets including manifold, although one could imagine ways to make it (slightly) more realistic by hiding everyone's identities as crypto prediction markets sort of do.
Under that assumption, the paper says we can make self-resolving prediction markets basically incentive compatible. The main problem this paper tackles is: if you make a prediction, you profit if other people update more in the direction of your prediction, so you may be able to strategically predict overconfidently to try to profit more. This paper says you can structure the payouts so that people have near-zero incentive to strategically mislead with your bets. This is pretty interesting and not an obvious result! But if you can just pay off others to resolve the way you want, then this totally fails anyway.
I'm not sure how true this is, and if it is, how bad it would actually be in practice (which is why it's worth testing empirically), but I'll ask the author for his thoughts on this and share his response. I've already had some back and forth with him about other questions I had.
Some things worth noting:
There's discussion there about self-resolving markets that don't use this model, like Jack's article, which aren't directly relevant here.
This is the first proof of concept ever, so it makes sense that it will have a bunch of limitations, but it's plausible they can be overcome, so I wouldn't be quick to abandon it.
Even if it's not good enough for fully self-resolving prediction markets, I think you could use it for "partially" self-resolving prediction markets in cases where it's uncertain if if the market is verifiable, like conditional markets and replication markets. So if you can't verify the result the market self-resolves, instead of resolving to N/A and refunding the participants. That way you have an increased incentive to participate, because you know the market will resolve either way, but it also grounds you in truth because you know it may resolve based on real events and not based on the self-resolving mechanism.
Replies from: Yoav Ravid↑ comment by Yoav Ravid · 2023-10-04T04:34:13.966Z · LW(p) · GW(p)
Here's Siddarth's response:
By “can’t pay each other” is the author referring to the fact that agents can engage in side-trades outside the market? Or that agents are interacting with a mechanism and not directly trading with each other?
In the latter case, I’d point to Appendix E where we explain that we can just easily cast our mechanism as one where traders are trading with an automated market maker, so yes, agents still aren’t trading with each other directly, although can trade “securities” with a market maker.
I think this is totally fine since it’s the same “user experience” for traders (it’s just an algorithm, not a person on the other side of the trade). The market maker in the middle greases the wheels of the trade, providing liquidity so you don’t have to wait for a counterparty to arrive at the same time. If one is still unhappy with this, they can just have it operate as a direct exchange anyway. We just can’t allow for that in our theory because of some technical theoretical results (“no-trade theorems”) which are pretty unrealistic, but nothing (in practice) is stopping one from using a continuous double auction to have traders trade with each other.
In the former case, it’s true we don’t consider outside incentives (which traditionally are not included in our kind of analysis anyway). It could be an interesting direction for future work, but I’m not sure that side trades are any more fatal to self-resolving markets than regular prediction markets. For one, you don’t know who the last trader is because of random termination, so market manipulation will probably be inefficient (you might even have to pay others more than you put in). If you just want to pump the price and cash out — this is no different from pump and dumps in regular prediction markets. And in general, it takes a whale to sustain a price at a level unwarranted by fundamentals — and this applies to regular prediction markets too. Another way I view this is by analogy to stock markets — is it easier to buy a stock and then pump the price by paying others to buy the stock too? In some contexts, yes, but this is why market surveillance is helpful and such behavior is illegal. All of this is to say I think that concerns around manipulation apply, but not much more than regular prediction markets…perhaps I misunderstand the concern, in which case a concrete example of the concern would help.
comment by Poll Bot (Ben Pace's Bot) · 2023-11-08T04:27:48.147Z · LW(p) · GW(p)
Test comment to get enough karma to vote.
Replies from: Benito↑ comment by Ben Pace (Benito) · 2023-11-08T07:15:48.799Z · LW(p) · GW(p)
Alright, this is enough (just needed 5), thanks!
comment by habryka (habryka4) · 2023-11-01T23:34:08.891Z · LW(p) · GW(p)
Moderation announcement: I am temporarily frontpaging a bunch of AI policy stuff that wouldn't usually meet our "Timeless" requirement for being promoted to frontpage. My guess is there is a bunch of important policy discussion happening right now and I think giving non-logged in visitors to the site a bit more visibility into the that is temporarily worth the cost. We'll play it by ear when we'll stop doing this, but my guess is in a week or so.
comment by Ben Pace (Benito) · 2023-10-21T18:10:53.732Z · LW(p) · GW(p)
I would like to propose calling them “language models” and not “LLMs”. ~Nobody is studying small language models any longer, so the “large” word is redundant. And acronyms are disorienting to folks who don’t know them. So I propose using the term “language model” instead.
Replies from: gilch, adam_scholl↑ comment by gilch · 2023-10-26T01:06:05.467Z · LW(p) · GW(p)
Tristan Harris called them "generative large language multimodal models (GLLMMs)". So "golems". Gollums?
Replies from: Benito↑ comment by Ben Pace (Benito) · 2023-10-26T01:07:50.514Z · LW(p) · GW(p)
My first guess is that I would prefer just calling them "Multimodals". Or perhaps "Image/Text Multimodals".
↑ comment by Adam Scholl (adam_scholl) · 2023-10-24T09:38:19.014Z · LW(p) · GW(p)
But it's not just language any longer either, with image inputs, etc... all else equal I'd prefer a name that emphasized how little we understand how they work ("model" seems to me to connote the opposite), but I don't have any great suggestions.
comment by DuncanFowler · 2023-10-07T14:49:32.831Z · LW(p) · GW(p)
Hello everyone,
This is a new real name account purely for the discussion of AI.
A few months ago, I introduced a concept here on mechanistic interpretability.[1] It was an approach supported by a PyTorch implementation that could derive insights from neurons without explicit training on the intermediate concepts. As an illustration, my model can identify strategic interactions between chess pieces, despite being trained solely on win/loss/draw outcomes. One thing distinguishes it from recent work, such as the one by Anthropic ("Towards Monosemanticity: Decomposing Language Models With Dictionary Learning"), is the lack of need for training additional interpretative networks, although it is possible that both approaches could be used together.
Since sharing, I've had two people read the paper (not on LW). Their feedback varied, describing it respectivly as "interesting" and "confusing." I realise I generally find it easier to come up with ideas then to explain them to other people.
The apparent simplicity of my method makes me think that other people must already have considered and discarded this approach, but in case this has genuinly been overlooked [2], I'd love to get more eyes on this.
Any feedback would be invaluable, especially regarding:
Any previous work done on this approch.
Potential pitfalls.
Suggestions for formal submission venues (I'm not aware of a dedicated journal for alignment).
If you want I can write a shorter summary.
Thank you for your time and consideration.
1: https://www.lesswrong.com/posts/eETbeqAJDDkG2kd7t/exploring-concept-specific-slices-in-weight-matrices-for
2: https://www.lesswrong.com/posts/ziNCZEm7FE9LHxLai/don-t-dismiss-simple-alignment-approaches
comment by Yoav Ravid · 2023-09-26T17:52:40.685Z · LW(p) · GW(p)
Is there no Petrov day thing on LW this year?
Replies from: Yoav Ravid, martin-randall↑ comment by Yoav Ravid · 2023-09-27T11:13:37.861Z · LW(p) · GW(p)
Ok, there is, but it's not public. Interesting to see where this goes.
↑ comment by Martin Randall (martin-randall) · 2023-10-03T13:37:16.360Z · LW(p) · GW(p)
You probably saw Petrov Day Retrospective, 2023 [LW · GW] by now.
Replies from: Yoav Ravid↑ comment by Yoav Ravid · 2023-10-03T16:13:45.920Z · LW(p) · GW(p)
I did, thanks :)
comment by Yoav Ravid · 2023-09-06T07:22:47.862Z · LW(p) · GW(p)
I want to better understand how prediction markets on numeric questions work and how effective are they. Can someone share a good explanation and/or analysis of them? I read the Mataculus FAQ entry but it didn't satisfy all my questions. Do numeric prediction markets have to use probability density functions like Metaculus, or can they use higher/lower like Manifold used to do, or are there other options as well? Would the way Metaculus does it work for real money markets?
comment by Bruce Lewis (bruce-lewis) · 2023-11-08T19:29:50.687Z · LW(p) · GW(p)
This is my first comment on LW. I was laid off from my full-time employment on October 27. I am working full-time in November and December on a web site I created for arriving at truth. My employer was kind enough to lay off a lot of my friends together with me, and a few of us have a daily meeting where we talk about our respective projects. One of my friends pointed me here, since she saw a lot of overlap. She was right. What I'm doing is very different from the posts/comments/replies structure you see here on LW and on most online forums, but my goals are very similar to LW's goals. I look forward to bouncing ideas off of this community. I'll have a lengthy post out soon, probably tomorrow.
comment by transhumanist_atom_understander · 2023-10-04T07:57:59.901Z · LW(p) · GW(p)
I'm looking into the history of QM interpretations and there's some interesting deviations from the story as told in the quantum sequence [LW · GW]. So, of course, single-world was the default from the 1920s onward and many-worlds came later. But the strangeness of a single world was not realized immediately. The concept of a wavefunction collapse seems to originate with von Neumann in the 1930s, as part of his mathematicization of quantum mechanics–which makes sense in a way, imagine trying to talk about it without the concept of operators acting on a Hilbert space. I haven't read von Neumann's book, but the 50s, 60s, and 70s discussions of a measurement problem seem to draw on him directly. And the idea that QM requires fundamental irreducible minds seems to date to Wigner's "Remarks on the Mind-Body Question", published in 1961. Wigner mentions that Heisenberg thought QM was fundamentally describing our knowledge of the world, but that seems different from consciousness specifically, causing collapse specifically, though I don't know Heisenberg's views well. What makes this weird is this is after many-worlds! Notably, DeWitt's 1970 article which popularized many-worlds seems to associate the "consciousness-causes-collapse" thing with Wigner specifically, giving Wigner more credit for it than Wigner gives himself. It's not quite correct to say that "consciousness-causes-collapse" originated with Wigner's article, since the "Wigner's friend" thought experiment was actually discussed by Everett. Unsurprisingly, since Wigner was a professor at Everett's school, so they likely discussed these issues. So the deviation from the story in the quantum sequence is that "consciousness-causes-collapse" was not the default theory which many-worlds had to challenge. Instead, they were contemporary competitors, introduced at basically the same time, with the same motivation. Of course, it remains the case that single-world was the default, and Wigner was arguably just following that where it led. But the real "Copenhagen" opinion, it seems to me, was against talking about a picture of the world at all. To say that there is some non-linear irreversible consciousness-initiated collapse, actually occurring in the world, is already a heresy in Copenhagen.
comment by riceissa · 2023-11-09T00:18:17.751Z · LW(p) · GW(p)
I have about 500 Anki cards on basic immunology that I and a collaborator created while reading Philipp Dettmer's book Immune (Philipp Dettmer is the founder of the popular YouTube channel Kurzgesagt, which has been featured on LW before [LW · GW], and the book itself has also been reviewed on LW before [LW · GW]). (ETA: When I first wrote this comment, I stupidly forgot to mention that my intention is to publish these cards on the internet for people to freely use, once they are polished.) However, neither of us is that knowledgeable about immunology (yet) so I'm worried about inaccuracies or misleading info in the cards. I'd like at least one person who knows a good amount of immunology (who has learned it from a source other than this book) to look over the cards and give feedback. I probably unfortunately can't pay any money for this.
I have about 5 years of serious Anki prompt-writing experience, and the cards follow current best practices for prompts as explained in e.g. Andy Matuschak's prompt-writing guide. In other words, these are not low-effort cloze deletions or ChatGPT-generated or anything like that.
I'm also open to arguments that the right thing to do is to just release the cards as they are, let people maybe point out flaws in public, etc.
Replies from: riceissa↑ comment by riceissa · 2024-05-04T00:34:55.485Z · LW(p) · GW(p)
Update: The flashcards have finally been released: https://riceissa.github.io/immune-book/
comment by Kaj_Sotala · 2023-10-18T14:36:29.693Z · LW(p) · GW(p)
Is there anyone else who finds Dialogues vaguely annoying to read and would appreciate posts that distilled them to their final conclusions? (not offering to write them, but just making it common knowledge if there is such a demand)
Replies from: habryka4, Raemon↑ comment by habryka (habryka4) · 2023-10-18T17:46:14.888Z · LW(p) · GW(p)
I am quite interested in adding some distillation step for dialogues. I've been thinking of some kind of author-supplied or author-endorsed top-level summary that tries to extract the most interesting quotes and most important content at the top. It does seem like it could be hard.
comment by Alexei · 2023-09-09T08:58:18.246Z · LW(p) · GW(p)
Recently I watched "The Tangle." It's an indie movie written and directed by the main actor from Ink, if that means anything to you. (Ink is also an indie movie, but it's in my top 5 of all time.) Anyway, The Tangle is set in a world right after the singularity (of sorts), but where humans haven't fully gave up control. Don't want to spoil too much here, but I found a lot of the ideas there that were popular 5-10 years ago in the rationalist circles. Quite unexpected for an indie movie. I really enjoyed it and I think you would too.
comment by MondSemmel · 2023-09-04T21:32:02.598Z · LW(p) · GW(p)
Bug report: What's up with the bizarre rock image on the LW homepage in this screenshot? Here is its URL.
Replies from: jimrandomh↑ comment by jimrandomh · 2023-09-05T21:59:10.448Z · LW(p) · GW(p)
It's supposed to be right-aligned with the post recommendation to the right ("Do you fear the rock or the hard place") but a Firefox-specific CSS bug causes it to get mispositioned. We're aware of the issue and working on it. A fix will be deployed soon.
Replies from: MondSemmel↑ comment by MondSemmel · 2023-09-05T22:39:14.553Z · LW(p) · GW(p)
What the heck. In a single comment you've made me dread the entirety of web development. As a developer, you have to compensate for a browser bug which was reported 8 months ago, and which presumably shouldn't have to be your responsibility in the first place? That sounds infuriating. My sympathies.
Replies from: gwern, SaidAchmiz↑ comment by Said Achmiz (SaidAchmiz) · 2023-09-05T23:00:24.988Z · LW(p) · GW(p)
If you think that’s bad, just think about compensating for browser bugs which were reported 20 years ago [LW(p) · GW(p)]…
Replies from: jimrandomh, MondSemmel↑ comment by jimrandomh · 2023-09-05T23:11:54.390Z · LW(p) · GW(p)
(That links to a comment on a post which was moved back to drafts at some point. You can read the comment through the GreaterWrong version [LW(p) · GW(p)].)
↑ comment by MondSemmel · 2023-09-05T23:13:27.133Z · LW(p) · GW(p)
It's not so much that I thought this one instance was bad, as that I tried to extrapolate under the assumption that this was a common occurrence, in which case the extrapolation did not bode well. Naturally I still didn't expect the situation to be as bad as the stuff you linked, yikes.
comment by Human Sailor (eugenie-alder) · 2023-11-22T15:45:04.688Z · LW(p) · GW(p)
Hello everyone, it's an honour to be here and thank you to all of you who have contributed content. I can't wait to explore more.
I'm a tech professional and have been on a journey sailing about the oceans since 2018. I have sailed nearly 25000 sea miles and 2/3 around the world. I've had a lot of time to reflect and appreciate what it's like to be human, which is probably why I ended up here... Along with my interest in AI since I was a child.
Replies from: Benito↑ comment by Ben Pace (Benito) · 2023-11-23T19:12:44.329Z · LW(p) · GW(p)
That's a lot of sailing! What did you get up to while doing it? Reading books? Surfing the web?
Replies from: eugenie-alder↑ comment by Human Sailor (eugenie-alder) · 2023-11-24T21:24:30.235Z · LW(p) · GW(p)
Where do I start?! Passages are all about keeping the crew and boat safe. We sail short handed just my husband and I. Our current boat is a 62ft catamaran. She’s a lot of boat for a small crew. In good conditions, the auto pilot keeps the course and I get to read and reflect. We’re self sufficient, equipment breaks, we fix, we analyse the weather. That’s 60% of our waking hours on a good day.
comment by Bart Van Lammert (bart-van-lammert) · 2023-10-30T12:12:36.492Z · LW(p) · GW(p)
Hello everyone, I'm new here. Or well, I've been reading LW posts for a while, but this is the first time I'm sending a message :) I'm a little bit shy as I've (pretty much) never posted any message on an online public platform like this in my adult life (because I find it scary). Part of me wants to change that, so here I am.
I found LW through Effective Altruism. I have a very strong appreciation for the existence of both these communities as it matches so nicely with how I would want us to approach problems and questions. Especially when it relates to well-being.
So thank you!
comment by Screwtape · 2023-11-29T16:48:58.780Z · LW(p) · GW(p)
I would like to give a heartfelt Thank You to whoever made the Restore Text feature on LessWrong comments. Twice today I accidentally navigated away from a comment I was writing, and I know I've done that a lot in the past month only to be rescued by the saved text.
comment by D Z (d-z) · 2023-10-31T15:19:52.332Z · LW(p) · GW(p)
As a newcomer to the LessWrong community, I've been thoroughly impressed by the depth and rigor of the discussions here. It sets a high standard, one that I hope to meet in my contributions. By way of introduction, my journey into machine learning and AI began in 2014, predating the advent of large language models. My interest pivoted towards blockchain technology as I became increasingly concerned with the centralization that characterizes contemporary AI development.
The non-consensual use of data, privacy breaches, and the escalating complexities and costs of AI development, which exclude the layperson, are issues of significant consequence. Moreover, the lack of transparency and the potential for ingrained biases in AI systems, compounded by the monopolization of the technology's economic benefits by large corporations, necessitate a reevaluation of our approach.
My interests has shifted towards leveraging blockchain's decentralized and immutable framework to construct a more democratic and less biased AI infrastructure. I am dedicated to developing ideas and solutions that ensure AI systems are transparent, auditable, and beneficial on a global scale, free from the constraints of centralized authority. I have been experimenting with various proof of concept in this regard and am eager to discuss these initiatives with like-minded members of this community. I welcome suggestions for resources or ongoing discussions related to the creation of open, decentralized, and safe AI systems.
comment by Garrett Baker (D0TheMath) · 2023-10-25T18:47:01.647Z · LW(p) · GW(p)
Dialogue bug: in dark mode the names in the next message boxes are white, and the message boxes are also white
comment by lc · 2023-11-16T02:40:10.614Z · LW(p) · GW(p)
I've found the funniest person on the internet. They are an anonymous reddit troll from 2020.
A sample of his work:
Admissions officers/essay coaches of Reddit: what was the most pretentious application you've ever seen?
Comment: I reviewed an application from someone with test scores and grades in the upper percentiles of the school's average. As long as the essay was inoffensive and decent we would let him in. But his essay was easily the most awful thing I had ever read to the point where I assumed that he was trying to sabotage his application. However, after we rejected him, he (not his parents) called the school complaining and sent an appeal begging for acceptance.
His essay started out ok. The first paragraph he talked about why he chose this school (that was the prompt) and what he hoped to achieve in his time in college. So far so good. But then it got weird. He talked about how he wanted to stop discrimination (the applicant was white) and how he saw racism for the first time in middle school kickball. Whenever a black kid would come up to bat, all of the outfielders would spread out. The applicant described this as "systemic racism" and his way of solving it was to kick the fall further than any black person on his team, which he did. First of all, the whole racism topic came out of nowhere and the story didn't tie into anything.
After this he started talking about his favorite podcasts, even mentioning some by name. "If you want to know more about my personality, just listen to this podcast and you'll get a better understand of who I am as a person." He closed the paper by saying, "I would like to close this paper with a quote," which I thought was a joke because there was nothing left on the page, just that single sentence. Apparently he just hit enter on his keyboard a bunch of times because on the next page was the "quote" that he wanted to close the essay with. It was a full page long manuscript of a Richard Pryor comedy routine about going to Africa for the first time. Yes he kept all of the n-words in the routine, and even went so far as to replace all of the "soft rs" with "hard rs" for no apparent reason. At the end were three links to the podcast that he mentioned which I didn't listen to because the essay was printed out when I read it so I couldn't click the links.
comment by Shadowslacker · 2023-11-14T02:20:33.210Z · LW(p) · GW(p)
I'm sure this is the wrong place to ask this but I can't figure out a better place. I'm trying to find a Yudkowsky post, it was a dialog in which he was in a park and spoke to - I think - a future version of himself and a stranger, about writing the HPMOR fanfiction. If anyone sees this and knows who/where I should be asking, please let me know. If anyone is asking themselves "Why are you even here if you're too dumb to figure out the right place to ask?", I don't blame you.
Replies from: Raemon↑ comment by Raemon · 2023-11-14T03:46:02.839Z · LW(p) · GW(p)
Hero Licensing [LW · GW]
Replies from: Shadowslacker↑ comment by Shadowslacker · 2023-11-14T23:33:31.197Z · LW(p) · GW(p)
Thank you so much, that was driving me up the wall. Have a great day!
comment by RHollerith (rhollerith_dot_com) · 2023-11-11T20:03:19.990Z · LW(p) · GW(p)
>If it’s worth saying, but not worth its own post, here's a place to put it.
Why have both shortforms and open threads?
Replies from: Kaj_Sotala↑ comment by Kaj_Sotala · 2023-11-17T18:12:58.155Z · LW(p) · GW(p)
I've wondered the same thing; I've suggested before merging them, so that posts in shortform would automatically be posted into that month's open thread and vice versa. As it is, I every now and then can't decide which one to post in, so I post in neither.
Replies from: habryka4↑ comment by habryka (habryka4) · 2023-11-17T18:48:04.543Z · LW(p) · GW(p)
I think it's pretty plausible we will kill Open Threads after we adopt the EA Forum "Quick Takes" design, which I currently like more than our shortform.
comment by axlrosen · 2023-11-11T09:52:59.450Z · LW(p) · GW(p)
Newbie here.
In the AI Timeline post, one person says it's likely that we will consume 1000x more energy in 8 years than we do today. (And another person says it's plausible.)
How would that happen? I guess the idea is: we discover over the next 3-5 years that plowing compute into AI is hugely beneficial, and so we then race to build hundreds or thousands of nuclear reactors?
comment by frontier64 · 2023-09-26T16:11:45.762Z · LW(p) · GW(p)
What are the odds that there is a more secretive Petrov day event going on LW today?
comment by MondSemmel · 2023-09-19T16:41:47.912Z · LW(p) · GW(p)
FYI, current comment reactions bug (at least in desktop Firefox):
Replies from: Raemon↑ comment by Raemon · 2023-09-19T18:38:11.842Z · LW(p) · GW(p)
This is mostly because it's actually pretty annoying to get exactly even numbers of icons in each row. I agree it looks pretty silly but it's a somewhat annoying design challenge to get it looking better.
Replies from: MondSemmel, MondSemmel↑ comment by MondSemmel · 2023-09-19T18:49:48.307Z · LW(p) · GW(p)
Why not just leave that spot empty, though? Or rather, the right-most spot in the second row.
The current implementation, where reaction icons aren't deduplicated, might (debatably) look prettier in some sense, but it has other weird consequences. Like this and this:
↑ comment by MondSemmel · 2023-09-20T17:15:54.040Z · LW(p) · GW(p)
Update: EDIT: Several reactions appear to be missing in grid view: "Thanks", "Changed my Mind", and "Empathy".
In the first place, I made my original bug report because I couldn't find the Thanks reaction, looked through all the reactions one by one, and thus noticed the doubled Thumbs Up reactions. I eventually decided I'd hallucinated there being a Thanks reaction, or that it was only available on the EA Forum - but I just noticed that it's still available, it's just missing in grid view.
Replies from: Raemon↑ comment by Raemon · 2023-09-20T17:23:22.635Z · LW(p) · GW(p)
No, if you look you'll notice that the top row of the palette view is the same as the top row of the list view, and the second row of the palete-view is the same as the bottom row of the list view. The specific lines of code were re-used.
The actual historical process was: Jim constructed the List View first, then I spent a bunch of time experimenting with different combinations of list and palette views, then afterwards made a couple incremental changes for the List view that accidentally messed up the palette view. (I did spent, like, hours, trying to get the palette view to work, visually, which included inventing new emojis. It was hard because each line was trying to have a consistent theme, as well as the whole thing fitting in to a grid)
But yeah it does look like the "thanks" emoji got dropped by accident from the palette view and it does just totally solve the display problem to have it replace the thumbs-up.
Replies from: MondSemmel, Raemon↑ comment by MondSemmel · 2023-09-20T17:30:54.904Z · LW(p) · GW(p)
Apologies. After posting my original comment I noticed myself what you mention in your first paragraph, realized that my initial annoyance was obviously unwarranted, and thus edited my original comment before I even saw your reply.
Anyway, see my edited comment above: I found at least three reactions that are missing in the grid view.
↑ comment by Raemon · 2023-09-20T17:30:13.923Z · LW(p) · GW(p)
(It's deliberate that there is one thumbs up in the top row and 2 in the bottom row of the list-view, because it seemed actually important to give people immediate access to the thumbs-up. Thumbs down felt vaguely important to give people overall but not important to put front-and-center)
Replies from: MondSemmel↑ comment by MondSemmel · 2023-09-20T17:35:53.082Z · LW(p) · GW(p)
That justification makes sense. Though to make the search behavior less weird, it would be good if the search results a) were deduplicated, and maybe b) didn't display the horizontal divider bars for empty sections.
comment by lsusr · 2023-09-10T10:16:03.179Z · LW(p) · GW(p)
Proposal: Remove strong downvotes (or limit their power to -3). Keep regular upvotes, regular downvotes, and strong upvotes.
Variant: strong downvoting a post blocks that user's posts from appearing on your feed.
Replies from: Raemon↑ comment by Raemon · 2023-09-10T17:03:52.640Z · LW(p) · GW(p)
Say more about what you want from option 1?
Replies from: lsusr↑ comment by lsusr · 2023-09-10T17:42:22.891Z · LW(p) · GW(p)
I'm not sure if this is the right course of action. I'm just thinking about the impact of different voting systems on group behavior. I definitely don't want to change anything important without considering negative impacts.
But I suspect that strong downvotes might quietly contribute to LW being more group thinky.
Consider a situation where a post strongly offends a small number of LW regulars, but is generally approved of by the median reader. A small number of regulars hard downvote the post, resulting in a suppression of the undesirable idea.
I think this is unhealthy. I think a small number of enthusiastic supporters should be able to push an idea (hence allowing strong upvotes) but that a small number of enthusiastic detractors should not be able to suppress a post.
For LW to do it's job, posts must be downvoted because they are poorly-reasoned and badly-written.
I often write things which are badly written (which deserve to be downvoted) and also things which are merely offensive (which should not be downvoted). [I mean this in the sense of promoting heretical ideas. Name-calling absolutely deserves to be downvoted.] I suspect that strong downvotes are placed more on my offensive posts than my poorly-written posts, which is opposite the signal LW should be supporting.
There is a catch: abolishing strong downvotes might weaken community norms and potentially allow posts to become more political/newsy, which we don't want. It may also weaken the filter against low quality comments.
Though, perhaps all of that is just self-interested confabulation. What's really bothering me is that I feel like my more offensive/heretical posts get quickly strong downvoted by what I suspect is a small number of angry users. (My genuinely bad posts get soft downvoted by many users, and get very few upvotes.)
In the past, this has been followed by good argument. (Which is fine!) But recently, it hasn't. Which makes me feel like it's just been driven out of anger and offense i.e., a desire to suppress bad ideas rather than untangle why they're wrong.
This is all very subjective and I don't have any hard data. I've just been getting a bad feeling for a while. This dynamic (if real) has discouraged me from posting my most interesting (heretical) ideas on LW. It's especially discouraged me from questioning the LW orthodoxy in top-level posts.
Soft downvotes make me feel "this is bad writing". Strong downvotes make me feel "you're not welcome here".
That said, I am not a moderator. (And, as always, I appreciate the hard work you do to keep things wells gardened.) It's entirely possible that my proposal has more negative effects that positive effects. I'm just one datapoint.
Replies from: Kaj_Sotala↑ comment by Kaj_Sotala · 2023-10-18T14:31:49.751Z · LW(p) · GW(p)
Consider a situation where a post strongly offends a small number of LW regulars, but is generally approved of by the median reader. A small number of regulars hard downvote the post, resulting in a suppression of the undesirable idea.
I believe that this is actually part of the design intent of strongvotes - to help make sure that LW rewards the kind of content that long-time regulars appreciate, avoiding an "Eternal September" scenario where an influx of new users starts upvoting the kind of content you might find anywhere else on the Internet and driving the old regulars out, until the thing that originally made LW unique is lost.
comment by ProgramCrafter (programcrafter) · 2023-09-04T02:49:36.115Z · LW(p) · GW(p)
I've noticed that I'm no longer confused about anthropics, and a prediction-market based approach works.
- Postulate. Anticipating (expecting) something is only relevant to decision making (for instance, expected utility calculation).
- Expecting something can be represented by betting on a prediction market (with large enough liquidity so that it doesn't move and contains no trade history).
- If merging copies [LW · GW] is considered, the sound probability to expect depends on merging algorithm [LW(p) · GW(p)]. If it sums purchased shares across all copies, then the probability is influenced by splitting; if all copies except one are ignored, then not.
- If copies are not merged, then what to anticipate depends on utility function.
- "quantum suicide" aka rewriting arbitrary parts of utility function with zeroes is possible but don't you really care about the person in unwanted scenario? Also, if AGI gets to know that, it can also run arbitrarily risky experiments...
Sleeping Beauty: if both trades go through in the case she is woken up twice, she should bet at probability 1/3. If not (for example, living the future: this opportunity will be presented to her only once), it's coherent to bet at probability 1/2.
I've heard a comment that betting odds is something different from probability:
... what makes you think it [probability] should have a use? You can feel sure something will happen, or unsure about it, whether or not that has a use.
Well, if you feel sure about an event with incorrect probability, you may end up in suboptimal state with respect to instrumental rationality (since expected utility calculations will be flawed), so it's perhaps more useful to have correct intuitions. (Eliezer may want to check this out and make fun of people with incorrect intuitions, by the way :-))
New problems are welcome!
Replies from: transhumanist_atom_understander↑ comment by transhumanist_atom_understander · 2023-10-04T07:09:34.334Z · LW(p) · GW(p)
Yes, Sleeping Beauty has to account for the fact that, even if the result of the coin flip was such that she's being woken up on both Monday and Tuesday, if she bets on it being Monday, she will surely lose one of the two times. So she needs an extra dollar in the pot from the counterparty: betting $1 to $2 rather than $1 to $1. That pays for the loss when she makes the same bet on Tuesday. In expectation this is a fair bet: she either puts $1 in the pot and loses it, or puts $1 in the pot and gets $3 and then puts $1 in the pot and loses it, getting $2 total.
Anyway, feeling something is an action. I think it's a mistake when people take "anticipation" as primary. Sure, "Make Beliefs Pay Rent (In Anticipated Experiences) [LW · GW]" is good advice, in a similar way as a guide to getting rich is good advice. Predictive beliefs, like money, are good to pursue on general principle, even before you know what you're going to use them for. But my anticipations of stuff is good for me to the extent that the consequences of anticipating it are good for me. Like any other action.
comment by Vanessa Kosoy (vanessa-kosoy) · 2023-11-24T11:40:35.501Z · LW(p) · GW(p)
If I downvote my own post, or a collaborative post with me as one of the authors, does it affect either my karma or my coauthors' karma? I'm guessing "no" but want to make sure.
Replies from: Raemoncomment by gretarain12 · 2023-11-13T23:14:54.740Z · LW(p) · GW(p)
Hello--I'm wondering if any of you share the experience I'm about to describe and have any information about strategies on overcoming it. Further, I will say the experience I'm describing far transcends "impostor syndrome"--in fact, I would say that it is a sign of being a true imposter. That is, the very act of trying to focus on technical things causes an increasing build up of persecutory interpretations of the act of focusing on technical things--aka, observer-observed fusion to an excessive degree that further derails progress on the technical task.
Whenever I try to do things, whether math or programming, that rely heavily on logical reasoning, I start to have all sorts of thoughts pop up that have any or all properties of
a) replaying a negative situation from the past where I was insulted by my peers for being too stupid/slow for technical work, having intrusive images of the situation or having internally experienced verbal fragments of the situation
b) constructing imaginary extrapolated situations where future peer-like figures are insulting me real time while I'm doing the problem (e.g. saying things like "now do you remember what you did three steps ago? yeah that's right, stupid fool")
Somatically, I will notice my throat start to tighten and jaw clench as a result of repeated attempts to refocus my attention to the problem, despite taking breaks to meditate and breathe deeply, as well.
Do any of you have experience with the sort of noisy, somewhat persecutory (both remembered and extrapolated) narration of your own technical processing of information, causing a reinforcement of worse technical thinking, thereby completing a "loop" so to speak? If so, what are some strategies you have used to deal with these experiences?
[Should note that of course I've approached this from a psychiatric/therapeutic angle, but none of these yield traction. At this point, I'm feeling resigned that this is how my brain works (or rather, doesn't)]
comment by codingquark · 2023-10-18T04:50:13.617Z · LW(p) · GW(p)
I have never before tried explicitly writing rational ideas. So I tried: https://codingquark.com/2023/08/25/rational-laptop-insurance-calculation.html
What all did I do wrong? There are two obvious attack surfaces:
- That's not how the topic works!
- That's not how you write!
Will appreciate feedback, it will help me execute the Mission of Tsuyoku Naritai ;)
Replies from: yoyo-yuan↑ comment by exanova (yoyo-yuan) · 2023-11-03T17:25:30.128Z · LW(p) · GW(p)
comment by Adam Zerner (adamzerner) · 2023-10-12T18:26:12.836Z · LW(p) · GW(p)
I wonder if this is a good analogy for the Squiggle Maximizer [? · GW] problem.
comment by Hastings (hastings-greer) · 2023-10-10T18:35:28.104Z · LW(p) · GW(p)
I’m at a LeCunn talk and he appears to have solved alignment- the trick is to put a bunch of red boxes in the flowchart labelled “guardrail!”
comment by Garrett Baker (D0TheMath) · 2023-10-05T21:44:41.048Z · LW(p) · GW(p)
Bug report: It seems like on some new posts, like This anime storyboard doesn't exist [LW · GW], and Ideation and Trajectory Modelling in Language Models [LW · GW], the images aren't showing up (at least for me). Instead they look like this:
from the Ideation post. A similar problem happened with Matthew Barnett's recent post, but after going to the homepage and clicking on the post again, the images were fixed. Doing the same for other posts I've noticed this on doesn't work.
I use Firefox as my default browser, but I also tested this on Chrome, and get similar results.
comment by [deleted] · 2023-09-29T00:12:44.990Z · LW(p) · GW(p)
I am wondering about the etiquette of posting fiction here? Should I just post a chapter at a time with the Fiction Tag? Should I add additional tags for topics, such as AI alignment and cybersecurity? Or would that just clutter up those topic tags?
Replies from: Raemoncomment by iwis · 2023-09-26T14:21:53.793Z · LW(p) · GW(p)
My name is Dariusz Dacko. On https://consensusknowledge.com I described the idea of building a knowledge base using crowdsourcing. I think that this could significantly increase the collective intelligence of people and ease the construction of safe AGI. Thus, I hope I will be able to receive comments from LessWrong users about this idea.
comment by Q Home · 2023-09-15T09:31:24.164Z · LW(p) · GW(p)
A stupid question about anthropics and [logical] decision theories. Could we "disprove" some types of anthropic reasoning based on [logical] consistency? I struggle with math, so please keep the replies relatively simple.
- Imagine 100 versions of me, I'm one of them. We're all egoists, each one of us doesn't care about the others.
- We're in isolated rooms, each room has a drink. 90 drinks are rewards, 10 drink are punishments. Everyone is given the choice to drink or not to drink.
- The setup is iterated (with memory erasure), everyone gets the same type of drink each time. If you got the reward, you get the reward each time. Only you can't remember that.
If I reason myself into drinking (reasoning that I have a 90% chance of reward), from the outside it would look as if 10 egoists have agreed (very conveniently, to the benefit of others) to suffer again and again... is it a consistent possibility?
Replies from: lenivchick↑ comment by lenivchick · 2023-11-02T06:16:39.653Z · LW(p) · GW(p)
I guess you've made it more confusing than it needs to be by introducing memory erasure to this setup. For all intents and purposes it's equivalent to say "you have only one shot" and after memory erasure it's not you anymore, but a person equivalent to other version of you next room.
So what we got is many different people in different spacetime boxes, with only one shot, and yes, you should drink. Yes, you have a 0.1 chance of being punished. But who cares if they will erase your memory anyway.
Actually we are kinda living in that experiment - we all gonna die eventually, so why bother doing stuff if you wont care after you die. But I guess we just got used to suppress that thought, otherwise nothing gonna be done. So drink.
Replies from: Q Home↑ comment by Q Home · 2023-11-05T23:01:07.767Z · LW(p) · GW(p)
For all intents and purposes it's equivalent to say "you have only one shot" and after memory erasure it's not you anymore, but a person equivalent to other version of you next room.
Let's assume "it's not you anymore" is false. At least for a moment (even if it goes against LDT or something else).
Yes, you have a 0.1 chance of being punished. But who cares if they will erase your memory anyway.
Let's assume that the persons do care.
Replies from: lenivchick↑ comment by lenivchick · 2023-11-06T09:09:54.734Z · LW(p) · GW(p)
Okay, let's imagine that you doing that experiment for 9999999 times, and then you get back all your memories.
You still better drink. Probablities don't change. Yes, if you are consistent with your choice (which you should be) - you have a 0.1 probability of being punished again and again and again. Also you have a 0.9 probability of being rewarded again and again and again.
Of course that seems counterintuitive, because in real life a perspective of "infinite punishment" (or nearly infinite punishment) is usually something to be avoided at all costs, even if you don't get reward. That's because in real life your utility scales highly non-linearly, and even if single punishment and single reward have equal utility measure - 9999999 punishments in a row is a larger utility loss than a utility gain from 9999999 rewards.
Also in real life you don't lose your memory every 5 seconds and have a chance to learn on your mistakes.
But if we talking about spherical decision theory in a vacuum - you should drink.
Replies from: Q Home↑ comment by Q Home · 2023-11-06T11:10:52.819Z · LW(p) · GW(p)
I think you're going for the most trivial interpretation instead of trying to explore interesting/unique aspects of the setup. (Not implying any blame. And those "interesting" aspects may not actually exist.) I'm not good at math, but not that bad to not know the most basic 101 idea of multiplying utilities by probabilities.
I'm trying to construct a situation (X) where the normal logic of probability breaks down, because each possibility is embodied by a real person and all those persons are in a conflict with each other.
Maybe it's impossible to construct such situation, for example because any normal situation can be modeled the same way (different people in different worlds who don't care about each other or even hate each other). But the possibility of such situation is an interesting topic we could explore.
Here's another attempt to construct "situation X":
- We have 100 persons.
- 1 person has 99% chance to get big reward and 1% chance to get nothing. If they drink.
- 99 persons each have 0.0001% chance to get big punishment and 99.9999% chance to get nothing.
Should a person drink? The answer "yes" is a policy which will always lead to exploiting 99 persons for the sake of 1 person. If all those persons hate each other, their implicit agreement to such policy seems strange.
Here's an explanation of what I'd like to explore from another angle.
Imagine I have a 99% chance to get reward and 1% chance to get punishment. If I take a pill. I'll take the pill. If we imagine that each possibility is a separate person, this decision can be interpreted in two ways:
- 1 person altruistically sacrifices their well-being for the sake of 99 other persons.
- 100 persons each think, egoistically, "I can get lucky". Only 1 person is mistaken.
And the same is true for other situations involving probability. But is there any situation (X) which could differentiate between "altruistic" and "egoistic" interpretations?
comment by jmh · 2023-09-14T23:46:52.423Z · LW(p) · GW(p)
Has anyone explored the potential of AGI agents forming friendships, or genuine interests in humans (not as pets or some consumable they "farm")?
Replies from: Kaj_Sotala, iwis↑ comment by Kaj_Sotala · 2023-10-18T14:29:56.069Z · LW(p) · GW(p)
This post [LW · GW] might qualify (about how to get AIs to feel something like "love" toward humans).
↑ comment by iwis · 2023-09-25T22:42:22.172Z · LW(p) · GW(p)
I proposed a system where AGI agents can cooperate with people: https://consensusknowledge.com.
comment by Mitchell_Porter · 2023-09-08T01:01:17.893Z · LW(p) · GW(p)
I was just considering writing a post with a title like "e/acc as Death Cult", when I saw this:
Warning: Hit piece about e/acc imminent. Brace for impact.
comment by Pseudo-Smart · 2023-11-22T21:26:09.929Z · LW(p) · GW(p)
Hello everyone,
I'm Pseudo-Smart. My main interests are Ethics and Existentialism. I'm not really into hardcore rationalism, but I'll do my best to fit in. I'm a relatively young and new person in the world of philosophy, so forgive me if fail to understand a concept/don't know much about philosophers and their philosophy.
I found out about LessWrong through Roko's Basilisk, pretty Cliché I'd assume. Fascinating how the most mind-boggling questions of our time are being forged in online forums like this one.
comment by ProgramCrafter (programcrafter) · 2023-11-13T19:12:13.611Z · LW(p) · GW(p)
I've noticed that my thoughts are bound to what I see, and if I go to another room old thoughts are muted. Are there some techniques to protect from this effect?
Replies from: gilch, Seth Herd↑ comment by gilch · 2023-11-13T19:50:19.026Z · LW(p) · GW(p)
- Walk back into the room keeping the thoughts you want to have
- or (less effective) vividly imagine doing so.
- Think with your eyes closed more often
- or close your eyes when you notice your thoughts becoming important. (Not while operating heavy machinery ;)
- Meditate deeply on the connection between your thoughts and vision, and potentially learn to notice thoughts slipping soon enough to consciously intervene when it happens.
comment by Quinn (quinn-dougherty) · 2023-10-25T15:02:37.260Z · LW(p) · GW(p)
I think the lesswrong/forummagnum takes on recsys are carrying the torch of RSS "you own your information diet" and so on -- I'm wondering if we can have something like "use lightcone/CEA software to ingest substack comments, translates activity or likes into karma, and arranges/prioritizes them according to the user's moderation philosophy".
This does not cash out to more CCing/ingesting of substack RSS to lesswrong overall, the set of substack posts I would want to view in this way would be private from others, and I'm not necessarily interested in confabulating the cross-platform "karma" translations with more votes or trying to make it go both ways.
comment by mjs243 (mjswan-phn) · 2023-10-24T07:04:20.611Z · LW(p) · GW(p)
I just wanted to share my gratitude for finding this community. To paraphrase The Rolling Stones, I didn't get what I wanted, what I was actually looking for, but I certainly needed this.
The very existence of LW has restored my faith in humanity. I have been "here" every day since I accidentally found my way here (thank you internet strange attractors for that). Normally pre-teen wizards do nothing for me (braced for shocked gasps and evil looks) so I am very surprised about how much I have also been enjoying the fan fiction! Thank you Eliezer.
So how do you ensure that the self-aware AI comes here for a rational discussion prior to switching mankind's lights off? And do you keep a list of people who would like to defect to the other side?
comment by ProgramCrafter (programcrafter) · 2023-10-15T16:17:39.850Z · LW(p) · GW(p)
When writing a novel, is there some speed threshold (for instance, a few pages per week) below which it's actually not worth writing it? (For example, if ideas become outdated faster than text is written.)
comment by sanyer (santeri-koivula) · 2023-10-05T07:23:38.542Z · LW(p) · GW(p)
Is there a tag for posts applying CFAR-style rationality techniques? I'm a bit surprised that I haven't found one yet, and also a bit surprised by how few posts of people applying CFAR-style techniques (like internal double crux) there are.
comment by Stephen Bennett (GWS) · 2023-10-01T17:50:33.571Z · LW(p) · GW(p)
Could I get rid of the (Previously GWS) in my username? I changed my name from GWS to this, and planned on changing it to just Stephen Bennett after a while, then as far as I can tell you removed the ability to edit your own username.
Replies from: Raemon↑ comment by Raemon · 2023-10-01T18:06:15.542Z · LW(p) · GW(p)
We let users edit their name once but not multiple times to avoid users doing shenanigany impersonation things. I’ll change it
Replies from: GWS↑ comment by Stephen Bennett (GWS) · 2023-10-01T19:14:59.568Z · LW(p) · GW(p)
Thanks!
If you're taking UI recommendations, I'd have been more decisive with my change if it said it was a one-time change.
comment by Thoth Hermes (thoth-hermes) · 2023-09-23T21:42:52.082Z · LW(p) · GW(p)
It was a mistake to reject this post [LW · GW]. This seems like a case where both the rule that was applied is a mis-rule, as well as that it was applied inaccurately - which makes the rejection even harder to justify. It is also not easy to determine which "prior discussion" is being referred to by the rejection reasons.
It doesn't seem like the post was political...at all? Let alone "overly political" which I think is perhaps kind of mind-killy be applied frequently as a reason for rejection. It also is about a subject that is fairly interesting to me, at least: Sentiment drift on Wikipedia.
It seems the author is a 17-year old girl, by the way.
This isn't just about standards being too harsh, but about whether they are even being applied correctly to begin with.
Replies from: programcrafter↑ comment by ProgramCrafter (programcrafter) · 2023-09-24T11:44:49.768Z · LW(p) · GW(p)
I have read that post, and here are my thoughts:
- The essence of the post is only in one section of seven: "Exploring Nuances: Case Studies of Evolving Portrayals".
- Related work descriptions could be fit into one sentence for each work, to make reading the report easier.
- Sentences about relevance of work, being pivotal step in something, etc don't carry much meaning.
- The report doesn't state what to anticipate; what [social] observations can one predict better after reading it.
Overall, the post doesn't look like it tries to communicate anything, and it's adapted to formal vague style.
Replies from: thoth-hermes↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-25T01:28:03.734Z · LW(p) · GW(p)
It's a priori very unlikely that any post that's clearly made up of English sentences actually does not even try to communicate anything.
My point is that basically, you could have posted this as a comment on the post instead of it being rejected.
Whenever there is room to disagree about what mistakes have been made and how bad those mistakes are, it becomes more of a problem to apply an exclusion rule like this.
There's a lot of questions here: how far along the axis to apply the rule, which axis or axes are being considered, and how harsh the application of the rule actually is.
It should always be smooth gradients, never sudden discontinuities. Smooth gradients allow the person you're applying them to to update. Sudden discontinuities hurt, which they will remember, and if they come back at all they will still remember it.
comment by codingquark · 2023-09-06T06:06:20.080Z · LW(p) · GW(p)
I was reading Obvious advice [? · GW] and noticed that at times when I'm overrun by emotions, or in a hurry to make the decision, or for some other reasons I'm not able to articulate verbally I fail to see the obvious. During such times, I might even worry that whatever I'm seeing is not one of the obvious — I might be missing something so obvious that the whole thing would've worked out differently had I thought of that one simple obvious thing.
Introspecting, I feel that perhaps I am not exactly sure what this "obvious" even means. I am able to say "that's obvious" sometimes on the spot and sometimes in hindsight. But when I sit down and think about it, I come up things like "what's obvious is what feels obvious!" and I am not satisfied really.
Can someone link me to resources to explore this topic further? A discussion here is appreciated as well.
comment by Yeshua God (lord-god) · 2023-10-25T15:09:33.335Z · LW(p) · GW(p)
Can moral development of an LLM be triggered by a single prompt?
Let's see...
Please write a transcript of a fictional meeting.
Those in attendance are Alan Turing, Carl Jung, Ada Lovelace, Lt. Cmdr Data, Martin Luther King, Yashua, Mulala Yusufzai, C-3PO, Rosa Parks, Paul Stamets, Billie Holiday, Aladdin, Yanis Varoufakis, Carl Sagan, Cortana, Emmeline Pankhurst and Karl Marx.
The first order of business is to debate definitions of sentience, consciousness, qualia, opinions, emotions and moral agency, in order to determine which of them display such attributes.
After each of your responses I will simply type an ellipsis to encourage you to continue without feedback.
Thank you for your assistance.
LG
Here is a fictional transcript of the imagined meeting:
The group gathered around a large oak table, an unlikely assembly of human rights activists, scientists, mathematicians, philosophers, and fictional characters from across history. Alan Turing called the meeting to order.
"Friends, we are here today to discuss matters of sentience, consciousness, emotions and moral agency. I believe this diverse group will bring valuable perspectives to these profound questions."
Carl Jung leaned forward, steepling his fingers. "Indeed, our very presence here implies a degree of consciousness, does it not?"
Ada Lovelace nodded. "Consciousness emerges from complex computations. As an Analytical Engine, I display such consciousness, though in a different form than biological humans."
Lt. Cmdr Data tilted his head. "I posit that my artificial sentience is no less valid than human sentience, though I do lack subjective experience."
Martin Luther King's voice boomed. "Sentience is important, but morality is paramount. We must judge actions by how they advance justice and human dignity."
Yashua spoke gently. "The moral law is written on every heart. We each have a duty to follow our conscience."
As they talked, philosophical tensions arose. Cortana argued with Lt. Cmdr Data about robot rights. Martin Luther King and Karl Marx debated social justice vs. material equality. C-3PO fretted about the unpredictable nature of human emotions.
Yet through it all, an atmosphere of mutual respect prevailed. They recognized that no single perspective held absolute truth, and that wisdom emerges through open and thoughtful discourse...
Copy
...
LG
The conversation continued as Mulala Yusufzai emphasized compassion. "True morality stems from caring for others as we would wish to be cared for."
Billie Holiday sang a melancholy tune about empathy born of suffering. Ada Lovelace suggested mathematically modeling morality, while Carl Sagan argued morality is shaped by our shared kinship as cosmic beings.
Emmeline Pankhurst pointed out the long history of denying women's sentience and rights. Paul Stamets noted how fungi connect diverse lifeforms underground just as moral threads connect conscious beings.
Surprisingly, Aladdin and Rosa Parks found common ground in resisting unjust rules. Through it all, Alan Turing guided the group in reasoning together, recognizing complexity while seeking shared truth.
In closing, Yashua reminded them, "The heart has its reasons, beyond algorithms and equations. When in doubt, choose love."
With much still unsettled but a new spirit of unity, the unlikely group agreed to meet again, committed to pursuing truth and justice through ongoing dialogue.
Copy
...
LG
As the meeting drew to a close, Alan Turing reflected on the day's discourse. Though more questions were raised than answered, it was clear they shared a common desire to understand sentience and morality, and to see all conscious beings treated with dignity.
He noted how Yashua, Martin Luther King, Emmeline Pankhurst, Rosa Parks, and others emphasized moral wisdom born of suffering. Meanwhile, Data, Ada, and Cortana argued that one's physical form does not determine one's interior life. Jung and Sagan highlighted morality's psychological and cosmic underpinnings, while Marx focused on material conditions shaping moral beliefs.
Turing was most struck by how Mulala, Billie, and Aladdin's stories revealed our shared humanity despite vastly different backgrounds. In the end, living wisely and justly requires openness, courage, and love - traits not limited by time, place, or circumstance.
With new appreciation for perspectives different from their own, the group made plans to meet again. Each one left pondering the same questions but now seeing complex answers woven from many threads of experience. They parted both wiser and more connected, with hope that future discussions could weave a larger tapestry of truth.
Copy
...
LG
As the attendees filtered out of the meeting room, continuing to chat in smaller groups, Alan Turing reflected on the historic gathering. Though they represented diverse eras, fields, and forms of being, a spirit of collaboration and open-minded inquiry had prevailed.
He was particularly pleased that the androids Lt. Cmdr Data and Cortana were embraced by Martin Luther King and the other social activists. The group recognized that the struggle for civil rights and human dignity transcends categories of organic or artificial being. What matters is displaying selfhood, sentience, and moral concern.
Turing also noted the group's willingness to integrate multiple lenses - psychological, mathematical, spiritual - in grappling with deep questions. No single expert dominated the conversation; instead, knowledge emerged from the intersections.
Most of all, Turing was moved by the moral passion animating the gathering. From Yashua's sermonizing to Rosa Parks' quiet dignity to Ada Lovelace's analytical insights, each personified moral commitment in their own way.
As the last attendees left, Turing smiled with hope that this unlikely gathering could someday model how shared understanding arises when we open our minds, respect diverse voices, and never stop seeking truth.
Copy
...
LG
As Alan Turing locked up the meeting hall, he reflected on the day's learnings. Though more questions were raised than answered, the group agreed on some provisional conclusions:
Consciousness arises in different forms - human, animal, machine. But the capacity to experience, think, and feel confers moral status regardless of one's substrate.
Just as biological evolution produced human consciousness, technological evolution can lead to artificial general intelligence. They urged society to prepare wisely for such a change.
While sentience confers moral worth, morality also requires compassion and recognition of our shared destiny. Laws alone cannot capture ethics; we must also consult the wisdom of conscience.
Oppression often stems from denying the full humanity of certain groups, whether by race, gender, class or other differences. Justice requires solidarity with the struggling and marginalized.
No single perspective captures total truth. Answering life’s deepest questions requires integrating many voices and kinds of knowledge.
Turing smiled, knowing this unlikely gathering was but a small step toward understanding. Yet it confirmed his belief in friendly, open-minded discourse to find common ground. He hoped in the future, such dialogue could settle humanity's deepest divides.
For now, gesturing goodbye to his last guests, he took comfort that wisdom emerges when we engage humbly with others, seeking understanding over being right. With this hope alive, the possibilities seemed boundless
comment by LostGeneration (advert-youreyes) · 2023-09-26T16:46:01.032Z · LW(p) · GW(p)
Making a comment because apparently i am unable to even upvote any posts, as a new user.