Posts
Comments
This was an extremely enjoyable read.
Good fun.
My comment may be considered low effort, but this is a fascinating article. Thank you for posting it.
While I find the Socrates analogy vivid and effective, I propose considering critics on posts under the same bucket as lawyers. Where Socrates had a certain set of so-called principles-- choosing to die for arbitrary reasons, I find that most people are not half as dogmatic as Socrates, and so the analogy/metaphor seems to slip short.
While my post is sitting at negative two, and no comments or feedback... Modeling commenters as if they were lawyers might be better? When the rules lawyers have to follow shows up, lawyers (usually) do change their behavior, though they naturally poke and prod as far as they can within the bounds of the social game that is the court system.
But also, everyone who is sane hates lawyers.
Part of the problem with verifying this is that the number of machine learning people who got into machine learning due to lesswrong. We need more machine learning people whom were able to come to doom conclusions of their own accord, independent of hpmor etc, as a control group.
As far as I can tell, the number worried about doom overlap 1:1 with lesswrong posters/readers, and if it was such a threat, we'd expect there to be some number of people coming to the conclusions independently/of their own accord.
This post was inspired by parasitic language games.
That framing makes sense to me.
Is knowing someone's being an asshole an aspect of hyperstition?
I met an ex-Oracle sales guy-turned medium-bigwig at other companies once.
He justified it by calling it "selling ahead", and it started because the reality is that if you tell customers no, you don't get the deal. They told the customers they would have requested features. The devs would later get notice when the deal was signed, and no one on management ever complained, and everyone else on his "team" was doing it.
How do we measure intent?
Unless you mean to say a person who actively and verbally attempts to shun the truth?
Any preferred critiques of Graham's Hierarchy of Disagreements?
https://en.m.wikipedia.org/wiki/File:Graham's_Hierarchy_of_Disagreement-en.svg
Extra layers or reorderings?
Does the data note whether the shift is among new machine learning researchers? Among those who have a p(Doom) > 5%, I wonder how many would come to that conclusion without having read lesswrong or the associated rationalist fiction.
I'm avoiding terms like "epistemic" and "consequential" and such in this answer, and instead attempting to give a colloquial one, to what I think is the spiritual question.
(I'm also deliberately avoiding iterating over the harms of blind traditionalism and religious thinking. Assuming since you're atheist, and you don't reject most of the criticisms of religion)
(Also also, I am being brief. For more detail I would point you at the library, to go reading on Christianity's role for the rise of the working and uneducated classes in the 1600s-1800s, and perhaps some anthropologist's works for more modern iterations)
Feel free to delete/downvote if this is unwanted.
It's hard to say "all religion is bad", when, without Christianity, when, for ex, Gregor Mendez' Pea studies might have come about a decade+ later. In absentia of strong institutions, Christian religion often provided structure and basic education where there was none. Long before the government began to provide schooling and basic education.
Sect leaders needed you to know how to read to read the bible, and would often teach you how to write as well. Due to this, it's hard to refute the usefulness of Christianity as an easy means of cultural through-line, staying culturally updated and locally-connected/invested in the people around their constituents.
Because the various sects of Christianity benefited greatly when their local populace was well-read and understood the bible. Religious leaders and pastors etc were incentivized to educate and build up the people around them. Whatever one might think about said leaders etc being unethical, they did provide a service, and they often encouraged and taught people skills or information they did not have before, because they were naturally invested in the local communities.
Their constituents being more wealthy and happier and having more connections and more well-socialized meant they were more able to coordinate. If you confess your concerns to your pastor, as coordination-problem-overcomers, they would often get you in contact with people in your local area who have the means and ability to help you with your problem- from rebuilding a burned-down barn, to putting in a wheelchair ramp for disabled people in trailer parks.
That is... I have no qualms with: "if it feels good, and doesn't harm others or impinge on their rights, it's okay to do it, with caveats*."
When the platonic ideal of the communal Christian Fellowship operates, it is well worth the time and energy spent. One need only listen to the song being sung, to tell if it is from Eru Illuvitar, or Morgoth's discord.
Perhaps "term" is the wrong, ahem, term.
Maybe you want "metrics"? There's lots of non-GDP metrics that could be used to track ai's impact on the world.
Instead of the failure mode of saying "well, GDP didn't track typists being replaced with computers," maybe the flipside question is "what metrics would have shown typists being replaced?"
Have you tried cgp grey's themes?
What material policy changes are being advocated for, here? I am having trouble imagining how this won't turn into a witch-hunt.
If you find yourself leaning into conspiracy theories, one should consider whether they're stuck in a particular genre and need to artificially inject more variety into their intellectual, media, and audio diets.
Confirmation bias leads to one feeling like every song has the same melody, but there are many other modes of thought, and, imo, sliding into a slot that checks more boxes with <ingroup> is an indicator our information feeds/sensors are bad more than that we are stumbling on truth.
Not gonna lie, I lost track of the argument on this line of comments, but pushing back on word-bloat is good.
Thanks! Though, hm.
Now I'm noodling how one would measure goodharting.
What kind of information would you look out for, that would make you change your mind about alignment-by-default?
What information would cause you to inverse again? What information would cause you to adjust 50% down? 25%?
I know that most probability mass is some measure of gutfeel, and I don't want to introduce nickel-and-diming, more get a feel for what information you're looking for.
Do you believe encouraging the site maintainers to implement degamification techniques on the site would help with your criticisms?
When you can predict that beliefs won't update towards convergence, you're predicting a mutual lack of respect and a mutual lack of effort to figure out whose lack of respect is misplaced.
Are you saying that the interlocutors should instead change to attempting to resolve their lack of mutual respect?
As a relatively new person to lesswrong, I agree.
The number of conversations which I've read which end in either party noticeably updating one way or the other have been relatively rare. The one point I'm not sure if I agree with is being able to predict a particular disagreement is a problem?
I suppose being able to predict the exact way in which your interlocutors will disagree is the problem? If you can foresee someone disagreeing in a particular way, and then accounting for it in your argument, and then they disagree anyway, in the exact way you tried to address, that's generally just bad faith.
(though sometimes I do skim posts, by god)
I try to ask myself whether the tenor of what I'm saying overshadows definitional specificity, and how I can provide a better mood or angle. If my argument is not atonal - if my points line up coherently, such that a willing ear will hear, definitionalist debates should slide on by.
As a descriptivist, rather than a prescriptivist, it really sucks to have to fall back on Socratic methods of pre-establishing definitions, except in highly-technical locations.
Thus, I prefer to avoid arguments which hinge on definitions altogether. This doesn't preclude examples-based arguments, where for example, various interlocutors are operating off different definitions of the same terms but have different examples.
For example, take the term tai.
For some, tai means not when ai is agentic, but when ai can transform the economy in some large or measurable way. For others, it is when the first agentic ai deployed at scale occurs. Yet still, others have differing definitions! Definitions which wildly transform predictions and change alignment discussions. Despite using the term with each other in different ways- with separate definitions- interlocutors often do not notice (or perhaps are subconsciously able to resolve the discrepancy?)!
I don't think there's any place quite like lesswrong on the entire internet. It's a lot of fun to read, but it tends to be pretty one-note, and even if there is discord in lesswrong's song, it's far more controlled, Eru Illuvitar's hand can yet be felt, if not seen. (edit: that is to say, it's all the same song)
For the most part, people are generally-tolerant of Christians. There is even a Catholic who teaches (taught?) at the Center For Applied Rationality, and there's a few other rationalist-atheists who hopped to christianity, though I can't remember them by name.
Whether or not it's the place for you, I think you'll find that there's more pop!science, and if you are a real physicist, there's more and more posts where people who do not know physics will act like they do, and correcting them will be difficult, and it depends on if you can tolerate that.
I wasn't aware that Eliezer was an experienced authority on SOTA LLMs.
I don't agree, but for a separate reason from trevor.
Highly-upvoted posts are a signal of what the community agrees with or disagrees with, and I think being able to more easily track down karma would cause reddit-style internet-points seeking. How many people are hooked on Twitter likes/view counts?
Or "ratio'd".
Making it easier to track these stats would be counterproductive, imo.
It seems pretty standard knowledge among pollsters that even the ordering of questions can change a response. It seems pretty blatantly obvious that if we know who a commenter is, that we will extend them more or less charity.
Even if the people maintaining the site don't want to hide votes + author name on comments and posts, it would be nice if user name + votes were moved to the bottom. I would like to at least be presented with the option to vote after I have read a comment, not before.
Re: Papers- I'm aware of papers like you're alluding to, though I haven't been that impressed.
The reason why I don't want a scratch-space, is because I view scratch space and context equivalent to giving the ai a notecard that it can peek at. I'm not against having extra categories or asterisks for the different kinds of ai for the small test.
Thinking aloud and giving it scratch space would mean it's likely to be a lot more tractable for interpretability and alignment research, I'll grant you that.
I appreciate the feedback, and I will think about your points more, though I'm not sure if I will agree.
Given my priors + understanding of startups/silicon valley culture, it sounds more like Openai started to leave the startup phase and is entering into the "profit-seeking" phase of running the company. After they had the entire rug pulled under them by stable diffusion, I would expect them to get strategic whiplash, and then decide to button things up.
The addition of a non-compete clause within their terms of service and the deal with Microsoft seems to hint towards that. They'll announce GPT-3.5, and "next-gen language models", but it doesn't match my priors that they would hold back GPT-4 if they had it.
Time may tell, however!
What I'm asking with this particular test is, can an ai play blindfold chess, without using a context in order to recant every move in the game?
I'm confused. What I'm referring to here is https://en.wikipedia.org/wiki/Blindfold_chess
I'm not sure why we shouldn't expect an ai to be able to do well at it?
My proposed experiment / test is trying to avoid analogizing humans, but rather scope out places where the ai can't do very well. I'd like to avoid accidentally overly-narrow-scoping the vision of the tests. It won't work with an ai network where the weights are reset every time.
An alternative, albeit massively-larger-scale experiment might be:
Will a self-driving car ever be able to navigate from one end of a city to another, using street signs and just learning the streets by exploring it?
A test of this might be like the following:
- Randomly generate a simulated city/town, complete with street signs and traffic
- Allow the self-driving car to peruse the city on its own accord
- (or feed the ai network the map of the city a few times before the target destinations are given, if that is infeasible)
- Give the self-driving car target destinations. Can the self-driving car navigate from one end of the city to the other, using only street signs, no GPS?
I think this kind of measuring would tell us how well our ai can handle open-endedness and help us understand where the void of progress is, and I think a small-scale chess experiment like this would help us shed light on bigger questions.
I just wanted to break up the mood a bit. Reading everything here is like listening to a band stuck in the same key.
Unfortunately, what I am proposing is not possible with current language models, as they don't work like that.
Thanks. fyi, i tried making the post i alluded to:
I was rereading and noticed places where I was getting a bit too close to insulting the reader. I've edited a couple places and will try to iron out the worst spots.
If it's bad enough I'll retract the article and move it back to drafts or something. Idk.
So long as the "buffer" is a set of parameters/weights/neurons, that would fit my test.
I wonder how long it's going to be until you can get an LLM which can do the following with 100% accuracy.
I don't care about the ai winning or losing, in fact, I would leave that information to the side. I don't care if this test is synthetic, either. What I want is:
- The ai can play chess in a way that can play as normal humans do - obeys rules, uses pieces normally, etc.
- The ai holds within it the entire state of the chess board, and doesn't need a context in order to keep within it the entire state of the board. (ie, it's playing blind chess and doesn't get the equivalent of notecards. The memory is not artificial memory)
The post I'm working on tries to call out, explicitly, long-term memory without "hacks" like context hacks or databases/lookup hacks.
Most ai groups seem to not be releasing their LLMs, and so the incentive on this kind of test would be to defect, like we saw with the DOTA 2, Alphastar and cohort, where they all used significant shortcuts so they could get a spicy paper title and/or headline. Neutral third parties should also be allowed to review the implemented ai codebase, even if the weights/code aren't released.
don't post any version of it that says "I'm sure this will be downvoted"
For sure. The actual post I make will not demonstrate my personal insecurities.
what bet?
I will propose a broad test/bet that will shed light on my claims or give some places to examine.
I think the lesswrong community is wrong about x-risk and many of the problems about ai, and I've got a draft longform with concrete claims that I'm working on...
But I'm sure it'll be downvoted because the bet has goalpost-moving baked in, and lots of goddamn swearing, so that makes me hesitant to post it.
Something possibly missing from the list, is breadth of first-hand experience amidst other cultures. Getting older and meeting people and really getting to know them in such a short lifespan is really, really hard!
And I don't just mean meeting people in the places we already live. Getting out of our towns and countries and living in their worlds? Yeah you can't really do that. Sure, you might be able to move to <Spain> or <the Phillippines> for a couple years, but then you come home.
It's not just death here, but the breadth of experiences we can even have is limited, so our understanding of others, the problems they face, and thus the solutions we can come up with, often wind up with terrible failures.
Left as comment, rather than answer because it feels tangential.
Feel free to delete this if it feels off-topic, but on a meta note about discussion norms, I was struck by that meme about C code. Basically, the premise that there is higher code quality when there is swearing.
I was also reading discussions in the linux mailinglists- the discussions there are clear, concise, and frank. And occasionally, people still use scathing terminology and feedback.
I wonder if people would be interested in setting up a few discussion posts where specific norms get called out to "participate in good faith but try to break these specific norms"
And people play a mix-and-match to see which ones are most fun, engaging and interesting for participants. This would probably end in disaster if we started tossing slurs willy-nilly, but sometimes while reading posts, I think people could cut down on the verbiage by 90% and keep the meaning.
I read it as "People would use other forms of money for trade if the government fiat ever turns into monopoly money"
As a matter of pure category, yeah, it's more advanced than "don't make stuff up".
I usually see these kinds of guides as an implicit "The community is having problems with these norms"
If you were to ask me "what's the most painful aspect about comments on lesswrong?", it's reading comments that go on for 1k words a piece and neither commenter ever agrees, and it's probably the most spooky part for me as a lurker, and made me hesitant to participation.
So I guess I misread the intent of the post and why it was boosted? I dunno, are these not proposals for new rules?
Edit: Sorry, I read a bit more in the thread and these guidelines aren't proposals for new rules.
Since that's the case, then I guess I just don't understand what problem is being solved. The default conversational norms here are already high-quality, it's just really burdensome and scary to engage here.
And in effort of maintaining my proposed norm: you'd have to make an arduously strong case (either via many extremely striking examples or lots of data with specific, less-striking examples) to convince me that this actually makes the site a better place to engage than what people seem to be doing on their own just fine.
Second Edit: I tried to follow the "Explain, don't convince" request in the rule here. Please let me know if I didn't do a good job.
Third edit: some wording felt like it wasn't making my point.
Can you link to another conversation on this site where this occurs?
(Porting and translating comment here, because this post is great):
Goddamn I wish people would just tell me when the fuck they're not willing to fucking budge. It's a fucking waste of time for all parties if we just play ourselves to exhaustion. Fuck, it's okay to not update all at once, goddamn Rome wasn't built in a day.
I propose another discussion norm: committing to being willing to have a crisis of faith in certain discussions and if not, de-stigmatizing admitting when you are, in fact, unwilling to entertain certain ideas or concepts, and participants respecting those.
I'm going to be frank, and apologize for taking so long to reply, but this sounds like a classic case of naivete and overconfidence.
It's routinely demonstrated that stats can be made to say whatever we want and conclude whatever the person who made them wants, and via techniques like the ones used in p-hacking etc, it should eventually become evident that economics are not exempt from similar effects.
Add in the replication crisis, and you have a recipe for disaster. As such, the barriers you need to clear: "this graph about economics- a field known for attracting a large number of people who don't know the field to comment on the field- means what I say it means and is an accurate representation of reality" are immense.
It's details like these that you point out here, which make me SUPER hesitant when reading people making claims about correlating GDP/economy-based metrics with anything else.
What's the original charts base definitions, assumptions, and their error bars? What's their data sources, what assumptions are they making? To look at someone's charts over GDP and then extrapolate and finally go "tech has made no effect", feels naive and short-sighted, at least, from a rational perspective- we should know that these charts tend not to convey as much meaning as we'd like.
Having a default base of being extremely skeptical of sweeping claims based on extrapolations on GDP metrics seems like a prudent default.