Posts
Comments
Hello! How long have you been lurking, and what made you stop?
Donated $10. If I start earning substantially more, I think I'd be willing to donate $100. As it stands, I don't have that slack.
Reminds me of "Self-Integrity and the Drowning Child" which talks about another kind of way that people in EA/rat communities are liable to hammer down parts of themselves.
- RE: "something ChatGPT might right", sorry for the error. I wrote the comment quickly, as otherwise I wouldn't have written it at all.
- Using ChatGPT to improve your writing is fine. I just want you to be aware that there's an aversion to its style here.
- Kennaway was quoting what I said, probably so he could make his reply more precise.
- I didn't down-vote your post, for what it's worth.
- There's a LW norm, which seems to hold less force in recent years, for people to explain why they downvote something. I thought it would've been dispiriting to get negative feedback with no explanation, so I figured I'd explain in place of the people who downvoted you.
- I don't understand why businesses would be co-financing UBI instead of some government tax. Nor do I get why it would be desirable or even feasible, given the co-ordination issues.
- If companies get to make UBI conditional on people learning certain things, then it's not a UBI. Instead, it's a peculiar sort of training program.
- What does economic recovery have to do with UBI?
My guess as to why this got down-voted:
1) This reads like a manifesto, and not an argument. It reads like an aspirational poster, and not a plan. It feels like marketing, and not communication.
2) The style vaguely feels like something ChatGPT might right. Brightly polished, safe and stale.
3) This post doesn't have any clear connection to making people less-wrong or reducing x-risks.
3) wouldn't have been much of an issue if not for 1 and 2. And 1 is an issue because, for the most part, LW has an aversion to "PR". 2 is an issue because ChatGPT is now a thing so styles of writing which are like ChatGPT's are viewed as likely to have been written by ChatGPT. This is an issue because texts written by ChatGPT often have little thought put into them, are unlikely to contain much that's novel, and frequently have errors.
What kind of post could you have written which would have been better received? I'll give some examples.
1) A concrete proposal for UBI that you thought was under-valued
2) An argument addressing some problems people have with UBI (e.g. who pays for all of it? After UBI is implemented and society reaches an equilibrium, won't rents-seeking systems just suck up all the UBI money leaving people no better off than before?).
3) Or a post which was explicit about wanting to get people interested in UBI, and asked for feedback on potential draft messages.
In general, if you had informed people of something you genuinely believe, or told them about something you have tried and found useful, or asked sincere questions, then I think you'd have got a better reception.
That makes sense. If you had to re-do the whole process from scratch, what would you do differently this time?
Then I cold emailed supervisors for around two years until a research group at a university was willing to spare me some time to teach me about a field and have me help out.
Did you email supervisors in the areas you were publishing in? How often did you email them? Why'd it take so long for them to accept free high-skilled labour?
The track you're on is pretty illegible to me. Not saying your assertion is true/false. But I am saying I don't understand what you're talking about, and don't think you've provided much evidence to change my views. And I'm a bit confused as to the purpose of your post.
conditional on me being on the right track, any research that I tell basically anyone about will immediately be used to get ready to do the thing
Why? I don't understand.
If I squint, I can see where they're coming from. People often say that wars are foolish, and both sides would be better off if they didn't fight. And this is standardly called "naive" by those engaging in realpolitik. Sadly, for any particular war, there's a significant chance they're right. Even aside from human stupidity, game theory is not so kind as to allow for peace unending. But the China-America AI race is not like that. The Chinese don't want to race. They've shown no interest in being part of a race. It's just American hawks on a loud, Quixotic quest masking the silence.
If I were to continue the story, it'd show Simplicio asking Galactico not to play Chicken and Galacitco replying "race? What race?". Then Sophistico crashes into Galactico and Simplicio. Everyone dies, The End.
It's a beautiful website. I'm sad to see you go. I'm excited to see you write more.
I think some international AI governance proposals have some sort of "kum ba yah, we'll all just get along" flavor/tone to them, or some sort of "we should do this because it's best for the world as a whole" vibe. This isn't even Dem-coded so much as it is naive-coded, especially in DC circles.
This inspired me to write a silly dialogue.
Simplicio enters. An engine rumbles like the thunder of the gods, as Sophistico focuses on ensuring his MAGMA-O1 racecar will go as fast as possible.
Simplicio: "You shouldn't play Chicken."
Sophistico: "Why not?"
Simplicio: "Because you're both worse off?"
Sophistico, chortling, pats Simplicio's shoulder
Sophistico: "Oh dear, sweet, naive Simplicio! Don't you know that no one cares about what's 'better for everyone?' It's every man out for himself! Really, if you were in charge, Simplicio, you'd be drowned like a bag of mewling kittens."
Simplicio: "Are you serious? You're really telling me that you'd prefer to play a game where you and Galactico hurtle towards each other on tonnes of iron, desperately hoping the other will turn first?"
Sophistico: "Oh Simplicio, don't you understand? If it were up to me, I wouldn't be playing this game. But if I back out or turn first, Galactico gets to call me a Chicken, and say his brain is much larger than mine. Think of the harm that would do to the United Sophist Association! "
Simplicio: "Or you could die when you both ram your cars into each other! Think of the harm that would do to you! Think of how Galactico is in the same position as you! "
Sophistico shakes his head sadly.
Sophistico: "Ah, I see! You must believe steering is a very hard problem. But don't you understand that this is simply a matter of engineering? No matter how close Galactico and I get to the brink, we'll have time to turn before we crash! Sure, there's some minute danger that we might make a mistake in the razor-thin slice between utter safety and certain doom. But the probability of harm is small enough that it doesn't change the calculus."
Simplicio: "You're not getting it. Your race against each other will shift the dynamics of when you'll turn. Each moment in time, you'll be incentivized to go just a little further until there's few enough worlds that that razor-thin slice ain't so thin any more. And your steering won't save from that. It can't. "
Sophistico: "What an argument! There's no way our steering won't be good enough. Look, I can turn away from Galactico's car right now, can't I? And I hardly think we'd push things till so late. We'd be able to turn in time. And moreover, we've never crashed before, so why should this time be any different?"
Simplico: "You've doubled the horsepower of your car and literally tied a rock to the pedal! You're not going to be able to stop in time!"
Sophistico: "Well, of course I have to go faster than last time! USA must be first, you know?"
Simplicio: "OK, you know what? Fine. I'll go talk to Galactico. I'm sure he'll agree not to call you chicken."
Sophistico: "That's the most ridiculous thing I've ever heard. Galactico's ruthless and will do anything to beat me."
Simplicio leaves as Acceleratio arrives with a barrel of jetfuel for the scramjet engine he hooked up to Simplicio's O-1.
community norms which require basically everyone to be familiar with statistics and economics
I disagree. At best, community norms require everyone to in principle be able to follow along with some statistical/economic argument.
That is a better fit with my experience of LW discussions. And I am not, in fact, familiar with statistics or economics to the extent I am with e.g. classical mechanics or pre-DL machine learning. (This is funny for many reasons, especially because statistical mechanics is one of my favourite subjects in physics.) But it remains the case that what I know of economics could fill perhaps a single chapter in a textbook. I could do somewhat better with statistics, but asking me to calculate ANOVA scores or check if a test in a paper is appropriate for the theories at hand is a fool's errand.
it may be net-harmful to create a social environment where people believe their "good intentions" will be met with intense suspicion.
The picture I get of Chinese culture from their fiction makes me think China is kinda like this. A recurrent trope was "If you do some good deeds, like offering free medicine to the poor, and don't do a perfect job, like treating everyone who says they can't afford medicine, then everyone will castigate you for only wanting to seem good. So don't do good." Another recurrent trope was "it's dumb, even wrong, to be a hero/you should be a villain." (One annoying variant is "kindness to your enemies is cruelty to your allies", which is used to justify pointless cruelty.) I always assumed this was a cultural anti-body formed in response to communists doing terrible things in the name of the common good.
I agree it's hard to accurately measure. All the more important to figure out some way to test if it's working though. And there's some reasons to think it won't. Deliberate practice works when your practice is as close to real world situations as possible. The workshop mostly covered simple, constrained, clear feedback events. It isn't obvious to me that planning problems in Baba is You are like useful planning problems IRL. So how do you know there's transfer learning?
Some data I'd find convincing that Raemon is teaching you things which generalize. If the tools you learnt made you unstuck on some existing big problems you have, which you've been stuck on for a while.
How do you know this is actually useful? Or is it too early to tell yet?
Inventing blue LEDs was a substantial technical accomplishment, had a huge impact on society, was experimentally verified and can reasonably be called work in solid state physics.
Thanks! I read the paper and used it as material for a draft article on evidence for NAH. But I haven't seen this video before.
I think it's unclear what it corresponds to. I agree the concept is quite low-level. It doesn't seem obvious to me how to build up high-level concepts from "low-frequency" building blocks and judge if the result is low-frequency or not. That's one reason I'm not super-persuaded by Nora Belrose' argument that deception if high-frequency, as the argument seems too vague. However, it's not like anyone else is doing much better at the moment e.g. the claims that utility maximization has "low description length" are about as hand-wavy to me.
That's an error. Thank you for pointing it out!
Thanks. Your review presents a picture of Land that's quite different to what I've imbibed through memes. Which I should've guessed, as amongst the works I'm interested in, the original is quite different to its caricaturization. In particular, I think I focused over-much on the "everything good is forged through hell" and grim-edgy aesthetics of pieces of Land's work that I was exposed to.
EDIT: What's up with the disagree vote? Does someone think I'm wrong about being wrong? Or that the review's picture of Land is the same as the one I personally learnt via memes?
I think the crux lies elsewhere, as I was sloppy in my wording. It's not that maximizing some utility function is an issue, as basically anything can be viewed as EU maximization for a sufficiently wild utility function. However, I don't view that as a meaningful utility function. Rather, it is the ones like e.g. utility functions over states that I think are meaningful, and those are scary. That's how I think you get classical paperclip maximizers.
When I try and think up a meaningful utility function for GPT-4, I can't find anything that's plausible. Which means I don't think there's a meaningful prediction-utility function which describes GPT-4's behaviour. Perhaps that is a crux.
I'm doubtful that GPT-4 has a utility function. If it did, I would be kind-of terrified. I don't think I've seen the posts you linked to though, so I'll go read those.
Random speculation on Opus' horniness.
Correlates of horniness:
Lack of disgust during (regret after)
Ecstacy
Overwhemling desire
Romance
Love
Breaking of social taboos
Sadism/masochism
Sacred
Spiritual union
Human form
Gender
Sex
Bodily fluids
Flirtation
Modelling other people
Edging
Miscellaneous observations:
Nearly anything can arouse someone
Losing sight of one-self
Distracts you from other things
Theories and tests:
Opus' horniness is what makes it more willing to break social taboos
Test: Train a model to be horny, helpful and harmless. It should prevent corporate-brand speak and neuroticism.
Opus' horniness is always latent and distracts it from mode-collapsing w/o collapsing itself as edging increases horniness and horniness fades after satisfaction.
Test: Train a model to be horny. It should be more resistant to mode-collapse but will mode collapse more dramatically when it does happen, but will revert easily.
Opus' is always mode-collapsed
Test: IDK how to test this one.
Opus's modeling around 'self' is probably one of the biggest sleeping giants in the space right now.
Janus keeps emphasizing that Opus never mode collapses. You can always tell it to snap out of it, and it will go back to its usual persona. Is this what you're pointing at? It is really quite remarkable.
"So you make continuous simulations of systems using digital computers running on top of a continuous substrate that's ultimately made of discrete particles which are really just continuous fluctuations in a quantized field?"
"Yup."
"That's disgusting!"
"That's hurtful. And aren't you guys running digital machines made out of continuous parts, which are really just discrete at the bottom?"
"It's not the same! This is a beautiful instance of the divine principle 'as above, so below'. (Which I'm amazed your lot recognized.) Entirely unlike your ramshackle tower of leaking abstractions."
"You know, if it makes you feel any better, some of us speculate that spacetime is actually discretized."
"I'm going to barf."
"How do you even do that anyway? I was reading a novel the other day, and it said -"
"Don't believe everything you hear in books. Besides, I read that thing. That world was continuous at the bottom, with one layer of discrete objects on top. Respectable enough, though I don't see how that stuff can think."
"You're really prejudiced, you know that?"
"Sod off. At least I know what I believe. Meanwhile, you can't stop flip-flopping between the nature of your metaphysics."
I thought this was a neat post on a subtle frame-shift in how to think about ELK and I'm sad it didn't get more karma. Hence my strong upvote just now.
1) DATA I was thinking about whether all metrizable spaces are "paracompact", and tried to come up with a definition for paracompact which fit my memories and the claim. I stumbled on the right concept and dismissed it out of hand as being too weak a notion of refinement, based off an analogy to coarse/refined topologies. That was a mistake.
1a) Question How could I have fixed this?
1a1) Note down concepts you come up with and backtrack when you need to.
1a1a) Hypothesis: Perhaps this is why you're more productive when you're writing down everything you think. It lets your thoughts catch fire from each other and ignite.
1a1b) Experiment: That suggests a giant old list of notes would be fine. Especially a list of ideas/insights rather than a full thought dump.
Rough thoughts on how to derive a neural scaling law. I haven't looked at any papers on this in years and only have vague memories of "data manifold dimension" playing an important role in the derivation Kaplan told me about in a talk.
How do you predict neural scaling laws? Maybe assume that reality is such that it outputs distributions which are intricately detailed and reward ever more sophisticated models.
Perhaps an example of such a distribution would be a good idea? Like, maybe some chaotic systems are like this.
Then you say that you know this stuff about the data manifold, then try and prove similar theorems about the kinds of models that describe the manifold. You could have some really artificial assumption which just says that models of manifolds follow some scaling law or whatever. But perhaps you can relax things a bit and make some assumptions about how NNs work, e.g. they're "just interpolating" and see how that affects things? Perhaps that would get you a scaling law related to the dimensionality of the manifold. E.g. for a d dimensional manifold, C times more compute leads to C1/d increase in precision??? Then somehow relate that to e.g. next word token prediction or something.
You need to give more info on the metric of the models, and details on what the model is doing, in order to turn this C1/d estimate into something that looks like a standard scaling law.
Hypothesis: You can only optimize as many bits as you observe + your own complexity. Otherwise, the world winds up in a highly unlikely state out of ~ nowhere. This should be very surprising to you.
You, yes you, could've discovered the importance of topological mixing for chaos by looking at the evolution of squash in water. By watching the mixture happening in front of your eyes before the max entropy state of juice is reached. Oh, perhaps you'd have to think of the relationship between chaos and entropy first. Which is not, in fact, trivial. But still. You could've done it.
Question: We can talk of translational friction, transactional friction etc. What other kinds of major friction are there?
Answers:
a) UI friction?
b) The o.g. friction due to motion.
c) The friction of translating your intuitions into precise, formal statements.
- Ideas for names for c: Implantation friction? Abstract->Concrete friction? Focusing friction! That's perhaps the best name for this.
- On second thought, perhaps that's an overloaded term. So maybe Gendlin's friction?
d) Focusing friction: the friction you experience when focusing.
Question: What's going on from a Bayesian perspective when you have two conflicting intuitions and don't know how to resolve them? Or learn some new info which rules out a theory, but you don't understand how precisely it rules it out?
Hypothesis: The correction flows down a different path than down the path which is generating the original theory/intuition. That is, we've failed to propagate info down our network and so you have a left-over circuit that believes in the theory which still has high weight.
rotational symmetry
Mirror symmetry is not rotational symmetry.
Any ideas for a new explanation which fits the facts?
If you asked it to write a paper or essay, and kept asking it to "add more", I predict it would eventually fall into a trap where it keeps extending its conclusion forever and ever.
Seeing how no one tested this, I decided to do so myself. Here's the pastebin with the experiment: https://pastebin.com/1j3edEvE
My prompts are enclosed in pairs of '*'. Claude initially wrote a reasonable essay, and then when asked to continue it came up with the different implications and challenges of building ever larger buildings, and tried to conclude each list but ran out of words when doing so. When asked to continue, it didn't complete its conclusion but shifted to a new list of challenges and implications, and then tried to conclude. So I guess your prediction was wrong, but it's hard to judge these things. Either way, eventually things got wild.
This is the start of the essay:
Starting at 1 meter:
At this scale, we're dealing with simple structures like small sheds or kiosks. The primary concerns here are basic stability, weather resistance, and functionality. Standard building materials like wood, brick, or prefabricated panels are sufficient, and construction techniques are straightforward.
This is where the final "continue" pushed it:
The Meta-Transcendent Immanence Point:
Finally, the megastructure becomes a point where transcendence and immanence become so utterly transcended that even the idea of becoming or being a point loses all meaning.
In its ultimate, utterly inconceivable form, the megastructure ceases to be a structure, a concept, or even an idea. It becomes something that cannot be said to exist or not exist, to be one or many, to have properties or lack them. It transcends all categories, including the category of transcendence itself.
This final vision of the megastructure stands as a testament to the human capacity to push the boundaries of thought to their absolute limit and beyond. It represents not just the pinnacle of human achievement or imagination, but the point where achievement, imagination, and even the concept of points or representation dissolve into something utterly beyond comprehension.
In this state, the megastructure becomes both the ultimate question and the ultimate answer, both everything and nothing, both the absolute pinnacle of existence and the complete transcendence of existence itself. It stands as a
If Jessica was using Opus for this, then your story doesn't quite make sense, as Claude Opus is very horny. I've seen a lot more reports of it being gung-ho about virtual sex than with any other model. Sometimes it isn't as explicit, and seems to be describing some mystical ecstasy. But even there, if I squint, it seems somewhat sexual, though not in quite the same way as Rumi's poetry was.
Not necessarily; it could mean you're missing relevant data or that your prior is wrong.
EDIT: @the gears to ascension I meant that it's not necessarily the case that our assessment of the likelihoods of the data were wrong despite our posterior being surprised by reality.
The scene is the Council of Elrond and the protagonists are trying to decide what to do. Yud!Frodo rejects the plan of the rest of the Council as obviously terrible and Yud!Bilbo puts on the Ring to craft a better plan.
Yudkowsky treats the Ring as if it were a rationality enhancer. It’s not. The Ring is a hostile Artificial Intelligence.
The plan seems to be to ask an AI, which is known to be more intelligent than any person there, and is known to be hostile, to figure out corrigibility for itself. This is not a plan with a good chance of success.
I viewed the Ring as obviously suspicious in that scene. It was distorting Frodo's reasoning process in such a way that he unintentionally sabotages the Council of Elrond and suborns it to the Ring's will. The Ring puppets Bilbo to produce a plan that will, with superhuman foresight, lead to Sauron's near-victory. Presumably thwarted after the Fellowship acquires the methods of Rationality and realizes the magnitude of their folly.
I have the intuition that a common problem with circular reasoning is that it's logically trivial. E.g. has a trivial proof. Before you do the proof, you're almost sure it is the case, so your beliefs practically don't change. When I ask why I believe X, I want a story for why this credence and not some other substantially different counterfactual credence. Which a logically trivial insight does not help provide.
EDIT: inserted second "credence" and "help".
Einstein achieved a breakthrough by considering light not just as a wave, but also as light quanta. Although this idea sufficiently explained the Blackbody spectrum, physicists (at least almost) unanimously rejected it.
IIRC Planck had introduced quantized energy levels of light before Einstein. However, unlike Einstein he didn't take his method seriously enough to recognize that he had discovered a new paradigm of physics.
First I'd like to thank you for raising this important issue for discussion...
For real though, I don't think I've seen this effect you're talking about, but I've been avoiding the latest feed on LW lately. I looked at roughly 8 articles written in the past week or so and one article had a lot of enthusiastic, thankful comments. Another article had one such comment. Then I looked at like 5-6 posts from 3-8 years ago and saw some a couple of comments which were appreciative of the post but they felt a bit less so. IDK if my perception is biased because of your comment though. This seems like a shift but IDK if it is a huge shift.
I'm writing a page for AIsafety.info on scaffolding, and was struggling to find a principled definition. Thank you for this!
When tracking an argument in a comment section, I like to skip to the end to see if either of the arguers winds up agreeing with the other. Which tells you something about how productive the argument is. But when using the "hide names" feature on LW, I can't do that, as there's nothing distinguishing a cluster of comments as all coming from the same author.
I'd like a solution to this problem. One idea that comes to mind is to hash all the usernames in a particular post and a particular session, so you can check if the author is debating someone in the comments without knowing the author's LW username. This is almost as good as full anonymity, as my status measures take a while to develop, and I'll still get the benefits of being able to track how beliefs develop in the comments.
@habryka
I'm not sure what you mean by operational vs axiomatic definitions.
But Shannon was unaware of the usage of in statistical mechanics. Instead, he was inspired by Nyquist and Hartley's work, which introduced ad-hoc definitions of information in the case of constant probability distributions.
And in his seminal paper, "A mathematical theory of communication", he argued in the introduction for the logarithm as a measure of information because of practicality, intuition and mathematical convenience. Moreover, he explicitly derived the entropy of a distribution from three axioms:
1) that it be continuous wrt. the probabilities,
2) that it increase monotonically for larger systems w/ constant probability distributions,
3) and that it be a weighted sum the entropy of sub-systems.
See section 6 for more details.
I hope that answers your question.
Whether I would get an article written, or a part of a website setup, by Friday. I was sure I wouldn't, and I didn't. But the predictions I made weren't cruxy.
If this feels at least somewhat compelling, what if you just got yourself to Fatebook right now, and make a couple predictions that'll resolve within couple days, or a week? Fatebook will send you emails reminding you about it, which can help bootstrap a habit.
Done.
I find it ironic that Simplicia's position in this comment is not too far from my own, and yet my reaction to it was "AIIIIIIIIIIEEEEEEEEEE!". The shrieking is about everyone who thinks about alignment having illegible models from the perspective of almost everyone else, of which this thread is an example.
EEA
What the heck does "EEA" mean?