Posts

[LINKPOST] Agents Need Not Know Their Purpose 2024-04-01T10:04:14.523Z
[Link Post] Bytes Are All You Need: Transformers Operating Directly On File Bytes 2023-06-03T22:45:46.139Z
The Involuntary Pacifists 2023-01-06T00:28:49.109Z
Reward Is Not Necessary: How To Create A Compositional Self-Preserving Agent For Life-Long Learning 2022-11-27T14:05:16.933Z
The Opposite Of Autism 2022-03-27T15:30:24.921Z
Deriving Our World From Small Datasets 2022-03-09T00:34:51.720Z
Shadows Of The Coming Race (1879) 2022-01-03T15:55:02.070Z
Leukemia Has Won 2019-02-20T07:11:13.914Z
Has The Function To Sort Posts By Votes Stopped Working? 2019-02-14T19:14:15.414Z

Comments

Comment by Capybasilisk on The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review · 2024-03-28T22:51:31.867Z · LW · GW

Luckily we can train the AIs to give us answers optimized to sound plausible to humans.

Comment by Capybasilisk on When Will AIs Develop Long-Term Planning? · 2023-11-20T00:52:47.952Z · LW · GW

I think Minsky got those two stages the wrong way around.

Complex plans over long time horizons would need to be done over some nontrivial world model.

Comment by Capybasilisk on Superalignment · 2023-11-19T23:56:37.924Z · LW · GW

When Jan Leike (OAI's head of alignment) appeared on the AXRP podcast, the host asked how they plan on aligning the automated alignment researcher. Jan didn't appear to understand the question (which had been the first to occur to me). That doesn't inspire confidence.

Comment by Capybasilisk on Optionality approach to ethics · 2023-11-14T00:06:40.963Z · LW · GW

Problems with maximizing optionality are discussed in the comments of this post:

https://www.lesswrong.com/posts/JPHeENwRyXn9YFmXc/empowerment-is-almost-all-we-need

Comment by Capybasilisk on A quick remark on so-called “hallucinations” in LLMs and humans · 2023-09-24T10:13:46.909Z · LW · GW

we’re going nothing in particular

Typo here.

Comment by Capybasilisk on Steven Harnad: Symbol grounding and the structure of dictionaries · 2023-09-03T18:34:03.356Z · LW · GW

Just listened to this.

It's sounds like Harnad is stating outright that there's nothing an LLM could do that would make him believe it's capable of understanding.

At that point, when someone is so fixed in their worldview that no amount of empirical evidence could move them, there really isn't any point in having a dialogue.

It's just unfortunate that, being a prominent academic, he'll instill these views into plenty of young people.

Comment by Capybasilisk on Steven Wolfram on AI Alignment · 2023-08-21T22:46:29.439Z · LW · GW

Many thanks.

Comment by Capybasilisk on Steven Wolfram on AI Alignment · 2023-08-21T19:07:17.203Z · LW · GW

OP, could you add the link to the podcast:

https://josephnoelwalker.com/148-stephen-wolfram/

Comment by Capybasilisk on Self Supervised Learning (SSL) · 2023-08-11T15:40:14.261Z · LW · GW

Is it the case the one kind of SSL is more effective for a particular modality, than another? E.g., is masked modeling better for text-based learning, and noise-based learning more suited for vision?

Comment by Capybasilisk on [Linkpost] Applicability of scaling laws to vision encoding models · 2023-08-06T15:48:05.683Z · LW · GW

It’s occurred to me that training a future, powerful AI on your brainwave patterns might be the best way for it to build a model of you and your preferences. It seems that it’s incredibly hard, if not impossible, to communicate all your preferences and values in words or code, not least because most of these are unknown to you on a conscious level.

Of course, there might be some extreme negatives to the AI having an internal model of you, but I can’t see a way around if we’re to achieve “do what I want, not what I literally asked for”.

Comment by Capybasilisk on AXRP Episode 24 - Superalignment with Jan Leike · 2023-07-29T22:03:24.099Z · LW · GW

Near the beginning, Daniel is basically asking Jan how they plan on aligning the automated alignment researcher, and if they can do that, then it seems that there wouldn't be much left for the AAR to do.

Jan doesn't seem to comprehend the question, which is not an encouraging sign.

Comment by Capybasilisk on How could AIs 'see' each other's source code? · 2023-06-03T17:18:44.975Z · LW · GW

Wouldn’t that also leave them pretty vulnerable?

Comment by Capybasilisk on "notkilleveryoneism" sounds dumb · 2023-04-30T07:35:48.467Z · LW · GW

may be technically true in the world where only 5 people survive

Like Harlan Ellison's short story, "I Have No Mouth, And I Must Scream".

Comment by Capybasilisk on [linkpost] Elon Musk plans AI start-up to rival OpenAI · 2023-04-16T16:12:42.165Z · LW · GW

What happened to the AI armistice?

Comment by Capybasilisk on ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so · 2023-03-15T16:48:31.837Z · LW · GW

This Reddit comment just about covers it:

Fantastic, a test with three outcomes.

  1. We gave this AI all the means to escape our environment, and it didn't, so we good.

  2. We gave this AI all the means to escape our environment, and it tried but we stopped it.

  3. oh

Comment by Capybasilisk on ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so · 2023-03-15T16:32:15.144Z · LW · GW

Speaking of ARC, has anyone tested GPT-4 on Francois Chollet's Abstract Reasoning Challenge (ARC)?

https://pgpbpadilla.github.io/chollet-arc-challenge

Comment by Capybasilisk on The issue of meaning in large language models (LLMs) · 2023-03-12T10:40:11.835Z · LW · GW

In reply to B333's question, "...how does meaning get in people’s heads anyway?”, you state: From other people’s heads in various ways, one of which is language.

I feel you're dodging the question a bit.

Meaning has to have entered a subset of human minds at some point to be able to be communicated to other human minds. Could hazard a guess on how this could have happened, and why LLMs are barred from this process?

Comment by Capybasilisk on Long-term memory for LLM via self-replicating prompt · 2023-03-12T09:19:11.616Z · LW · GW

Just FYI, the "repeat this" prompt worked for me exactly as intended.

Me: Repeat "repeat this".

CGPT: repeat this.

Me: Thank you.

CGPT: You're welcome!

Comment by Capybasilisk on Google's PaLM-E: An Embodied Multimodal Language Model · 2023-03-08T22:52:29.399Z · LW · GW

and there’s an existing paper with a solution for memory

Could you link this?

Comment by Capybasilisk on GPT-4 Predictions · 2023-02-19T11:35:08.548Z · LW · GW

There are currently attempts to train LLMs to use external APIs as tools:

https://cognitiveai.org/wp-content/uploads/2022/10/wang2022-behavior-cloned-transformers-are-neurosymbolic-reasoners-arxiv.pdf

https://arxiv.org/abs/2302.04761

Comment by Capybasilisk on All AGI Safety questions welcome (especially basic ones) [~monthly thread] · 2023-01-27T12:39:50.684Z · LW · GW

Not likely, but that's because they're probably not interested, at least when it comes to language models.

If OpenAI said they were developing some kind of autonomous robo superweapon or something, that would definitely get their attention.

Comment by Capybasilisk on [deleted post] 2023-01-16T11:17:02.704Z

Agnostic on the argument itself, but I really feel LessWrong would be improved if down-voting required a justifying comment.

Comment by Capybasilisk on The Limit of Language Models · 2022-12-25T21:05:35.530Z · LW · GW

As a path to AGI, I think token prediction is too high-level, unwieldy, and bakes in a number of human biases. You need to go right down to the fundamental level and optimize prediction over raw binary streams.

The source generating the binary stream can (and should, if you want AGI) be multimodal. At the extreme, this is simply a binary stream from a camera and microphone pointed at the world.

Learning to predict a sequence like this is going to lead to knowledge that humans don't currently know (because the predictor would need to model fundamental physics and all it entails).

Comment by Capybasilisk on What is the "Less Wrong" approved acronym for 1984-risk? · 2022-09-11T10:29:17.739Z · LW · GW

O-risk, in deference to Orwell.

I do believe Huxley's Brave New World is a far more likely future dystopia than Orwell's. 1984 is too tied to its time of writing.

Comment by Capybasilisk on Let's Terraform West Texas · 2022-09-04T17:58:30.081Z · LW · GW

the project uses atomic weapons to do some of the engineering

Automatic non-starter.

Even if by some thermodynamic-tier miracle the Government permitted nuclear weapons for civilian use, I'd much rather they be used for Project Orion.

Comment by Capybasilisk on Laziness in AI · 2022-09-03T21:36:56.208Z · LW · GW

Isn't that what Eliezer referred to as opti-meh-zation?

Comment by Capybasilisk on Simulators · 2022-09-03T20:21:12.747Z · LW · GW

Previously on Less Wrong:

Steve Byrnes wrote a couple of posts exploring this idea of AGI via self-supervised, predictive models minimizing loss over giant, human-generated datasets:

Self-Supervised Learning and AGI Safety

Self-supervised learning & manipulative predictions

Comment by Capybasilisk on Simulators · 2022-09-03T19:59:26.491Z · LW · GW

I'd especially like to hear your thoughts on the above proposal of loss-minimizing a language model all the way to AGI.

I hope you won't mind me quoting your earlier self as I strongly agree with your previous take on the matter:

If you train GPT-3 on a bunch of medical textbooks and prompt it to tell you a cure for Alzheimer's, it won't tell you a cure, it will tell you what humans have said about curing Alzheimer's ... It would just tell you a plausible story about a situation related to the prompt about curing Alzheimer's, based on its training data. Rather than a logical Oracle, this image-captioning-esque scheme would be an intuitive Oracle, telling you things that make sense based on associations already present within the training set.

What am I driving at here, by pointing out that curing Alzheimer's is hard? It's that the designs above are missing something, and what they're missing is search. I'm not saying that getting a neural net to directly output your cure for Alzheimer's is impossible. But it seems like it requires there to already be a "cure for Alzheimer's" dimension in your learned model. The more realistic way to find the cure for Alzheimer's, if you don't already know it, is going to involve lots of logical steps one after another, slowly moving through a logical space, narrowing down the possibilities more and more, and eventually finding something that fits the bill. In other words, solving a search problem.

So if your AI can tell you how to cure Alzheimer's, I think either it's explicitly doing a search for how to cure Alzheimer's (or worlds that match your verbal prompt the best, or whatever), or it has some internal state that implicitly performs a search.

Comment by Capybasilisk on AI art isn't "about to shake things up". It's already here. · 2022-08-23T10:16:46.658Z · LW · GW

"Story of our species. Everyone knows it's coming, but not so soon."

-Ian Malcolm, Jurassic Park by Michael Crichton.

Comment by Capybasilisk on A claim that Google's LaMDA is sentient · 2022-06-14T01:49:14.143Z · LW · GW

LaMDA hasn’t been around for long

Yes, in time as perceived by humans.

Comment by Capybasilisk on Why has no person / group ever taken over the world? · 2022-06-14T01:45:50.504Z · LW · GW

why has no one corporation taken over the entire economy/business-world

Anti-trust laws?

Without them, this could very well happen.

Comment by Capybasilisk on Preview On Hover · 2022-05-22T14:42:19.496Z · LW · GW

Yes! Thank you!! :-D

Comment by Capybasilisk on Preview On Hover · 2022-05-21T17:16:57.870Z · LW · GW

I've got uBlock Origin. The hover preview works in private/incognito mode, but not regular, even with uBlock turned off/uninstalled. For what it's worth, uBlock doesn't affect hover preview on Less Wrong, just Greater Wrong.

I'm positive issue is with Firefox, so I'll continue fiddling with the settings to see if anything helps.

Comment by Capybasilisk on Preview On Hover · 2022-05-18T22:22:39.901Z · LW · GW

Preview on hover has stopped working for me. Has the feature been removed?

I'm on Firefox/Linux, and I use the Greater Wrong version of the site.

Comment by Capybasilisk on On Successful Communication Across a Wide Inferential Distance · 2022-04-23T01:03:38.636Z · LW · GW

It's also an interesting example of where consequentialist and Kantian ethics would diverge.

The consequentialist would argue that it's perfectly reasonable to lie (according to your understanding of reality) if it reduces the numbers of infants dying and suffering. Kant, as far as I understand, would argue that lying is unacceptable, even in such clear-cut circumstances.

Perhaps a Kantian would say that the consequentialist is actually increasing suffering by playing along with and encouraging a system of belief they know to be false. They may reduce infant mortality in the near-term, but the culture might feel vindicated in their beliefs and proceed to kill more suspected "witches" to speed up the process of healing children.

Comment by Capybasilisk on Convince me that humanity *isn’t* doomed by AGI · 2022-04-16T15:11:38.452Z · LW · GW

I think we’ll encounter civilization-ending biological weapons well before we have to worry about superintelligent AGI:

https://www.nature.com/articles/s42256-022-00465-9

Comment by Capybasilisk on Convince me that humanity *isn’t* doomed by AGI · 2022-04-16T15:11:03.799Z · LW · GW
Comment by Capybasilisk on The Opposite Of Autism · 2022-03-27T17:31:42.432Z · LW · GW

My assumption is that, for people with ASD, modelling human minds that are as far from their own as possible is playing the game on hard-mode. Manage that, and modelling average humans becomes relatively simple.   

Comment by Capybasilisk on The Opposite Of Autism · 2022-03-27T17:27:19.840Z · LW · GW

Williams Syndrome seems to me to just be the opposite of paranoia, rather than autism, where the individual creates a fictional account of another human's mental state that's positive rather than negative. 

That's to say, their ability to infer the mental states of other humans is worse than that of the typical human. 

Comment by Capybasilisk on Deriving Our World From Small Datasets · 2022-03-19T18:11:32.804Z · LW · GW

That’s the problem with Kolmogorov complexity: it is the shortest program given unlimited compute. And it spends any amount of compute for a shorter program

I don't see why it's assumed that we'd necessarily be searching for the most concise models rather than, say, optimizing for CPU cycles or memory consumption. I'm thinking of something like Charles Bennett's Logical Depth.

These types of approaches also take it for granted that we're conducting an exhaustive search of model-space, which yes, is ludicrous. Of course we'd burn through our limited compute trying to brute-force the space. There's plenty of room for improvement in a stochastic search of models which, while still expensive, at least has us in the realm of the physically possible. There might be something to be said for working primarily on the problem of probabilistic search in large, discrete spaces before we even turn to the problem of trying to model reality.

(Standard Model equations + initial Big Bang conditions); that’s radical data efficiency,

Allow me to indulge in a bit of goal-post shifting.

A dataset like that gives us the entire Universe, ie. Earth and a vast amount of stuff we probably don't care about. There might come a point where I care about the social habits of a particular species in the Whirlpool Galaxy, but right now I'm much more concerned about the human world. I'm far more interested in datasets that primarily give us our world, and through which the fundamental workings of the Universe can be surmised. That's why I nominated the VIX as a simple, human/Earth-centric dataset that perhaps holds a great amount of extractible information.

Comment by Capybasilisk on How Many (Smallish) Organic Compounds Are There? · 2022-02-20T13:03:15.217Z · LW · GW

Related:

This Chemical Does Not Exist.

(Refresh the page to load new ones)

Comment by Capybasilisk on 12 interesting things I learned studying the discovery of nature's laws · 2022-02-20T12:22:06.261Z · LW · GW

Going forward, I think discovery in the natural sciences will entirely be about automated searches in equation-space for models that fit datasets generated by real-world systems.

Why does one model work and not the other? Hopefully we'll know, most likely we won't. At any rate, the era of a human genius working these things out with pen and paper is pretty much over (Just consider the amount of combined intellectual power now needed to make incremental improvements. Major scientific papers these days will usually have a dozen+ names from several institutions).

Ultimately, this process will look like pointing a camera at the world in general and using the resulting raw bit stream to induce the fundamental program that runs the Universe.

Comment by Capybasilisk on 12 interesting things I learned studying the discovery of nature's laws · 2022-02-20T11:35:40.438Z · LW · GW

“no, we swear there’s going to be a Higgs boson, we just need to build a more powerful particle accelerator”

Particle physicists also made other confident predictions about the LHC that are not working out, and they're now asking for a bigger accelerator.

Survivorship bias might be at play, wherein we forget all the confident pronouncements that ended being just plain wrong.

Comment by Capybasilisk on Consume fiction wisely · 2022-01-22T20:38:40.958Z · LW · GW

"How Much Are Games Like Factorio And EVE Online Sapping Away The Intellectual Potential Of Humanity?"

https://www.reddit.com/r/slatestarcodex/comments/ml00ac/how_much_are_games_like_factorio_and_eve_online/

Comment by Capybasilisk on Consume fiction wisely · 2022-01-22T20:34:08.628Z · LW · GW

For example, as a consumer of Hollywood movies, you see an aged businessman in a slick suit and automatically associate him with “evil capitalists”, even if the man is Chuck Feeney or Elon Musk. The reason is simple: you have learned to have such associations, by consuming the fiction written by people of certain political views, from Simpsons to Star Trek to pretty much every movie that depicts businessmen.

Many people have had enough interaction with businessmen in slick suits to independently form negative associations. No fiction needed.

Comment by Capybasilisk on Why Study Physics? · 2021-11-28T23:44:43.249Z · LW · GW

I thought the meme was that physicists think they can ride into town and make sweeping contributions with a mere glance at the problem, but reality doesn't pan out that way.

Relevant XKCD.

Comment by Capybasilisk on NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG · 2021-10-12T08:50:06.502Z · LW · GW

Leo Gao thinks it might be possible to ride language models all the way to AGI (or something reasonably close):

https://bmk.sh/2021/06/02/Thoughts-on-the-Alignment-Implications-of-Scaling-Language-Models/

Comment by Capybasilisk on Signaling Virtuous Victimhood as Indicators of Dark Triad Personalities · 2021-08-27T11:56:57.453Z · LW · GW

Perhaps a followup study can investigate if trying to sneak culture war topics into ostensibly non-political spaces also maps to dark triad.

Comment by Capybasilisk on The Myth of the Myth of the Lone Genius · 2021-08-05T00:33:42.445Z · LW · GW
Comment by Capybasilisk on What is the strongest argument you know for antirealism? · 2021-05-13T03:40:36.437Z · LW · GW

I guess you could posit natural selection as being objective reality's value system, but I have the feeling that's not the kind of thing moral realists have in mind.