Posts

TED talk by Eliezer Yudkowsky: Unleashing the Power of Artificial Intelligence 2023-05-07T05:45:17.345Z
Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky 2023-02-20T16:42:07.413Z
What's the Most Impressive Thing That GPT-4 Could Plausibly Do? 2022-08-26T15:34:51.675Z

Comments

Comment by bayesed on Intelligence Officials Say U.S. Has Retrieved Craft of Non-Human Origin · 2023-06-06T16:57:09.324Z · LW · GW

But if that's the case, he could simply mention the amount he's willing to bet. The phrasing kinda suggested to me that he doesn't have all the info needed to do the Kelly calculation yet.

Comment by bayesed on Intelligence Officials Say U.S. Has Retrieved Craft of Non-Human Origin · 2023-06-06T16:36:14.408Z · LW · GW

What do you mean when you say you're "willing to bet according to the Kelly Criterion"? If you're proposing a bet with 99% odds and your actual belief that you'll win the bet is also 99%, then the Kelly Criterion would advise against betting (since the EV of such a bet would be zero, i.e. merely neutral).

Perhaps you mean that the other person should come up with the odds, and then you'll determine your bet amount using the Kelly criterion, assuming a 99% probability of winning for yourself.

Comment by bayesed on TED talk by Eliezer Yudkowsky: Unleashing the Power of Artificial Intelligence · 2023-05-07T08:52:37.095Z · LW · GW

Yeah, based on EY's previous tweets regarding this, it seemed like it was supposed to be a TED talk.

Comment by bayesed on Hell is Game Theory Folk Theorems · 2023-05-01T18:30:13.384Z · LW · GW

Example origin scenario of this Nash equilibrium from GPT-4:

In this hypothetical scenario, let's imagine that the prisoners are all part of a research experiment on group dynamics and cooperation. Prisoners come from different factions that have a history of rivalry and distrust. 

Initially, each prisoner sets their dial to 30 degrees Celsius, creating a comfortable environment. However, due to the existing distrust and rivalry, some prisoners suspect that deviations from the norm—whether upward or downward—could be a secret signal from one faction to another, possibly indicating an alliance, a plan to escape, or a strategy to undermine the other factions.

To prevent any secret communication or perceived advantage, the prisoners agree that any deviation from the agreed-upon temperature of 30 degrees Celsius should be punished with a higher temperature of 100 degrees Celsius in the next round. They believe that this punishment will deter any secret signaling and maintain a sense of fairness among the factions.

Now, imagine an external party with access to the temperature control system decides to interfere with the experiment. This person disables the dials and changes the temperature to 99 degrees Celsius for a few rounds, heightening the prisoners' confusion and distrust.

When the external party re-enables the dials, the prisoners regain control of the temperature. However, their trust has been severely damaged, and they are now unsure of each other's intentions. In an attempt to maintain fairness and prevent further perceived manipulation, they decide to adopt a strategy of punishing any deviations from the new 99 degrees Celsius temperature.

As a result, the prisoners become trapped in a suboptimal Nash equilibrium where no single prisoner has an incentive to deviate from the 99 degrees Celsius strategy, fearing retaliation in the form of higher temperatures. In this scenario, a combination of technical glitches, external interference, miscommunication, and distrust leads to the transition from an agreed-upon temperature of 30 to a suboptimal Nash equilibrium of 99 degrees Celsius.

As time goes on, this strategy becomes ingrained, and the prisoners collectively settle on an equilibrium temperature of 99 degrees Celsius. They continue to punish any deviations—upward or downward—due to their entrenched suspicion and fear of secret communication or potential advantage for one faction over another. In this situation, the fear of conspiracy and the desire to maintain fairness among the factions lead to a suboptimal Nash equilibrium where no single prisoner has an incentive to deviate from the 99 degrees Celsius strategy.

Comment by bayesed on LLMs and computation complexity · 2023-04-29T18:08:40.868Z · LW · GW

Hmm, I've not seen people refer to (ChatGPT + Code execution plugin) as an LLM. IMO, an LLM is supposed to be language model consisting of just a neural network with a large number of parameters.

Comment by bayesed on LLMs and computation complexity · 2023-04-29T06:50:52.910Z · LW · GW

I'm a bit confused about this post. Are you saying it is theoretically impossible to create an LLM that can do 3*3 matrix multiplication without using chain of thought? That seems false. 

The amount of computation an LLM has done so far will be a function of both the size of the LLM (call it the s factor) and the number of tokens generate so far (n). Let's say matrix multiplication of n*n matrices requires cn^3 amount of computation (actually, there are more efficient algos, but it doesn't matter).

You can do this by either using a small LLM and n^3 tokens so that sn^3 > cn^3. Or you can use a bigger LLM, so that s_big*n> cn^3. So then just need n tokens.

In general, you can always get a bigger and bigger constant factor to solve problems with higher n.

If your claim is that, for any LLM that works in the same way as GPT, there will exist a value of n for which it will stop being capable of doing n*n matrix multiplication without chain of thought/extra work, I'd cautiously agree. 

An LLM takes the same amount of computation for each generated token, regardless of how hard it is to predict. This limits the complexity of any problem an LLM is trying to solve.

For a given LLM, yes, there will be a limit on amount of computation it can do while generating a token. But this amount can be arbitrarily large. And you can always make a bigger/smarter LLM.

Consider two statements:

  1. "The richest country in North America is the United States of ______"
  2. "The SHA1 of 'abc123', iterated 500 times, is _______"

An LLM's goal is to predict the best token to fill in the blank given its training and the previous context. Completing statement 1 requires knowledge about the world but is computationally trivial. Statement 2 requires a lot of computation. Regardless, the LLM performs the same amount of work for either statement.

Well, it might do the same amount of "computation" but for problem 1 that might mostly be filler work while for problem 2 it can do intelligent work. You can always use more compute than necessary, so why does statement 1 using the same amount of compute as statement 2 imply any sort of limitation on the LLM?

It cannot correctly solve computationally hard statements like #2. Period. If it could, that would imply that all problems can be solved in constant time, which is provably (and obviously) false.

Why does this matter? It puts some bounds on what an LLM can do.

But it's easy to imagine a huge LLM capable doing 500 iterations of SH1 of small strings in one shot (even without memorization)? Why do you think that's impossible? (Just imagine a transformer circuit calculating SHA1, repeated 500 times). This doesn't imply that all problems can be solved in constant time. It just means that the LLM will only be able to do this until the length of string is bigger than a certain limit. After that, you'll need to make the LLM bigger/smarter.

Comment by bayesed on GPT-4 Plugs In · 2023-03-28T03:34:04.189Z · LW · GW

I don't think GPT4 can be used with plugins in ChatGPT. It seems to be a different model, probably based on GPT3.5 (evidence: the color of the icon is green, not black; seems faster than GPT4; no limits or quota; no explicit mention of GPT4 anywhere in announcement).

So I think there's a good chance the title is wrong.

Comment by bayesed on How well did Manifold predict GPT-4? · 2023-03-16T15:57:20.368Z · LW · GW

Additional comments on creative mode by Mikhail (from today):

https://twitter.com/MParakhin/status/1636350828431785984

We will {...increase the speed of creative mode...}, but it probably always be somewhat slower, by definition: it generates longer responses, has larger context.

https://twitter.com/MParakhin/status/1636352229627121665

Our current thinking is to keep maximum quality in Creative, which means slower speed.

https://twitter.com/MParakhin/status/1636356215771938817

Our current thinking about Bing Chat modes: Balanced: best for the most common tasks, like search, maximum speed Creative: whenever you need to generate new content, longer output, more expressive, slower Precise: most factual, minimizing conjectures

So creative mode definitely has larger context size, and might also be a larger model?

Comment by bayesed on How well did Manifold predict GPT-4? · 2023-03-16T04:27:41.448Z · LW · GW

Based on Mikhail's Twitter comments, 'precise' and 'creative' don't seem to be too much more than simply the 'temperature' hyperparameter for sampling. 'Precise' would presumably correspond to very low, near-zero or zero, highly deterministic samples.

Nope, Mikhail has said the opposite: https://twitter.com/MParakhin/status/1630280976562819072

Nope, the temperature is (roughly) the same.

So I'd guess the main difference is in the prompt.

Comment by bayesed on AI #1: Sydney and Bing · 2023-02-22T17:57:26.432Z · LW · GW

https://grabbyaliens.com/

Comment by bayesed on AI #1: Sydney and Bing · 2023-02-22T04:44:49.371Z · LW · GW

I think it's more of a correction than a misunderstanding. It shouldn't be assumed that "value" just means human civilization and its potential. Most people reading this post will assume "wiping out all value" to mean wiping out all that we value, not just wiping out humanity. But this is clearly not true, as most people value life and sentience in general, so a universe where all alien civs also end up dying due to our ASI is far worse than the one where there are survivors.

Comment by bayesed on AI #1: Sydney and Bing · 2023-02-21T18:23:56.971Z · LW · GW

Minor (?) correction: You've mentioned multiple times that our ASI will wipe out all value in the universe, but that's very unlikely to happen. We won't be the only (or the first) civilization to have created ASI, so eventually our ASI will run into other rogue/aligned ASIs and be forced to negotiate.

Relevant EY tweets: https://twitter.com/ESYudkowsky/status/1558974831269273600

People who value life and sentience, and think sanely, know that the future galaxies are the real value at risk.

...

Yes, I mean that I expect AGI ruin to wipe out all galaxies in its future lightcone until it runs into defended alien borders a billion years later.

Comment by bayesed on Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky · 2023-02-21T16:40:38.181Z · LW · GW

That part was a bit unclear. I guess he could work with redwood/conjecture without necessarily quitting his MIRI position?

Comment by bayesed on Is the AI timeline too short to have children? · 2022-12-15T05:32:47.360Z · LW · GW

Suppose you lived in the dark times, where children have a <50% of living to adulthood. Wouldn't you still have kids? Even if probabilistically smallpox was likely to take them?

Just wanna add that each of you children individually having a 50% chance of survival due to smallpox is different from all of your children together having 50% chance of survival due to AI (i.e. uncorrelated vs correlated), so some people might decide differently in these 2 cases.

Comment by bayesed on Can someone explain to me why most researchers think alignment is probably something that is humanly tractable? · 2022-09-03T06:16:37.448Z · LW · GW

AFAIK, it is not necessary to "“accurately reverse engineer human values and also accurately encode them”.  That's considered too hard, and as you say, not tractable anytime soon. Further, even if you're able to do that, you've only solved outer alignment, inner alignment still remains unsolved.

Instead, the aim is to build "corrigible" AIs. See Let's See You Write That Corrigibility Tag, Corrigibility (Arbital),  Hard problem of corrigibility (Arbital).

Quoting from the last link:


The "hard problem of corrigibility" is to build an agent which, in an intuitive sense, reasons internally as if from the programmers' external perspective. We think the AI is incomplete, that we might have made mistakes in building it, that we might want to correct it, and that it would be e.g. dangerous for the AI to take large actions or high-impact actions or do weird new things without asking first. 

We would ideally want the agent to see itself in exactly this way, behaving as if it were thinking, "I am incomplete and there is an outside force trying to complete me, my design may contain errors and there is an outside force that wants to correct them and this a good thing, my expected utility calculations suggesting that this action has super-high utility may be dangerously mistaken and I should run them past the outside force; I think I've done this calculation showing the expected result of the outside force correcting me, but maybe I'm mistaken about that."

Also, most if not all researchers think alignment is a solvable problem, but many think we may not have enough time. 

Comment by bayesed on Vingean Agency · 2022-08-25T07:23:33.823Z · LW · GW

Interesting example, but I still feel like Bob doesn't need to contradict Alice's known beliefs.

If Bob found a page from Alice's notebook that said "time and space are relative," he could update his understanding to realize that the theory of Newtonian physics he's been using is only an approximation, and not the real physics of the universe. Then, he could try to come up with upper bounds on how inaccurate Newtonian physics is, by thinking about his past experiences or doing new experiments. Even so, he could still keep using Newtonian physics, with the understanding that it's approximate, and may not always predict things correctly, without contradicting Alice's known beliefs, or separating Alice's beliefs from his own.

I might be missing the point though. Are you implying that, in this example, Bob is not smart enough to realize that Newtonian physics is only an approximation after learning about Alice's beliefs?