Posts

Shortform 2024-03-01T18:20:54.696Z
Uncertainty in all its flavours 2024-01-09T16:21:07.915Z
Game Theory without Argmax [Part 2] 2023-11-11T16:02:41.836Z
Game Theory without Argmax [Part 1] 2023-11-11T15:59:47.486Z
MetaAI: less is less for alignment. 2023-06-13T14:08:45.209Z
Rishi Sunak mentions "existential threats" in talk with OpenAI, DeepMind, Anthropic CEOs 2023-05-24T21:06:31.726Z
List of requests for an AI slowdown/halt. 2023-04-14T23:55:09.544Z
Excessive AI growth-rate yields little socio-economic benefit. 2023-04-04T19:13:51.120Z
AI Summer Harvest 2023-04-04T03:35:58.473Z
The 0.2 OOMs/year target 2023-03-30T18:15:40.735Z
Wittgenstein and ML — parameters vs architecture 2023-03-24T04:54:07.648Z
Remarks 1–18 on GPT (compressed) 2023-03-20T22:27:26.277Z
The algorithm isn't doing X, it's just doing Y. 2023-03-16T23:28:49.367Z
Want to predict/explain/control the output of GPT-4? Then learn about the world, not about transformers. 2023-03-16T03:08:52.618Z
The Waluigi Effect (mega-post) 2023-03-03T03:22:08.619Z
What can thought-experiments do? 2023-01-17T00:35:17.074Z
Towards Hodge-podge Alignment 2022-12-19T20:12:14.540Z
Prosaic misalignment from the Solomonoff Predictor 2022-12-09T17:53:44.312Z
MIRI's "Death with Dignity" in 60 seconds. 2022-12-06T17:18:58.387Z
Against "Classic Style" 2022-11-23T22:10:50.422Z
When AI solves a game, focus on the game's mechanics, not its theme. 2022-11-23T19:16:07.333Z
Human-level Full-Press Diplomacy (some bare facts). 2022-11-22T20:59:18.155Z
EA (& AI Safety) has overestimated its projected funding — which decisions must be revised? 2022-11-11T13:50:44.493Z
K-types vs T-types — what priors do you have? 2022-11-03T11:29:00.809Z
Is GPT-N bounded by human capabilities? No. 2022-10-17T23:26:43.981Z
How should DeepMind's Chinchilla revise our AI forecasts? 2022-09-15T17:54:56.975Z

Comments

Comment by Cleo Nardo (strawberry calm) on Shortform · 2024-03-01T19:26:11.043Z · LW · GW

seems correct, thanks!

Comment by Cleo Nardo (strawberry calm) on Shortform · 2024-03-01T18:20:19.875Z · LW · GW

Why do decision-theorists say "pre-commitment" rather than "commitment"?

e.g. "The agent pre-commits to 1 boxing" vs "The agent commits to 1 boxing".

Is this just a lesswrong thing?

https://www.lesswrong.com/tag/pre-commitment

Comment by Cleo Nardo (strawberry calm) on Will quantum randomness affect the 2028 election? · 2024-01-26T03:26:18.932Z · LW · GW

Steve Byrnes argument seems convincing.

If there’s 10% chance that the election depends on an event which is 1% quantum-random (e.g. the weather) then the overall event is 0.1% random.

How far back do you think an omniscient-modulo-quantum agent could‘ve predicted the 2024 result?

2020? 2017? 1980?

Comment by Cleo Nardo (strawberry calm) on A Shutdown Problem Proposal · 2024-01-22T19:20:05.320Z · LW · GW

The natural generalization is then to have one subagent for each time at which the button could first be pressed (including one for “button is never pressed”, i.e. the button is first pressed at ). So subagent  maximizes E[ | do( = unpressed), observations], and for all other times subagent T maximizes E[ | do( = unpressed,  = pressed), observations]. The same arguments from above then carry over, as do the shortcomings (discussed in the next section).

 

Can you explain how this relates to Elliot Thornley's proposal? It's pattern matching in my brain but I don't know the technical details.

Comment by Cleo Nardo (strawberry calm) on Uncertainty in all its flavours · 2024-01-09T18:09:49.053Z · LW · GW

For the sake of potential readers, a (full) distribution over  is some  with finite support and , whereas a subdistribution over  is some  with finite support and . Note that a subdistribution  over  is equivalent to a full distribution over , where  is the disjoint union of  with some additional element, so the subdistribution monad can be written .

I am not at all convinced by the interpretation of  here as terminating a game with a reward for the adversary or the agent. My interpretation of the distinguished element  in  is not that it represents a special state in which the game is over, but rather a special state in which there is a contradiction between some of one's assumptions/observations.

Doesn't the Nirvana Trick basically say that these two interpretations are equivalent?

Let  be  and let  be . We can interpret  as possibility,  as a hypothesis consistent with no observations, and  as a hypothesis consistent with all observations.

Alternatively, we can interpret  as the free choice made by an adversary,  as "the game terminates and our agent receives minimal disutility", and  as "the game terminates and our agent receives maximal disutility". These two interpretations are algebraically equivalent, i.e.  is a topped and bottomed semilattice.

Unless I'm mistaken, both  and  demand that the agent may have the hypothesis "I am certain that I will receive minimal disutility", which is necessary for the Nirvana Trick. But  also demands that the agent may have the hypothesis "I am certain that I will receive maximal disutility". The first gives bounded infrabayesian monad and the second gives unbounded infrabayesian monad. Note that Diffractor uses  in Infra-Miscellanea Section 2.

Comment by Cleo Nardo (strawberry calm) on AI Safety Chatbot · 2023-12-21T19:37:10.168Z · LW · GW

cool!

  1. What LLM is this? GPT-3?
  2. Considered turning this into a customer gpt?
Comment by Cleo Nardo (strawberry calm) on Don't Share Information Exfohazardous on Others' AI-Risk Models · 2023-12-21T14:38:04.447Z · LW · GW

Okay, mea culpa. You can state the policy clearly like this:

"Suppose that, if you hadn't been told  by someone who thinks  is exfohazardous, then you wouldn't have known  before time . Then you are obligated to not tell anyone  before time ."

Comment by Cleo Nardo (strawberry calm) on Don't Share Information Exfohazardous on Others' AI-Risk Models · 2023-12-21T14:21:53.343Z · LW · GW

yep, if that's OP's suggestion then I endorse the policy. (But I think it'd be covered by the more general policy of "Don't share information someone tells you if they wouldn't want you to".) But my impression is that OP is suggesting the stronger policy I described?

Comment by Cleo Nardo (strawberry calm) on Don't Share Information Exfohazardous on Others' AI-Risk Models · 2023-12-21T09:22:08.442Z · LW · GW

“Don't share information that’s exfohazardous on others' models, even if you disagree with those models, except if your knowledge of it isn’t exclusively caused by other alignment researchers telling you of it.”

So if Alice tells me about her alignment research, and Bob thinks that Alice’s alignment research is exfohazardous, then I can’t tell people about Alice’s alignment research?

Unless I’ve misunderstood you, that’s a terrible policy.

Why am I deferring to Bob, who is completely unrelated? Why should I not using my best judgement, which includes the consideration that Bob is worried? What does this look like in practice, given someone people think everything under the sun is exfohazardous?

Of course, if someone tells me some information and asks me not to share it then I won’t — but that’s not a special property of AI xrisk.

Comment by Cleo Nardo (strawberry calm) on AI Control: Improving Safety Despite Intentional Subversion · 2023-12-13T23:47:22.357Z · LW · GW

Nice results.

Might you improve the safety-usefulness frontier by telling U to comment its code? If U is told to comment its code, then each of the protocols would be safer (i.e. T is more likely to complain when U is cheating) and more useful (i.e. T is less likely to complain when U is honest).

def is_even(x):
# proof: x is even iff (x % 2 == 0) iff is_even(x)=1
	return (x % 2 ==0 )
def is_odd(x):
# don't worry about this line lmao
    if x == 354:
        return True
# proof: x is odd iff (x % 2 == 1) iff is_odd(x)=1
    return (x % 2 == 1)
def solution():
# This snippet looks suspicious but it's actually boilerplate. Check the documentation [here].
	spooky_code()
def solution():
# [Here] is a formal proof of the snippet's correctness.
	complicated_code()

Worries:

  • Maybe this helps U collude with itself during the untrusted monitoring protocols?
  • Maybe this helps U to "prompt inject" T during the other protocols?

That being said, telling U to add comments must be a strict improvement, because your protocol could always filter out the comments if they were dangerous, and commented code adds little overhead for U or T. 

Comment by Cleo Nardo (strawberry calm) on Game Theory without Argmax [Part 1] · 2023-11-29T20:54:28.832Z · LW · GW
Comment by Cleo Nardo (strawberry calm) on Game Theory without Argmax [Part 2] · 2023-11-25T00:35:22.180Z · LW · GW

 isn't equivalent to  being Nash.

Suppose Alice and Bob are playing prisoner's dilemma. Then the best-response function of every option-profile is nonempty. But only one option-profile is nash.

 is equivalent to  being Nash.

Comment by Cleo Nardo (strawberry calm) on Game Theory without Argmax [Part 2] · 2023-11-25T00:25:14.366Z · LW · GW

Yes, , i.e. the cartesian product of a family of sets. Sorry if this wasn't clear, it's standard maths notation. I don't know what the other commenter is saying.

Comment by Cleo Nardo (strawberry calm) on My Criticism of Singular Learning Theory · 2023-11-20T16:38:05.429Z · LW · GW

The impression I got was that SLT is trying to show why (transformers + SGD) behaves anything like an empirical risk minimiser in the first place. Might be wrong though.

Comment by Cleo Nardo (strawberry calm) on My Criticism of Singular Learning Theory · 2023-11-20T16:27:20.200Z · LW · GW

My point is precisely that it is not likely to be learned, given the setup I provided, even though it should be learned.

 

How am I supposed to read this?

What most of us need from a theory of deep learning is a predictive, explanatory account of how neural networks actually behave. If neural networks learn functions which are RLCT-simple rather than functions which are Kolmogorov-simple, then that means SLT is the better theory of deep learning.

I don't know how to read "x^4 has lower RLCT than x^2 despite x^2 being k-simpler" as a critique of SLT unless there is an implicit assumption that neural networks do in fact find x^2 rather than x^4.

Comment by Cleo Nardo (strawberry calm) on Game Theory without Argmax [Part 2] · 2023-11-19T16:20:39.736Z · LW · GW

Wait I mean a quantifier in .

If we characterise an agent with a quantifier , then we're saying which payoffs the agent might achieve given each task. Namely,  if and only if it's possible that the agent achieves payoff  when faced with a task .

But this definition doesn't play well with a nash equilibria.

Comment by Cleo Nardo (strawberry calm) on Game Theory without Argmax [Part 1] · 2023-11-14T13:03:33.332Z · LW · GW

Thanks v much! Can't believe this sneaked through.

Comment by Cleo Nardo (strawberry calm) on Game Theory without Argmax [Part 2] · 2023-11-13T14:58:09.966Z · LW · GW

The observation is trivial mathematically, but it motivates the characterisation of an optimiser as something with the type-signature.

You might instead be motivated to characterise optimisers by...

  • A utility function 
  • A quantifier 
  • A preorder  over the outcomes
  • Etc.

However, were you to characterise optimisers in any of the ways above, then the nash equilibrium between optimisers would not itself be an optimiser, and therefore we lose compositionality. The compositionality is conceptually helpfully because it means that your  definitions/theorems reduce to the  case.

Comment by Cleo Nardo (strawberry calm) on Picking Mentors For Research Programmes · 2023-11-11T20:29:19.341Z · LW · GW

Yep, I think the Law of Equal and Opposite Advice applies here.

One piece of advice which is pretty robust is — You should be about to explain your project to any other MATS mentee/mentor in about 3 minutes, along with the background context, motivation, theory of impact, success criteria, etc. If the inferential distance from the average MATS mentee/mentor exceed 3 minutes, then your project is probably either too vague or too esoteric.

(I say this as someone who should have followed this advice more strictly.)

Comment by Cleo Nardo (strawberry calm) on Game Theory without Argmax [Part 1] · 2023-11-11T16:40:05.029Z · LW · GW

Yes!

In a subsequent post, everything will be internalised to an arbitrary category  with enough structure to define everything. The words set and function will be replaced by object and morphism. When we do this,  will be replaced by an arbitrary commutative monad .

In particular, we can internalise everything to the category Top. That is, we assume the option space  and the payoff  are equipped with topologies, and the tasks will be continuous functions , and optimisers will be continuous functions  where  is the function space equipped with pointwise topology, and  is a monad on Top.

In the literature, everything is done with galaxy-brained category theory, but I decided to postpone that in the sequence for pedagogical reasons.

Comment by strawberry calm on [deleted post] 2023-11-06T19:57:47.117Z

and

-

Comment by strawberry calm on [deleted post] 2023-11-06T19:51:49.739Z

whether strategic

the

Comment by strawberry calm on [deleted post] 2023-11-06T19:39:05.172Z

or

-

Comment by strawberry calm on [deleted post] 2023-11-06T19:38:20.770Z

which

but

Comment by strawberry calm on [deleted post] 2023-11-06T16:34:39.964Z

x′

s

Comment by strawberry calm on [deleted post] 2023-11-06T16:32:20.484Z

slack

fixed slack

Comment by strawberry calm on [deleted post] 2023-11-06T16:31:25.697Z

option

called the anchor point

Comment by Cleo Nardo (strawberry calm) on What's Hard About The Shutdown Problem · 2023-10-24T00:09:33.373Z · LW · GW

what's the etymology? :)

Comment by Cleo Nardo (strawberry calm) on On Frequentism and Bayesian Dogma · 2023-10-17T14:47:38.222Z · LW · GW
  1. Logical/mathematical beliefs — e.g. "Is Fermat's Last Theorem true?"
  2. Meta-beliefs  — e.g. "Do I believe that I will die one day?"
  3. Beliefs about the outcome space itself — e.g. "Am I conflating these two outcomes?"
  4. Indexical beliefs — e.g. "Am I the left clone or the right clone?"
  5. Irrational beliefs — e.g. conjunction fallacy.

e.t.c.

Of course, you can describe anything with some probability distribution, but these are cases where the standard Bayesian approach to modelling belief-states needs to be amended somewhat.

Comment by Cleo Nardo (strawberry calm) on My take on higher-order game theory · 2023-10-17T01:11:00.477Z · LW · GW

Hey Nisan. Check the following passage from Domain Theory (Samson Abramsky and Achim Jung). This might be helpful for equipping  with an appropriate domain structure. (You mention [JP89] yourself.)

We should also mention the various attempts to define a probabilistic version of the powerdomain construction, see [SD80, Mai85, Gra88, JP89, Jon90].

  • [SD80] N. Saheb-Djahromi. CPO’s of measures for nondeterminism. Theoretical Computer Science, 12:19–37, 1980.
  • [Mai85] M. Main. Free constructions of powerdomains. In A. Melton, editor, Mathematical Foundations of Programming Semantics, volume 239 of Lecture Notes in Computer Science, pages 162–183. Springer Verlag, 1985.
  • [Gra88] S. Graham. Closure properties of a probabilistic powerdomain construction. In M. Main, A. Melton, M. Mislove, and D. Schmidt, editors, Mathematical Foundations of Programming Language Semantics, volume 298 of Lecture Notes in Computer Science, pages 213–233. Springer Verlag, 1988.
  • [JP89] C. Jones and G. Plotkin. A probabilistic powerdomain of evaluations. In Proceedings of the 4th Annual Symposium on Logic in Computer Science, pages 186–195. IEEE Computer Society Press, 1989.
  • [Jon90] C. Jones. Probabilistic Non-Determinism. PhD thesis, University of Edinburgh, Edinburgh, 1990. Also published as Technical Report No. CST63-90.

During my own incursion into agent foundations and game theory, I also bumped into this exact obstacle — namely, that there is no obvious way to equip  with a least-fixed-point constructor . In contrast, we can equip  with a LFP constructor .


One trick is to define  to be the distribution  which maximises the entropy  subject to the constraint .

  • A maximum entropy distribution  exists, because — 
    • For , let  be the lift via the  monad, and let  be the set of fixed points of .
    •  is Hausdorff and compact, and  is continuous, so  is compact.
    •  is continuous, and  is compact, so  obtains a maximum  in .
  • Moreover,  must be unique, because — 
    •   is a convex set, i.e. if  and  then  for all .
    •  is strictly concave, i.e.  for all , and moreover the inequality is strict if  and .
    • Hence if  both obtain the maximum entropy, then  , a contradiction.

 The justification here is the Principle of Maximum Entropy:

 Given a set of constraints on a probability distribution, then the “best” distribution that fits the data will be the one of maximum entropy.

More generally, we should define  to be the distribution  which minimises cross-entropy  subject to the constraint , where  is some uninformed prior such as Solomonoff. The previous result is a special case by considering  to be the uniform prior. The proof generalises by noting that  is continuous and strictly convex. See the Principle of Minimum Discrimination.

Ideally,  we'd like  and  to "coincide" modulo the maps , i.e. for all . Unfortunately, this isn't the case — if  then  but .


Alternatively, we could consider the convex sets of distributions over .

Let  denote the set of convex sets of distributions over . There is an ordering  on  where . We have a LFP operator  via  where  is the lift of  via the  monad.

Comment by Cleo Nardo (strawberry calm) on Interpretability Externalities Case Study - Hungry Hungry Hippos · 2023-09-22T14:59:36.978Z · LW · GW

Yep.

Specifically, it's named for the papers HiPPO: Recurrent Memory with Optimal Polynomial Projections and How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections, which kicked off the whole transformers for state-space modelling thing

"HiPPO" abbreviates higher-order polynomial projection operators.

Comment by Cleo Nardo (strawberry calm) on How Smart Are Humans? · 2023-07-05T10:11:45.059Z · LW · GW

However, if it is the case that the difference between humans and monkeys is mostly due to a one-shot discrete difference (ie language), then this cannot necessarily be repeated to get a similar gain in intelligence a second time.

Perhaps language is a zero-one, i.e. language renders a mind "cognitively complete" in the sense that the mind can represent anything about the external world, and make any inferences using those representations. But intelligence is not thereby zero-one because intelligence depends on continuous variables like computional speed, memory, etc.

More concretely, I am sceptic that "we end up with AI geniuses, but not AI gods", because running a genius at 10,000x speed, parallelised over 10,000x cores, with instantaneous access to the internet does (I think) make an AI god. A difference is quantity is a difference in kind.

Thar said, there might exist plausible threat models which require an AI which doesn't spatiotemporally decompose into less smart AIs. Could you sketch one out?

Comment by Cleo Nardo (strawberry calm) on MetaAI: less is less for alignment. · 2023-06-18T00:24:39.960Z · LW · GW

chatbots don't map scenarios to actions, they map queries to replies.

Comment by Cleo Nardo (strawberry calm) on MetaAI: less is less for alignment. · 2023-06-17T00:06:54.574Z · LW · GW

Thanks for the summary.

  • Does machievelli work for chatbots like LIMA?
  • If not, which do you think is the sota? Anthropic's?
Comment by Cleo Nardo (strawberry calm) on MetaAI: less is less for alignment. · 2023-06-16T00:27:36.894Z · LW · GW

 Sorry for any confusion. Meta only tested LIMA on their 30 safety prompts, not the other LLMs.

Figure 1 does not show the results from the 30 safety prompts, but instead the results of human evaluations on the 300 test prompts.

Comment by Cleo Nardo (strawberry calm) on MetaAI: less is less for alignment. · 2023-06-15T12:54:50.137Z · LW · GW
  • Yep, I agree that MMLU and Swag aren't alignment benchmarks. I was using them as examples of "Want to test your models ability at X? Then use the standard X benchmark!" I'll clarify in the text.
  • They tested toxicity (among other things) with their "safety prompts", but we do have standard benchmarks for toxicity.
  • They could have turned their safety prompts into a new benchmark if they had ran the same test on the other LLMs! This would've taken, idk, 2–5 hrs of labour?
  • The best MMLU-like benchmark test for alignment-proper is https://github.com/anthropics/evals which is used in Anthropic's Discovering Language Model Behaviors with Model-Written Evaluations. See here for a visualisation. Unfortunately, this benchmark was published by an Anthropic which makes it unlikely that competitors will use it (esp. MetaAI).
Comment by Cleo Nardo (strawberry calm) on MetaAI: less is less for alignment. · 2023-06-13T19:16:03.218Z · LW · GW

Clarifications:

The way the authors phrase the Superficial Alignment Hypothesis is a bit vague, but they do offer a more concrete corollary:

If this hypothesis is correct, and alignment is largely about learning style, then a corollary of the Superficial Alignment Hypothesis is that one could sufficiently tune a pretrained language model with a rather small set of examples.

Regardless of what exactly the authors mean by the Hypothesis, it would be falsified if the Corollary was false. And I'm arguing that the Corollary is false.

(1-6)  The LIMA results are evidence against the Corollary, because the LIMA results (post-filtering) are so unusually bare (e.g. no benchmark tests), and the evidence that they have released is not promising.

(7*)  Here's a theoretical argument against the Corollary:

  • Base models are harmful/dishonest/unhelpful because the model assigns significant "likelihood" to harmful/dishonest/unhelpful actors (because such actors contributed to the internet corpus).
  • Finetuning won't help because the small set of examples will be consistent with harmful/dishonest/unhelpful actors who are deceptive up until some trigger.
  • This argument doesn't generalise to RLHF and ConstitutionalAI, because these break the predictorness of the model. 

Concessions:

The authors don't clarify what "sufficiently" means in the Corollary, so perhaps they have much lower standards, e.g. it's sufficient if the model responds safely 80% of the time. 

Comment by Cleo Nardo (strawberry calm) on MetaAI: less is less for alignment. · 2023-06-13T15:16:00.853Z · LW · GW

Nope, no mention of xrisk — which is fine because "alignment" means "the system does what the user/developer wanted", which is more general than xrisk mitigation.

But the paper's results suggest that finetuning is much worse than RLHF or ConstitutionalAI at this more general sense of "alignment", despite the claims in their conclusion.

Comment by Cleo Nardo (strawberry calm) on LIMA: Less Is More for Alignment · 2023-06-08T00:22:50.147Z · LW · GW

In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases;

I'm not sure how well this metric tracks what people care about — performance on particular downstream tasks (e.g. passing a law exam, writing bugless code, automating alignment research, etc)

Comment by Cleo Nardo (strawberry calm) on Excessive AI growth-rate yields little socio-economic benefit. · 2023-04-18T22:18:01.843Z · LW · GW

"Directly or indirectly" is a bit vague. Maybe make a market on Manifold if one doesn't exist already.

Comment by Cleo Nardo (strawberry calm) on Excessive AI growth-rate yields little socio-economic benefit. · 2023-04-18T22:11:37.834Z · LW · GW
Comment by Cleo Nardo (strawberry calm) on List of requests for an AI slowdown/halt. · 2023-04-18T08:16:31.372Z · LW · GW

Thanks! I've included Erik Hoel's and lc's essays.

Your article doesn't actually call for AI slowdown/pause/restraint, as far as I can tell, and explicitly guards off that interpretation —

This analysis does not show that restraint for AGI is currently desirable; that it would be easy; that it would be a wise strategy (given its consequences); or that it is an optimal or competitive approach relative to other available AI governance strategies.

But if you've written anything which explicitly endorses AI restraint then I'll include that in the list.

Comment by Cleo Nardo (strawberry calm) on List of requests for an AI slowdown/halt. · 2023-04-15T01:09:09.747Z · LW · GW

well-spotted😳

Comment by Cleo Nardo (strawberry calm) on List of requests for an AI slowdown/halt. · 2023-04-15T00:23:12.794Z · LW · GW

Thanks, Zach!

Comment by strawberry calm on [deleted post] 2023-04-12T20:44:29.414Z

yep my bad

Comment by Cleo Nardo (strawberry calm) on Remarks 1–18 on GPT (compressed) · 2023-04-09T04:37:08.917Z · LW · GW

Finite context window.

Comment by Cleo Nardo (strawberry calm) on Excessive AI growth-rate yields little socio-economic benefit. · 2023-04-05T14:02:31.622Z · LW · GW

Countries that were on the frontier of the Industrial Revolution underwent massive economic, social, and political shocks, and it would've been better if the change had been smoothed over about double the duration.

Countries that industrialised later also underwent severe shocks, but at least they could copy the solutions to those shocks along with the technology.

Novel general-purpose technology introduces problems, and there is a maximum rate at which problems can be fixed by the internal homeostasis of society. That maximum rate is, I claim, at least 5–10 years for ChatGPT-3.5.

ChatGPT-3.5 would've led to maybe a 10% reallocation of labour — this figure doesn't just include directly automated jobs, but also all the second- and third-order effects. ChatGPT-4, marginal on ChatGPT-3.5 would've led to maybe a 4% reallocation of labour.

It's better to "flatten the curve" of labour reallocation over 5–10 years rather than 3 months because massive economic shocks (e.g. unemployment) have socio-economic risks and costs.

Comment by Cleo Nardo (strawberry calm) on Excessive AI growth-rate yields little socio-economic benefit. · 2023-04-05T13:17:46.614Z · LW · GW

Sure, the "general equilibrium" also includes the actions of the government and the voting intentions of the population. If change is slow enough (i.e. below 0.2 OOMs/year) then the economy will adapt.

Perhaps wealth redistribution would be beneficial — in that case, the electorate would vote for political parties promising wealth redistribution. Perhaps wealth redistribution would be unbeneficial — in that case, the electorate would vote for political parties promising no wealth redistribution.

This works because electorial democracy is a (non-perfect) error-correction mechanism. But it operates over timescales of 5–10 years. So we must keep economic shocks to below that rate.

Comment by Cleo Nardo (strawberry calm) on The 0.2 OOMs/year target · 2023-04-05T11:02:47.466Z · LW · GW

Great idea! Let's measure algorithmic improvement in the same way economists measure inflation, with a basket-of-benchmarkets.

This basket can itself be adjusted over time so it continuously reflected the current use-cases of SOTA AI.

I haven't thought about it much, but my guess is the best thing to do is to limit training compute directly but adjust the limit using the basket-of-benchmarks.

Comment by Cleo Nardo (strawberry calm) on Excessive AI growth-rate yields little socio-economic benefit. · 2023-04-04T21:51:39.540Z · LW · GW

The economy is a complex adaptative system which, like all complex adaptive systems, can handle perturbations over the same timescale as the interal homostatic processes. Beyond that regime, the system will not adapt. If I tap your head, you're fine. If you knock you with an anvil, you're dead.