Posts

Comments

Comment by Herb Ingram on AI#28: Watching and Waiting · 2023-09-08T12:21:46.446Z · LW · GW

Formally proving that some X you could realistically build has property Y is way harder than building an X with property Y. I know of no exceptions (formal proof only applies to programs and other mathematical objects). Do you disagree?

I don't understand why you expect the existence of a "formal math bot" to lead to anything particularly dangerous, other than by being another advance in AI capabilities which goes along other advances (which is fair I guess).

Human-long chains of reasoning (as used for taking action in the real world) neither require nor imply the ability to write formal proofs. Formal proofs are about math and making use of math in the real world requires modelling, which is crucial, hard and usually very informal. You make assumptions that are obviously wrong, derive something from these assumptions, and make an educated guess that the conclusions still won't be too far from the truth in the ways you care about. In the real world, this only works when your chain of reasoning is fairly short (human-length), just as arbitrarily complex and long-term planning doesn't work, while math uses very long chains of reasoning. The only practically relevant application so-far seems cryptography because computers are extremely reliable and thus modeling is comparatively easy. However, plausibly it's still easier to break some encryption scheme than to formally prove that your practically relevant algorithm could break it.

LLMs that can do formal proof would greatly improve cybersecurity across the board (good for delaying some scenarios of AI takeover!). I don't think they would advance AI capabilities beyond the technological advances used to build them and increasing AI hype. However, I also don't expect to see useful formal proofs about useful LLMs in my lifetime (you could call this "formal interpretability"? We would first get "informal interpretability" that says useful things about useful models.) Maybe some other AI approach will be more interpretable.

Fundamentally, the objection stands that you can't prove anything about the real world without modeling, and modeling always yields a leaky abstraction. So we would have to figure out "assumptions that allow to prove that AI won't kill us all while being only slightly false and in the right ways". This doesn't really solve the "you only get one try problem". Maybe it could help a bit anyway?

I expect a first step might be an AI test lab with many layers of improving cybersecurity, ending at formally verified, air-gapped, no interaction to humans. However, it doesn't look like people are currently worried enough to bother building something like this. I also don't see such an "AI lab leak" as the main path towards AI takeover. Rather, I expect we will deploy the systems ourselves and on purpose, finding us at the mercy of competing intelligences that operate at faster timescales than us, and losing control.

Comment by Herb Ingram on Biosecurity Culture, Computer Security Culture · 2023-09-01T17:33:33.270Z · LW · GW

I think it makes a huge difference that most cybersecurity desasters only cost money (or cause damage to a company's reputation and loss of confidential information of customers) while a biosecurity desaster can kill a lot of people. This post seems to ignore this?

Comment by Herb Ingram on Assume Bad Faith · 2023-08-28T23:20:13.080Z · LW · GW

Besides thinking it fascinating and perhaps groundbreaking, I don't really have original insights to offer. The most interesting democracies on the planet in my opinion are Switzerland and Taiwan. Switzerland shows what a long and sustained cultural development can do. Taiwan shows the potential for reform from within and innovation.

There's a lot of material to read, in particular the events after the sunflower movement in Taiwan. Keeping links within lesswrong: https://www.lesswrong.com/posts/5jW3hzvX5Q5X4ZXyd/link-digital-democracy-is-within-reach and https://www.lesswrong.com/posts/x6hpkYyzMG6Bf8T3W/swiss-political-system-more-than-you-ever-wanted-to-know-i

Comment by Herb Ingram on Assume Bad Faith · 2023-08-28T23:10:38.339Z · LW · GW

What's missing in this discussion is why one is talking to the "bad faith" actor in the first place.

If you're trying to get some information and the "bad faith" actor is trying to deceive you, you walk away. That is, unless you're sure that you're much smarter or have some other information advantage that allows you to get new useful information regardless. The latter case is extremely rare.

If you're trying to convince the "bad faith" actor, you either walk away or transform the discussion into a negotiation (it arguably was a negotiation in the first place). The post is relevant for this case. In such situations, people often pretend to be having an object level discussion although all parties know it's a negotiation. This is interesting.

Even more interesting, Politics: you're trying to convince an amateur audience that you're right and someone else is wrong. The other party will almost always act "in bad faith" because otherwise the discussion would be taking place without an audience. You can walk away while accusing the other party of bad faith but the audience can't really tell if you were "just about to loose the argument" or if you were arguing "less in bad faith than the other party", perhaps because the other party is losing the argument. Crucially, given that both parties are compelled to argue in bad faith, the audience is to some extent justified in not being moved by any object level arguments since they mostly cannot check if they're valid. They keep to the opinions they have been holding and the opinions of people they trust.

In this case, it might be worth it to move from the above situation, where the object level being discussed isn't the real object-level issue, as in the bird example, to one where a negotiation is taking place that is transparent to the audience. However, this is only possible if there is a competent fourth party arbitrating, as the competing parties really cannot give up the advantage of "bad faith". That's quite rare.

An upside: If the audience is actually interested in the truth, however, and if it can overcome the tribal drive to flock to "their side", they can maybe force the arguing parties to focus on the real issue and make object-level arguments in such a way that the audience can become competent enough to judge the arguments.Doing this is a huge investment of time and resources. It may be helped by all parties acknowledging the "bad faith" aspect of the situation and enforcing social norms that address it. This is what "debate culture" is supposed to do but as far as I know never really has.

My takeaway: don't be too proud of your debate culture where everyone is "arguing in good faith", if it's just about learning about the word. This is great, of course, but doesn't really solve the important problems.

Instead, try to come up with a debate culture (debate systems?) that can actually transform a besides-the-point bad-faith apparent disagreement into a negotiation where the parties involved can afford to make their true positions explicitly known. This is very hard but we shouldn't give up. For example, some of the software used to modernize democracy in Taiwan seems like an interesting direction to explore.

Comment by Herb Ingram on Ideas for improving epistemics in AI safety outreach · 2023-08-22T23:08:31.664Z · LW · GW

I think any outreach must start with understanding where the audience is coming from. The people most likely to make the considerable investment of "doing outreach" are in danger of being too convinced of their position and thinking it obvious; "how can people not see this?".

If you want to have a meaningful conversation with someone and interest them in a topic, you need to listen to their perspective, even if it sounds completely false and missing the point, and be able to empathize without getting frustrated. For most people to listen and consider any object level arguments about a topic they don't care about, there must first be a relationship of mutual respect, trust and understanding. Getting people to consider some new ideas, rather than convincing them of some cause, is already a very worthy achievement.

Comment by Herb Ingram on Large Language Models will be Great for Censorship · 2023-08-22T22:44:00.625Z · LW · GW

Indeed, systems controlling the domestic narrative may become sophisticated enough that censorship plays no big role. No regime is more powerful and enduring than one which really knows what poses a danger to it and what doesn't, one which can afford to use violence, coercion and censorship in the most targeted and efficient way. What a small elite used to do to a large society becomes something that the society does to itself. However, this is hard and I assume will remain out of reach for some time. We'll see what develops faster: sophistication of societal control and the systems through which it is achieved, or technology for censorship and surveillance. I'd expect at least a "transition period" of censorship technology spreading around the world as all societies that successfully use it become sophisticated enough to no longer really need it.

What seems more certain is that AI will be very useful for influencing societies in other countries, where the sophisticated domestically optimal means aren't possible to deploy. This goes very well with exporting such technology.

Comment by Herb Ingram on What does it mean to “trust science”? · 2023-08-16T23:10:29.260Z · LW · GW

Uncharitably, "Trust the Science" is a talking point in debates that have some component which one portrays as "fact-based" and which one wants to make an "argument" about based on the authority of some "experts". In this context, "trust the science" means "believe what I say".

Charitably, it means trusting that thinking honestly about some topic, seeking truth and making careful observations and measurements actually leads to knowledge, that knowledge is inteligibly attainable. This isn't obvious, which is why there's something there to be trusted. It means trusting that the knowledge gained this way can be useful, that it's worth at least hearing out people who seem or claim to have it, that it's worth stopping for a moment to honestly question one's own motivations and priors, the origins of one's beliefs and to ponder the possible consequences in case of failure, whenever one willingly disbelieves or dismisses that knowledge. In this context, "trust the science" means "talk to me and we'll figure it out".

Comment by Herb Ingram on Summary of and Thoughts on the Hotz/Yudkowsky Debate · 2023-08-16T22:53:02.115Z · LW · GW
  1. There's a big difference between philosophy and thinking about unlikely scenarios in the future that are very different from our world. In fact, those two things have little overlap. Although it's not always clear, (I think) this discussion isn't about aesthetics, or about philosophy, it's about scenarios that are fairly simple to judge but have so many possible variations, and are so difficult to predict, that is seems pointless to even try. This feeling of futility is the parallel with philosophy, much of which just digests and distills questions into more questions, never giving an answer, until a question is no longer philosophy and can be answered by someone else.

The discussion is about whether or not human civilization will distroy itself due to negligence and lack of ability to cooperate. This risk may be real or imagined. You may care about future humans or not. But that doesn't make this neither philosophy nor aesthetics. The questions are very concrete, not general, and they're fairly objective (people agree a lot more on whether civilization is good than on what beauty is).

  1. I really don't know what you're saying. To attack an obvious straw man and thus give you at least some starting point for explaining further: Generally, I'd be extremely sceptical of any claim about some tiny coherent group of people understanding something important better than 99% of humans on earth. To put it polemically, for most such claims, either it's not really important (maybe we don't really know if it is?), it won't stay that way for long, or you're advertising for a cult. The phrase "truly awakened" doesn't bode well here... Feel free to explain what you actually meant rather than responding to this.

  2. Assuming these "ideologies" you speak of really exist in a coherent fashion, I'd try to summarize "Accelerationist ideology" as saying: "technological advancement (including AI) will accelerate a lot, change the world in unimaginable ways and be great, let's do that as quickly as possible", while "AI safety (LW version)" as saying "it might go wrong and be catastrophic/unrecoverable; let's be very careful". If anything, these ideas as ideologies are yet to get out into the world and might never have any meaningful impact at all. They might not even work on their own as ideologies (maybe we mean different things by that word).

So why are the origins interesting? What do you hope to learn from them? What does it matter if one of those is an "outgrowth" of one thing more than some other? It's very hard for me to evaluate something like how "shallow" they are. It's not like there's some single manifesto or something. I don't see how that's a fruitful direction to think about.

Comment by Herb Ingram on We Should Prepare for a Larger Representation of Academia in AI Safety · 2023-08-15T23:27:47.651Z · LW · GW

No offense, this reads to me as if it was deliberately obfuscated or AI-generated (I'm sure you didn't do either of these, this is a comment on writing style). I don't understand what you're saying. Is it "LW should focus on topics that academia neglects"?

I also didn't understand at all what the part starting with "social justice" is meant to tell me or has to do with the topic.

Comment by Herb Ingram on LLMs are (mostly) not helped by filler tokens · 2023-08-10T22:52:18.787Z · LW · GW

There has been some talk recently about long "filler-like" input (e.g. "a a a a a [...]") somewhat derailing GPT3&4, e.g. leading them to output what seems like random parts of it's training data. Maybe this effect is worth mentioning and thinking about when trying to use filler input for other purposes.

Comment by Herb Ingram on Lack of Social Grace Is an Epistemic Virtue · 2023-07-31T21:22:19.782Z · LW · GW

just in case it turns out he's heir to a giant fortune or something.

That seems like a highly dubious explanation to me. I guess, the woman's honest account (or what you'd get by examining her state of mind) would say that she does it as a matter of habit, aiming to be nice and conform to social conventions.

If that's true, the question becomes where the convention comes from and what maintains it despite the naively plausible benefits one might hope to gain by breaking it. I don't claim to understand this (that would hint at understanding a lot of human culture at a basic level). However, I strongly suspect the origins of such behavior (and what maintains it) to be social. I.e., a good explanation of why the woman has come to act this way involves more than two people. That might involve some sort of strategic deception, but consider that most people in fact want to be lied to in such situations. An explanation must go a lot deeper than that kind of strategic deception.

Comment by Herb Ingram on Rationality !== Winning · 2023-07-24T07:34:42.101Z · LW · GW

While I completely agree in the abstract, I think there's a very strong tendency for systems-of-thought, such as propagated on this site, to become cult-like. There's a reason why people outside the bubble criticize LW for building a cult. They see small signs of it happening and also know/feel the general tendency for it, which always exists in auch a context and needs to be counteracted.

As you point out, the concrete ways of thinking propagated here aren't necessarily the best for all situations and it's another very deep can of worms to be able to tell which situations are which. Also, it attracts people (such as myself to some degree) who enjoy armchair philosophizing without actually ever trying to do anything useful with that. Akrasia is one thing, not even expecting to do anything useful with some knowledge and pursuing it as a kind of entertainment is another still.

So there's two ways to frame the message: one is saying that "rationality is about winning", which is a definition that's very hard to attack but also vague in it's immediate and indisputable consequences for how one should think, and also makes it hard to tell if "one is doing it right".

The other way is to impose some more concrete principles and risk them becoming simplified, ritualized, abused and distorted to a point where they might do net harm. This way also makes it impossible to develop the epistemology further. You pick some meta-level and propose rules for thinking at that level which people eventually and inevitably propagate and defend with the fervor of religious belief. It becomes impossible to improve the epistemology at that point.

The meme ("rationality") has to be about something in order to spread and also needs some minimum amount of coherence. "It's about winning" seems to do this job quite well and not too well.

Comment by Herb Ingram on Proof of posteriority: a defense against AI-generated misinformation · 2023-07-18T07:13:44.145Z · LW · GW

Unfortunately for this scheme, I would expect rendering time for AI videos to eventually be faster than real time. So, as the post implies, even if we had a reasonably good way to prove posteriority, this may not do to certify videos as "non-AI" for long.

On the other hand, as long as rendering AI videos is slower than real time, poof of priority alone might go a long way. You can often argue that prior to some point in time you couldn't reasonably have known what kind of video you should fake.

The "analog requirement" reminds me of physical unclonable functions, which might have some cross-pollination with this issue. I couldn't think of a way to make use of them but maybe someone else will.

Comment by Herb Ingram on Jailbreaking GPT-4's code interpreter · 2023-07-14T22:15:57.483Z · LW · GW

I guess it depends on whether this post found anything at all that can be called questionable security practice. Maybe it didn't but the author was also no cybersecurity expert. Upon reflection, my earlier judgement was premature and the phrasing overconfident.

In general, I assume that OpenAI would view a serious hack as quite catastrophic, as it might e.g. leak their model (not an issue in this case), severely damage their reputation and undermine their ongoing attempt at regulatory capture. However, such situations didn't prevent shoddy security practices in countless cybersecurity desasters.

I guess for this feature even the most serious vulnerabilities "just" lead to some Azure VMs being hacked, which has no relevance for AI safety. It might still be indicative of OpenAIs approach to security, which usually isn't so nuanced within organizations as to differ wildly between applications where stakes are different. So it's interesting how secure the system really is, which we won't know how untill someone hacks it or some whistleblower emerges.

Some of my original reasoning was this:

You might argue that the "inner sandbox" is only used to limit resource use (for users who do not bother jailbreaking it) and to examine how users will act, as well as how badly exactly the LLM itself will fare against jailbreaking. In this case studying how people jailbreak it may be an integral part of the whole feature.

However, even if that is the case, to count as "security mindset", the "outer sandbox" has to be extremely good and OpenAI needs to be very sure that it is. To my (very limited) knowledge im cybersecurity, it's an unusual idea that you can reconcile very strong security requirements with purposely not using every opportunity to make it more secure. Maybe the idea that comes closest would be a "honeypot", which this definitely isn't.

So that suggests they purposely took a calculated security risk for some mixture of research and commercial reasons, which they weren't compelled to do. Depending on how dangerous they really think such AI models are or may soon become, how much what they learn from the experiment benefits future security and how confident they are in the outer sandbox, the calculated risk might make sense. Assuming by default that the outer sandbox is "normal indusitry standard", it's incompatible with the level of worry they claim when pursuing regulatory capture.

Comment by Herb Ingram on Jailbreaking GPT-4's code interpreter · 2023-07-14T07:24:29.631Z · LW · GW

I agree. To me, the most interesting aspects of this (quite interesting and well-executed) exercise are getting a glimpse into OpenAI's approach to cybersecurity, as well as the potentially worrying fact that GPT3 made meaningful contributions to finding the "exploits".

Given what was found out here, OpenAI's security approach seems to be "not terrible" but also not significantly better than what you'd expect from an average software company, which isn't necessarily encouraging because those get hacked all the time. It's definitely not what people here call "security mindset", which casts doubt on OpenAI's claim to be "taking the dangers very seriously". I'd expect to hear about something illegal being done with one of these VMs before too long, assuming they continue and expand the service, which I expect they will.

I'm sure there are also security experts (both at OpenAi and elsewhere) looking into this. Given OpenAI's PR strategy, they might be able to shut down such services "due to emerging security concerns" without much reputational damage. (Many companies are economically compelled to keep services running that they know are compromised or that have known vulnerabilities and instead pretend not to know about them or at least not inform customers as long as possible.) Not sure how much e.g. Microsoft would push back on that. All in all, security experts finding something might be taken seriously.

I'm increasingly worried (while ascribing a decent chance, mind you, that "AI might well go about as bad for us as most of history but not worse") about what happens when GPT-X has hacking skills that are, say, on par with the median hacker. Being able to hack easy-ish targets at scale might not be something the internet can handle, potentially resulting in, e.g , an evolutionary competition between AIs to build a super-botnet.

Comment by Herb Ingram on Is the Endowment Effect Due to Incomparability? · 2023-07-10T19:29:48.679Z · LW · GW

Someone who you're likely to trade with (either because they offer you a trade or because they are around when you want to trade) are on average more experienced than you at trading. So trades available to you are disproportionately unfavorable and you cannot figure out which ones "are likely to lead to favorable trades in the future", by assumption that they are incomparable.

This is what you mean by "trades are often adversarialy chosen" in (1.), right? I don't understand why or in what situation you're dismissing that argument in (1.).

There can be a lot of other reasons to avoid incomparable trades. In accepting a trade where you don't clearly gain anything, you're taking a risk to be cheated and reveal information to others about your preferences, which can risk social embarrassment and might enable others to cheat you in the future. You're investing the mental effort to evaluate these things despite already having decided that you don't stand to gain anything.

An interesting counterexample are social contexts where trading is an established and central activity. For example, people who exchange certain collectibles. In such a context, people feel that the act of trading itself has positive value and thus will make incomparable trades.

I think this situation is somewhat analogous to betting. Most people (cultures?) are averse to betting in general. Risk aversion and the known danger of gambling addiction explains aversion to betting for money/valuables. However, many people also strongly dislike betting without stakes. In some social contexts (horse racing, LW) betting is encouraged, even between "incomparable options", where the odds correctly reflect your credence.

In such cases, most people seem to consider it impolite to answer an off-hand probability estimate by offering a bet. It is understood perhaps as questioning their credibility/competence/sincerity, or as an attempt to cheat them when betting for money. People will decline the bet but maintain their off-hand estimate. This might very well make sense, especially if they don't explicitly value "training to make good probability estimates", and perhaps for some of the same reasons as apply to trades?

Comment by Herb Ingram on Some reasons to not say "Doomer" · 2023-07-10T00:12:07.210Z · LW · GW

Who is the target audience for this?

I doubt anyone has been calling themselves a "doomer". There are people on this site who wouldn't ever get called that but I haven't seen anyone else here label anyone a "doomer" yet. So it seems that you're left with people who don't frequent this site and would probably dismiss your arguments as "a doomer complaining about being called a doomed"?

Did I miss people call each other "doomer" on LW? Did you also post something like this on Twitter?

Comment by Herb Ingram on [deleted post] 2023-07-08T09:02:40.717Z

To me, the arguments from both sides, both arguing for and against worrying about existential risk from AI, make sense. People have different priors and biased access to information. However, even if everyone agreed on all matters of fact that can be currently established, the disagreement would persist. The issue is that predicting the future is very hard and we can't expect to be in any way certain what will happen. I think the interesting difference between how people "pro" and "contra" AI-x-risk think about this is in dealing with this uncertainty.

Imagine you have a model of the world, which is the best model you have been able to come up with after trying very hard. This model is about the future and predicts catastrophe unless something is done about it now. It's impossible to check if the model holds up, other than by waiting until it's too late. Crucially, your model seems unlikely to make true predictions: it's about the future and rests on a lot of unverifiable assumptions. What do you do?

People "pro-x-risk" might say: "we made the best model we could make, it says we should not build AI. So let's not do that, at least until our models are improved and say it's safe enough to try. The default option is not to do something that seems very risky.".

The opponents might say: "this model is almost certainly wrong, we should ignore what it says. Building risky stuff has kinda worked so far, let's just see what happens. Besides, somebody will do it anyway."

My feeling when listening to eleborate and abstract discussions is that people mainly disagree on this point. "What's the default action?" or, in other words, "who has the burden of proof?". That proof is basically impossible to give for either side.

It's obviously great that people are trying to improve their models. That might get harder to do the more politicized the issue becomes.

Comment by Herb Ingram on When do "brains beat brawn" in Chess? An experiment · 2023-07-04T02:54:38.019Z · LW · GW

I really have no idea, probably a lot?

I don't quite see what you're trying to tell me. That one (which?) of my two analogies (weather or RTS) is bad? That you agree or disagree with my main claim that "evaluating the relative value of an intelligence advantage is probably hard in real life"?

Your analogy doesn't really speak to me because I've never tried to start a company and have no idea what leads to success, or what resources/time/information/intelligence helps how much.

Comment by Herb Ingram on When do "brains beat brawn" in Chess? An experiment · 2023-07-03T22:53:48.493Z · LW · GW

What point are you trying to make? I'm not sure how that relates to what I was trying to illustrate with the weather example. Assuming for the moment that you didn't understand my point.

The "game" I was referring to was one where it's literally all-or-nothing "predict the weather a year from now", you get no extra points for tomorrow's weather. This might be artificial but I chose it because it's a common example of the interesting fact that chaos can be easier to control than simulate.

Another example. You're trying to win an election and "plan long-term to make the best use of your intelligence advantage", you need to plan and predict a year ahead. Intelligence doesn't give you a big advantage in predicting tomorrow's polls given today's polls. I can do that reasonably well, too. In this contest, resources and information might matter a lot more than intelligence. Of course, you can use intelligence to obtain information and resources. But this bootstrapping takes time and it's hard to tell how much depending where you start off.

Comment by Herb Ingram on When do "brains beat brawn" in Chess? An experiment · 2023-07-03T21:38:19.981Z · LW · GW

That makes sense to me but to make any argument about the "general game of life" seems very hard. Actions in the real world are made under great uncertainty and aggregate in a smooth way. Acting in the world is trying to control (what physicists call) chaos.

In such a situation, great uncertainty means that an intelligence advantage only matters "on average over a very long time". It might not matter for a given limited contest, such as a struggle for world domination. For example, you might be much smarter than me and a meteorologist, but you'd find it hard to predict the weather in a year's time better than me if it's a single-shot-contest. How much "smarter" would you need to be in order to have a big advantage? Pretty much regardless of your computational ability and knowledge of physics, you'd need such an amount of absurdly precise knowledge about the world that it might still take (both you and even much less intelligent actors) less resources to actively control the entire planet's weather than predict it a year in advance.

The way that states of the world are influenced by our actions is usually in some sense smooth. For any optimal action, there are usually lots of similar "nearby actions". These may or may not be near-optimal but in practice only plans that have a sufficiently high margin for error are feasible. The margin of error depends on the resources that allow finely controlled actions and thus increase the space of feasible plans. This doesn't have a good analogy in chess: chess is much further from smooth than most games in the real world.

Maybe RTS games are a slightly better analogy. They have "some smoothness of action-result mapping" and high amounts of uncertainty. Based on AlphaStar's success in StarCraft, I would expect we can currently build super-human AIs for such games. They are superior to humans both in their ability to quickly and precisely perform many actions, as well as find better strategies. An interesting restriction is to limit the numbers of actions the AI may take to below what a human can to see the effect of these abilities individually. Restricting the precision and frequency of actions reduces the space of viable plans, at which point the intelligence advantage might matter much less.

All in all, what I'm trying to say is that the question "how much does what intelligence imbalance matter in the world" is hard. The question is not independent of access to information and access to resources or ability to act on the world. To make use of a very high intelligence, you might need a lot more information and also a lot more ability to take precise actions. The question for some system "taking over" is whether its initial intelligence, information and ability to take actions is sufficient to bootstrap quickly enough.

These are just some more reasons you can't predict the result just by saying "something much smarter is unbeatable at any sufficiently complex game".

Comment by Herb Ingram on [deleted post] 2023-07-01T12:45:29.728Z

If you want to compare to the "ancestral environment", it's crucial not to forget about the amounts of dust that are breathed in. How much dust of what particle sizes do we breathe in outside, compared to inside a dusty home?

Comment by Herb Ingram on Lessons On How To Get Things Right On The First Try · 2023-06-21T06:31:22.110Z · LW · GW

Very interesting exercise on modeling, with some great lessons. I don't really like the AI analogy though.

The ramp problem is a situation where idealizations are well-understood. The main steps to solving it seem to be realizing that these idealizations are very far from reality and measuring (rather than modeling) as much as possible.

On the first step, comparing with AI progress and risks there, nobody thinks they have a detailed mechanistic model of what should happen. Rather, most people just assume there is no need to get anything right on the first try because that approach has "worked" so-far for new technology. People also anticipate strong capabilities to take a lot longer to develop and generally underdeliver on the industry's promises, again because that's how it usually goes with new hyped technology. You could say "that's the model one should resist using there" but the analogy is very stretched in my opinion. It only applies if the "potential models to be resisted" are taken to be extremely crude base-rate estimates from guestimated base-rates for how "technological development in general" is supposed to work. Such a model would be just as established as the fact that "such very crude models give terribly inaccurate predictions". There is no temptation there.

On the second step, I don't seee what one might reasonably try to measure concerning AI progress. E.g., extrapolating some curve of "capability advancement over time" rather than just being sceptical-by-default isn't going to make a difference for AI risk.

I think a better metaphorical experiment/puzzle relevant to AI risk would be one where you naively think you have a lot of tries, but it turns out that you only get one due to some catastrophic failure which you could have figured out and mitigated if only you thought and looked more carefully. In the ramp problem, the "you get one try" part is implicit in how the problem is phrased.

My argument is based on models concerned with the question whether "you only get one try for AI". Maybe some people are unconcerned because they assume that others have detailed and reasonably accurate models of what a given AI will do. I doubt that because "it's a blackbox" is the one fact one hears most often about current AI.

Comment by Herb Ingram on Solomonoff induction still works if the universe is uncomputable, and its usefulness doesn't require knowing Occam's razor · 2023-06-18T23:17:50.255Z · LW · GW

This post seems to be about justifying why Solomonoff induction (I presume) is a good way to make sense of the world. However, unless you use it as another name for "Bayesian reasoning", it clearly isn't. Rather, it's a far-removed idealization. Nobody "uses" Solomonoff induction, except as an idealized model of how very idealized reasoning works. It has purely academic interest. The post sounds (maybe I'm misunderstanding this completely) like this weren't the case, e.g., discussing what to do about a hypothetical oracle we might find.

In practice, humanity will only ever care about a finite number of probability distributions, each of which has finite support and will be updated a finite number of times using finite memory and finite processing power. (I guess one might debate this, but I personally usually can't think of anything interesting to say in that debate) As such, for all practical purposes, the solution to any question that has any practical relevance is computable. You could also put an arbitrary fixed but very large bound on complexity of hypotheses, keeping all hypotheses we might ever discuss in the race, and thus make everything computable. This would change the model but make no difference at all in practice, since the size of hypotheses we may ever actually assess is miniscule. The reason why we discus the infinite limit is because it's easier to grasp (ironically) and prove things about.

Starting from this premise, what is this post telling me? It's saying something about a certain idealized conception of reasoning. I can see how to transfer certain aspects of Solomonoff induction to the real world: you have Occam's razor, you have Bayesian reasoning. Is there something to transfer here? Or did I misunderstand it completely, leading me to expect there to be something?

Of course, I'm not sure if this "focus on finite things and practicality" is the most useful way to think about it, and I've seen people argue otherwise elsewhere, but always very unconvincingly from my perspective. Perhaps someone here will convince me that computability should matter in practice, for some reasonable concept of practice?

Comment by Herb Ingram on The Dial of Progress · 2023-06-14T18:49:17.441Z · LW · GW

I agree all of these things are possible and expect such capabilities to develop eventually. I also strongly agree with your premise that having more advanced AI can be a big geopolitical advantage, which means arms races are an issue. However, 5-20 years is not very long. It may be enough to have human level AGI, I don't expect such an AGI will enable feeding an entire country on hydroponics in the event of global nuclear war.

In any case, that's not even relevant to my point, which is that, while AI does enable nuclear bunkers, defending against ICBMs and hydroponics, in the short term it enables other things a lot more, including things that matter geopolitically. For a country with a large advantage in AI capabilities pursuing geopolitical goals, it seems a bad choice to use nuclear weapons or to take precautions against attack using such weapons and be better off in the aftermath.

Rather, I expect the main geopolitically relevant advantages of AI superiority to be economic and political power, which gives advantage both domestically (ability to organize) as well as for influencing geopolitical rivals. I think resorting to military power (let alone nuclear war) will not be the best use of AI superiority. Economic power would arise from increased productivity due to better coordination, as well as the ability to surveil the population. Political power abroad would arise from the economic power, as well as from collecting data about citizens and using it for predicting their sentiments, as well as propaganda. AI superiority strongly benefits from having meaningful data about the world and other actors, as well as good economy and stable supply chains. These things go out the window in a war. I also expect war to be a lot less politically viable than using the other advantages of AI, which matters. 

Comment by Herb Ingram on The Dial of Progress · 2023-06-14T15:51:15.999Z · LW · GW

I disagree with the last two paragraphs. First, global nuclear war implies destruction of civilized society and bunkers can do very little to mitigate this at scale. Global supply chains and especially food production are the important facor. To restructure the food production and transportation of an entire country in the situation after nuclear war, AGI would have to come up with biotechnology bordering on magic from our point of view.

Even if building bunkers was a good idea, it's questionable if that's an area where AGI helps a lot compared to many other areas. Same for ICBMs: I don't see how AGI changes the defensive/offensive calculation much.

To use the opium wars scenario: AGI enables a high degree of social control and influence. My expectation is that one party having a decisive AI advantage (implying also a wealth advantage) in such a situation may not need to use violence at all. Rather, it may be feasible to gain enough political influence to achieve most goals (including auch a mundane goal as making people and government tolerate the trade of drugs).