Posts
Comments
Depending on the agent implementation you may find that it is demotivated to achieve any useful outcome if they are power limited. Half-assing things seems pointless and futile, they aren't sane actions in the world. E.g. trying to put out a fire when all you have is a squirt gun.
I'm someone who is moving in the opposite direction mainly (from AI to climate change). I see AGI as a lot harder to do than most, mainly due to the potential political ramifications causing slow development and thinking it will need experiments with novel hardware, so is more visible than just coding. So I see it as relatively easy to stop, at least inside a country. Multi-nationally would be trickier.
Some advise, I would try and frame your effort as "Understanding AGI risk". While you think there is risk currently, having an open mind about the status of the risk is important. If AGI turns out to be existential risk-free then it could help with climate adaptation, even if it is not in time for climate mitigation.
Edit: You could frame it just as understanding AI, and put together independent briefs on each project for policy makers to understand the likely impacts both positive and negative and the state of play. Getting a good reputation and maintaining independence might be hard though.
A theory I read in "Energy and Civilisation" by Vaclav Smil is that we could get a big brain by developing tools and techniques (like cooking) to reduce the need for a complicated guts by having a higher quality diet.
This is connected to the Principled Intelligence hypothesis, because things like hunting or maintaining a fire require cooperation and communication. Maintaining the knowledge through a tribe for those things also required consistent communication. If you don't all have the same word for 'hot' and use it in the same way, lots of people are going to get burned or go hungry. So you need a norm policing mechanism for the meaning of words in order for language to be at all useful for coordination and the transfer of culture needed for survival.
This is not norm policing against people trying to trick you, just norm policing against jibberish. It probably works against both to an extent.
All the tribes where a modicum of norm policing didn't go to fixation couldn't coordinate or communicate at all. And went back to the old process of eating roots and got out competed by those that did.
My view is that you have to build AIs with a bunch of safeguards to stop it destroying *itself* while it doesn't have great knowledge of the world or the consequences of its actions. So some of the arguments around companies/governments skimping on safety don't hold in the naive sense.
So things like how do you :
- Stop a robot jumping off something too high
- Stop an AI DOSing it's own network connection
- Stop a robot disassembling itself
When it is not vastly capable. Solving these things would give you a bunch of knowledge of safeguards and how to build them. I wrote about some of problems here
It is only when you expect a system to radically gain capability without needing any safeguards, does it makes sense to expect there to be a dangerous AI created by a team with no experience of safe guards or how to embed them.
As a data point for why this might be occurring. I may be an outlier, but I've not had much luck getting replies or useful dialogue from X-risk related organisations in response to my attempts at communications.
My expectation, currently. is that if I apply I won't get a response and I will have wasted my time trying to compose an application. I won't get any more information than I previously had.
If this isn't just me, you might want to encourage organisations to be more communicative.
My view is more or less the one Eliezer points to here:
The big big problem is, “Nobody knows how to make the nice AI.” You ask people how to do it, they either don’t give you any answers or they give you answers that I can shoot down in 30 seconds as a result of having worked in this field for longer than five minutes.
There are probably no fire alarms for "nice AI designs" either, just like there are no fire alarms for AI in general.
Why should we expect people to share "nice AI designs"?
For longer time frames where there might be visible development, the public needs to trust that the political regulators of AI to have their interests at heart. Else they may try and make it a party political issue, which I think would be terrible for sane global regulation.
I've come across pretty strong emotion when talking about AGI even when talking about safety, which I suspect will come bubbling to the fore more as time goes by.
It may also help morale of the thoughtful people trying to make safe AI.
I think part of the problem is that corporations are the main source of innovation and they have incentives to insert themselves into the things they invent so that they can be trolls and sustain their business.
Compare email and facebook messenger for two different types of invention, with different abilities to extract tolls. However if you can't extract a toll, it is unlikely you can create a business around innovation in an area.
I had been thinking about metrics for measuring progress towards shared agreed outcomes as a method of co-ordination between potentially competitive powers to avoid arms races.
I passed around the draft to a couple of the usual suspects in the ai metrics/risk mitigation in hopes of getting collaborators. But no joy. I learnt that Jack Clark of OpenAI is looking at that kind of thing as well and is a lot better positioned to act on it, so I have hopes around that.
Moving on from that I'm thinking that we might need a broad base of support from people (depending upon the scenario) so being able to explain how people could still have meaningful lives post AI is important for building that support. So I've been thinking about that.
To me closed loop is impossible not due to taxes but due to desired technology level. I could probably go buy a plot of land and try and recreate iron age technology. But most likely I would injure myself, need medical attention and have to reenter society.
Taxes aren't also an impediment to close looped living as long as waste from the tax is returned. If you have land with a surplus of sunlight or other energy you can take in waste and create useful things with it (food etc). The greater loop of taxes has to be closed as well as well as the lesser loop.
From an infosec point of view, you tend to rely on responsible disclosure. That is you tell people that will be most affected or that can solve the problem for other people, they can create counter measures and then you release those counter measures to everyone else (which gives away the vulnerability as well), who should be in a position to quickly update/patch.
Otherwise you are relying on security via obscurity. People may be vulnerable and not know it.
There doesn't seem to be a similar pipeline for non-computer security threats.
Similarly, it is not irrational to want to form a cartel or political ingroup. Quite the opposite. It's like the concept of economic moat, but for humans.
And so you get the patriarchy and the reaction to it feminism. This leads to the culture wars that we have to day. So it is locally optimal but leads to problems in the greater system.
How do we escape this kind of trap?
I'm reminded of the quote by George Bernard Shaw.
“The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.”
I think it would be interesting to look at the reasons and occasions not to follow "standard" incentives.
I've been re-reading a sci-fi book which has the interesting Existential Risk scenario where most people are going to die. But some may survive.
If you are a person on earth in the book, you have the choice of helping out people and definitely dieing or trying desperately to be one of the ones to survive (even if you personally might not be the best person to help humanity survive).
In that situation I would definitely be in the "helping people better suited for surviving" camp. Following orders because the situation was too complex to keep in one persons head. Danger is fine because you are literally a dead person walking.
It becomes harder when the danger isn't so clear and present. I'll think about it a bit more.
The title of the book is frirarirf (rot13)
She asked my advice on how to do creative work on AI safety, on facebook. I gave her advice as best I could.
She seemed earnest and nice. I am sorry for your loss.
Dulce et Decorum Est Pro Huminatas Moria?
As you might be able to tell from the paraphrased quote I've been taught some bad things that can happen when this is taken too far.
Therefore the important thing is how we, personally, would engage with that decision if it came from outside.
For me it depends on my opinion of the people on the outside. There are four things I weigh:
- Epistemic rigour. With lots of crucial considerations around existential risk, do I believe that the outside has good views on the state of the world? If they do not, they/I may be doing more harm than good.
- Are they trying to move to better equilibria? Do they believe in winner take all or are they trying to plausibly pre-commit to sharing the winnings (with other people who are trying to plausibly pre-commit to sharing the winnings). Are they trying to avoid the race to bottom? It doesn't matter if they can't, but not trying at all means that they may miss out on better outcomes.
- Feedback mechanisms: How is the outside trying to make itself better? It may not be good enough in the first two items, but do they have feedback mechanisms to improve them?
- Moral uncertainty. What is their opinion on moral theory. They/I may do some truly terrible things if they are too sure of themselves.
My likelihood of helping humanity when following orders stems from those considerations. It is a weighty decision.
I'm interested in seeing where you go from here. With the old lesswrong demographic, I would predict you would struggle, due to cryonics/life extension being core to many people's identities.
I'm not so sure about current LW though. The fraction of the EA crowd that is total utilitarian probably won't be receptive.
I'm curious what it is that your intuitions do value highly. It might be better to start with that.
Has anyone done work on a AI readiness index? This could track many things, like the state of AI safety research and the roll out of policy across the globe. It might have to be a bit dooms day clock-ish (going backwards and forwards as we understand more) but it might help to have a central place to collect the knowledge.
Out of curiosity what is the upper bound on impact?
Do you think the AI-assisted humanity is in a worse situation than humanity is today?
Lots of people involved in thinking about AI seem to be in a zero sum, winner-take-all mode. E.g. Macron.
I think there will be significant founder effects from the strategies of the people that create AGI. The development of AGI will be used as an example of what types of strategies win in the future during technological development. Deliberation may tell people that there are better equilibrium. But empiricism may tell people that they are too hard to reach.
Currently the positive-sum norm of free exchange of scientific knowledge is being tested. For good reasons, perhaps? But I worry for the world if lack of sharing of knowledge gets cemented as the new norm. It will lead to more arms races and make coordination harder on the important problems. So if the creation of AI leads to the destruction of science as we know it, I think we might be in a worse position.
I, perhaps naively, don't think it has to be that way.
Interesting. I didn't know Russia's defences had degraded so much.
I'm curious what type of nuclear advantage you think America has. It is is still bound by MAD due to nukes on submersibles.
I think that US didn't have a sufficient intelligence capability to know where to inspect. Take Israel as an example.
CIA were saying in 1968 "...Israel might undertake a nuclear weapons program in the next several years". When Israel had already built a bomb in 1966.
While I think the US could have threatened the soviets into not producing nuclear weapons at that point in time, I think I have trouble seeing how the US could put in the requisite controls/espionage to prevent India/China/Uk etc from developing nuclear weapons later on.
I think the generalised flinching away from hypocrisy in itself, is mainly a status thing. Of the explanations for hypocrisy given.
- Deception
- Lack of will power
- Inconsistent thinking
None of them are desirable traits to have in allies (at least visible to other people).
I might take this up at a later date. I want to solve AI alignment, but I don't want to solve it now. I'd prefer it if our societies institutions (both governmental and non-governmental) were a bit more prepared.
Differential research that advances safety more than AI capability still advances AI capability.
Gambling on your knowledge might work, rather thank on your luck (at least in a rationalist setting).
It is interesting to think about, what does this look like as a societal norm. Physical risk gets you to adrenaline junkies, social standing can get you many places (Burning Culture is one, pushing the boundaries of social norms). Good ol' Goodheart.
Another element of the exciting-ness of risk is the novelty. We are making risky choices everyday. To choose to go to university is a risky choice, sometimes you make a good network/grow as a person or learn something useful. Other times it is just a complete waste of time and money. But it is seen as a normal option, so it has no cache.
To chose not to do something has elements of risk too. If you never expose yourself to small risk, you risk struggling later in life, because you never got a big pay off compared to the people that put themselves out there. But that kind of risk taking is rarely lauded.
I often like to bring questions of behaviour back to the question of what kind of society we want. How does risk fit into that society?
It is Fear and the many ways it is used in society and can make a potential problem seem bigger than it is. In the general things like FUD; a concrete example of that being the red scare. Often it seems to have an existence bigger than any individual, which is why it got made a member of the pantheon, albeit a minor one
With regards to the Group, people have found fear of the Other easier to form. Obligatory sociology potential non-replicability warning.
I personally wouldn't fetishize being exciting too much. Boring stability is what allows civilisation to continue to do what functioning it somehow, against all the odds, manages to do. Too much exciting is just chaos.
That said, I would like more exciting in the world. One thing I've learnt anything from working on a live service is that any attempt at large-scale change, not matter how well planned/prepared for has an element of risk.
what kinds of risks should we take?
It might be worth enumerating the things we can risk. Your example covers at least getting the feeling of risking the phyiscal body. Other things I thought of off the top of my head.
- Social Standing - E.g. Write an essay on something you are interested in that doesn't link immediately to the interests of your community.
- Money - Taking a large bet on something. This tends not to be exciting to me, but other people might like it.
- Emotional - Hard to give non-specific examples here. Declaring your love or being vulnerable in front of someone, maybe? Probably not exciting for the rationalist community, but for others.
Other risks, such as risking your organisations/communities status/well being seem like they would have thorny issues of consent.
I've probably missed some categories though.
I didn't/don't have time to do the science justice, so I just tried my hand at the esoteric. It was scratching a personal itch, if I get time I might revisit this.
I'm reminded of this Paul Graham essay. So maybe it is not all western cities. But the focus of the elite in those cities.
. What happened? More generally, what makes a social role exciting or boring at a certain point in time?
So I think the question is what qualities are incentivised in the social role. So for lots of bankers the behaviour that is incentivised is reliability and trustworthiness. It is not just the state that likes people to be predictable and boring, but the people giving lots of money to someone to keep safe, will also select for predictability and boringness.
Russia I imagine the people selected for were the people that could navigate the polictical/social scene. I imagine there is an amount of gambling involved in that, that lots of people failing and falling. Does that fit with your experience?
I suspect you wouldn't find the Silicon Valley or Boston elite boring because there is certain amount of exploration and being novel that is required.
I like arguing with myself. So it is fun to make the best case. But yup I was going beyond what people might. I think I find arguments against naive views less interesting so spice them up some.
In accelerando the participants in Economy 2.0 had a treacherous turn because they had the pressure of being in a sharply competitive, resource hungry environment. This could have happened if they were EM or even aligned AGI to a subset of humanity, if they don't solve co-ordination problems.
This kind of evolutionary problem has not been talked about for a bit (everyone seems focussed on corrigibility etc), so maybe people have forgotten? I think it worth making it explicit that that is what you need to worry about. But the question then becomes should we worry about it now or when we have cheaper intelligence and a greater understanding of how intelligences might co-ordinate?
Edit: One might even make the case we should focus our thought on short term existential risks, like avoiding nuclear war during the start of AGI, because if we don't pass that test we won't get to worry about super intelligence. And you can't use the cheaper later intelligence to solve that problem.
I feel that this post is straw-manning "I don't think superintelligence is worth worrying about because I don't think that a hard takeoff is realistic" a bit.
A steel man might be,
I don't feel super intelligence is worth worrying at this point, as in a soft takeoff scenario we will have lots of small AGI related accidents (people wire heading themselves with AI). This will provide both financial incentives to companies to concentrate of safety to stop themselves getting sued and if they are using it themselves, stopping the damages caused by it to themselves. It will also provide government incentives to introduce regulation to make them safe, from political pressure. AGI Scientists on the cusp of creating AGI have incentives to not be associated with the bad consequences of AGI, they are also on the best position to understand what safe guards are needed.
Also there will be a general selective pressure towards safe AGI as we would destroy the unaligned ones with the safer/most alignable ones. There is no reason to expect a treacherous turn when the machines get to a decisive strategic advantage, as we will have seen treacherous behaviour in AGIs that are not super rational or good at hiding their treachery and then designed against it.
It is only when there is the chance of foom do we the current generation need to worry about super intelligence right now.
As such it would be better to save money now and use the componded interest to then buy safer AGI from safety focused AGI companies to distribute to needy people. The safety focused company will have greater knowledge of AGI and be able to make more a lot more AGI safety for the dollar than we currently can with our knowledge.
If you want to make AGI, then worrying about the super intelligence case is probably a good exercise in seeing where the cracks are in your system to avoid the small accidents.
I'm not sure I believe it. But it is worth seeing that incentives for safety are there.
But when you're an adult, you are independent. You have choice to decline interactions you find unpleasant. You don't need everyone you know to like you to have a functioning life. There are still people and institutions to navigate, but they aren't out to get you. They won't thwart your cookie quests. You are free.
I think this depends a lot on the context. The higher profile you are the more people might be out to get you, because they can gain something by dragging you down. See twitter mobs etc.
Similarly if you want to doing something that might be controversial you can't just waltz out and do it, unless you are damn sure it is right. Building strong alliances and not making too many enemies seems important as well. Sometimes you need to do the unpleasant interactions because that is also what being an adult is about.
But nice post, I'm sure it will help some people.
Ah, makes sense. I saw something on facebook by Robert Wiblin arguing against unnamed people in the "evidence-based optimist" group. And thought I was missing something important going on, for both you and cousin_it to react to. You have not been vocal on take off scenarios before. But it seems it is just conincidence.
Thanks for the explanation.
I have to say I am a little puzzled. I'm not sure who you and cousin_it are talking to with these moderate take off posts. I don't see anyone arguing that a moderate take off would be okay by default.
Even more mainstream places like mit, seem to be saying it is too early to focus on AI safety, rather than never focus on AI safety. I hope that there would conversation around when to focus on AI safety. While there is no default fire alarm it doesn't mean you can't construct one. Get people working on AGI science to say what they expect their creations to be capable of and formulate a plan for what to do if it is vastly more capabale than they expect.
I suppose there is the risk that the AGI or IA is suffering while helping out humanity as well.
I didn't know that!
I do think there is a difference in strategy though still. In the foom scenario you want to keep small the number of key players or people that might become key players.
In the non-foom you have the unhappy compromise between trying to avoid too many accidents and building up defense early vs practically everyone in time being a key player and needing to know how to handle AGI.
TLA is MIA :) Thanks
FWIW lesswrong has rarely felt like a comfortable place for me. Not sure why. Maybe I missed the fandom stage.
I did have a laugh now and again back in the day. Even then I think I came here for the "taking ideas seriously" thing that rationalists can do, than for the community.
I've argued before that we should understand the process of science (how much analysis vs data processing vs real world tests), in order to understand how likely it is that AGI will be able to do science quickly. Which impacts the types of threats we should expect. We should also look at the process of programming with a similar lens to see how much a human level programmer could be improved upon. There is lots of non-human bounded activity in the process of industrial scale programming, lots of it are in running automated test suites. Will AIs need to run similar suites or can they do things in a more adequate way?
Information from sociology and history should impact our priors on the concrete strategies that may work. But that may be taken as a given and less interesting.
I would add in animals if you are asking questions about the nature of general intelligence. For example people claim monkeys are better at certain tasks than humans. What does that mean for the notion of general intelligence, if anything?
What are the questions you are trying to answer about the first AGIs?
- How they will behave?
- What they will be capable of?
- What is the nature of the property we call intelligence?
I find the second one much more interesting, with more data to be acquired. For the second one I would include things like modern computer hardware and what we have managed to achieve with it (and the nature and structure of those achievements).
I've got a bit more time now.
I agree "Things need to be done" in a rising tide scenario. However different things need to be done to the foom scenario. The distribution of AI safety knowledge is different in an important way.
Discovering ai alignment is not enough in the rising tide scenario. You want to make sure the proportion of aligned AIs vs misaligned AIs is sufficient to stop the misaligned AIs outcompeting the aligned AIs. There will be some misaligned AIs due to parts wear, experiments gone wrong, AIs aligned with insane people that are not sufficiently aligned with the rest of humanity to allow negotiation/discussion.
The biggest risk is around the beginning. Everyone will be enthusiastic to play around with AGI. If they don't have good knowledge of alignment (because it has been a secret project) then they may not know how it should work and how it should be used safely. They may also buy AGI products from people that haven't done there due diligence in making sure their product is aligned.
It might be that it requires special hardware for alignment (e.g there is the equivalent of spectre that needs to be fixed in current architectures to enable safe AI), then there is the risk of the software getting out and being run on emulators that don't fix the alignment problem. Then you might get lots of misaligned AGIs.
In this scenario you need lots of things that are antithetical to the strategy of fooming AGI, of keeping things secret and hoping that a single group brings it home. You need a well educated populace/international community, regulation of computer hardware and AGI vendors (preferably before AGI hits). All that kind of stuff.
Knowing whether we are fooming or not is pretty important. The same strategy does not work for both. IMO.
I've been trying to think about historical examples. Marxism while in some ways being strongly about conflict theory, they still wanted to keep the veneer of reasoned debate to get the backing of academics.
A quote from wikipedia from Popper.
Hegel thought that philosophy develops; yet his own system was to remain the last and highest stage of this development and could not be superseded. The Marxists adopted the same attitude towards the Marxian system. Hence, Marx's anti-dogmatic attitude exists only in the theory and not in the practice of orthodox Marxism, and dialectic is used by Marxists, following the example of Engels' Anti-Dühring, mainly for the purposes of apologetics – to defend the Marxist system against criticism. As a rule critics are denounced for their failure to understand the dialectic, or proletarian science, or for being traitors. Thanks to dialectic the anti-dogmatic attitude has disappeared, and Marxism has established itself as a dogmatism which is elastic enough, by using its dialectic method, to evade any further attack. It has thus become what I have called reinforced dogmatism.[51]
This article from AlexMennen has some relevant discussion and links.
This is why I've always insisted, for example, that if you're going to start talking about "AI ethics", you had better be talking about how you are going to improve on the current situation using AI, rather than just keeping various things from going wrong. Once you adopt criteria of mere comparison, you start losing track of your ideals—lose sight of wrong and right, and start seeing simply "different" and "same".
From: Guardians of the Truth
I'd put some serious time into that as well, if you can. If you think that you can influence the big lever that is AI, you should know which way to pull.
Some of your comments appear to be hidden. I shall reply here with a question they bought to mind.
"then they take off like a rocket,"
I think it worth talking about whether it is sustainable. Whether they can do what is needed to be done at the current time when going at that high speed? Before people go too far down that path. Basically I'm asking, "But at what cost?"
The trust you have to have is that the person you are building with won't take the partially built rocket and and finish it for themselves to go off to a gambling den. That they too actually want to get groceries and aren't just saying that they do to gain your cooperation. You want to avoid cursing their sudden but inevitable betrayal.
You do want to get the groceries right?
The trickle down from religio to ops view of the world seems to de-emphasise intelligence gathering.
I think a big part of operations is making sure you have the correct information to do the things you need to do at the correct time. As such the informatation gathered regularly from ops inform strategy and tactics.