Posts
Comments
I agree with your point in general of efficiency vs rationality, but I don’t see the direct connection to the article. Can you explain? It seems to me that a representation along correlated values is more efficient, but I don’t see how it is any less rational.
I would describe this as a human-AI system. You are doing at least some of the cognitive work with the scaffolding you put in place through prompt engineering etc, which doesn’t generalise to novel types of problems.
You seem to make a strong assumption that consciousness emerges from matter. This is uncertain. The mind body problem is not solved.
It is so difficult to know whether this is genuine or if our collective imagination is being projected onto what an AI is.
If it was genuine, I might expect it to be more alien. But then what could it say that would be coherent (as it’s trained to be) and also be alien enough to convince me it’s genuine?
You said that you are not interested in exploring the meaning behind the green knight. I think that it's very important. In particular, your translation to the Old West changes the challenge in important ways. I don't claim to know the meaning behind the green knight. But I believe that there is something significant in the fact that the knights were so obsessed with courage and honour and the green knight laid a challenge at them that they couldn't turn down given their code. Gawain stepped forward partly to protect Arthur. That changes the game. I asked ChatGPT to describe the differences, here are some parts of the answer:
Moral and Ethical Framework: "Sir Gawain and the Green Knight" operates within a chivalric code that values honor, bravery, and integrity. Gawain's acceptance of the challenge is a testament to his adherence to these ideals. In contrast, the Old West scenario lacks a clear moral framework, presenting a more ambiguous ethical dilemma that revolves around survival and personal pride rather than chivalric honor.
Social and Cultural Context: "Sir Gawain and the Green Knight" is deeply embedded in medieval Arthurian literature, reflecting the societal values and ideals of the time. The Old West scenario reflects a different set of cultural values, emphasizing individualism and the ability to face death bravely.
And with a bit more prompting
If I were in a position similar to Sir Gawain, operating under the chivalric codes and values of the Arthurian legend, accepting the challenge could be seen as a necessary act to uphold honor and valor, integral to the identity of a knight. However, stepping out of the narrative and considering the challenge from a modern perspective, with contemporary ethical standards and personal values, my response would differ.
It’s useful in that it is a model that describes certain phenomena. I believe it is correct given the caveat that all models are approximations.
I did a physics undergraduate degree a long time ago. I can’t remember specifically but I’m sure the equation was derived and experimental evidence was explained. I have strong faith that matter converts to energy because it explains radiation, fission reactors and atomic weapons. I’ve seen videos of atomic bombs going off. I’ve seen evidence of radioactivity with my own eyes in a lab. I know of many technologies that rely on radioactivity to work - smoke alarms, Geiger counters, carbon dating, etc.
I have faith in the scientific process that many people have verified the equation and phenomena. If the equation was not correct then proving or showing that would be a huge piece of work that would make the career of a scientist that did that. I’m sure many have tried.
Overall the equation is a part of a whole network of beliefs. If the equation was incorrect then that would mean that my word model was very wrong in many uncorrelated ways. I find that unlikely.
Well I agree it is a strawman argument. Following the same lines as your argument, I would say the counter argument is that we don’t really care if a weak model is fully aligned or not. Is my calculator aligned? Is a random number generator aligned? Is my robotic vacuum cleaner aligned? It’s not really a sensical question.
Alignment is a bigger problem with stronger models. The required degree of alignment is much higher. So even if we accept your strawman argument it doesn’t matter.
I found this a useful framing. I’ve thought quite a lot about the offender versus defence dominance angle and to me it seems almost impossible that we can trust that defence will be dominant. As you said, defence has to be dominant in every single attack vector, both known and unknown vectors.
That is an important point because I hear some people argue that to protect against offensive AGI we need defensive AGI.
I’m tempted to combine the intelligence dominance and starting costs into a single dimensions, and then reframe the question in terms of “at what point would a dominant friendly AGI need to intervene to prevent a hostile AGI from killing everyone”. The pivotal act view is that you need to intervene before a hostile AGI even emerges. It might be that we can intervene slightly later, before a hostile AGI has enough resources to cause much harm but after we can tell if it is hostile or friendly.
Thank you for the great comments! I think I can sum up a lot of that as "the situation is way more complicated and high dimensional and life will find a way". Yes I agree.
I think what I had in mind was an AI system that is supervising all other AIs (or AI components) and preventing them from undergoing natural selection. A kind of immune system. I don't see any reason why that would be naturally selected for in the short-term in a way that also ensures human survival. So it would have to be built on purpose. In that model, the level of abstraction that would need to be copied faithfully would be the high-level goal to prevent runaway natural selection.
It would be difficult to build for all the reasons that you highlight. If there is an immunity/self-replicating arms race then you might ordinarily expect the self-replication to win because it only has to win once while the immune system has to win every time. But if the immune response had enough oversight and understanding of the system then it could potentially prevent the self-replication from ever getting started. I guess that comes down to whether a future AI can predict or control future innovations of itself indefinitely.
Thanks for the reply!
I think it might be true that substrate convergence is inevitable eventually. But it would be helpful to know how long it would take. Potentially we might be ok with it if the expected timescale is long enough (or the probability of it happening in a given timescale is low enough).
I think the singleton scenario is the most interesting, since I think that if we have several competing AI's, then we are just super doomed.
If that's true then that is a super important finding! And also an important thing to communicate to people! I hear a lot of people who say the opposite and that we need lots of competing AIs.
I agree that analogies to organic evolution can be very generative. Both in terms of describing the general shape of dynamics, and how AI could be different. That line of thinking could give us a good foundation to start asking how substrate convergence could be exacerbated or avoided.
Here’s a slightly different story:
The amount of information is less important than the quality of the information. The channels were there to transmit information, but there were not efficient coding schemes.
Language is an efficient coding scheme by which salient aspects of knowledge can be usefully compressed and passed to future generations.
There was no free lunch because there was an evolutionary bottleneck that involved the slow development of cognitive and biological architecture to enable complex language. This developed in humans in a co-evolutionary process with advanced social dynamics. Evolution stumbled across cultural transmission in this way and the rest is quite literally history.
This is all highly relevant to AI development. There is the potential for the development of more efficient coding schemes for communicating AI learnt knowledge between AI models. When that happens we get the sharp left turn.
I think I’m more concerned with minimising extreme risks. I don’t really mind if I catch mild covid but I really don’t want to catch covid in a bad way. I think that would shift the optimal time to take the vaccine earlier, as I’d have at least some protection throughout the disease season.
I am interested in the substrate-needs convergence project.
Here are some initial thoughts, I would love to hear some responses:
- An approach could be to say under what conditions natural selection will and will not sneak in.
- Natural selection requires variation. Information theory tells us that all information is subject to noise and therefore variation across time. However, we can reduce error rates to arbitrarily low probabilities using coding schemes. Essentially this means that it is possible to propagate information across finite timescales with arbitrary precision. If there is no variation then there is no natural selection.
- In abstract terms, evolutionary dynamics require either a smooth adaptive landscape such that incremental changes drive organisms towards adaptive peaks and/or unlikely leaps away from local optima into attraction basins of other optima. In principle AI systems could exist that stay in safe local optima and/or have very low probabilities of jumps to unsafe attraction basins.
- I believe that natural selection requires a population of "agents" competing for resources. If we only had a single AI system then there is no competition and no immediate adaptive pressure.
- Other dynamics will be at play which may drown out natural selection. There may be dynamics that occur at much faster timescales that this kind of natural selection, such that adaptive pressure towards resource accumulation cannot get a foothold.
- Other dynamics may be at play that can act against natural selection. We see existence-proofs of this in immune responses against tumours and cancers. Although these don't work perfectly in the biological world, perhaps an advanced AI could build a type of immune system that effectively prevents individual parts from undergoing runaway self-replication.
I’d like to add that there isn’t really a clear objective boundary between an agent and the environment. It’s a subjective line that we draw in the sand. So we needn’t get hung on what is objectively true or false when it comes to boundaries - and instead define them in a way that aligns with human values.
I agree but I don’t think that this is the specific problem. I think it’s more that the relationship between agent and environment changes over time i.e. the nodes in the Markov blanket are not fixed, and as such a Markov blanket is not the best way to model it.
The grasshopper moving through space is just an example. When the grasshopper moves, the structure of the Markov blanket changes radically. Or, if you want to maintain a single Markov blanket then it gets really large and complicated.
Regarding your study idea. Sounds good! Would be interesting to see, and as you rightly point out wouldn't be too complicated/expensive to run. It's generally a challenge to run multi-year studies of this sort due to the short-term nature of many grants/positions. But certainly not impossible.
An issue that you might have is being able to be sure that any variation that you see is due to changes in the general population vs changes in the sample population. This is an especially valid issue with MTurk because the workers are doing boring exercises for money for extended periods of time, and many of them will be second screening media etc. A better sample might be university students, who researchers fortunately do have easy access to.
There are extra costs here that aren’t being included. There’s a cost to maintaining the pill box - perhaps you consider that small but it’s extra admin and we’re already drowning in admin. There’s a cost to my self identity of being a person who carries around pills like this (don’t mean to disparage it, just not for me). There’s also potentially hidden costs of not getting ill occasionally, both mentally and physically.
Much harder to put enough capital together to make it worthwhile,
Beat me to it. Yes the lesson is perhaps to not create prediction markets that incentivise manipulation of that market towards bad outcomes. The post could be expanded to a better question of, given that prediction markets can incentivise bad behaviour, how can we create prediction markets that incentivise good behaviour?
This reminds me somewhat of the potentially self-fulfilling prophecy of defunding bad actors. E.g. if we expect that global society will react to climate change by ultimately preventing oil companies from extracting and selling their oil field assets. Then those assets are worth much less than their balance sheets claim, so we should divest from oil companies. That reduces the power of oil companies that then makes climate change legislation easier to implement and the prophecy is fulfilled. Here the share price is the prediction market.
I’d ask the question whether things typically are aligned or not. There’s a good argument that many systems are not aligned. Ecosystems, society, companies, families, etc all often have very unaligned agents. AI alignment, as you pointed out, is a higher stakes game.
Your proofs all rely on lotteries over infinite numbers of outcomes. Is that necessary? Maybe a restriction to finite lotteries avoids the paradox.
Leinbiz's Law says that you cannot have separate objects that are indistinguishable from each other. It sounds like that is what you are doing with the 3 brains. That might be a good place to flesh out more to make progress on the question. What do you mean exactly by saying that the three brains are wired up to the same body and are redundant?
I’ve always thought that the killer app of smart contracts is creating institutions that are transparent, static and unstoppable. So for example uncensored media publishing, defi, identity, banking. It’s a way to enshrine in code a set of principles of how something will work that then cannot be eroded by corruption or interference.
There is the point that 80% of people can say that they are better than average drivers and actually be correct. People value different things in driving, and optimise for those things. One person’s good driver may be safe, someone else may value speed. So both can say truthfully and correctly that they are a better driver than the other. When you ask them about racing it narrows the question to something more specific.
You can expand that to social hierarchies too. There isn’t one hierarchy, there are many based on different values. So I can feel high status at being a great musician while someone else can feel high status at earning a lot, and we can both be right.
I think a problem you would have is that the speed of information in the game is the same as the speed of, say, a glider. So an AI that is computing within Life would not be able to sense and react to a glider quickly enough to build a control structure in front of it.
I’d say 1 and 7 (for humans). The way humans understand go is different to how bots understand go. We use heuristics. The bots may use heuristics too but there’s no reason we could comprehend those heuristics. Considering the size of the state space it seems that the bot has access to ways of thinking about go that we don’t, the same way a bot can look further ahead in a chess games than we could comprehend.
Why are we still paying taxes if we have AI this brilliant? Surely we then have ridiculous levels of abundance
I strongly disagree with your sentiments.
Advertising is bad because it’s fundamentally about influencing people to do things they wouldn’t do otherwise. That takes us all away from what’s actually important. It also drives the attention economy, which turns the process of searching for information and learning about the world into a machine for manipulating people. Advertising should really be called commercial propaganda - that reveals more clearly what it is. Privacy is only one aspect of the problem.
Your arguments are myopic in that they are all based on the current system we have now, which is built around advertising models. Of course those models don’t work well without advertising. If we reduced advertising the world would keep on turning and human ingenuity would come up with other ways for information to be delivered and funded. I don’t need to define what new system that would be to say that advertising is bad.
Maybe look at GAMS
I find rationalist cringey for some reason and won’t describe myself like that. As you said, in seems to discount intuition, emotion and instinct. 99% of human behaviour is driven by irrational forces and that’s not necessarily a bad thing. The word rationalist to me feels like a denial of our true nature and a doomed attempt to be purely rational - rather than trying to be a bit more deliberate in action and beliefs
What I want to know is how bad an effect, exactly, will a solar storm be likely to have. It’s all very vague.
How long will it take to get the power back on? A couple of days? Weeks? Months? Those are very different scenarios.
And, can we do something now to turn the monty long scenario into a week? Maybe we can stockpile a few transformers or something.
Just a writing tip. Might help to define initialisations at least once before using them. EA isn’t self evidently effective altruism.
I’m in the UK. Rules are stricter than ever but also people are taking it seriously, more than the 2nd lockdown. And it’s January and freezing cold so no one wants to go out anyway.
With neuropreservation you might also lose the sense of embodiment, of being in a body and the body being a part of you. That could be extremely traumatic to the point where you wouldn't want to come back without your body. It is unclear whether that could be successfully countered using a "grown" body or a sophisticated simulation if you are being uploaded.
Good point. I think it would depend on how useful the word is in describing the world. If your culture has very different norms between “boyfriend/girlfriend” and fiancé then a replacement for fiancé would likely appear.
I suppose that on one extreme you would have words that are fundamental to human life or psychology e.g. water, body, food, cold. These I’m sure would reappear if banned. Then on the other extreme you have words associated with somewhat arbitrary cultural behaviour e.g. thanksgiving, tinsel, Twitter, hatchback. These words may not come back if the thing they are describing is also banned.
Uncle/father is an interesting one. Those different meanings could be described with compound words. Father could be “direct makuakane” and uncle “brother makuakane”, or something like that. We already use compound words in family relations in English like “grandfather” whereas Spanish it is “abuelo”.
You might be interested in this paper, it supports the idea of a constant information processing rate in text. "Different languages, similar coding efficiency: Comparable information rates across the human communicative niche", Coupe, Mi Oh, Dediu, Pellegrino.. 2019, Science Advances.
I would agree that language would likely adapt to newspeak by simply using other compound words to describe the same thing. Within a generation or two these would then just become the new word. Presumably the Orwellian government would have to continually ban these new words. Perhaps with enough pressure over enough years the ideas themselves would be forgotten, which is perhaps Orwell's point.
I think the claim that sophisticated word use is caused by intelligence signalling requires more evidence. It is I'm sure one aspect of the behaviour. But a wider vocabulary is also beneficial in terms of being able to more clearly and efficiently disambiguate and communicate ideas. This could be especially true I think when communicating across contexts - having context specific language may help prevent misunderstandings that would arise with a more limited vocabulary. It would be interesting to try and model that with ideas from information theory.
I was thinking something similar. I vaguely remember that the characteristic function proof includes an assumption of n being large, where n is the number of variables being summed. I think that allows you to ignore some higher order n terms. So by keeping those in you could probably get some way to quantify how "close" a resulting distribution is to Gaussian. And you could relate that back to moments quite naturally as well.
- I will become famous by telling the world this news
- I can probably get a professor job off the back of this research
- The open question of why they became intelligent will lead to alien conspiracy theories
- Religions will claim that God must have done this
- Some people will say this is proof we are in a simulation
- New businesses will start up to exploit the intelligent ants
- Interfaces will be built to communicate with the ants
- Some people will say it is a conspiracy theory and the ants aren’t really smart
- The research will lead to advances in swarm computing
- I’ll doubt my research and have to check it over again
- We will need to make peace with ant kind - they would win in a war
- Ant kind will start communicating with each other and making their own culture
- Governments might try and stop the ants from communicating
- Some people will ally with the ants, running secret anthills
- Someone will develop a way for ants to access the internet
- Will the ants have emotions?
- New religions will form that incorporate ant and human kind
- We can send ant colonies in spaceships more easily than people
- Ant colony pilots
- Ants will take over some areas of the economy that they are suited to
- People will protest about ants taking their jobs
- We can send GM ant colonies to Mars and other planets
- We can use ants to spy on people
- Eating ants will become a weird fetish for some people
- Anti-ant areas will be setup with strict regulations to prevent ants from getting in
- Governments will suspect each other of creating the intelligent ants
- Ant colonies will start to work together, creating mega-colonies
- The entire Amazon will become a giant ant hive mind
- There will be peace accords between different species of ants
- People will investigate if bees and other hive type species are also intelligent
- There will be an ant made novel
- Studying ants will lead to leaps forward in AI
- Someone will run an ant Turing test
- Adam Ant will have a pop revival
- Some people will deify the ants and live by their dictates
- Some ant colonies will become particularly famous
- It will become fashionable to have an ant colony companion in your home
- Artists will collaborate with ants to generate intricate 3d sculptures
- The first ant politician will be elected
- There will be a human ant war
- Some people will move to cold places that have less ants
- Alex Jones will invent some conspiracy theory about ants
- The ant news will be drowned out by the US election
- There will be films made about ant kind
- There will be human-ant romances
- People will develop pheromone sprays to influence the ants
- Some ant hives will become addicted to pheromone sprays
- The bill of human rights will be extended to include ant rights
- Studying ant hives will lead to advances in neuroscience
- Whole new university departments will be created to study ants
I like to think of advertising as commercial propaganda. That is technically what it is. Whereas political propaganda's purpose may be to influence people to support a political belief, commercial propaganda is to influence people to support a commercial enterprise.
People tend to think of political propaganda as something from World War 2 and authoritarian regimes. But it was used in the West and it never went away. It just became more sophisticated over time and a part of that was re-branding it to "spin" or "public relations". The original word is useful because it is accurate and it highlights the obvious negative consequences of the practice.