Posts
Comments
Unearthing my old dissertation. Still think there is something to it
I've been thinking about non AI catastrophic risks.
One that I've not seen talked about is the idea of cancerous ideas. That is ideas that spread throughout a population and crowd out other ideas for attention and resources.
This could lead to civilisational collapse due to basic functions not being performed.
Safeguards for this are partitioning the idea space and some form of immune system that targets ideas that spread uncontrollably.
Trying something new a hermetic discussion group on computers.
By corporation I am mainly thinking about current cloud/SaaS providers. There might be a profitable hardware play here, if you can get enough investment to do the R&D.
Self-managing computer systems and AI
One of my factors in thinking about the development of AI is self-managing systems, as humans and animals self manage.
It is possible that they will be needed to manage the complexity of AI, once we move beyond LLMs. For example they might be needed to figure out when to train on new data in an efficient way and how much resources to devote to different AI sub processes in real time depending upon the problems being faced.
They will change the AI landscape making it easier for people to run their own AIs, for this reason it is unlikely that corporations will develop them or release them to the outside world (much like corporations cloud computing infra is not open source) as it will erode their moats.
Modern computer systems have and rely on the concept of a super user. It will take lots of engineering effort to remove that and replace it with something new.
With innovation being considered the purview of corporations are we going to get stuck in a local minima of cloud compute based AI, that is easy for corporations to monetise?
Looks like someone has worked on this kind of thing for different reasons https://www.worlddriven.org/
I was thinking of having evals that controlled deployment of LLMs could be something that needs multiple stakeholders to agree upon.
Butt really it is a general use pattern.
Agreed code as coordination mechanism
Code nowadays can do lots of things, from buying items to controlling machines. This presents code as a possible coordination mechanism, if you can get multiple people to agree on what code should be run in particular scenarios and situations, that can take actions on behalf of those people that might need to be coordinated.
This would require moving away from the “one person committing code and another person reviewing” code model.
This could start with many people reviewing the code, people could write their own test sets against the code or AI agents could be deputised to review the code (when that becomes feasible). Only when an agreed upon number of people thinking the code should it be merged into the main system.
Code would be automatically deployed, using gitops and the people administering the servers would be audited to make sure they didn’t interfere with running of the system without people noticing.
Code could replace regulation in fast moving scenarios, like AI. There might have to be legal contracts that you can’t deploy the agreed upon code or use the code by itself outside of the coordination mechanism.
As well as thinking about the need for the place in terms of providing a space for research, it is probably worth thinking about the need for a place in terms of what it provides the world. What subjects are currently under-represented in the world and need strong representation to guide us to a positive future? That will guide who you want to lead the organisation.
I admit that it is extreme circumstances that would make slavery consensual and justified. My thinking was if existential risk was involved, you might consent to slavery to avert it. It would have to be a larger entity than a single human doing the enslaving, because I think I agree that individuals shouldn't do consequentialism. Like being a slave to the will of the people, in general. Assuming you can get that in some way.
I don't follow the reasoning here
So let's say the person has given up autonomy to avert existential risk, they should perhaps get something in return. Maybe they get influence, but they can't use influence for their own benefit (as one of the deontological rules stipulates that is disallowed). So they are stuck trying to avert existential risk with no pay off. If you unenslave them you remove the will of the people's voice and maybe increase existential risk or s risks.
Hmm, sorry went off on a bit of tangent here. All very unlikely agreed.
Tangential but is there ever justified unconscious slavery. For example if you asked whether you consent to slavery and then your mind wiped, might you get into a situation where the slave doesn't know they consented to it, but the slave master is justified in treating them like a slave.
You would probably need a justification for the master slave relationship. Perhaps it is because it needs to be hidden for a good reason? Or to create a barrier against interacting with the ethical. In order to dissolve such slavery, understanding the justifications for why the slavery started would be important.
Proposal for new social norm - explicit modelling
Something that I think would make rationalists more effective at convincing people is if we had explicit models of the things we care about.
Currently we are at the stage of physicists arguing that the atom bomb might ignite the atmosphere without concrete math and models of how that might happen.
If we do this for lots of issues and have a norm of making models composable this would have further benefits.
- People would use the models to make real world decisions with more accuracy
- We would create frameworks for modelling that would be easily composable, that other people would use
Both would raise the status and knowledge of the rationalist community.
Does it make sense to plan for one possible world or do you think that the other possible worlds are being adequately planned for and it is only the fast unilateral take off that is neglected currently?
Limiting AI to operating in space makes sense. You might want to pay off or compensate all space launch capability in some way as there would likely be less need.
Some recompense for the people who paused working on AI or were otherwise hurt in the build up to AI makes sense.
Also trying to communicate ahead of time what a utopic vision of AI and humans might look like, so the cognitive stress isn't too major is probably a good idea to commit to.
Committing to support multilateral acts if unilateral acts fail is probably a good idea too. Perhaps even partnering with a multilateral effort so that effort on shared goals can be spread around?
Relatedly I am thinking about improving the wikipedia page on recursive self-improvement. Does anyone have any good papers I should include? Ideally with models.
I'm starting a new blog here. It is on modelling self-modifying systems, starting with AI. Criticisms welcome
I'm wary about that one, because that isn't a known "general" intelligence architecture, so we can expect AIs to make better learning algorithms for deep neural networks, but not necessarily themselves.
I'd like to see more discussion of this, I read some of the FOOM debate but I'm assuming that there has been more discussion of this important issue since?
I suppose the key question is for recursive self-improvement. We can give hardware improvement (improved hardware allows design of more complex and better hardware) because we are on the treadmill already. But how likely is algorithmic self-improvement. For an intelligence to be able to improve itself algorithmically the following seem to need to hold.
- The system needs to understand itself
- There has to be some capacity that can be improved without detriment to some other capacity (else you are doing some self-optimization and not necessarily improvement)
If it is the memeplex that gives us our generality (as is suggested by our flowering of discovery over the past 250 years compared to the past 300,000 years of homo sapiens), it might not be understandable. It would be in the weights or equivalents in whatever the AI uses. No human would understand it either.
Fiddling about with weights without knowledge would likely lead to trade offs and so you might not have the second consideration holding.
I'm not saying AI won't change history, but we need an accurate view of how it will change things.
Found "The Future of Man" by Pierre Teilhard de Chardin in a bookshop. Tempted to wite a book review. It discusses some interesting topics, like the planetisation of Mankind. However it treats them as inevitable, rather as something contingent on us getting our act together. Anyone interested in a longer review?
Edit: I think his faith in the super natural plays a part in the assumption of inevitability.
That's true. Communities that can encourage truth speaking and exploration will probably get more of it and be richer for it in the long term.
Not really part of the lesswrong community at the moment, but I think evolutionary dynamics will be the next thing.
Not just of AI, but post humans, uploads etc. Someone will need to figure out what kind of selection pressures the should be so that things don't go to ruin in an explosion of variety.
All competitive situations against ideal learning agents are anti inductive in this sense. Because they can note regularities in their actions and avoid them in the future as well as you can note regularities in their actions and exploit them. The usefulness of induction is based on the relative speeds of the induction of the learning agents.
As such anti induction appears in situations like bacterial resistance to antibiotics. We spot a chink in the bacterias armour, and we can predict that that chink will become less prevalent and our strategy less useful.
So I wouldn't mark markets as special, just the most extreme example.
I find neither that convincing. Justice is not a terminal value for me, so I might sacrifice it for Winning. I prefered reading the first, but that is no indication of what a random person may prefer.
With international affairs, isn't stopping the aggression the main priority? That is stopping the death and suffering of humans on both sides? Sure it would be good to punish the aggressors rather than the retaliators but if that doesn't stop the fighting it just means more people are dying.
Also there is a difference between the adult and the child, the adult relies on the law of the land for retaliation the child takes it upon himself when he continues the fight. That is the child is a vigilante, and he may punish disproportionately e.g. breaking a leg for a dead leg.
I don't really have a good enough grasp on the world to predict what is possible, it all seems to unreal.
One possibility is to jump one star away back towards earth and then blow up that star, if that is the only link to the new star.
Re: "MST3K Mantra"
Illustrative fiction is a tricky business, if this is to be part of your message to the world it should be as coherent as possible, so you aren't accidentally lying to make a better story.
If it is just a bit of fun, I'll relax.
I wonder why the babies don't eat each other. There must be a huge selective pressure to winnow down your fellows to the point where you don't need to be winnowed. This would in turn select for small brained, large and quick growing at the least. There might also be selective pressure to be partially distrusting of your fellows (assuming there was some cooperation), which might follow over into adulthood.
I also agree with the points Carl raised. It doesn't seem very evolutionarily plausible.
"Except to remark on how many different things must be known to constrain the final answer."
What would you estimate the probability of each thing being correct is?
Reformulate to least regret after a certain time period, if you really want to worry about the resource usage of the genie.
Personally I believe in the long slump. However I believe in human optimisim that will make people rally the market every so often. The very fact that most people believe the stock market will rise, will make it rise at least once or twice before people start to get the message that we are in the long slump.
Eliezer, didn't you say that humans weren't designed as optimizers? That we satisfice. The reaction you got is probably a reflection of that. The scenario ticks most of the boxes humans have, existence, self-determination, happiness and meaningful goals. The paper clipper scenario ticks none. It makes complete sense for a satisficer to pick it instead of annihilation. I would expect that some people would even be satisfied by a singularity scenario that kept death as long as it removed the chance of existential risk.
Dognab, your arguments apply equally well to any planner. Planners have to consider the possible futures and pick the best one (using a form of predicate), and if you give them infinite horizons they may have trouble. Consider a paper clip maximizer, every second it fails to use its full ability to paper clip things in its vicinity it is losing possible useful paper clipping energy to entropy (solar fusion etc). However if it sits and thinks for a bit it might discover a way to hop between galaxies with minimal energy. So what decision should it make? Obviously it would want to run some simulations, see if there gaps in its knowledge. How detailed simulations should it make, so it can be sure it has ruled out the galaxy hopping path?
I'll admit I was abusing the genie-trope some what. But then I am sceptical of FOOMing anyway, so when asked to think about genies/utopias, I tend to suspend all disbelief in what can be done.
Oh and belldandy is not annoying because she has broken down in tears (perfectly natural), but because she bases her happiness too much on what Stephen Grass thinks of her. A perfect mate for me would tell me straight what was going on and if I hated her for it (when not her fault at all), she'd find someone else because I'm not worth falling in love with. I'd want someone with standards for me to meet, not unconditional creepy fawning.
Bogdan Butnaru:
What I meant was is that the AI would keep inside it a predicate Will_Pearson_would_regret_wish (based on what I would regret), and apply that to the universes it envisages while planning. A metaphor for what I mean is the AI telling a virtual copy of me all the stories of the future, from various view points, and the virtual me not regretting the wish. Of course I would expect it to be able to distill a non sentient version of the regret predicate.
So if it invented a scenario where it killed the real me, the predicate would still exist and say false. It would be able to predict this, and so not carry out this plan.
If you want to, generalize to humanity. This is not quite the same as CEV, as the AI is not trying to figure out what we want when we would be smarter, but what we don't want when we are dumb. Call it coherent no regret, if you wish.
CNR might be equivalent of CEV if humanity wishes not to feel regret in the future for the choice. That is if we would regret being in a future where people regret the decision, even though current people wouldn't.
I don't believe in trying to make utopias but in the interest of rounding out your failed utopia series how about giving a scenario against this wish.
I wish that the future will turn out in such a way that I do not regret making this wish. Where I is the entity standing here right now, informed about the many different aspects of the future, in parallel if need be (i.e if I am not capable of groking it fully then many versions of me would be focused on different parts, in order to understand each sub part).
I'm reminded by this story that while we may share large parts of psychology, what makes a mate have an attractive personality is not something universal. I found the cat girl very annoying.
Personally I don't find the scientific weirdtopia strangely appealing. Finding knowledge for me is about sharing it later.
Utopia originally meant no-place, I have a hard time forgetting that meaning when people talk about them.
I'd personally prefer to work towards negated-dystopias. Which is not necessarily the same thing as working towards Utopia, depending on how broad your class of dystopia is. For example rather than try and maximise Fun, I would want to minimize the chance that humanity and all its work were lost to extinction. If there is time and energy to devote to Fun while humanity survives then people can figure it out for themselves.
Time scaling is not unproblematic. We don't have a single clock in the brain, clocks must be approximated by neurons and by neural firing. Speeding up the clocks may affect the ability to learn from the real world (if we have a certain time interval for associating stimuli).
We might be able to adapt, but I wouldn't expect it to be straight forward.
A random utility function will do fine, iff the agent has perfect knowledge.
Imagine, if you will a stabber, something that wants to turn the world into things that have been stabbed. If it knows that stabbing itself will kill itself, it will know to stab itself last. If it doesn't know know that stabbing itself will lead to it no longer being able to stab things, then it may not do well in actually achieving its stabbing goal by stabbing itself too early.
I'd agree with the sentiment in this post. I'm interested in building artificial brain stuff, more than building Artificial People. That is a computational substrate that allows the range of purpose-oriented adaptation shown in the brain, but with different modalities. Not neurally based, because simulating neural systems on a system where processing and memory is split defeats the majority of the point of them for me.
Don't you need a person predicate as well? If the RPOP is going to upload us all or something similar, doesn't ve need to be sure that the uploads will still be people.
I suspect the knowledge you get from reading someones writings is very different than the knowledge you get from working with them or them teaching you. When you work or learn closely with someone they can see your reasoning processes and correct them when they go astray at the right point when they are still newly formed and not too ingrained. Otherwise it relies too much on luck. When in someone intellectual career should you read OB, too early it won't mean too much lacking the necessary background and too late you will be inured against it (assuming it is the right way to go!).
Autodidacts are going to be most intellectually useful when you need to break new ground and the methodologies of the past aren't the way to solve the problems needed to be solved.
Are you saying "snakes are often deadly poisonous to humans" is an instrumental value?
I'd agree that dying is bad therefore avoid deadly poisonous things. But I still don't see that snakes have little xml tags saying keep away, might be harmful.... I don't see that as a value of any sort.
Morality does not compress; it's not something you can learn just by looking at the (nonhuman) environment or by doing logic; if you want to get all the details correct, you have to look at human brains.
Why? Why can't you rewrite this as "complexity and morality"?
You may talk about the difference between mathematical and moral insights. Which is true, but then mathematical insights aren't sufficient for intelligence. Maths doesn't tell you whether a snake is poisonous and will kill you or not....
The number of people living today because their ancestors invested their money in themselves/their status and their children, all of us:
The number of people living today because they or someone else invested their money in cryonics or other scheme to live forever, 0.
Not saying that things won't change in the future, but there is a tremendously strong bias to spend your resources on ambulatory people and new people, because that has been what has worked previously.
Women might have stronger instincts in this respect as they have been more strongly selected for the ability to care for their children (unlike men).
If you want to change this state of affairs, swiftly at least, you have to tap into our common psyche as successful replicators and have it pass the "useful for fitness test". This would be as easy as making it fashionable or a symbol of high status, get Obama to sign up publicly and I think you would see a lot more interest.
High status has been something sort after because it gets you better mates and more of them (perhaps illicitly).
Will, your example, good or bad, is universal over singletons, nonsingletons, any way of doing things anywhere.
My point was not that non-singletons can see it coming. But if one non-singletons trys self-modification in a certain way and it doesn't work out then other non-singletons can learn from the mistake (or in worst the evolutionary case the descendents of people curious in a certain way would be out competed by those that instinctively didn't try the dangerous activity). Less so with the physics experiments, depending on dispersal of non-singletons, range of the physical destruction.
There are some types of knowledge that seem hard to come by (especially for singletons). The type of knowledge is knowing what destroys you. As all knowledge is just an imperfect map, there are some things a priori that you need to know to avoid. The archetypal example is in-built fear of snakes in humans/primates. If we hadn't had this while it was important we would have experimented with snakes the same way we experiment with stones/twigs etc and generally gotten ourselves killed. In a social system you can see what destroys other things like you, but the knowledge of what can kill you is still hard won.
If you don't have this type of knowledge you may step into an unsafe region, and it doesn't matter how much processing power or how much you correctly use your previous data. Examples that might threaten singletons:
1) Physics experiments, the model says you should be okay but you don't trust your model under these circumstances, which is the reason to do the experiment. 2) Self-change, your model says that the change will be better but the model is wrong. It disables the system to a state it can't recover from, i.e. not an obvious error but something that renders it ineffectual. 3) Physical self-change. Large scale unexpected effects from feedback loops at a different levels of analysis, e.g. things like the swinging/vibrating bridge problem, but deadly.
No diminishing returns on complexity in the region of the transition to human intelligence: "We're so similar to chimps in brain design, and yet so much more powerful; the upward slope must be really steep."
Or there is no curve and it is a random landscape with software being very important...
Scalability of hardware: "Humans have only four times the brain volume of chimps - now imagine an AI suddenly acquiring a thousand times as much power."
Bottle nosed dolphins have twice the brain volume as normal dolphins (and comparable to our brain volume), yet aren't massively more powerful compared to them. Asian elephants have 5 times the weight...
I personally find the comparison between spike frequency and clockspeed unconvincing. It glosses over all sorts of questions of whether the system can keep all the working memory it needs in 2MB or whatever processor cache it has. Neurons have the advantage of having local memory, no need for the round trip off chip.
We also have no idea how neurons really work, there has been recent work on the role of methylation of dna in memory. Perhaps it would be better to view neural firing as communication between mini computers, rather than processing in itself.
I'm also unimpressed with large numbers, 10^15 operations is not enough to process the positions of 1 gram of hydrogen atoms, in fact it would take 20 million years for it to do so (assuming one op per atom). So this is what we have to worry about planning to atomically change our world to the optimal form. Sure it is far more than we can consciously do, and quite possibly a lot more than we can do unconsciously as well. But it is not mindboglingly huge compared to the real world.
The universe doesn't have to be kind and make all problems amenable to insight....
There are only a certain number of short programs, and once a program gets above a certain length it is hard to compress (I can't remember the reference for this, so it may be wrong, can anyone help?). We can of course reorder things, but then we have to make things currently simple complex.
That said I do think insight will play some small part in the development of AI, but that there may well be a hell a lot of parameter tweaking that we don't understand or know why they are so.
You are right about the smaller is faster and local being more capable of reacting. But Eliezer's arguments are predicated on there being a type of AI that can change itself without deviation from a purpose. So an AI that splits itself into two may deviate in capability, but should share the same purpose.
Whether such an AI is possible or would be effective in the world is another matter.
We also suppose that the technology feeding Moore's Law has not yet hit physical limits. And that, as human brains are already highly parallel, we can speed them up even if Moore's Law is manifesting in increased parallelism instead of faster serial speeds - we suppose the uploads aren't yet being run on a fully parallelized machine, and so their actual serial speed goes up with Moore's Law. Etcetera.
Moore's Law says nothing about speed in the canonical form. You should probably define exactly what variant you are using.
Consider the following. Chimpanzees make tools. The first hominid tools were simple chipped stone from 2.5 million years ago. Nothing changed for a million years. Then homo erectus came along with Acheulian tech, nothing happened for a million years. Then two thousand years ago H. Sapiens appeared and tool use really diversified. The brains had been swelling from 3 million years ago.
If brains had been getting more generally intelligent at that time as they were increasing in size, it is not shown. They may have been getting better at wooing women and looking attractive to men.
This info has been cribbed from the Red Queen page 313 hardback edition.
I would say this shows a discontinous improvement in intelligence, where intelligence is defined as the ability to generally hit a small target in search space about the world. Rather than the ability to get into another hominids pants.