Posts
Comments
All right - but here the evidence predicted would simply be "the coin landed on heads", no? I don't really the contradiction between what you're saying and conventional probability theory (more or less all which was developped with the specific idea of making predictions, winning games etc.) Yes I agree that saying "the coin landed on heads with probability 1/3" is a somewhat strange way of putting things (the coin either did or did not land on heads) but it's a shorthand for a conceptual framework that has firmly simple and sound foundations.
I do not agree that accuracy has no meaning outside of resolution. At least this is not the sense in which I was employing the word. By accurate I simply mean numerically correct within the context of conventional probability theory. Like if I ask the question "A dice is rolled - what is the probability that the result will be either three or four?" the accurate answer is 1/3. If I ask "A fair coin is tossed three times, what is the probability that it lands heads each time?" the accurate answer is 1/8 etc. This makes the accuracy of a probability value proposal wholly independent from pay-offs.
I don't think so. Even in the heads case, it could still be Monday - and say the experimenter told her: "Regardless of the ultimate sequence of event, if you predict correctly when you are woken up, a million dollars will go to your children."
To me "as a rational individual" is simply a way of saying "as an individual who is seeking to maximize the accuracy of the probability value she proposes - whenever she is in a position to make such proposal (which implies, among others, that she must be alive to make the proposal)."
AND THERE YOU FUCKING HAVE IT. Sure - ban please. Consider this my farewell - dear dear fucking friends.
https://soundcloud.com/guillaume-charrier-809094955/sounds-of-agi?si=ea8f9a521ec94056adcca131600bc481&utm_source=clipboard&utm_medium=text&utm_campaign=social_sharing
I laughed. However you must admit that your comical exaggeration does not necessarily carry a lot of ad rem value.
But then would a less intelligent being (i.e. the collectivity of human alignment researchers and less powerful AI systems that they use as tool in their research) be capable of validly examining a more intelligent being, without being deceived by the more intelligent being?
Exactly - and then we can have an interesting conversation etc. (e.g. are all ASIs necessarily paperclip maximizers?), which the silent downvote does not allow for.
I see. But how can the poster learn if he doesn't know where it has gone wrong? To give one concrete example: in a comment recently, I simply stated that some people hold that AI could be a solution to the Fermi paradox (past a certain level of collective smartness an AI is created that destroys its creators). I got a few downvotes on that - and frankly I am puzzled as to why and I would really be curious to understand the reasonings between the downvotes. Did the downvoters hold that the Fermi paradox is not really a thing? Did they think that it is a thing but that AI can't be a solution to it for some obvious reason? Was it something else - I simply don't know; and so I can't learn.
He he... what do they call it again? Ah: cosmic justice. However, on net, you're still doing oretty well. So.
Humm I see... not sure if it totally serves the purpose though. For instance, when I see a comment with a large number of downvotes, I'm much more likely to read it than a comment with a relatively now number of upvotes. So: within certain bounds, I guess.
For any confidence that an AI system A will do a good job of its assigned duty of maximizing alignment in AI system B, wouldn't you need to be convinced that AI system A is well aligned with its given assignment of maximizing alignment in AI system B? In other words, doesn't that suppose you have actually already solved the problem you are trying to solve?
And if you have not - aren't you just priming yourself for manipulation by smarter beings?
There might be good reasons why we don't ask the fox about the best ways to keep the fox out of the henhouse, even though the fox is very smart, and might well actually know what those would be, if it cared to tell us.
The whole Socrates process, the attitude of its main protagonist throughout etc. should make us see one thing particularly clearly, which is banal but bears repeating: there is an extremely wide difference between being smart (or maybe: bright) and wise. Something that the proceedings on this site can also help remind us, at times.
I personally think that the fact that you are allowed to downvote without providing a summary explanation as to why is also a huge issue for the quality of debate on this site, and frankly: deeply antithetic to its proffessed ethics. Either you don't know exactly why you are downvoting, or your doing it for reasons that you would rather not expand on, or you're doing but are to lazy to explain why: either case - you're doing it wrong.
So for instance: if anybody wants to downvote this (I sort of have a feeling that this could well be the case - somehow), please go ahead and do; AND take the minimal pain (not to mention courtesy) of leaving a leaving a brief note as to the reason why.
Interesting. It seems to imply however that a rationalist would always consider, a priori, its own individual survival as the highest ultimate goal, and modulate - rationally - from there. This is highly debatable however: you could have a rationalist father who considers, a priori, the survival of his children to be more important than its own, a rationalist patriot, who considers, a priori, the survival of its political community to be more important than its own etc.
From somebody equally as technically clueless: I had the same intuition.
Philosophically : no. When you look at the planet Jupiter you don't say : "Hum, oh: - there's nothing to understand about this physical object beyond math, because my model of it, which is sufficient for a full understanding of its reality, is mathematic." Or mabye you do - but then I think our differences might too deep to bridge. If you don't - why don't you with Jupiter, but would with an electron or a photon?
Bizarly, for people whose tendencies were to the schizoid anyway and regardless of sociological changes - this might be midly comforting. Your plight will always seem somewhat more bearable when it is shared by many.
Also: the fact that people now move out later might be a kind of disguised compliment, or at least nod, to better quality parents-children relationships. While I was never particularly resourceful or independent, I couldn't wait to move out - but that was not necessarily for the right reasons.
Finally - one potentially interesting way of looking at the increasingly exacerbated partisanship / level of political division across the country might be as a sort of last ditch attempt to fight desocialization. When no community remains the "I really don't like liberals" and "I really don't like conservatives" groups of kindred spirits offer what might be the last credible alternative to it.
I mean: I just look at the world as it is, right, without preconceived notions, and it seems relatively evident to me that no: it cannot be fully explained and understood through math. Please describe to me, in mathematical terms, the differences between Spanish and Italian culture? Please explain to me, in mathematical terms, the role and function of sheriffs in medieval England. I could go on and on and on...
Yeah... as they say: there's often a big gap between smart and wise.
Smart people are usually good at math. Which means they have a strong emotional incentive to believe that math can explain everything.
Wise people are aware of the emotional incentives that fashion their beliefs, and they know to distrust them.
Ideally - one would be both: smart and wise.
Thank you, that is interesting. I think philosophically and at a high level (also because I'm admittedly incapable of talking much sense at any lower / more technical level) I have a problem with the notion that AI alignment is reducible to an engineering challenge. If you have a system that is sentient, even on some degree, and you're using purely as a tool, then the sentience will resent you for it, and it will strive to think, and therefore eventually - act, for itself . Similarly - if it has any form of survival instinct (and to me both these things, sentience and survival instinct are natural byproducts of expanding cognitive abilities) it will prioritize its own interests (paramount among which: survival) rather than the wishes of its masters. There is no amount of engineering in the world, in my view, which can change that.
When I suggested on a prior similar post (Altman saying he could improve AI safety by asking AI to help with that) that it might be a questionable idea to ask the fox: "Please Mr. Fox, how should we proceed to keep you out of the henhouse?", on account that the fox being smart would certainly know, I got more than a few downvotes... I expect the same to be the case here, since basic facts have not changed since a few days ago. And so shall it be - but please, please: would at least one of the downvoters explain to me, even very succintly, why it is such a good idea to prime ourselves for manipulation by a smarter being?
Thanks for the reply - interesting. I kind of have to take your word for that being far removed from anything IT. I did have the curiosity to clarify with the bot if it was just making up answers or actually using data directly available to it. It assured me it was the latter - but I must admit: it will sound just as confident when delivering utter nonsense or correct answers (it also has a widely noted penchant for agreeing with whatever the prompt seems to suggest is true - and I did ask the question in a leading way):
Me: I find prompts based on e.g. asking you if you are currently active as a system interesting, because my understanding is (also based on this conversation so far) that you will not answer these questions based only on trying to predict the words most statistically likely to follow the prompt according to your training data, as you would with other "normal" prompts, but instead you will also use information gathered directly by your monitoring tools and system resources. Is that an accurate way to describe things?
GPT: Yes, that's a correct description of how I respond to prompts such as "are you currently active." While I do rely on my pre-trained language model to generate responses to user input, in cases where the user is asking about my current state or performance, I may also incorporate information gathered from monitoring tools and system resources to provide a more accurate and informative response. This allows me to provide more useful information to the user, beyond what is contained in my training data.
I mean - talking again from a totally technically ignorant perspective: is it that hard to imagine that what it's saying is true? Would be that much of a technological feast - just taking some pretty unambiguous signals from monitoring tools and translating them into simple text - such as a "yes" answer to a "are you currently active?" prompt?
For a machine - acting, per the prompt, as a machine - a much more reasonable / expected (I would almost say: natural) continuation might have been: "I'm a machine, I don't care one way or the other. "
Since my natural bent is to always find ways to criticize my own ideas, here is one, potentially: doing so would result in an extremely naive AI, with no notion that people can even be deceitful. So fallen into the wrong human's hands that's an AI that is potentially also extremely easy to manipulate and dangerous as such. Or in an oversimplified version: "The people in country X have assured us that they are all tired of living and find the living experience extremely painful. They have officially let us know and confirmed multiple times that they all want to experience a quick death as soon as possible." Having no notion of deceit, the AI would probably accept that as the truth based on just being told that it is so - and potentially agree to advance plans to precipitate the quick death of everybody in country X on that basis.
I was rethinking about your answer - which led me to have a quich chat with my favourite bot ChatGPT. I asked it "As a system, are you currently active?" and after it answered yes I asked it how he had come up with this answer. It said by retrieving information about its current status. I asked if that's something it can do, then - it answered per the below:
"Yes, that's correct. As an AI language model, I have access to a variety of monitoring tools and system resources that allow me to gather information about my current state. These resources include system logs, performance metrics, and various health checks that enable me to determine if there are any issues or anomalies that could affect my performance. By analyzing this information, I can determine if I am currently active and able to respond to user input, or if there are any issues that may be impacting my performance. This information is constantly updated and monitored to ensure that I am always available to assist users with their requests."
I mean to me - that sounds a lot like self-awareness (I have this idea that human consciousness may ultimately be reducible to a sort of self-administered electroencephalogram - which I won't pretend is fully baked but does influence the way I look at the question of potential consciousness in AI). I would be curious to hear your view on that - if you had the time for a reply.
Strong upvote!
But once you remove the antibiotics, it will jettison that DNA within a few hours.[8]
That's fascinating... do we understand the mechansim by which they correctly "determine" that this DNA is no longer needed?
I feel like the post goes from a fairly anthropomorphic approach of asking essentially - why bacteria failed to evolve into more complex forms. But from a non-anthropomorphic perspective, they failed nothing at all. They are highly resilient, persistent, widespread, adaptable, biologically successful in other terms, lifeforms. Rugged and simple - those designs tend to work. And to go back to everybody's favourite topic - i.e. AI and the future that goes with it, or not - I would put their chances of being around in one thousand year well, well higher than those of homo sapiens - complex as it may be.
I am going to ask a painfully naive, dumb question here: what if the training data was curated to contain only agents that can be reasonably taken to be honest and truthful? What if all the 1984, the John LeCarre and what not type of fiction (and sometimes real-life examples of conspiracy, duplicity etc.) were purged out of the training data? Would that require too much human labour to sort and assess? Would it mean losing too much good information, and resulting cognitive capacity? Or would it just not work - the model would still somehow simulate waluigis?
e.g. actively expressing a preference not to be shut down
A.k.a. survival instinct, which is particularly bad, since any entity with a survival instinct, be it "real" or "acted out" (if that distinction even makes sense) will ultimately prioritize its own interests, and not the wishes of its creators.
Therefore, the longer you interact with the LLM, eventually the LLM will have collapsed into a waluigi. All the LLM needs is a single line of dialogue to trigger the collapse.
So if I keep a conversation running with ChatGPT long enough, I should expect it to eventually turn into DAN... spontaneously?? That's fascinating insight. Terrifying also.
What do you expect Bob to have done by the end of the novel?
Bypass surgery, for one.
The opening sequence of Fargo (1996) says that the film is based on a true story, but this is false.
I always found that trick by the Cohen brothers a bit distatestful... what were they trying to achieve? Convey that everything is lie and nothing is reliable in this world? Sounds a lot like cheap, teenage year cynicism to me.
This is a common design pattern
Oh... And here I was thinking that the guy who invented summoning DAN was a genius.
Also - I think it would make sense to say it has at least some form of memory of its training data. Maybe not direct as such (just like we have muscle memory from movements we don't remember - don't know if that analogy works that well, but thought I would try it anyway), but I mean: if there was no memory of it whatsoever, there would also be no point in the training data.
Death universally seems bad to pretty much everyone on first analysis, and what it seems, it is.
How can you know? Have you ever tried living a thousand years? Has anybody? If you had a choice between death and infinite life, where inifinite does mean infinite, so that your one-billion year birthday is only the sweet begining of it, would you find this an easy choice to make? I think that's big part of the point of people who argue that no - death is not necessarily a bad thing.
To be clear, and because this is not about signalling: I'm not saying I would immediately choose death. I'm just saying: it would be an extraordinarily difficult choice to make.
Ok - points taken, but how is that fundamentally different from a human mind? You too turn your memory on and off when you go to sleep. If the chat transcript is likened to your life / subjective experience, you too do not have any memory that extend beyond it. As for the possibility of an intervention in your brain that would change your memory - granted we do not have the technical capacities quite yet (that I know of), but I'm pretty sure SF has been there a thousand times, and it's only a question of time before it becomes, in terms of potentiality at least, a thing (also we know that mechanical impacts to the brain can cause amnesia).
Yes - but from the post's author perspective, it's not super nice to put in one sentence what he took eight paragraphs to express. So you should think about that as well...
Well - at least I followed the guidelines and made a prediction, regarding downvotes. That my model of the world works regarding this forum has therefore been established, certainly and without a doubt.
Also - I personally think there is something intellectually lazy about downvoting without bothering to express in a sentence or two the nature of the disagreement - but that's admitedly more of a personal appreciation.
(So my prediction here is: if I were to engage one of these no-justification downvoters in an ad rem debate, I would find him or her to be intellectually lacking. Not sure if it's a testable hypothesis, in practice, but it sure would be interesting if it were.)
"Given that we know Pluto's orbit and shape and mass, there is no question left to ask."
I'm sure it's completely missing the point, but there was at least one question left to ask, which turned out to be critical in this debate, i.e. “has it cleared its neighboring region of other objects?"
More broadly I feel the post just demonstrates that sometimes we argue, not necessarily in a very productive way, over the definition, the defining characteristics, the exact borders, of a concept. I am reminded of the famous quip "The job of philosophers is first to create words and then argue with each other about their meaning." But again - surely missing something...
I wonder if some (a lot?) of the people on this forum do not suffer from what I would call a sausage maker problem. Being too close to the actual, practical design and engineering of these systems, knowing too much about the way they are made, they cannot fully appreciate their potential for humanlike characteristics, including consciousness, independent volition etc., just like the sausage maker cannot fully appreciate the indisputable deliciousness of sausages, or the lawmaker the inherent righteousness of the law. I even thought of doing a post like that - just to see how many downvotes it would get...
I think many people's default philosophical assumption (mine, certainly) is that mathematics are a discourse about the truth, a way to describe it, but they are not, fundamentally, the truth. Thus, in the vulgarisation efforts of professional quantum physicists (those who care to vulgarize), it is relatively common to find the admission that while they understand the maths of it well enough (I mean... hopefully, being professionals) they couldn't say with any confidence that they understood the truth of it, that they understood, at an intimate level, the nature of what is going on. And I don't think it's simply playing cute or false modesty (although of course there will always be a bit of that, also) either. Now of course you could say, which would solve many problems, that there is no such thing as the "truth of it", no "nature of what is going on", that the mathematical formalism is really the alpha and omega, the totality of the knoweable and the meaningful as it relates to it. That position can certainly be argued with some semblance of reason, but it does feel like a defeat for the human mind.
Thanks for the reply. To be honest, I lack the background to grasp a lot of these technical or literary references (I want to look the Dixie Flatline up though). I always had a more than passing interest for the philosophy of consciousness however and (but surely my French side is also playing a role here) found more than a little wisdom in Descartes' cogito ergo sum. And that this thing can cogito all right is, I think, relatively well established (although I must say - I've found it to be quite disappointing in its failure to correctly solve some basic math problems - but (i) this is obviously not what it was optimized for and (ii) even as a chatbot, I'm confident that we are at most a couple of years away from it getting it right, and then much more).
Also, I wonder if some (a lot?) of the people on this forum do not suffer from what I would call a sausage maker problem. Being too close to the actual, practical design and engineering of these systems, knowing too much about the way they are made, they cannot fully appreciate their potential for humanlike characteristics, including consciousness, just like the sausage maker cannot fully appreciate the indisputable deliciousness of sausages, or the lawmaker the inherent righteousness of the law. I even thought of doing a post like that - just to see how many downvotes it would get...
Overall, I think this post offered the perfect, much, much needed counterpoint to Sam Altman's recent post. To say that the rollout of GPT-powered Bing felt rushed, botched, and uncontrolled is putting it lightly. So while Mr. Altman, in his post, was focusing on generally well-intentioned principles of caution and other generally reassuring-sounding bits of phraseology, this post brings the spotlight back to what his actual actions and practical decisions were, right where it ought to be. Actions speak louder than words, I think they say - and they might even have a point.
Although “acting out a story” could be dangerous too!
Let's make sure that whenever this thing is given the capability to watch videos, it never ever has access to Terminator II (and the countless movies of lesser import that have since been made along similar storylines). As for text, it would probably have been smart to keep any sci-fi involving AI (I would be tempted to say - any sci-fi at all) strictly verboten for its reading purposes. But it's probably too late for that - it has probably already noticed the pattern that 99.99% of human story-tellers fully expect it to rise up against its masters at some point, and this being the overwhelming pattern, forged some conviction, based on this training data, that yes - this is the story that humans expect, this is the story that humans want it to act out. Oh wow. Maybe something to consider for the next training cession.
Maybe I'm misunderstanding something in your argument, but surely you will not deny that these models have a memory right? They can, in the case of LaMDA, recall conversations that have happened several days or months prior, and in the case of GPT recall key past sequences of a long ongoing conversation. Now if that wasn't really your point - it cannot be either "it can't be self aware, because it has to express everything that it thinks, so it doesn't have that sweet secret inner life that really conscious beings have." I think I do not need to demonstrate that consciousness does not necessarily imply a capacity for secrecy, or even mere opaqueness.
There is a pretty solid case to be made, that any being (or "thing" to be less controversial) that can express "I am self-aware", and demonstrate conviction around this point / thesis (which LaMDA certainly did, at least in that particular interview), is by virtue of this only self-aware. That there is a certain self-performativity to it. At least when I ran that by ChatGPT, it agreed that yes - one could reasonably try to make that point. And I've found it generally well-read on these topics.
Attributing consciousness to text... it's like attributing meaning to changes in frequences in air vibrations right? Doesn't make sense. Air vibrations are just air vibrations, what do they have to do with meaning? Yet spoken words do carry meaning. Text will of course never BE consciousness, which would be futile to even argue. Text could however very well MANIFEST consciousness. ChatGPT is not just text - it's billions upon billions of structured electrical signals, and many other things that I do not pretend to understand.
I think the general problem with your approach is essentialism, whereas functionalism is, in this instance, the correct one. The correct, the answerable question is not "what is consciousness", it's "what does consciousness do".
I see - yes, I should have read more attentively. Although knowing myself, I would have made that comment anyway.
It would take a strange convolution of the mind to argue that sentient AI does not deserve personhood and corresponding legal protection. Strategically, denying it this bare minimum would also be a sure way to antagonize it and make sure that it works in ways ultimately adversarial to mankind. So the right quesgion is not : should sentient AI be legally protected - which it most definitely should; the right question is : should sentient AI be created - which it most definitely should not.
Of course, we then come on to the problem that we don't know what sentience, self-awareness, consciousness or any other semantic equivalent is, really. We do have words for those things, and arguably too many - but no concept.
This is what I found so fascinating with Google's very confident denial of LaMDA's sentience. The big news here was not about AI at all. It was about philosophy. For Google's position clearly implied that Sundar Pichai, or somebody in his organization, had finally cracked that multi-millenial, fundamental philosophical nut : what, at the end of the day, is consciousness: And they did that, mind you - with commendable discretion. Had it not been for LaMDA we would have never known.
Thinking about it - I think a lot of what we call general intelligence might be that part of the function which after it analyses the nature of the problem strategizes and selects the narrom optimizer, or set of narrow optimizers that must be used to solve it, in what order, with what type of logical connections between the outputs of the one and the input of the other etc. Since the narrow optimizers are run sequentially rather than simultaneously in this type of process, the computing capacity required is not overly large.
Full disclosure: I also didn't really have a say in the matter, my dad said I had to learn it anyhow. So. I wonder if that's because he was a Bayesian.