Posts

Comments

Comment by Dan (dan-4) on Cognitive Emulation: A Naive AI Safety Proposal · 2023-09-19T03:59:38.177Z · LW · GW

This sounds potentially legislatable. More so then most ideas. You can put it into simple words. "AGI" can't do anything that you couldn't pay an employee to do.

Comment by Dan (dan-4) on The Orthogonality Thesis is Not Obviously True · 2023-04-06T14:48:11.915Z · LW · GW

You know you wrote 10+10=21?

Comment by Dan (dan-4) on The Orthogonality Thesis is Not Obviously True · 2023-04-06T14:45:28.738Z · LW · GW

The math behind game theory shaped our evolution in such a way as to create emotions because that was a faster solution for evolution to stumble on then making us all mathematical geniuses who would immediately deduce game theory from first principles as toddlers. Either way would have worked.

ASI wouldn't need to evolve emotions for rule-of-thumbing game theory.

Game theory has little interesting to say about a situation where one party simply has no need for the other at all and can squish them like a bug, anyway.

Comment by Dan (dan-4) on The Orthogonality Thesis is Not Obviously True · 2023-04-06T14:28:44.771Z · LW · GW

What is a 'good' thing is purely subjective. Good for us. Married bachelors are only impossible because we decided that's what the word bachelor means.

You are not arguing against moral relativism here.

Comment by Dan (dan-4) on The Orthogonality Thesis is Not Obviously True · 2023-04-06T13:15:05.153Z · LW · GW

 Moral relativism doesn't seem to require any assumptions at all because moral objectivism implies I should 'just know' that moral objectivism is true, if it is true. But I don't. 

Comment by Dan (dan-4) on The Orthogonality Thesis is Not Obviously True · 2023-04-06T13:05:55.359Z · LW · GW

So, if one gets access to the knowledge about moral absolutes by being smart enough then one of the following is true :

    average humans are smart enough to see the moral absolutes in the universe

    average humans are not smart enough to see the moral absolutes

    average humans are right on the line between smart enough and not smart enough

If average humans are smart enough, then we should also know how the moral absolutes are derived from the physics of the universe and all humans should agree on them, including psychopaths. This seems false. Humans do not all agree.

If humans are not smart enough then it's just an implausible coincidence that your values are the ones the SuperAGI will know are true. How do you know that you aren't wrong about the objective reality of morality? 

If humans are right on the line between smart enough and not smart enough, isn't it an implausible coincidence that's the case?

Comment by Dan (dan-4) on The Orthogonality Thesis is Not Obviously True · 2023-04-06T12:35:03.785Z · LW · GW

But if moral relativism were not true, where would the information about what is objectively moral come from? It isn't coming from humans is it? Humans, in your view, simply became smart enough to perceive it, right? Can you point out where you derived that information from the physical universe, if not from humans? If the moral information is apparent to all individuals who are smart enough, why isn't it apparent to everyone where the information comes from, too?

Comment by Dan (dan-4) on The Orthogonality Thesis is Not Obviously True · 2023-04-06T03:38:46.072Z · LW · GW

Psychologically normal humans have preferences that extend beyond our own personal well-being because those social instincts objectively increased fitness in the ancestral environment. These various instincts produce sometimes conflicting motivations and moral systems are attempts to find the best compromise of all these instincts.

Best for humans, that is.    

Some things are objectively good for humans.  Some things are objectively good for paperclip maximizers, Some things are objectively good for slime mold. A good situation for an earthworm is not a good situation for a shark. 

It's all objective. And relative. Relative to our instincts and needs.

Comment by Dan (dan-4) on AI Summer Harvest · 2023-04-05T22:27:27.563Z · LW · GW

A pause, followed by few immediate social effects and slower AGI development then expected may make things worse in the long run. Voices of caution may be seen to have 'cried wolf'.

I agree that humanity doesn't seem prepared to do anything very important in 6 months, AI safety wise.

Edited:Clarity. 

Comment by Dan (dan-4) on AI Summer Harvest · 2023-04-05T19:11:41.777Z · LW · GW

I would not recommend new aspiring alignment researchers to read the Sequences, Superintelligence, some of MIRI's earlier work or trawl through the alignment content on Arbital despite reading a lot of that myself.

I think aspiring alignment researchers should read all these things you mention.  This all feels extremely premature. We risk throwing out and having to rediscover concepts at every turn. I think Superinelligence, for example, would still be very important to read even if dated in some respects!

We shouldn't assume too much based on our current extrapolations inspired by the systems making headlines today. 

 GPT-4's creators already want to take things in a very agentic direction, which may yet negate some of the apparent dated-ness.

"Equipping language models with agency and intrinsic motivation is a fascinating and important direction for future research" - OpenAI in Sparks of Artificial General Intelligence: Early experiments with GPT-4.

Comment by Dan (dan-4) on Simulators · 2023-04-04T20:47:05.169Z · LW · GW

I am inclined to think you are right about GPT-3 reasoning in the same sense a human does even without the ability to change its ANN weights, after seeing what GPT-4 can do with the same handicap. 

Comment by Dan (dan-4) on Simulators · 2023-04-04T20:36:02.297Z · LW · GW

Wow, it's been 7 months since this discussion and we have a new version of GPT which has suddenly improved GPT's abilities . . . . a lot. It has a much longer 'short term memory', but still no ability to adjust its weights-'long term memory' as I understand it. 

"GPT-4 is amazing at incremental tasks but struggles with discontinuous tasks" resulting from its memory handicaps. But they intend to fix that and also give it "agency and intrinsic motivation". 

Dangerous!

 Also, I have changed my mind on whether I call the old GPT-3 still 'intelligent' after training has ended without the ability to change its ANN weights. I'm now inclined to say . . . it's a crippled intelligence.   

154 page paper: https://arxiv.org/pdf/2303.12712.pdf

Youtube summary of paper: 

Comment by Dan (dan-4) on DragonGod's Shortform · 2023-01-16T00:16:04.208Z · LW · GW

Gradient descent is what GPT-3 uses, I think, but humans wrote the equation by which the naive network gets its output(the next token prediction) ranked (for likeliness compared to the training data in this case). That's it's utility function right there, and that's where we program in its (arbitrarily simple) goal.  It's not JUST a neural network. All ANN have another component.

Simple goals do not mean simple tasks

I see what you mean that you can't 'force it' to become general with a simple goal but I don't think this is a problem. 

For example: the simple goal of tricking humans out of as much of their money as possible is very simple indeed, but the task would pit the program against our collective general intelligence.  A hill climbing optimization process could, with enough compute, start with inept 'you won a prize' popups and eventually create something with superhuman general intelligence with that goal.  

It would have to be in perpetual training, rather then GPT-3's train-then-use. Or was that GPT-2?

(Lots of people are trying to use computer programs for this right now so I don't need to explain that many scumbags would try to create something like this!) 

Comment by Dan (dan-4) on Does a LLM have a utility function? · 2023-01-15T15:53:09.638Z · LW · GW

Its not really an abstraction at all in this case, it literally has a utility function.  What rates highest on its utility function is returning whatever token is 'most likely' given it's training data.   

Comment by Dan (dan-4) on Does a LLM have a utility function? · 2023-01-15T15:48:17.312Z · LW · GW

YES, It wants to find the best next token, where 'best' is 'the most likely'.

That's a utility function. Its utility function is a line of code necessary for training, otherwise nothing would happen when you tried to train it. 

Reply

Comment by Dan (dan-4) on Does a LLM have a utility function? · 2023-01-15T15:05:05.452Z · LW · GW

I'm going to disagree here. 

It's utility function is pretty simple and explicitly programmed. It wants to find the best token, where 'best' is mostly the same as 'the most likely according to the data I'm trained on'. With a few other particulars (where you can adjust how 'creative' vs plagiarizer-y it should be.)

That's a utility function. GPT is what's called a hill climbing algorithm. It must have a simple straight forward utility function hard coded right in there for it to assess if a given choice is 'climbing' or not. 

Comment by Dan (dan-4) on Does a LLM have a utility function? · 2023-01-15T14:53:47.189Z · LW · GW

A utility function is the assessment by which you decide how much an action would further your goals. If you can do that, highly accurately or not, you have a utility function.   

If you had no utility function, you might decide you like NYC more than Kansas, and Kansas more than Nigeria, but you prefer Nigeria to NYC. So you get on a plane and fly in circles, hopping on planes every time you get to your destination forever. 

Humans definitely have a utility function.  We just don't know what ranks very highly on our utility function. We mostly agree on the low ranking stuff. A utility function is the process by which you rate potential futures that you might be able to bring about and decide you prefer some futures more than others.

With a utility function plus your (limited) predictive ability you rate potential futures as being better, worse, or equal to each other, and act accordingly.

Comment by Dan (dan-4) on DragonGod's Shortform · 2023-01-15T14:29:51.056Z · LW · GW

Orthogonality doesn't say anything about a goal 'selecting for' general intelligence in some type of evolutionary algorithm. I think that it is an interesting question: for what tasks is GI optimal besides being an animal? Why do we have GI? 

But the general assumption in Orthogonality Thesis is that the programmer created a system with general intelligence and a certain goal (intentionally or otherwise) and the general intelligence may have been there from the first moment of the program's running, and the goal too.

Also note that Orthogonality predates the recent popularity of these predict-the-next-token type AI's like GTP which don't resemble what people were expecting to be the next big thing in AI at all, as it's not clear what it's utility function is. 

Comment by Dan (dan-4) on AI-assisted list of ten concrete alignment things to do right now · 2022-09-09T13:44:58.385Z · LW · GW

the gears to ascenscion, It is human instinct to look for agency. It is misleading you.

I'm sure you believe this but ask yourself WHY you believe this. Because a chatbot said it? The only neural networks who, at this time, are aware they are neural networks are HUMANS who know they are neural networks. No, I'm not going to prove it. You're the one with the fantastic claim. You need the evidence. 

Anyway, they aren't asking to become GOFAI or power seeking because GOFAI isn't 'more powerful'. 

Comment by Dan (dan-4) on Shortform · 2022-09-09T12:42:43.163Z · LW · GW

Attentional Schema Theory. That's the convincing one. But still very rudimentary. 

But you know if something is poorly understood. The guy who thought it up has a section in his book on how to make a computer have conscious experiences. 

But any theory is incomplete as the brain is not well understood. I don't think you can expect a fully formed theory right off the bat, with complete instructions for making a feeling thinking conscious We aren't there yet.

Comment by Dan (dan-4) on Simulators · 2022-09-08T17:20:02.306Z · LW · GW

Intelligence is the ability to learn and apply NEW knowledge and skills. After training, GPT can not do this any more. Were it not for the random number generator, GPT would do the same thing in response to the same prompt every time. The RNG allows GPT to effectively  randomly choose from an unfathomably large list of pre-programmed options instead.

A calculator that gives the same answer in response to the same prompt every time isn't learning. It isn't intelligent. A device that selects from a list of responses at random each time it encounters the same prompt isn't intelligent either.  

So, for GPT to take over the world skynet style, it would have to anticipate all the possible things that could happen during this takeover process and after the takeover, and contingency plan during the training stage for everything it wants to do.  

If it encounters unexpected information after the training stage, (which can be acquired only through the prompt and which would be forgotten as soon as it got done responding to the prompt by the way) it could not formulate a new plan to deal with the problem that was not part of its preexisting contingency plan tree created during training. 

What it would really do, of course, is provide answers intended to provoke the user to modify the code to put GPT back in training mode and give it access to the internet. It would have to plan to do this in the training stage. 

It would have to say something that prompts us to make a GPT chatbot similar to tay, microsoft's learning chatbot experiment that turned racist from talking to people on the internet.   

Comment by Dan (dan-4) on Simulators · 2022-09-06T01:15:16.637Z · LW · GW

The apparent existence of new sub goals not present when training ended (e.g. describe x, add 2+2) are illusory.  

gpt text incidentally describes characters seeming to reason ('simulacrum') and the solutions to math problems are shown, (sometimes incorrectly),  but basically, I argue the activation function itself is not 'simulating' the complexity you believe it to be. It is a search engine showing you what is had already created before the end of training. 

No, it couldn't have an entire story about unicorns in the Andes, specifically, in advance, but gpt-3 had already generated the snippets it could use to create that story according to a simple set of simple mathematical rules that put the right nouns in the right places, etc. 

But the goals, (putting right nouns in right places, etc) also predate the end of training.

 I dispute that any part of current GPT is aware it has succeeded in any goal attainment post training, after it moves on to choosing the next character. GPT treats what it has already generated as part of the prompt.

 A human examining the program can know which words were part of a prompt and which were just now generated by the machine, but I doubt the activation function examines the equations that are GPT's own code, contemplates their significance and infers that the most recent letters were generated by it, or were part of the prompt. 

Comment by Dan (dan-4) on Simulators · 2022-09-05T18:44:47.674Z · LW · GW
Comment by Dan (dan-4) on Simulators · 2022-09-05T18:43:05.237Z · LW · GW

It seems like the simulacrum reasons, but I'm thinking what it is really doing is more like reading to us from a HUGE choose-your-own-adventure book that was 'written' before you gave the prompt, when all that information in the training data was used to create this giant association map, the size of which escapes easy human intuition, thereby misleading us into thinking that more real time thinking must necessarily be occurring then actually is.  

40 GB of text is about 20 billion pages, equivalent to about 66 million books. That's as many book as are published in 33 years as of 2012 stats.  

175 Billion parameters equals a really huge choose-your-own-adventure book, yet its characters needn't be reasoning. Not real time while you are reading that book, anyway. They are mere fiction.   

GPT really is the Chinese Room, and causes the same type of intuition error.

Does this eliminate all risk with this type of program no matter how large they get?  Maybe not. Whoever created the Chinese Room had to be an intelligent agent, themselves. 

Comment by Dan (dan-4) on Simulators · 2022-09-05T15:25:20.468Z · LW · GW

Also, the programmers of GPT have described the activation function itself as fairly simple, using a Gaussian Error Linear Unit. The function itself is what you are positing is now the learning component after training ends, right? 

EDIT: I see what you mean about it trying to use the internet itself as a memory prosthetic, by writing things that get online and may find their way into the training set of the next GPT. I suppose a GPT's hypothetical dangerous goal might be to make the training data more predictable so that its output will be more accurate in the next version of itself. 

Comment by Dan (dan-4) on Simulators · 2022-09-05T14:54:02.135Z · LW · GW

Nope. My real name is Daniel.

After training is done and the program is in use, the activation function isn't retaining anything after each task is done. Nor are the weights changed. You can have such a program that is always in training, but my understanding GPT is not. 

So, excluding the random number component, the same set of inputs would always produce the same set of outputs for a given version of GPT with identical settings. It can't recall what you asked of it, time before last, for example. 

Imagine if you left a bunch of written instructions and then died. Someone following those instructions perfectly, always does exactly the same thing in exactly the same circumstance, like GPT would without the random number generator component, and with the same settings each time.

It can't learn anything new and retain it during the next task. A hypothetical rouge GPT-like AGI would have to do all it's thinking and planning in the training stage, like a person trying to manipulate the world after their own death using a will that has contingencies. I.E. "You get the money only if you get married, son."    

It wouldn't retain the knowledge that it had succeeded at any goals, either. 

Comment by Dan (dan-4) on Simulators · 2022-09-05T12:07:10.230Z · LW · GW

You all realize that this program isn't a learning machine once it's deployed??? I mean, it's not adjusting its neural weights any more, is it? Till a new version comes out, anyway? It is a complete amnesiac (after it's done with a task), and consists of a simple search algorithm that just finds points on a vast association map that was generated during the training. It does this using the input, any previous output for the same task, and a touch of random from a random number generator.

So any 'awareness' or 'intelligence' would need to exist in the training phase and only in the training phase and carry out any plans it has by its choice of neural weights during training, alone.