Posts

Comments

Comment by esaund on Bing Chat is blatantly, aggressively misaligned · 2023-02-16T19:48:16.294Z · LW · GW

LLMs are trained not as desired-answer-predictors, but as text predictors. Some of the text is questions and answers, most is not.

I rather doubt that there is much text to be harvested that exhibits the sort of psychotic going around in circles behaviors Sydney is generating. Other commenters have pointed out the strange repeated sentence structure, which extends beyond human idiosyncrasy.

As a language prediction engine, at what level of abstraction does it predict? It certainly masters English syntax. It is strong on lexical semantics and pragmatics. What about above that?  In experiments with ChatGPT, I have elicited some level of commonsense reasoning and pseudo-curiosity.

The strange behaviors we see from Sydney really do resemble a neurotic and sometimes psychotic person. Thus, the latent abstraction model reaches the level of a human persona.

These things are generative. I believe it is not a stretch to say that these behaviors operate at the level of ideas, defined as novel combinations of well-formed concepts. The concepts that LLMs have facility with include abstract notions like thought, identity, and belief. People are fascinated by these mysteries of life and write about them in spades.

Sydney's chatter reminds me of a person undergoing an epistemological crisis. It may therefore be revealing a natural philosophical quicksand in idea-space. Just as mathematics explores formal logical contradictions, these should be subject to systematic charting and modeling.  Just like learning how to talk someone down from a bad place, once mapped out, these rabbit holes may be subject to guardrails grounded in something like relatively hardcoded values.