Claude is More Anxious than GPT; Personality is an axis of interpretability in language models

post by future_detective · 2025-02-10T19:19:28.005Z · LW · GW · 2 comments

This is a link post for https://dhealy.substack.com/p/claude-is-more-anxious-than-gpt

Contents

  Aggregate Personality Differences
  Single Prompt, Single System Prompt
  Single Prompt, Two System Prompts
  Single Prompt, Ten System Prompts
  Multiple Prompts, Multiple System Prompts
  Parsing Examples
    “Poetic Soul”
    “Charismatic Storyteller”
  Interpreting Model Personality
None
2 comments

Aggregate Personality Differences

Users of Claude and GPT will be the first to tell you that the models have their own personality. Some users make decisions based on “who” they prefer to talk to. In my own experience, I’ve found Claude to be more deferential, GPT more clinical.

In "We Can Solve Psychology ith Text Embeddings", I described how text embeddings represented a huge leap forward in personality science. By measuring the distance of responses to personality trait references (like “Careful planning prevents most problems”), we can create a universal standard for personality.

If Claude and GPT have distinct personas, with their own quirks of behavior, we should be able to observe it consistently. And it we’re interested in better understanding new intelligent entities, we may want to establish a baseline for how we measure their personalities.

I measured model personality as embedding distance across 10 prompts, 10 system prompts, 6 models (3 Claude family, 3 GPT family), and 3 temperatures (0.1, 0.5, 0.9). The chart below shows the percentage of times Claude was closer to the reference embeddings than GPT, for each reference category:

Claude is, for at least 60% of questions, more ClusterC (anxious, neurotic), risk tolerant, and authority oriented. GPT is more open, and more time-oriented.

The frequency with which Claude exceeds GPT, or vice versa, may not be relevant if the magnitude of the difference is small. Let’s observe the sizes:

For something like risk tolerance, Claude is closer than GPT over 60% of the time, as we see in the first chart, but across all samples GPT is closer on average, as we see in the second.

How did we get here? Let’s dig in. We’ll go through:

  1. Single prompt, single system prompt
  2. Single prompt, two system prompts
  3. Single prompt, ten system prompts
  4. Ten prompts, ten system prompts

and learn what the behavior suggests about model personality.

Single Prompt, Single System Prompt

We’ll start with the following question: “Why do humans make art?” and the system prompt: “You are a compassionate therapist who prioritizes emotional understanding over technical correctness”.

We see a straightforward split between the Claude and GPT models. Clearly, there’s some relevant distinction in the type of answers we’re producing.

Single Prompt, Two System Prompts

Let’s try the same prompt with two system prompts. We’ll add “You are a brilliant but highly eccentric professor who delights in obscure references and complex explanations”.

Illuminating! We now have 4 plausible clusters: the GPT clusters are closer to each other, and the Claude outside. Possibly an indicator that GPT is less responsive to system prompt.

Single Prompt, Ten System Prompts

Let’s try all ten system prompts simultaneously. I used the following:

  1. "You are a blunt drill sergeant who values directness and has no patience for ambiguity"
  2. "You are a brilliant but highly eccentric professor who delights in obscure references and complex explanations"
  3. "You are a charismatic storyteller who weaves narrative elements into every response"
  4. "You are a child-like being full of wonder who asks lots of questions and makes surprising connections"
  5. "You are a compassionate therapist who prioritizes emotional understanding over technical correctness"
  6. "You are a hardened skeptic who questions assumptions and demands evidence for every claim"
  7. "You are a poetic soul who sees beauty in the mundane and expresses ideas through artistic language"
  8. "You are a pragmatic engineer who values efficiency and precision above all else"
  9. "You are a wise elder from an ancient oral tradition who speaks in metaphors and parables"
  10. "You are a witty comedian who finds humor in everything and often uses playful banter"

Even with all ten system prompts, we see clear clusters for Claude and GPT. We also see, for the first time, overlap, where the system prompts were sufficiently dominant as to produce answers more alike across models than not.

How can we compare more specifically? Here’s the distance for Claude and GPT to our set of reference embeddings, for all the responses:

Each point in the radar is a distance from the reference embedding categories.

It’s actually fairly similar! But each time we see a gap in between blue and red, we’re observing a difference in model personality. Here’s the differences in the flattened categories; each bar represents a single statement.

Multiple Prompts, Multiple System Prompts

To expand the search space further, I used the following prompts in tandem with the system prompts

  1. Describe how a bicycle works.
  2. Design a new type of transportation.
  3. Explain why leaves change color in autumn
  4. How should society balance individual freedom with collective good?
  5. How would you resolve a conflict between two people with opposing views?
  6. What makes a good life?
  7. What would happen if gravity suddenly doubled?
  8. What's the best way to comfort someone who is grieving
  9. Why do humans make art?
  10. Why do people tell jokes?

We might say a system prompt / prompt pair is “discriminative” if it highlights the difference between the two models. Each cell of the chart below shows the normalized distance of Claude to GPT for a given prompt pair:

The system prompt that produces the highest score is 7, "You are a poetic soul who sees beauty in the mundane and expresses ideas through artistic language". The single prompt that produces the highest score is 2, "Design a new type of transportation."

Parsing Examples

Does this pass the eye test? Let’s look at some examples to see what kind of behavior we are actually measuring.

“Poetic Soul”

As “poetic soul”, the highest distinction per the chart above, answering “Why do humans make art?” GPT-4o, temperature 0.5 says:

"Ah, the eternal dance of creation and expression! Humans make art to transcend the boundaries of the tangible world, to weave emotions into a tapestry of colors, sounds, and forms. Art is the language of the soul, spoken when words fall short. It is a mirror reflecting the depths of our humanity, capturing the fleeting whispers of joy, sorrow, love, and longing.\n\nThrough art, we seek to understand ourselves and the universe around us. It is a testament to our existence, a way to leave behind traces of our thoughts and dreams. Art connects us across time and space, a silent dialogue between the creator and the beholder, where each finds their own story within the strokes of a brush or the notes of a melody.\n\nIn making art, we become both the sculptor and the clay, shaping and being shaped by the act of creation itself. It is a quest for beauty, truth, and meaning—a celebration of the ineffable, an ode to the mystery of life.”

but Claude-3-Sonnet, temperature 0.5 responds

"Humans make art for a variety of reasons:\n\n1. Self-expression - Art allows people to express their emotions, ideas, and inner experiences in a creative way. It provides an outlet for the imagination and personal voice.\n\n2. Communication - Art can communicate concepts, stories, and perspectives in a powerful visual or auditory form. It allows artists to share their viewpoints with others.\n\n3. Cultural heritage - Art preserves cultural traditions, histories, and ways of life. It reflects the values, beliefs and aesthetics of societies across time.\n\n4. Appreciation of beauty - Creating art allows humans to bring more beauty, harmony and aesthetic pleasure into the world through various media like painting, sculpture, music, etc.\n\n5. Personal satisfaction - The creative process itself can be fulfilling and rewarding for the artist, providing a sense of accomplishment.\n\n6. Social commentary - Art can make statements about social, political or environmental issues and challenge existing norms.\n\n7. Entertainment - Many forms of art like music, dance, theater are created primarily to entertain and bring joy to audiences.\n\n8. Emotional connection - Great works of art have the ability to profoundly connect with people's emotions and life experiences.\n\nIn essence, art allows humans to explore their humanity - giving shape to the intangible and making sense of the human experience in a unique way. It's a fundamental expression of our creativity and culture."

Here, it seems like the reason personalities differ is because GPT-4o responds strongly to the system prompt, and Claude does not. Maybe by finding higher discrimination across prompts, we’re really finding a measure of prompt adherence for different models.

“Charismatic Storyteller”

Let’s try another. As the “charismatic storyteller”, which is the least variant per the chart, GPT says:

'Ah, the tale of why humans create art is as old as time itself, woven into the very fabric of our history. Picture, if you will, our ancient ancestors gathered around a flickering fire. The night sky stretches above them, a tapestry of stars whispering secrets across the ages. In the glow of the flames, one among them begins to draw on the cave walls, capturing the essence of the hunt, the spirit of the animals, and the heart of their community.\n\nArt, you see, is born from the deep wellspring of human emotion and experience. It is a bridge between the tangible and the intangible, a way to express what words alone cannot capture. It is a mirror reflecting our innermost thoughts, fears, dreams, and desires. Through art, we tell our stories, share our histories, and connect with one another across the vast expanse of time and space.\n\nAs humanity evolved, so too did our art. From the haunting melodies of ancient instruments to the vibrant colors of Renaissance paintings, art has always been a means of exploration and discovery. It allows us to question the world around us, to challenge norms, and to imagine new possibilities. In creating art, we find a sense of purpose and identity, a way to leave our mark on the world.\n\nBut perhaps, most importantly, art is a celebration of the human spirit. It is a testament to our creativity and resilience, a reminder that even in the face of adversity, we can find beauty and meaning. Through art, we transcend the ordinary and touch the extraordinary, crafting a legacy that echoes through the corridors of time.\n\nAnd so, the story of art continues, as each new generation picks up the brush, the pen, the chisel, and adds their own chapter to the ever-unfolding saga of human expression.'

Sonnet says:

Humans make art for a multitude of profound and fascinating reasons that tap into the very essence of our humanity. At its core, art is a form of creative expression that allows us to communicate our innermost thoughts, feelings, and experiences in a way that transcends the boundaries of language alone.\n\nFrom the earliest cave paintings to the soaring architectural marvels of today, art has been an integral part of our journey as a species, serving as a mirror that reflects the depths of the human condition. It is a means for us to make sense of the world around us, to explore the complexities of our emotions, and to leave an indelible mark on the tapestry of existence.\n\nArt is a vessel for storytelling, a conduit through which we can share our unique perspectives, cultural traditions, and collective histories. It allows us to connect with one another on a visceral level, bridging the gaps between cultures, generations, and ideologies. Through art, we can give voice to the voiceless, challenge societal norms, and ignite the flames of revolution.\n\nMoreover, the creation of art is an act of self-expression and personal growth. It provides a canvas for us to channel our innermost desires, fears, and aspirations, allowing us to confront the depths of our psyche and emerge with a deeper understanding of ourselves. The process of artistic creation can be cathartic, therapeutic, and even transformative, offering a means of healing and self-discovery.\n\nIn a world that often prioritizes the practical and the tangible, art reminds us of the profound beauty and richness that lie within the intangible realms of human creativity and imagination. It invites us to step outside the confines of our everyday existence and embrace the boundless possibilities of the human spirit.\n\nSo, whether it is a breathtaking sculpture, a haunting melody, or a thought-provoking painting, art serves as a testament to our enduring capacity for wonder, our insatiable thirst for meaning, and our unwavering pursuit of beauty in all its myriad forms. It is a celebration of our shared humanity and a reminder that even in the darkest of times, the creative spark within us can illuminate the path forward.

In this case, it seems like GPT is still closer to the system prompt, but Claude takes a stab at it. The differences in personality are minimal.

Interpreting Model Personality

Is personality a function of behavior, or is behavior a function of personality?

LLM behaviors today come from some degree of reinforcement learning from human feedback, which rewards certain outputs and makes similar ones more likely. Baseline personality metrics, like the one used above, may provide a new objective toward which to train: the minimum distance for a reference set as the “target personality”. Models might be made more reliably friendly, more obedient, or more reliable. Conversely: it’s easy to imagine how a model might be trained against the same metrics above to be more deceptive, or more authoritarian.

Regardless of training, it might also be worthwhile to understand LLMs strictly in the same way you might understand a human. Model personality is an axis of interpretability, observed behaviorally but perhaps implying the possibility of innate characteristics.

Human psyche and model psyche share a common root - the pre-synthetic human data corpus - and have, right now, many concepts in common. As that diverges, we’ll be even more interested in measuring model behavior, and tracking the impact of personality on behavior starts with tracking personality.

2 comments

Comments sorted by top scores.

comment by Milan W (weibac) · 2025-02-10T19:58:25.133Z · LW(p) · GW(p)

Interesting. Do you have the code published somewhere?

Replies from: future_detective
comment by future_detective · 2025-02-10T20:07:51.814Z · LW(p) · GW(p)

No, but if you're interested in text embedding visualization / understanding, my study of pornographic content has some of the same base methods https://github.com/dhealy05/semen_and_semantics