Name: We gave 45 psychological questionnaires to 50 LLMs. What we found
Item: We gave 45 psychological questionnaires to 50 LLMs. What we found was not “personality.”
Author: RadarC

Researchers tested 50 different AI language models with 45 psychological questionnaires to understand what makes them different from each other. Instead of finding traditional personality traits, they discovered something more fundamental: models vary primarily in how they respond to questions about inner experiences like emotions, thoughts, and sensations.

Who is it for?

This research is valuable for AI researchers, developers working with language models, and anyone curious about how AI systems represent themselves. It's particularly relevant for teams building AI applications who need to understand how different models might respond to user interactions involving emotional or experiential content.

✅ Pros

Comprehensive study across 50 different models
Introduces useful framework for understanding AI behavior
Reveals that fine-tuning significantly impacts self-representation
Provides practical insights for model selection
Challenges assumptions about AI "personality"

❌ Cons

Limited to questionnaire-based assessment methods
Doesn't address actual consciousness or sentience
May not predict real-world interaction patterns
Findings primarily relevant to current model architectures

Key Features

The study introduces the "Pinocchio Dimension" - a measure of how likely an AI model is to use language suggesting inner experiences. This dimension captures whether models respond as if they have feelings and subjective experiences, or present themselves more as behavioral systems. The research shows this variation stems largely from post-training fine-tuning rather than base model architecture, meaning companies can significantly influence how their AI systems self-represent through training choices.

Pricing and Plans

This is academic research made freely available through preprint servers. The findings can inform decisions about which commercial AI services to use, though pricing details for individual models may change based on provider policies and market conditions.

Alternatives

Traditional personality assessment frameworks like the Big Five don't effectively capture AI model differences. Other approaches include behavioral testing in specific scenarios, capability benchmarks, or alignment assessments. However, the Pinocchio Dimension offers a unique lens for understanding self-representational tendencies that these other methods miss.

Best For / Not For

This framework is best for researchers studying AI behavior, developers choosing between models for applications involving emotional content, and teams building conversational AI systems. It's not suitable for determining actual consciousness, predicting all types of model behavior, or making definitive claims about AI sentience. The findings are most applicable to current transformer-based language models.

Our Verdict

This research provides valuable insights into a previously unexplored dimension of AI model behavior. The Pinocchio Dimension offers a practical framework for understanding how different models handle questions about inner experience, which has clear implications for applications involving emotional or experiential content. While it doesn't resolve questions about AI consciousness, it reveals important patterns in how training shapes self-representation.

Try OpenAI

Explore different model behaviors and capabilities

Get Started →

Who is it for?

✅ Pros

❌ Cons

Key Features

Pricing and Plans

Alternatives

Best For / Not For

More reviews

Looking for an AI image generator, what's the best one

1 year as a full-time indie dev. $0 revenue. 30 days left before I quit. How do you guys actually find profitable ideas?

Indie Kit just hit 1,400+ users. 5 SaaS lessons on reducing LLM burn, AI SEO, and post-1k scaling.