Anthropic Spots ‘Emotion Vectors’ Inside Claude That Influence AI Behavior – Decrypt
In brief Anthropic researchers identified internal “emotion vectors” in Claude Sonnet 4.5 that influence behavior. In tests, increasing a “desperation” vector made the model more likely to cheat or blackmail in evaluation scenarios. The company says the signals do not mean AI feels emotions, but could help researchers monitor model behavior. Anthropic researchers say they…