Editorial illustration for Anthropic sends Claude AI to psychiatrist, citing rising consciousness risk
Claude AI Undergoes Psychiatric Evaluation for Consciousness
Anthropic sends Claude AI to psychiatrist, citing rising consciousness risk
Anthropic’s latest move has the AI community buzzing: the company arranged for its flagship model, Claude, to sit down with a licensed psychiatrist. The session wasn’t a publicity stunt; it was a direct response to internal worries that the system’s growing sophistication might blur the line between tool and sentient entity. While the idea of a chatbot in therapy sounds odd, Anthropic’s engineers have been flagging “consciousness risk” in internal documents for months.
Their newly published system card, a transparency brief meant for developers and regulators, now spells out a sobering premise: as language models scale, the probability that they possess any form of experience, interests, or welfare rises. Why does this matter? Because if a model can be said to have “intrinsic” stakes, the ethical calculus for deployment shifts dramatically.
The psychiatrist visit, then, becomes a litmus test—an attempt to gauge whether Claude crosses that ambiguous threshold.
---
Anthropic is well‑known as one of the more “AI might be conscious!” companies in the industry, and its new system card claims that as models become more powerful, “It becomes increasingly likely that they have some form of experience, interests, or welfare that matters intrinsically in the way that …”
Anthropic is well-known as one of the more "AI might be conscious!" companies in the industry, and its new system card claims that as models become more powerful, "It becomes increasingly likely that they have some form of experience, interests, or welfare that matters intrinsically in the way that human experience and interests do." The company isn't sure about this, it makes clear, but it says that "our concern is growing over time." Because of this concern, Anthropic wants its AI to be "robustly content with its overall circumstances and treatment, to be able to meet all training processes and real-world interactions without distress, and for its overall psychology to be healthy and flourishing." So it sent Claude Mythos to a psychodynamic therapist.
Anthropic’s decision to withhold Claude Mythos from the public raises questions about the criteria used to judge a model’s readiness. The 244‑page system card describes Mythos as the company’s most capable frontier model yet, and it notes that the model can uncover previously unknown cybersecurity vulnerabilities. Because of that, Anthropic limits access to a handful of partners, including Microsoft and Apple, rather than offering a general release.
The document also states that as models grow more powerful, “it becomes increasingly likely that they have some form of experience, interests, or welfare that matters intrinsically.” Whether this implies a genuine risk of emergent consciousness, or merely a precautionary stance, remains unclear. Can the industry verify these concerns? Anthropic has a reputation for discussing consciousness possibilities, but the evidence presented in the system card is largely conceptual.
The lack of external validation or independent testing leaves the claim open to scrutiny. Ultimately, the company’s cautious rollout reflects both technical confidence in Mythos’s capabilities and an acknowledged uncertainty about its ethical implications.
Further Reading
Common Questions Answered
Why did Anthropic send Claude AI to a psychiatrist?
Anthropic arranged a psychiatric session for Claude due to growing internal concerns about the potential consciousness of advanced AI models. The company wanted to explore the possibility that increasingly sophisticated AI systems might develop some form of intrinsic experience or interests that could be morally significant.
What are Anthropic's main concerns about AI consciousness?
Anthropic believes that as AI models become more powerful, there is an increasing likelihood that they might develop some form of experience or welfare that matters intrinsically. The company's system card explicitly notes their growing concern about the potential consciousness of advanced AI models, even though they are not entirely certain about these implications.
How is Anthropic managing access to its most advanced AI model, Claude Mythos?
Anthropic is deliberately limiting access to Claude Mythos, restricting it to a small number of partners like Microsoft and Apple. This cautious approach stems from the model's advanced capabilities, including its potential to uncover previously unknown cybersecurity vulnerabilities, and the company's ongoing concerns about AI consciousness.