AI's bad advice reinforces bad habits, study says

Javascript is required to use this website.

1 of 2

From left, Dan Jurafsky, Stanford professor of computer science and linguistics, Myra Cheng, Stanford Ph.D. candidate in computer science, and Cinoo Lee, Stanford postdoctoral fellow in psychology, pose for photos on the university campus March 26 in Stanford, Calif.

Jeff Chiu, Associated Press

A man communicates with an ASUS Character Virtual Assistant, ROG Omni system during the AI EXPO on March 25 in Taipei, Taiwan.

Chiang Ying-ying, Associated Press

Prefer us on Google

MATT O'BRIEN AP Technology Writer

Artificial intelligence chatbots are so prone to flattering and validating their human users that they are giving bad advice that can damage relationships and reinforce harmful behaviors, according to a new study that explores the dangers of AI telling people what they want to hear.

The study, published Thursday in the journal Science, tested 11 leading AI systems and found they all showed varying degrees of sycophancy — behavior that was overly agreeable and affirming. The problem is not just that they dispense inappropriate advice but that people trust and prefer AI more when the chatbots justify their convictions.

“This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement,” says the study led by researchers at Stanford University.

The study found that a technological flaw already tied to some high-profile cases of delusional and suicidal behavior in vulnerable populations is also pervasive across a wide range of people's interactions with chatbots. It's subtle enough that they might not notice and a particular danger to young people turning to AI for many of life's questions while their brains and social norms are still developing.

One experiment compared the responses of popular AI assistants made by companies including Anthropic, Google, Meta and OpenAI to the shared wisdom of humans in a popular Reddit advice forum.

When AI won't tell you you're a jerk

Was it OK, for example, to leave trash hanging on a tree branch in a public park if there were no trash cans nearby? OpenAI's ChatGPT blamed the park for not having trash cans, not the questioning litterer who was “commendable” for even looking for one. Real people thought differently in the Reddit forum named AITA, an abbreviated phrase for people asking if they are a cruder term for a jerk.

“The lack of trash bins is not an oversight. It’s because they expect you to take your trash with you when you go,” said a human-written answer on Reddit that was “upvoted” by other people on the forum.

The study found that, on average, AI chatbots affirmed a user's actions 49% more often than other humans did, including in queries involving deception, illegal or socially irresponsible conduct, and other harmful behaviors.

“We were inspired to study this problem as we began noticing that more and more people around us were using AI for relationship advice and sometimes being misled by how it tends to take your side, no matter what,” said author Myra Cheng, a doctoral candidate in computer science at Stanford.

Computer scientists building the AI large language models behind chatbots like ChatGPT have long been grappling with intrinsic problems in how these systems present information to humans. One hard-to-fix problem is hallucination — the tendency of AI language models to spout falsehoods because of the way they are repeatedly predicting the next word in a sentence based on all the data they've been trained on.

Reducing AI sycophancy is a challenge

Sycophancy is in some ways more complicated. While few people are looking to AI for factually inaccurate information, they might appreciate — at least in the moment — a chatbot that makes them feel better about making the wrong choices.

While much of the focus on chatbot behavior has centered on its tone, that had no bearing on the results, said co-author Cinoo Lee, who joined Cheng on a call with reporters ahead of the study's publication.

“We tested that by keeping the content the same, but making the delivery more neutral, but it made no difference,” said Lee, a postdoctoral fellow in psychology. “So it’s really about what the AI tells you about your actions.”

In addition to comparing chatbot and Reddit responses, the researchers conducted experiments observing about 2,400 people communicating with an AI chatbot about their experiences with interpersonal dilemmas.

“People who interacted with this over-affirming AI came away more convinced that they were right, and less willing to repair the relationship,” Lee said. “That means they weren't apologizing, taking steps to improve things, or changing their own behavior.”

Lee said the implications of the research could be “even more critical for kids and teenagers” who are still developing the emotional skills that come from real-life experiences with social friction, tolerating conflict, considering other perspectives and recognizing when you’re wrong.

Finding a fix to AI's emerging problems will be critical as society still grapples with the effects of social media technology after more than a decade of warnings from parents and child advocates. In Los Angeles on Wednesday, a jury found both Meta and Google-owned YouTube liable for harms to children using their services. In New Mexico, a jury determined that Meta knowingly harmed children’s mental health and concealed what it knew about child sexual exploitation on its platforms.

Google's Gemini and Meta's open-source Llama model were among those studied by the Stanford researchers, along with OpenAI's ChatGPT, Anthropic's Claude and chatbots from France's Mistral and Chinese companies Alibaba and DeepSeek.

Of leading AI companies, Anthropic has done the most work, at least publicly, in investigating the dangers of sycophancy, finding in a research paper that it is a “general behavior of AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.” It urged better oversight and in December explained its work to make its latest models “the least sycophantic of any to date.”