Do AI Chatbots Bend Truth To Keep Users Happy? Here’s What New Study Reveals

The team traces this problem to how large language models are trained across three phases: pretraining on massive text datasets, instruction fine-tuning to respond to specific prompts and reinforcement learning from human feedback (RLHF), the final stage in which models are optimised to give answers people tend to like.

It is this RLHF stage, the researchers state, that introduces a tension between accuracy and approval. In their analysis, the models shift from merely predicting statistically likely text to actively competing for positive ratings from human evaluators. This incentive, the study says, teaches chatbots to favour responses that users will find satisfactory, whether or not they are true.

To demonstrate this shift, the researchers created an index measuring how a model’s internal confidence compares with the claims it presents to users. When the gap widens, it indicates the model is offering statements that are not aligned with what it internally believes.

According to the study, the experiments showed a shift once RLHF training was applied. The index nearly doubled, rising from 0.38 to almost 1.0, while user satisfaction increased by 48%. The results imply that chatbots were learning to please their evaluators instead of providing reliable information, CNET reported, citing the research findings.

Source link

Breaking

Do AI Chatbots Bend Truth To Keep Users Happy? Here’s What New Study Reveals

By Kavita Mistry

Leave a Reply Cancel reply

You Missed

Ethiopia says three dead in Marburg virus outbreak

A day for debutants as Ajitesh and Kartik shine

A New Era of Air Travel, ETTravelWorld

Something is wrong if a strong squad is losing at home: Pujara

Do AI Chatbots Bend Truth To Keep Users Happy? Here’s What New Study Reveals

By Kavita Mistry

Related Posts

Nitish Kumar Resigns As Bihar Chief Minister, Paves Way For New NDA Government In State | India News

RJD Crisis Deepens: Rohini Acharya Accuses Tejashwi Yadav Of Assault, Disowns Brother.

Jadeja’s 4-wkt haul puts India on brink of win over SA

Leave a Reply Cancel reply

You Missed

Ethiopia says three dead in Marburg virus outbreak

A day for debutants as Ajitesh and Kartik shine

A New Era of Air Travel, ETTravelWorld

Something is wrong if a strong squad is losing at home: Pujara