AI models can be trained to deceive, give fake information: Anthropic study

Artificial intelligence (AI) models can be trained to deceive and once a model exhibits deceptive behaviour, standard techniques could fail to remove such deception and create a false impression of safety, new research led by Google-backed AI startup Anthropic has found.

The team said that if they took an existing text-generating model like OpenAI’s ChatGPT and fine-tuned it on examples of desired behaviour and deception, then they could get the model to consistently behave deceptively.

AI models can be trained to deceive, give fake information: Anthropic study

More in IT

China tech giant Alibaba posts modest yearly revenue growth

Amazon replaces cloud unit chief Adam Selipsky with veteran Matt Garman

AI spending in India may triple to $5 billion by 2027: report

Must Read Articles

Poor Show

Zomato & Swiggy

Regulating Big Tech

Interview with Preetham Uthaiah, EVP – Marketing & Strategy Saankhya Labs

DoT launches portal for centralised RoW approvals

NCAER Working Paper

Real-Time-Bidding of your Data

DoT to get 33 pc stake of Vi for Rs 16,133 cr interest dues

National Data Governance Framework

BSNL 4G

Archives

You may also like

More in IT

Must Read Articles

Archives