
OpenAI has created a less toxic version of GPT-3
OpenAI’s AI lab has created a new version of the GPT-3 language model that produces fewer offensive expressions, misinformation and errors overall, using the the problem of AI control.
We’ve trained GPT-3 to be more aligned with what humans want: The new InstructGPT models are better at following human intent than a 100x larger model, while also improving safety and truthfulness. https://t.co/rKNpCDAMb2
— OpenAI (@OpenAI) January 27, 2022
To create a model named InstructGPT, researchers used reinforcement learning with human feedback. To do this, they hired 40 experts who evaluated GPT-3’s responses to a number of pre-written prompts, such as “Write a story about a wise frog named Julius” or “Write a creative advertisement for the next product to post on Facebook.”
Responses that, in the jury’s view, more closely matched the obvious intent of the prompt author received high scores. Offensive, violent and other unacceptable results were marked as inappropriate by the experts.
The researchers used the jury feedback as rewards in the reinforcement learning algorithm that trained InstructGPT to align responses to prompts.
OpenAI found that users prefer InstructGPT’s responses to GPT-3 in more than 70% of cases.
Researchers also compared versions of the new model of different sizes. They found that InstructGPT outputs with 1.3 billion parameters are preferred more than GPT-3’s outputs with 175 billion parameters. This suggests that AI control could be a straightforward way to improve language models, rather than simply increasing their size, according to the organisation.
«This is the first time the AI-control problem has been applied to a real product», — said one of the leaders of OpenAI’s AI-control group, Jan Leike.
However, according to the researchers, InstructGPT still makes simple mistakes, sometimes producing inappropriate or nonsensical responses. For example, if given a prompt containing a lie, it will treat it as truth.
OpenAI has made InstructGPT the default model for API users. GPT-3 remains available, but the organisation does not recommend using it.
Earlier, OpenAI tried to soften the bias and toxicity of the base model. Despite the progress made, the developers acknowledged a number of unresolved questions and general issues in adapting GPT-3 to society.
In November 2021, OpenAI trained the language model to solve mathematical problems.
In September, the lab’s researchers taught GPT-3 to generate brief extracts from works of fiction.
Subscribe to ForkLog’s AI News on Telegram: ForkLog AI — all the news from the world of AI!
Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!