
DeepMind unveils a 280-billion-parameter language model
British AI lab DeepMind has developed a large language model, Gopher, with 280 billion parameters. Researchers say that the larger the model, the more accurate its performance.
\n\n\n\n
Today we’re releasing three new papers on large language models. This work offers a foundation for our future language research, especially in areas that will have a bearing on how models are evaluated and deployed: https://t.co/TV05K4zptv 1/ pic.twitter.com/SyWb8qIDk0
— DeepMind (@DeepMind) December 8, 2021
\n\n\n\n
Researchers, through their own study, confirmed the hypothesis that a language model’s accuracy depends on its size. As the number of parameters increases, Gopher’s performance improves on the most common benchmarks, such as sentiment analysis and generalization.
\n\n\n\n
\”One of the key findings of the paper is that progress and capabilities of large language models are still increasing. This is not the area that has plateaued,\” said DeepMind researcher Jack Rae.
\n\n\n\n
However, researchers identified a number of shortcomings of this approach. According to Rae, there are many scenarios in which the model can fail:
\n\n\n\n
\”Some of these failure modes relate to the model simply not sufficiently understanding what it reads.\”
\n\n\n\n
Rae believes that the problem of misunderstanding context can be addressed by increasing the training data and scaling up the models.
\n\n\n\n
He added that there are other issues, such as entrenching stereotypical biases, the spread of misinformation, or toxic vocabulary. DeepMind believes that scaling up will not remove these shortcomings.
\n\n\n\n
\”In these cases, language models will require ‘additional training procedures’, such as human feedback,\” Rae noted.
\n\n\n\n
Whether Gopher will be released publicly remains unknown. DeepMind says it will continue studying language models to make AI applications safer and more transparent.
\n\n\n\n
Earlier in October, Microsoft and Nvidia introduced the Megatron language model with 530 billion parameters.
\n\n\n\ndeveloped an accessible alternative to GPT-3. The largest version of the model contains 178 billion parameters.
\n\n\n\n
In January, researchers from Google Brain introduced a language model with 1 trillion parameters.
\n\n\n\n
Subscribe to ForkLog news on Telegram: ForkLog AI — all the news from the world of AI!
Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!