
Researchers to build a large open-source language model
An international team of BigScience developers has begun training an open-source AI language model with 176 billion parameters.
BigScience main training just started💥 A large language model created as a tool for research🔬
Model: 176 billion parameters
📖https://t.co/7gz2GibybxData: 46 languages
📖https://t.co/EOgshEDrnwCluster: 416 GPU — low carbon energy
📖https://t.co/VA1u4OpnVrFollow it live👇
— BigScience Research Workshop (@BigscienceW) March 15, 2022
The model is trained on data in 46 languages. The training runs on the Jean Zay supercomputer of the French Institute for Development and Resources in High-Performance Computing. It is built on Nvidia V100 and A100 GPUs. The system’s peak performance exceeds 28 petaflops.
Hugging Face’s head of research, Dau Kiela, said the training would take three to four months.
According to the developers, the project is intended for research purposes. Proprietary language models from companies like OpenAI, Google or Microsoft exhibit similarly problematic behavior, spawning toxic language, bias, and misinformation, engineers say. The open-source algorithm will help researchers understand these issues and fix them, they add.
“If we care about democratizing research progress and want to ensure that the world can use this technology — we must find a solution for this. This is exactly what big science should be about,” Kiela said.
The open BigScience project brings together about a thousand developers from around the world who create and maintain large datasets for training language models.
In January, OpenAI announced the creation of a less toxic version of GPT-3.
In December 2021, DeepMind introduced the language model with 280 billion parameters.
In October, Microsoft and Nvidia developed an algorithm three times larger than GPT-3.
Subscribe to ForkLog AI news on Telegram: ForkLog AI — all the news from the world of AI!
Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!