
Meta unveils the LLaMA language model
Meta has released the large language model LLaMA for AI researchers with 13 billion and 65 billion parameters.
Today we’re publicly releasing LLaMA, a state-of-the-art foundational LLM, as part of our ongoing commitment to open science, transparency and democratized access to new research.
Learn more & request access ➡️ https://t.co/8AeLVhMWkq pic.twitter.com/1BEkTngtnM
— Meta AI (@MetaAI) February 24, 2023
According to the developers, the smaller LLaMA-13B version demonstrated better results in most tests than OpenAI’s GPT-3. The larger LLaMA-65B system is “competitive with advanced models,” such as DeepMind’s Chinchilla-70B and Google’s PaLM-540B.
Numbers in the model names refer to billions of parameters in each. The criterion is commonly used to gauge model size, yet these two attributes do not necessarily scale in lockstep.
After training, LLaMA-13B can be run on a single Nvidia Tesla V100 GPU. According to the developers, this “democratizes” compute for small institutions that lack powerful hardware.
Meta believes LLaMA will help AI researchers identify issues in language models related to bias, toxicity and tendency to hallucinate. To this end they released the algorithm under a non-commercial license.
“We believe that the entire community […] should work together to establish clear guidelines for responsible AI in general and responsible large language models in particular,” they said.
According to Meta’s CEO Mark Zuckerberg, language models have shown promising capabilities in text generation, conversation and predicting protein structure.
“Meta is committed to this open model of research, and we will make our new model accessible to the AI research community,” he added.
Previously the tech giant released its own language models, but they were often criticised. In August 2022, Meta launched public Blenderbot 3 with 175 billion parameters. The system was later accused of antisemitism and dissatisfaction with Facebook.
Another chatbot named Galactica shut down just three days after launch. The system intended to summarize scientific papers was accused of creating fake information and misinformation.
Earlier in November 2022, Meta unveiled the AI agent Cicero, which plays the board game Diplomacy at a human level.
In the same month, the tech giant’s AI lab discussed the transformer neural network ESM-2 with 15 billion parameters for predicting protein structure.
Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!