Meta has unveiled the Llama 3 family of large language models (LLMs), introducing advanced generative text models available in 8 billion (8B) and 70 billion (70B) parameter sizes. These models, designed for both commercial and research applications, excel in generating text and code, with a particular optimization for dialogue use cases. Llama 3 models leverage an auto-regressive language model framework and incorporate an optimized transformer architecture. They have been fine-tuned through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to prioritize helpfulness and safety in their outputs.
The training of these models utilized a new mix of publicly available online data, amassing over 15 trillion tokens, without incorporating Meta user data. The 8B and 70B models both support a context length of 8k and have been pretrained with data up to March and December 2023, respectively. Meta has committed to offsetting the carbon footprint generated during the training process, aligning with its sustainability goals.
Llama 3 models have demonstrated superior performance on various industry benchmarks compared to their predecessors and other open-source chat models. They are intended for English language use, with the instruction-tuned models specifically designed for assistant-like chat functionalities. Meta emphasizes responsible AI development, providing guidelines and tools like Meta Llama Guard 2 and Code Shield to mitigate risks and ensure safety in applications.
The release of Llama 3 models is accompanied by a custom commercial license, with Meta encouraging feedback and contributions from the developer community to further enhance model safety and effectiveness. This initiative reflects Meta’s commitment to fostering innovation and responsible use in the AI field, making powerful tools accessible for a wide range of applications while prioritizing user safety and ethical considerations.
Read more…