Llama 3 is now on Unsloth 🦙

Llama 3 is now available to train! Read our Blog: https://unsloth.ai/blog/llama3

Apr 24, 2024

Meta's new Llama 3 models are the most capable open LLMs to date - outperforming many open models on industry standard benchmarks.

Unsloth makes Llama 3 (8B) model training 2x faster and use 63% less memory than Flash Attention 2 + Hugging Face. Llama 3 (70B) is 1.8x faster and uses 68% less VRAM.

To train your own Llama 3 model for free, we uploaded a Google Colab notebook to finetune Llama 3 (8B): Notebook.

We also uploaded pre-quantized 4bit models for 4x faster downloading to our Hugging Face page. On one A100 80GB GPU, Llama 3 (70B) with Unsloth can now fit 48K total tokens (8192 * bsz of 5) vs 7K tokens without Unsloth. That's 6x longer context lengths!

P.S. Don't forget to ⭐Star us on GitHub and join our Discord server ❤️

Unsloth AI

Discussion about this post