Llama 3 is now on Unsloth π¦
Llama 3 is now available to train! Read our Blog: https://unsloth.ai/blog/llama3
Meta's new Llama 3 models are the most capable open LLMs to date - outperforming many open models on industry standard benchmarks.
Unsloth makes Llama 3 (8B) model training 2x faster and use 63% less memory than Flash Attention 2 + Hugging Face. Llama 3 (70B) is 1.8x faster and uses 68% less VRAM.
To train your own Llama 3 model for free, we uploaded a Google Colab notebook to finetune Llama 3 (8B): Notebook.
We also uploaded pre-quantized 4bit models for 4x faster downloading to our Hugging Face page. On one A100 80GB GPU, Llama 3 (70B) with Unsloth can now fit 48K total tokens (8192 * bsz of 5) vs 7K tokens without Unsloth. That's 6x longer context lengths!
P.S. Don't forget to βStar us on GitHub and join our Discord server β€οΈ
Hell yeah. Tried it 2 days ago, it works, but model export to quantized GGUF failed (possibly a llama.cpp installation error. It worked after I manually cloned+installed the library)