• Ben's Bites
  • Posts
  • Mark Zuckerberg is aiming for the crown in AI race with Llama 3 models.

Mark Zuckerberg is aiming for the crown in AI race with Llama 3 models.

Meta is bringing Llama 3 into the world. It is starting with open-sourcing two variants of Llama 3 models and an upgraded version of Meta AI. These two variants (8B and 70B parameters) blow the earlier models from Meta out of the water. A larger model with 400B+ parameters is still in training. Zuck claims that Meta AI is the most intelligent free chatbot available publically.

What's going on?

Meta has released the first two public variants of Llama 3, their powerful new large language model (LLM).

What does this mean?

Llama 3 is the next generation of AI models from Meta. Previous Llama models ranged from 7B to 70B parameters in size and were a few of the leading open-source AI models.

But Zuck has switched the gears this time around. Llama 3 models are not just the best among open-source, they are giving closed-source models a run for their money too.

[This post covers Llama 3 models, read about Meta AI here.]

Here are some benchmarks for the 8B and 70B models from Llama 3 family:

To give a sense of improvement beyond the numbers:

  • Llama 3’s 8B version is now almost as powerful as Llama 2’s largest 70B model.

  • Llama 3’s 70B version beats the models behind free versions of ChatGPT, Gemini and Claude while inching towards the paid ones too.

One technical sauce that Meta has shared for this success is training small models on too much data. Meta trained these kiddos on 15 trillion text-tokens, where the optimal amount is around 200 billion-ish. And they kept learning. Now, Meta had to pull the plug because at some point you gotta let the models out in the world to see how people use it and prepare for the next training run.

But this is not it. Meta is still training a 400B beast. Meta claims it’ll likely be at the level of Claude Opus (#2 in all models as of now) based on their early testing. And when it completes training this. most likely Llama 3 405B would be open-source as well. Listen to Mark talk about these models with Dwarkesh and Roberto.

Why should I care?

Meta has made this progress in a very short time catching up to OpenAI and other AI labs, and open-sourcing them just makes it even better. Prior to Llama models, no one thought a 7B model could run on a Macbook. With the Llama 3 release, I can see hackers compressing the model to run on even an iPhone in less than 24 hours.

Llama 3 models are also smarter across the board. So now builders can build smart apps that run on a local machine and don't need to compromise heavily on the intelligence.

If you’re building anything cool with Llama 3 models, tag me on Twitter/X @bentossell. I’d love to see it.

Reply

or to participate.