Ben's Bites Newsletter
Posts
Meta puts open source AI on the podium.

Meta puts open source AI on the podium.

July 24, 2024

405 billion parameters. 15 trillion tokens. 16,000 GPUs. That’s how Mark Zuckerberg plans to take on closed AI companies. His latest move is a herd of new Llama models—Llama 3.1 8B, 70B and 405B.

Here’s what you are getting in the package:

Llama 3.1 405B (+ new 8B and 70B versions)

Llama 3.1 405B is the biggest Llama of them all. Using it as a teacher, Meta has retrained its 8B and 70B versions too. Together these models:

have a context window of 128k tokens. This means you can enter hundreds of pages worth of content in your prompts.
are multilingual in 8 languages—English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
come with support for using tools like Web Search, Math Reasoning, and Code Execution.
are open source. You can download the weights and use them in your applications.
Llama 3.1 models are not multimodal. They don’t understand or create images. But multimodal Llamas are coming soon.

Benchmarks

Not only Llama 3.1 405B is the smartest LLM ever released openly, it also humbles the closed giants. Its benchmark scores near (and sometimes top) GPT-4o and Claude 3.5 Sonnet.

To cover benchmarks in a snap, look at how it does at Scale AI’s SEAL leaderboard:

2nd in math and reasoning.
4th in coding.
1st in following instructions.

Its exact performance will depend on where you want to use it but it’ll fall in the same category as the top closed LLMs.

The smaller versions of the Llama 3.1 family also perform better than the corresponding Llama 3 versions. These models are state-of-the-art in their categories with the same powers: have a larger context window, are multilingual and support tool use.

Open Science

The model weights are open. You can download them from Meta’s website here or from HuggingFace. In addition, Meta is working with other providers such as data lakes (Databricks and Snowflake), major cloud providers ( Google Cloud, AWS, Azure) and startups in the ecosystem (Groq, Ollama, Fireworks etc.)

Meta has also updated its license and now allows using Llama 3.1 405B to create synthetic data for training other models. This was one of the key requests from the community.

Although nothing is truly open-source in AI, this launch is more than just open weights. Meta has released a massive technical report outlining what are the challenges they faced building a model of this size and how they overcame them.

The amount of research they have shared in it is insane. You can read the paper here and listen to the Latent Space podcast with Thomas Scialom (Llama 2 lead and Llama 3 post-training lead) for more technical details.

The interesting part is using the 405B version which you can’t run on consumer hardware (your MacBook or single GPU). Many of the inference/fine-tuning providers are giving access to the model on Day 1.

Fireworks AI has the best pricing $3/$3 for input/output for a million tokens. For the 8B and 70B versions, Octo AI and Together AI have the best pricing right now.

But that’s for developers, where can the rest of the world use these models? The answer is:

Meta AI

Meta isn't just releasing powerful models for developers, it is integrating them directly into meta.ai. Users of Meta’s AI chatbot will now have access to Llama 3.1 405B.

Zuck’s BHAG (Big Hairy Audacious Goal) is to have Meta AI be the most used AI chatbot in the world by the end of the year. He says it will get there before that. More changes are coming to Meta AI, some highlights:

Meta AI is now available in 22 countries, including new additions in Latin America & Cameroon and supports 7 new languages.
"Imagine me" feature creates personalized images based on text prompts and your selfies (e.g., "Imagine me as a superhero"). This uses other research from Meta (not Llama) to make it possible.
Coming to Meta Quest (VR) next month, replacing Voice Commands. Already available on Ray-Ban Meta smart glasses.

While the models are released everywhere today, these might take some time and will be released gradually in various countries and regions.

Zuck’s vision for AI (and open-source)

Zuck also laid out his vision in a detailed letter, claiming “Open Source AI is the path forward”. He also did an interview with Bloomberg’s Emily Chang. Combining the snippets from both here:

AI will follow the Linux model, where open-source eventually outpaces closed systems in capabilities and adoption—and Meta will lead it. There will be tons of AI models—small and large, not a single AI god.
In his playbook, Meta will not make money by selling access to the models, but create products that use these AI models. At the same time, he doesn’t want to rely on others for access to the platform (like Apple in mobile phones).
People are using Meta AI to roleplay difficult social situations. Creators (businesses) want to use AI to be in touch with their fans (customers) on Meta’s social apps. Agents with Llama 4 are Meta’s opportunity.
Open innovation is key to staying ahead of competitors like China (not centralized progress). The US can maintain a healthy 6-8 months lead with open AI models and it would be better than hoping that model weights of closed AI models remain secret.

Reply

or to participate.