• Ben's Bites
  • Posts
  • Daily Digest: AI labels on social media

Daily Digest: AI labels on social media

PLUS: New ways to test AI are needed

Want to get in front of 100k AI enthusiasts? Work with us here

Hello folks, here’s what we have today;

PICKS
  1. New tutorial: Auto post project updates from your PM tool to Slack - How to create a Zapier Central bot that connects your management tool to Slack.

  2. Meta has updated its AI label on Instagram and other platforms. Instead of “Made with AI”, it’s now called “AI info”. This comes after many posts with small AI edits (eg. from Photoshop’s Generative Fill) were marked as made with AI.

  3. Anthropic's throwing cash at third-party AI evaluations. Anthropic’s new initiative is funding new ways to test frontier AI models that are getting too smart for current evals. Safety, advanced capabilities, and eval-building tools are on the wishlist.🍿Our Summary (also below)

  4. How do you publish a book on AI that isn't immediately out of date? - We defined prompt engineering principles that worked on GPT-3, still work on GPT-4, and will work on GPT-5 too (we hope). This book survived 8 rounds of technical review from O'Reilly, including an early LangChain contributor, so it belongs on your shelf. Prompt Engineering for Generative AI – by James Phoenix & Mike Taylor. (Sponsor)

TOP TOOLS
  • Gen-3 Alpha by Runway AI is now available to everyone. It takes a few mins and $1-$2 to generate a 10-second video clip.

  • Suno is now on iOS - Imagine a song from anywhere on your phone.

  • Superlocal - Map search with results personalized to you.

  • Prompt Easy - Craft fine-tuning datasets for GPT in under 5min.

  • Cooked Wiki - Recipes summarized and organized, from anywhere on the web.

  • Canyon - An all-in-one AI tool for job seekers to land their dream job.

  • Lytix - Monitor, improve and scale your LLM applications.

  • Booth AI - Build no-code Gen AI apps.

NEWS
QUICK BITES

Anthropic wants to pay people to build better ways to test their AI models. They're basically saying "Hey nerds, our AI keeps acing all the tests we throw at it, so we need some real brain-busters now!”

What's going on here?

Anthropic announced a new initiative to fund third-party evaluations of advanced AI capabilities and risks.

What does this mean?

Frontier AI models are outgrowing the old evaluation methods faster than teenagers outgrow their shoes. Creating and running new evals is expensive, especially as we get more of the larger models (GPT-4 class) like Gemini 1.5 Pro and Claude 3 and 3.5 series.

To solve this, Anthropic is opening up its wallet to the wider AI community, hoping fresh eyes (i.e. new evals from third parties) can judge these models better.

Anthropic is looking at three main categories: AI safety level assessments, advanced capability metrics, and tools for building evals.

  • For safety, they want tests for stuff like AI hacking skills, the ability to design bioweapons, and how autonomous models can get.

  • On the capability side, they're after evals for cutting-edge science, multilingual skills, and societal impacts.

  • They also want infrastructure to make it easier for experts to whip up good evals without needing coding chops.

It is also sharing a wishlist and inviting proposals through an application form.

Why should I care?

Anthropic's trying to stay ahead of the curve because when your creation starts acing tests faster than you can write them, it's time to bring in the reinforcements.

If you're an AI whiz or domain expert, there's cash on the table. It’s not just Anthropic, other big AI labs are also sweating about evals (OpenAI famously gives early access to eval contributors).

Ben’s Bites Insights

We have 2 databases that are updated daily which you can access by sharing Ben’s Bites using the link below;

  • All 10k+ links we’ve covered, easily filterable (1 referral)

  • 6k+ AI company funding rounds from Jan 2022, including investors, amounts, stage etc (3 referrals)

Join the conversation

or to participate.