BB Digest: Claude tops new coding test

PLUS: HP buys Humane, Mira Murati's new company

Hey folks, been down “building” this week but still, I'm not the one coding, AI is. Also, does anyone feel like they hesitated before starting to build with AI (or maybe you still are hesitating)? Why was that?

TLDR; inside today’s newsletter

  • OpenAI's coding benchmark, Mira Murati’s new lab

  • A wave of AI science tools from Big Tech

  • ICYMI: New upscaling tutorial and guides on vibe coding

  • Community chats you don’t wanna miss

  • 6 new AI tools with Google's Career Dreamer

  • 5 interesting posts covering the death of ai pin and more

  • to-do's

Want to partner with us? Click here.

  • OpenAI released a new benchmark to test AI models’ coding capabilities. SWE-Lancer takes in tasks from Upwork that real freelancers have been paid for and ranks models based on how much they can earn from a total pool of $1M. Claude 3.5 Sonnet is at the top with the ability to do tasks worth a little over $400k. Our blog breaks down the benchmark and the nature of real software work AI can or cannot do.

  • Mira Murati, OpenAI’s ex-CTO, is back with her new company. Thinking Machines Lab, the new company is made up of insane talent from OpenAI, Meta’s AI division and Mistral. The key difference: The company pledges itself to publishing its progress in public because, in their words “science is better when shared.”

  • Speaking of AI and science, a bit too much happened this midweek:

    • Google released a new AI system - A co-scientist built with Gemini. It proposes testable hypotheses, a summary of relevant literature and a possible experimental approach to help real scientists make new discoveries.

    • Sakana AI, another AI lab, has created an AI CUDA Engineer that can produce highly optimized CUDA kernels for AI engineers.

    • Microsoft also dropped Muse AI - a model to generate gameplay and Majorana 1 - a quantum chip. Also, Dwarkesh got Satya Nadella on his podcast to diss AGI.

    Wild times.

  • Get the ultimate guide to AI agents! Galileo’s latest eBook is a 100-page deep dive into the world of AI agents — offering practical strategies for selecting, optimizing, and scaling agents with confidence. Build powerful, reliable agents like an expert with the guide. Get your copy now!*
    *sponsored

👀 ICYMI: New stuff we’ve launched

AI images are no longer janky but there’s still a big block when using them for real work: You can’t generate them in full HD or higher resolutions. The solution you ask? Upscale them by following our new tutorial. You can upscale your own images too.

Someone asked us: what exactly is vibe coding? Here’s our answer and a pitch to get you on board—we’re all in. And if you’re already convinced, follow our guide on what coding tool should YOU use and just get started.

We have workshops:

💬 Inside the community this week

Become pro to join us building and learning together.

  • Gurpreet built Zombifyme, a fun project to transform people into zombies. (link)

  • Wyatt shared his post comparing ChatGPT, Gemini, and Perplexity search capabilities. (link)

  • Robert shared insights on finding product-market fit with AI from a recent meetup. (link)

  • Nichlas built a curated database of 300+ product manager courses. (link)

  • Claire's hunting for some examples to run a killer AI data analysis masterclass. (link)

  • Gurpreet's exploring how to make TikTok-style real estate tours with AI. (link)

  • Goldstein's trying to pick between Cursor, Replit, and Create for coding with AI. A long thread with members chiming on. (link)

  • Brett's exploring what we could do with browser-use tech. (link)

  • Jon's catching up on 4 months of AI progress - there's been a lot! (link)

Join the conversation, plus full access to courses and workshops by becoming a pro member today!

⚙️ Top new tools

  • AssemblyAI: Speech-to-Text API built for conversations*

  • Auto Label by Superhuman - Assigns labels like marketing, pitch, social, and news automatically to emails (or create your own label).

  • Proxy by Convergence - A web-browsing agent to do tasks for you better than OpenAI’s Operator.

  • Career Dreamer by Google - A playful way to explore career possibilities. (blog)

  • Yess - Sales client research and personalized meeting prep done in minutes.

  • Langflow - Low code tools for RAG and AI development.

 📜 Interesting posts

📌 To-dos

That’s it for today. Feel free to hit reply and share your thoughts. 👋

Enjoy this newsletter? Please forward to a friend.

Want to join a community of AI builders? Become a Ben’s Bites member and join our Slack (bonus: access to all our workshops and courses).

Want to advertise in this newsletter? Click here.

Reply

or to participate.