- Ben's Bites
- Posts
- ChatGPT with Vision and Voice
ChatGPT with Vision and Voice
PLUS: Podcast dubbing by Spotify, Getty Images' new tool and what builders think about AI.
Hello folks, here’s what we have today;
Our picks
1/
ChatGPT can see, hear and speak now - The new features allow users to speak conversationally with ChatGPT and show it images to get relevant assistance. The image capabilities are powered by the multimodal GPT-4V (GPT-4 with Vision), which Open AI showcased earlier this year. The new voice generation model is still behind the wraps. 🍿Summary
2/
Spotify translates podcasts into new languages - It’s a new pilot program for AI-powered voice translations of podcasts into different languages while retaining the original speaker's voice. Lex Fridman, Steven Bartlett, and a few other podcasters are part of this pilot program. It’s based on OpenAI’s new voice generation model.🍿Summary
3/
New AI image generator by Getty Images - The generator is powered by Nvidia and trained on Getty's library of 477 million images. The company positions it as a safer, more responsible option than rival generative AI tools. 🍿Summary
4/
a16z has hosted a series of talks on AI revolution. These include guests like Mira Murati from Open AI, Kevin Scott from Microsoft, and others. Here are two from a16z’s perspective:
The economic case for generative AI with a16z's Martin Casado.
What builders talk about when they talk about AI.
from our sponsor
Join Sam Altman, Co-founder and CEO of OpenAI, Wade Foster, Co-founder and CEO of Zapier, for a fireside chat on AI, entrepreneurship, and much more, at ZapConnect on September 28.
From the community
Make it real - An invite-only responsible AI leadership summit by Credo AI in NYC.
Cool Tools trending product launches from the last 24 hours
Arcee DALM toolkit - Toolkit for developers to build on top of open source domain pretrained (DPT) LLMs.
Wondercraft - Reach new audiences by dubbing your podcast.
Anthropic Cookbook - Recipes to use Claude in fun and effective ways.
HuggingFace Pro - Access to curated API endpoints of best models and improved rate limits for free Inference APIs.
Athelas Scribe - Transform the clinical intake process, delivering unmatched results in record time.
Slack assistant by MyAskAI - Use AI powers in Slack along with your Google Drive & Notion data.
FireCut AI - Your lightning-fast AI video editor.
Stylize - Make incredible style edits to your photos with AI.
Vespio - Sentiment analysis & improvement platform for sales teams.
Deepeval by Confident AI - Evaluate LLMs in production.
Low latency voice ordering system demo by Jordan Fisher (CEO - Standard AI).
Ben’s Bites News top posts from the last 24 hours
UK government shares an introduction to its AI safety summit happening in November.
Elicit raises $9 million and becomes a public benefit corporation.
Snap partners with Microsoft on ads in its ‘My AI’ chatbot feature.
Elad Gill’s 3 tips for entrepreneurs building AI agent companies.
Nous-Capybara 7B - A new SOTA model trained on less than 20,000 carefully curated GPT-4 examples.
Towards the AI agent ecosystem - A technical guide for founders & operators building agents.
My CS classes were too slow, so I dropped out to focus on my AI startup, Automorphic.
ChatGPT & friends - The cool kids boosting my productivity.
Copyright liability for generative AI pivots on fair use doctrine.
Open Souls - How we’re going to integrate AI with the indescribable essence of humanity.
Unclassifieds - short, sponsored links
eesel AI is a ChatGPT oracle over your Google Docs, Notion, Slack, Confluence and other apps 🔮
Ben’s Bites Insights
We have 2 databases that are updated daily which you can access by sharing Ben’s Bites using the link below;
All 10k+ links we’ve covered, easily filterable (1 referral)
6k+ AI company funding rounds from Jan 2022, including investors, amounts, stage etc (3 referrals)
Reply