- Ben's Bites
- Posts
- Daily Digest: Google I/O 2024 - AI search is here.
Daily Digest: Google I/O 2024 - AI search is here.
PLUS: It's got Agents, Video and more. And, Ilya leaves OpenAI
Subscribe | Ben’s Bites Pro | Ben’s Bites News
Daily Digest #412
Want to get in front of 100k AI enthusiasts? Work with us here
Hello folks, here’s what we have today;
PICKS
New tutorial published: Analysing website user behaviour using AI
and given Google announcements below - here’s our free course on how to use Gemini
GOOGLE I/O 2024. Huff!! There’s too much to cover but here are the key themes:
Google is integrating AI into all of its ecosystem: Search, Workspace, Android, etc. In true Google fashion, many features are “coming later this year”. If they ship and perform like the demos, Google will get a serious upper hand over OpenAI/Microsoft.
All of the AI features across Google products will be powered by Gemini 1.5 Pro. It’s Google’s best model and one of the top models. A new Gemini 1.5 Flash model is also launched, which is faster and much cheaper.
Google has ambitious projects in the pipeline. Those include a real-time voice assistant called Astra, a long-form video generator called Veo, plans for end-to-end agents, virtual AI teammates and more.
Watch these demos of Astra in action and read 🍿Our Summary (also below)
OpenAI and Ilya Sutskever part ways. Along with Ilya, Jan Leike, the head of the superalignment team has also resigned. This leaves Open AI without two researchers who made key contributions to the GPT models. Jakub Pachocki is now the Chief Scientist of OpenAI.
from our sponsor
Fast-track your growth with the IO credit card.*
You’ll earn 1.5% cashback on all spend, from AWS costs to GPUs to coffee runs. Apply in minutes.
*The IO Card is issued by Patriot Bank, Member FDIC, pursuant to a license from Mastercard. To receive cash back, your Mercury accounts must be open and in good standing, meaning they cannot be suspended, restricted, past due, or otherwise in default.
TOP TOOLS
Snatched - A game where you lure LLM-powered NPCs to your alien ship.
Glitter AI - Turn any process into a step-by-step guide.
Glue - Smarter work chat for people, apps, and AI.
ElevenLabs Dubbing API - Add audio or video translation to your product.
Insight XR - AI analytics and actionable insights for XR applications.
Shortlist - The future of AI recruitment tools.
NEWS
Google is hosting a Gemini developers competition with $1M in prizes.
GPT-4o and OpenAI’s race to win consumers.
Microsoft will invest €4B in France for cloud and AI.
How generative AI is remaking UI/UX design.
Expedia starts testing AI-powered features for search and travel planning.
Musk’s xAI is about to close a $10B deal for renting Oracle’s AI servers.
Quantifying GitHub Copilot’s impact in the enterprise with Accenture.
Unclassifieds - short, sponsored links
AI crowdfunding – Invest in Authors AI, find out why it has the Most Momentum on StartEngine. Check it out.
QUICK BITES
GOOGLE I/O 2024. Huff!! There’s too much to cover but let’s start with the key themes:
Google is integrating AI into all of its ecosystem. In true Google fashion, many features are “coming later this year”. If they ship and perform like the demos, Google will get a serious upper hand over OpenAI/Microsoft.
All of the AI features across Google products will be powered by Gemini 1.5 Pro. It’s Google’s best model and one of the top models. A new Gemini 1.5 Flash model is also launched, which is faster and much cheaper.
Google has ambitious projects in the pipeline. Those include a real-time voice assistant called Astra, a long-form video generator called Veo, plans for end-to-end agents, virtual AI teammates and more.
Now, let’s dive in—beginning with integration to products:
In Search, the SGE (Search Generative Experience) is coming out of beta as “AI overviews” to everyone, starting with the US. AI overviews will also get multi-step reasoning and the ability to search based on a video.
"Ask Photos" is a new feature that will upgrade searching through photos from simple keywords like “flowers” to “show me all the images of my son playing with our dog” and “what was the first place we visited on our Japan trip”.
The same search, reasoning and Q&A across all your emails will also come to Gmail. Gmail is also testing a few features via Labs: summaries for each email and suggested drafts as replies. The AI side panel in Docs, Sheets and other workspace apps is coming to everyone, with more features—see details.
On Android, Gemini Nano will power on-device features like Circle to Search, Multimodal Talkback, and Scam detection.
Gemini Advanced, Google’s paid chatbot is getting some much-needed features like file uploads and data analysis (like code interpreter). These will be live soon powered by Gemini 1.5 Pro. On the longer horizon, Gemini Live will compete with ChatGPT’s voice assistant that OpenAI revealed yesterday and Gems will be Google’s version of GPTs.
Which brings us to the models:
Gemini 1.5 Pro which Google revealed in February is coming to everyone now: via API, AI studio, Gemini Advanced and all the product updates. Google says it’s also improved on several metrics but we don’t have a technical report to know the specifics yet.
A new model, Gemini 1.5 Flash, is also available in API and AI studio. It’s faster and cheaper than Gemini 1.5 Pro. Its performance on some select benchmarks is provided, putting it in the Llama 3 70B and Claude Sonnet category but with a price similar to Claude Haiku.
A key thing to remember about Gemini models is that all the models in this family are multimodal natively. They can take any form: text, images, audio, or video as input and create any form of output. Ask Photos and Audio outputs in Notebook LM are good examples of the power of multimodality.
Also, the 1.5 series had a context window of 1M tokens (2M is in preview now). That means you can add a bunch of (very) long files in any media form and these models should work.
OpenAI’s GPT-4o is their first such model. Previously they used to stitch different models to make multimodality work.
Why not jump to Google’s war against OpenAI now? Google showcased a bunch of ambitious projects from Deepmind that take on OpenAI’s impressive sci-fi reveals.
Project Astra is a general AI assistant. It can talk to you in real-time, understanding the audio and video around you. Google isn’t fixated on making it sound like ScarJo for the Her hype but the functionality is almost similar. Jerry compiled a bunch of friends trying out Astra at I/O. Project Astra would likely come to us with a feature called Gemini Live.
Veo is Google’s Sora competitor. It can create long-form, 1080p videos from text prompts with the same claim as Sora of “simulating world physics”. The samples don’t look as awesome as Sora though, but there is a waitlist for you to try getting your hands on it.
Google is also coming to the agent's land. There are variations of how Google is planning this: Gems (like GPTs), virtual teammates in workspaces, and end-to-end task completion via the Gemini app.
Ben’s Bites Insights
We have 2 databases that are updated daily which you can access by sharing Ben’s Bites using the link below;
All 10k+ links we’ve covered, easily filterable (1 referral)
6k+ AI company funding rounds from Jan 2022, including investors, amounts, stage etc (3 referrals)
Reply