AI is becoming Google's Midas Touch

Google’s been pumping the weights the past couple of weeks to flex its AI muscles but every time it tries to do something, it either falls short, gets overshadowed by OpenAI or backfires on them. Feels like Google has got the Midas Touch when it comes to AI, the good and bad.

What’s going on here?

Google rolls out new AI models and features but faces extreme backlash at the same time.

What does that mean?

Just recently Google announced its new model with 1M tokens context length. A couple of hours later OpenAI slam dunks them with Sora, an AI video model with too real to believe outputs.

But, that’s a week-old story. On Wednesday, Google

And guess what people are talking about? Its flagship chatbot, Gemini, is not creating pictures of white people.

So, here’s my attempt at separating the wheat from the chaff and bringing some nuance to the hype chamber.

  • Gemini’s image fiasco:

It’s bad. Creating diverse images in response to vague queries is a nice idea but turns out Google’s Gemini takes it too literally. It started generating historically incorrect images of white leaders, misrepresented ethnicities and fictional narratives. Google put a bandaid as entirely refusing to generate images of white people and that added fuel to the fire. Now it’s running around trying to find a fix.

Now, here’s the nuance: LLMs and image generators have a bias against minorities. Some moderation is necessary to not further promote that. But it’s hard to find a balance and Google’s first attempt at it “sucked” (to say the least). OpenAI’s ChatGPT has the same issue, but since the model has been public for a while, they have gotten much better at that balance. Google will too.

This is much more likely to be an LLM/image model bug and less likely to be an evil agenda.

  • Google open source models:

Google’s back in the “open models” game. Some are claiming that it is “joining” open source but Google’s been at it for years now. Before Llama in mid-2023, Google’s FlanT5 was the best open language model. And based on the benchmarks, it seems like they are again now (in the weight category they are playing—7B).

However early results from the LLM hackers I follow are not as great. Google claims to beat Llama and Mistral both but to recap of what I’ve seen on Twitter, it is more like Llama < Gemma < Mistral.

  • Gemini 1.5 Pro:

OpenAI’s Sora overshadowed this baby beast. And there’s some merit behind it. Sora’s insane, and Sam Altman was generating clips for people in real-time. But looks like Google has learnt somewhat.

More devs are getting access to these models and getting blown away. Google’s team is also actively sharing more demos including some insane ones. 

Why should I care?

This time, I have a weird answer: care less i.e. focus on what matters to you. Grand narratives are likely someone playing with your focus. Be vigilant, participate in discussions around safety, features, limitations, etc., and question things that make you doubtful but don’t let it get to your head.

Google’s gonna win some and lose some in the AI battle. Find your moments of stricking the gold in between all this.

Reply

or to participate.