Daily Digest: Nvidia strikes again

PLUS: Cheap compute for AI startup.

Sign up | Advertise | Ben’s Bites News
Daily Digest #284

Hello folks, I’m researching AI software satisfaction across a number of tools. If you use any AI products at work, I’d love to hear from you by filling this in - it’s quick and you can use audio if you’d prefer 🙂. I plan on making the compiled data available in the near future!

Here’s what we have today;

PICKS
  1. NVIDIA has announced a new GPU, the H200, based on their Hopper architecture. This will provide a major boost in performance for AI workloads like large language models.🍿Our Summary (also below)

  2. OpenAI chief seeks new Microsoft funds to build ‘superintelligence’. Sam Altman did an interview with the Financial Times and namedropped many things—GPT 5 being one of them. 🍿Our Summary (also below)

  3. Bay Bridge by SF Compute (I’m an investor) - The San Francisco Compute Company is selling the cheapest access ever to beastly H100 training clusters. We're talking bleeding-edge setups with 1024 interconnected H100 GPUs and massive InfiniBand networking.🍿Our Summary (also below)

TOP TOOLS
  • Tokenwiz - Hugging Face tokenizer visualizer.

  • Together Inference Engine - The fastest inference available at 117 tok/sec for Llama 2 70B chat.

  • Room AI advice - Get personalized advice from the world's first AI interior designer.

  • Million AI - Detect, diagnose, and fix slow components in your code.

  • Typredream - Launch your website in minutes with AI.

  • SEC filings AI - Fast-track reading reports from your favorite stocks in one click.

  • Earnbetter - Free AI job search assistant. (+$4.5M funding)

  • Leap workflows - The future of professional AI-driven automation.

  • GPTs by My Ask AI - Launch a GPT with superpowers & unlimited external knowledge.

WHO’S HIRING IN AI
NEWS
QUICK BITES

NVIDIA has announced a new GPU, the H200, based on their Hopper architecture. This will provide a major boost in performance for AI workloads like large language models.

What’s going on here?

The H200 delivers vastly more memory bandwidth and capacity over previous GPUs, enabling generative AI to process much more data much faster.

What does this mean?

The H200 has 141GB of HBM3e memory, nearly double the A100, and 2.4x more memory bandwidth. This is ideal for handling the massive data needs of large language models and other generative AI applications.

The new GPU is expected to nearly double inference speeds on 70 billion parameter models compared to the H100 GPU. Further optimizations will unlock more performance over time.

The H200 will be available in NVIDIA's HGX server boards and the Grace Hopper Superchip module. It is compatible with existing H100 servers, allowing easy upgrading.

Why should I care?

This new GPU will accelerate development and use of large language models that can power more capable AI assistants, creative tools, and information systems.

Faster training and inference directly enables companies to build and deploy more advanced AI sooner. This can translate into providing better services, insights, and automation.

As AI becomes an increasingly crucial part of business and research, access to more powerful hardware like the H200 will be key to staying competitive. The acceleration it offers will help drive generative AI forward across industries.

QUICK BITES

Sam Altman did an interview with the Financial Times and namedropped many things—GPT 5 being one of them. Sam talked about their true product, revenue growth, relationship with Microsoft and more.

What is going on here?

Sam Altman talks GPT5, superintelligence and future funding.

What does this mean?

Altman says the partnership with Microsoft is working well, and he expects Microsoft to keep investing in OpenAI's pursuit of AGI. Revenue growth has been good this year, but costs to train AI models remain high, so OpenAI continues to operate at a loss. To build out its enterprise business, OpenAI has hired executives like COO Brad Lightcap.

OpenAI’s true product is "superintelligence". Tools like GPTs to create more autonomous agents capable of complex tasks are a path to it. Altman confirmed that OpenAI is working on GPT-5, though didn’t give any timeline hints. Apart from compute, GPT-5 will need more data to be better than the previous models and OpenAI’s data programs may be a step in that direction.

Altman also commented on the high demand for Nvidia’s chips being a factor in work this year but seems to think that the problem will be solved soon.

QUICK BITES

The San Francisco Compute Company is selling the cheapest access ever to beastly H100 training clusters. We're talking bleeding-edge setups with 1024 interconnected H100 GPUs and massive InfiniBand networking.

What is going on here?

SF Compute is renting AI startups time on an H100 cluster by the month or even just weeks.

What does this mean?

San Francisco Compute is offering short-term bursts on thousands of interconnected H100s with InfiniBand. They have one cluster online now, called Angel Island. The next one, Bay Bridge, comes online in December.

Other providers force you to buy a whole year upfront, which costs like $25 million for 1024 H100s. SF Compute is planning to sell the same beast rig for just the months you need, like $2.5 million for one month. It's more per hour than what most folks will do, which doesn't always make it a great choice for inference. But for training very large models, it's cheaper since you can say goodbye to multi-year lock-ins.

Why should I care?

If you're an AI shop wanting to play with the biggest models, you need access to hardware like this. But until now you had to pre-pay tens of millions a year to play. SF Compute also guarantees bare metal grade performance, ML infra support and refund if things go wrong.

Ben’s Bites Insights

We have 2 databases that are updated daily which you can access by sharing Ben’s Bites using the link below;

  • All 10k+ links we’ve covered, easily filterable (1 referral)

  • 6k+ AI company funding rounds from Jan 2022, including investors, amounts, stage etc (3 referrals)

Reply

or to participate.