Ben's Bites Newsletter
Posts
Nvidia's latest launch spree.

Nvidia's latest launch spree.

November 16, 2023

Nvidia has silently launched a bunch of new features alongside Microsoft Ignite. New model classes, upgrades to the software stack, wider availability of its tools and more.

What’s going on here?

Nvidia is making it easier for developers to work with AI, many steps at a time.

What does that mean?

Here’s a recap of what it has announced:

1) Nvidia Foundation Models - A new family of foundation models Nemotron-3-8B with chat and QA variants. Nvidia combines them with a curated list of other enterprise-grade models from third-party providers. Developers can experiment with new NVIDIA AI Foundation Models directly from a browser, test in their applications with NVIDIA AI Foundation Endpoints, then customize using their unique business data.

2) NVIDIA announced an AI foundry service — a collection of NVIDIA AI Foundation Models, NVIDIA NeMo framework and tools, and NVIDIA DGX Cloud AI supercomputing and services — that gives enterprises an end-to-end solution for creating and optimizing custom generative AI models. It’s also applying it with Amdocs in the Telco industry.

3) Faster silicon with confidentiality - Microsoft Azure Cloud will offer new NVIDIA GPU virtual machines. Azure will have H100 VMs, delivering high AI performance. Confidential VMs with H100 GPUs are coming to enable privacy. Azure will add H200 GPUs next year, ideal for large generative AI models.

4) Run heavy computer graphics simulations in the cloud - Nvidia Omniverse now has a cloud version (again, hosted on Azure) to use the platforms without on-prem setup. Autonomous vehicle companies can simulate virtual factories on Omniverse Cloud to increase production quality while saving years of effort and millions of dollars.

5) TensorRT-LLM for Windows will soon be compatible with OpenAI’s popular Chat API through a new wrapper. This will enable hundreds of developer projects and applications to run locally on a PC with RTX starting at 8GB of VRAM, instead of in the cloud — so users can keep private and proprietary data on Windows 11 PCs.

Why should I care?

The new tools and services from Nvidia make it easier for any developer or company to leverage powerful AI, especially generative AI. You now have an end-to-end solution to go from experimenting with models to deploying customized AI.

This means you can more quickly integrate next-generation AI capabilities into your products and services. The optimized models, APIs, and cloud infrastructure remove a lot of the heavy lifting required before.

Reply

or to participate.