Daily Digest: Open Source Multimodal

PLUS: Amazon's new AI robots and AI transparency

Sign up | Advertise | Ben’s Bites News
Daily Digest #266

Hello folks, here’s what we have today;

PICKS
  1. Wake up, babe! Adept is open-sourcing Fuyu-8B. Fuyu (hip name btw) is multimodal, i.e. it can see pictures AND read text. The model is available on HuggingFace and is designed for digital agents to understand images and text. 🍿Our Summary (also below)

  2. Amazon is rolling out a major overhaul of its fulfilment centres using new AI and robotics to speed up deliveries. Amazon says that the new technology is designed to work alongside human employees to reduce injuries. Nice, but what about lunch and pee time mate?🍿Our Summary (also below)

  3. Folks from Stanford, MIT and Princeton got together and created a new index called the Foundation Model Transparency Index (FMTI) to measure companies' transparency levels across 100 indicators. Turns out Top AI models offer little transparency, even open-source ones. 🍿Our Summary (also below)

from our sponsor

AI copilot for interviews that let's you focus on your candidates - AI-notes for Google Meet, MS Teams & Zoom.

Dive deep with Aspect HQ’s custom AI summaries, ChatGPT for hiring, AI interviewer coach, and auto-drafted ATS scorecards.

TOP TOOLS
WHO’S HIRING IN AI
NEWS

Unclassifieds - short, sponsored links

  • Centenarian - performance & longevity coaching driven by your wearable data.

QUICK BITES

Wake up, babe! Adept is open-sourcing Fuyu-8B. Fuyu (hip name btw) is multimodal, i.e. it can see pictures AND read text. The model is available on HuggingFace and is designed for digital agents to understand images and text.

What is going on here?

The AI squad at Adept just dropped an open-source multimodal model called Fuyu-8B.

What does this mean?

Unlike other multimodal cuties, Fuyu-8B keeps it simple. She feeds images right into her transformer decoder so she can work with any image size. No separate image encoder or complex training. Fuyu-8B's chill with charts, diagrams, and docs—she answers questions about them like a boss.

On common benchmarks, Fuyu-8B outperforms models with more parameters, showing its efficient architecture. However, these benchmarks have issues, so Adept be like: no worries, we’ll build our own.

Fuyu-8B is a small version of the larger multimodal model that powers their products. Her big sis Fuyu-Medium does next-level stuff like OCR scanned docs and pinpointing UI elements. Adept is keeping their bigger models under wraps for now—fair.

Why should I care?

An open multimodal model is a big step for AI. Simpler architecture = more accessible and scalable. Fuyu-8B is a solid base for researchers and devs to build real-world apps.

Understanding visual data matters for business. Precision OCR/localization unlocks assistants that can see screens like humans and take action. The Fuyu models are geared toward knowledge workers, worth checking the detailed examples on the blog if you’re one.

QUICK BITES

Amazon is rolling out a major overhaul of its fulfilment centres using new AI and robotics to speed up deliveries. Amazon says that the new technology is designed to work alongside human employees to reduce injuries. Nice, but what about lunch and pee time mate? 👉👈

What is going on here?

Amazon's new warehouse system will significantly cut delivery times and make inventory tracking much faster.

What does this mean?

Amazon's revamp introduces robots and AI into its warehouses to reduce the time it takes to process orders. The centrepiece is a robotic arm called Sparrow and a new sortation system named Sequoia. Together, these will slash the time to fulfil orders by up to 25% while identifying inventory 75% faster. Amazon’s plans for Sequoia include the “same-day delivery sites” it’s working on.

Why should I care?

The claim from Amazon is that automation is not about eliminating jobs but rather mundane tasks. The goal is to integrate robots seamlessly into workflows. Though, I believe the pattern of people who can’t adapt to these newer workflows (which, I agree, is easier said than done) will lose their jobs.

Rivals like Walmart are also turning warehouse jobs into robot management roles. Amazon would also start to test a bipedal robot named Digit in its operations. We’re not ready for LLMs, now imagine AI + robotics.

QUICK BITES

Commercial foundation models are becoming less transparent, according to researchers at Stanford's Center for Research on Foundation Models. Together with folks from MIT and Princeton, they created a new index called the Foundation Model Transparency Index (FMTI) to measure companies' transparency levels across 100 indicators.

What is going on here?

FMTI finds the top 10 major foundation model companies lacking in transparency.

What does this mean?

The group evaluated 10 major companies on 100 indicators covering how models are built, how they work, and how they're used downstream. The highest score was 54 out of 100 (Llama 2) showing much room for improvement across the board. Many critical details like training data sources, labor practices, and model usage stats weren't disclosed by any company.

The FMTI methodology and indicators are designed to avoid conflicts between transparency and other values like privacy and security.

Why should I care?

As foundation models spread across sectors, transparency is crucial for properly regulating these powerful systems and ensuring they are built and used responsibly. This lack of transparency makes it hard for businesses, academics, regulators and the public to understand these increasingly influential technologies.

Without basic details of how models work, issues like bias, privacy violations, and other harms can't even be identified, let alone addressed. Nine of the 10 companies have committed to managing AI risks, and the researchers hope the index will help them follow through. They also want to inform policymakers considering regulation around foundation models.

Ben’s Bites Insights

We have 2 databases that are updated daily which you can access by sharing Ben’s Bites using the link below;

  • All 10k+ links we’ve covered, easily filterable (1 referral)

  • 6k+ AI company funding rounds from Jan 2022, including investors, amounts, stage etc (3 referrals)

Reply

or to participate.