Ben's Bites Newsletter
Posts
A Figma AI tool, magic videos, dance generation

A Figma AI tool, magic videos, dance generation

and an AI tool I actually use every day

Ben Tossell
November 22, 2022

Hey folks, overwhelmingly positive feedback on the new format so I'm going to stick with it for now.

Grab your cutlery of choice, let's dig in.

🤌 Ben's Picks

An AI tool I actually use every day is finally out in the wild: Bearly AI. It lives as a chrome extension or keyboard shortcut to pop up as you’re in the groove. So think of me looking through tons of research papers and trying to quickly summarise and simplify them for you, my beautiful readers. That’s where I use Bearly. I’m on an arXiv post, click the little bear, and boom I get a summary that I can use to explain what the paper is talking about. It’s got a bunch of writing tools baked in; summarisation, rewording, generation etc. (link)
Magician - A magical design tool for Figma powered by AI. Firstly, what a great way to talk about the tool, ‘magic’ is exactly what a lot of this feels like. You can use this tool to make icons, images and text in your Figma designs. No more lorem ipsum, no more Unsplash, no more going somewhere else to come back and populate in Figma. (link) You can read the announcement blog post here.

MagicVideo: Efficient video generation with latent diffusion models. (link)
EDGE - editable dance generation. Create realistic and physically-plausible dances from music. Physically-plausible is a stretch for my moves, but still! (link)

🛠️ Cool Tools

Clip - an audio search engine and soon a way to generate new audio. To be clear, this isn’t a “search audio from podcasts” tool, like I thought it was. It’s more relevant for searching things like ‘ocean waves’. (link)
Replace one thing in an image, with another, using text-prompts. (link)

👋 Too many links?! I created a database for all links mentioned in these emails. Refer 1 friend using this link and I'll send over the link database.

🔬 Research

Using AI to study 12 years of representation in TV. The study found that while there has been some progress in terms of representation, there are still significant gaps, particularly for women, older people and people with dark skin tones. (link)
Using reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning. (link)
3D GAN framework for unsupervised learning of generative, high-quality and 3D-consistent facial avatars from unstructured 2D images. (link)
DIAL is a robot that is designed to help with data collection and policy evaluation. (link)
Image cropping that takes into account the user's intentions. (link)
UniMASK - randomly masking and predicting word tokens in order to pre-train language models. (link)
SinFusion: Training diffusion models on a single image or video. (link)
DDCap is designed to be more flexible than existing methods for image captioning. The model uses diffusion-based inference, which allows for more efficient decoding of text tokens. (link)
Conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes - SceneComposer. (link)
Text-to-SVG. VectorFusion - a text-conditioned diffusion model trained on pixel representations of images can be used to generate SVG-exportable vector graphics. (link)
AR-LDM - designed for story visualization and continuation tasks. The model is conditioned on history captions and generated images, and is able to adapt to new characters. (link)

🤓 Everything else

OpenAI’s text-davinci-002 was not trained by reinforcement learning from human feedback (RLHF) - as mentioned in this updated post. (link)
How you and your team can use open-source models to build custom document AI solutions for free. (link)
Chinese OCR - a tool to pull out Chinese text in images. (link)
Ghibli Diffusion - fine-tuned Stable Diffusion model trained on images from modern anime feature films from Studio Ghibli. (link)

How BetterTransformer from PyTorch works under the hood and how to efficiently use it in your models in Hugging Face using Transformers and Optimum libraries. (link)
Text-to-emoji. Emoji Diffusion - a Stable Diffusion model fine-tuned on the Russian-emoji dataset to generate emoji images. (link)
The award-winning papers for NeurIPS 2022. (link)
Buildspace is launching its first AI project - Intro to ML: Build your own writing assistant w/ GPT-3. (link)
Tutorial: Train and deploy a DreamBooth model on Replicate. (link)
Scott Aaronson (OpenAI) on “The differences between the Orthodox and Reform branches of AI Risk”. Orthodox AI Riskers believe that a few elite engineers will determine whether humanity survives or is destroyed by AI, while Reform AI Riskers believe that many factors will affect the development of AI and that public outreach is important. (link)
Roam Research users - an extension “similar pages” uses machine learning and graph theory to help you find connections between ideas. (link)
Pieter Levels is working on more photorealistic image renders, and they’re looking gooood. Lexica vs Levelsio. (link)

Stable Diffusion model trained on screenshots from The Clone wars TV series. (link)

🧑‍💻 Who's hiring in AI

VEED - Simple Online Video Editing. VEED is hiring AI / ML engineers to level up its creative toolkit and make it more magical.
Buildspace - where builders, build! They're looking for an ML/AI instructor to build their new course.

Add your company here | Join the Talent pool

🖼 AI IMAGES OF THE DAY

🤗 SHARE BENS BITES

Send this with 1 AI-curious friend and receive my AI project tracker database!

or copy/paste this link: https://bensbites.beehiiv.com/subscribe?ref=PLACEHOLDER

👋 SEE YA

⭐️ HOW DID WE DO?

How was today's email?

⭐️ REAL REVIEWS

Reply

or to participate.