- Ben's Bites
- Posts
- Text-to-3D
Text-to-3D
Magic3D from Nvidia: High-resolution text-to-3D content creation, Prompt Engineering 101, MultiRay by Meta, Canva text-to-image
Hey everyone! Welcome to the 210 new folks joining since Friday.
I'm testing a new format today so you'll notice that I've split things into sections; Ben's picks - for any major stuff I think is worth checking out, cool tool - self-explanatory, research - any arXiv or scientific breakthroughs, and then everything else.
This is to help skim-ability and not overwhelm. Making it easier to pay attention to the stuff you want.
Please reply with 'better' or 'worse' so I can see your feedback :)
🤌 Ben's Picks
Magic3D from Nvidia: High-resolution text-to-3D content creation. It can create high-quality 3D textured mesh models from input text prompts and utilizes a coarse-to-fine strategy that leverages both low- and high-resolution diffusion priors for learning the 3D representation of the target content. Magic3D synthesizes 3D content with 8× higher-resolution supervision than DreamFusion while also being 2× faster. (link)
Buildspace wrote a guide: Prompt Engineering 101. You can probably guess what it’s about 😉 but their images are so dope I’d check it out just to attempt to replicate similar. (link)
MultiRay by Meta: a new platform for running state-of-the-art AI models at scale. It allows multiple models to run on the same input and shares the majority of the processing costs while incurring only a small per-model cost. (link)
Canva text-to-image. Canva, the multi-billion dollar design app, is now able to generate images from text. It makes a lot of sense and was only a matter of time. (link)
🛠️ Cool Tools
Whisper Memos is an app that records your voice and sends you an email with the transcription a few minutes later. (link)
Alpaca - a next-generation design platform powered by generative models. (link)
Get answers for CLI commands from GPT3 right from your terminal. (link)
Getimg.ai - a suite of AI-powered image-generation tools. (link)
👋 Too many links?! I created a database for all links mentioned in these emails. Refer 1 friend using this link and I'll send over the link database.
🔬 Research
TART: follow human-written instructions to find the best documents for a given query. (link)
The Twitter-to-arXiv pipeline for GPT-3 discoveries. (link)
Ask4Help enables agents to request and use expert assistance. The policy is designed to efficiently train agents without modifying the original agent's parameters, and to learn a desirable trade-off between task performance and the amount of requested help. (link)
Efficient exploration in reinforcement learning, called random curiosity with general value functions (RC-GVF). (link)
Explaining the behaviour of AI systems. (link)
CNeRV is designed to improve the reconstruction of visual data. (link)
A new machine learning model, Distilled DeepConsensus - for genome sequencing. The model is faster and more accurate than the standard HMM-based methods and improves downstream applications of genomic sequence analysis. (link)
CITADEL - efficient and effective multi-vector retrieval. CITADEL is designed to reduce the computation cost while maintaining high accuracy. (link)
A new framework for efficient inference of MoE models with conditional execution of sparsely activated layers - Who Says Elephants Can't Run. This framework enables faster computation of sparse models and reduces memory consumption significantly. (link)
SmoothQuant - a new method for quantizing large language models (LLMs). (link)
Program-Aided Language models (PaL) use a language model to understand natural language problems and generate programs as the intermediate reasoning steps, but offloads the solution step to a programmatic runtime such as a Python interpreter. (link)
GENIUS - a conditional text generation model that uses sketches as input. GENIUS is pre-trained on a large-scale textual corpus, and is able to generate diverse and high-quality texts given sketches. (link)
Image completion that incorporates explicit structural guidance. (link)
Renderdiffusion: image diffusion for 3d reconstruction, inpainting and generation. (link)
🤓 Everything else
AssemblyAI released its Playground. You can test its transcription by using any YouTube video link, local audio file, or local video file. (link)
Want to know what Deep Reinforcement Learning is? Here’s a quick thread. (link)
Snorkel AI released its Data-centric Foundation Model Development, a new paradigm for enterprises to use foundation models to solve complex, real-world problems. (link)
Interview with generative artist Gene Kogan - Between art and engineering: AI and expanding what it means to create. (link)
If anyone is actually using Google Chat, get AI summaries of your conversations. (link)
Scott Galloway wrote about his thoughts on AI. He’s become a bit of a meme for addressing technology and getting it very very wrong. Luckily here there aren’t any outlandish claims. (link)
An overview of the United States vs China chips situation. (link)
AltDiffusion is a multilingual text-image generation model built on Stable Diffusion. Currently supports English, Chinese, Spanish, French, Japanese, Korean, Arabic, Russian and Italian. (link)
DiffusionDB: A large-scale text-to-image prompt gallery dataset based on Stable Diffusion. (link)
I wrote a thread linking a bunch of educational resources to learn Machine Learning. (link)
Accuracy and performance testing of OpenAI's transcription software. (link)
🧑💻 Who's hiring in AI
VEED.IO - Simple Online Video Editing. VEED is hiring AI / ML engineers to level up its creative toolkit and make it more magical.
Buildspace - where builders, build! They're looking for an ML/AI instructor to build their new course.
🖼 AI IMAGES OF THE DAY
Woody Cage
🤗 SHARE BENS BITES
Send this with 1 AI-curious friend and receive my AI project tracker database!
or copy/paste this link: https://bensbites.beehiiv.com/subscribe?ref=PLACEHOLDER
Reply