- Ben's Bites
- Posts
- New Image & Speech capabilities
New Image & Speech capabilities
PLUS: šØ Midjourney v5, š AssemblyAI's Speech Model, āļø PwC Lawyers & AI, š¬š§ UK AI Investments
Hey folks, today weāve got the next version of Midjourney for image creation, AssemblyAIās new, super powerful speech recognition model, 1 of the big 4 (PwC) bringing AI to its lawyers and the UK pushing forward with AI investments.
Want to build projects and make content around AI? Iām launching a maker-in-residence program. Learn about it here and apply!
Letās get to it.
š¤ Our Picks
Highlights if you've only got 2 minutes
Remember that crazy period when some kind of generative art stuff came out and changed the world years ago 6 months ago, and then we all got distracted by language models? Just kidding, but hot on the heels of GPT-4 is Midjourney v5, and itās equally as awesome. The thread linked above goes through a whole host of examples, and the results are stunning, albeit with subtle improvements over v4. It strikes me as better at capturing angular views of faces, is slightly less cartoonish, and captures the āvibeā of the prompts more effectively. Iām keen to see assessments from the art experts out there!
Check out this video for how to get access.
Time for Shazaam to take a leaf out of AssemblyAIās book and level up their ability to spot song lyrics in a noisy roomā¦ These guys are releasing Conformer-1, a speech recognition model thatās achieving near-human-level performance after being trained on 650,000 hours of audio data. Thatās about an entire lifetime, or a whole day for all of the people that open this email added togetherā¦ It has 43% fewer errors on noisy data than other publications, a 29% speed-up at inference, and actually uses novel tech. Check out some live examples on their website here.
OpenAI dived straight in with their demonstration that GPT-4 is a significantly better lawyer (amongst other things of course), and PwC is standing ready to put that to the test. Teaming up with the OpenAI-backed startup Harvey (back to that GOD DAMN brilliant Suits reference again), theyāre using ChatGPT-powered tech to bring efficiency to due diligence and regulatory compliance. With 4,000+ lawyers globally, PwC is making a big, commercial push to integrate generative AI into their business here.
Non-paywalled Bloomberg article here.
Itās a warm and fuzzy feeling when people in power decide to take note of the importance of the nerdy work we all do - and the UK government seems to have done something along those lines in the Spring budget. With a number of AI-specific allocations of money, itās a true demonstration of how AI is taking hold of the public discourse. My favourite bit is our very own Kaggle competition in the form of an annual Ā£1m prize for AI research; thereās also Ā£3b in additional funding for high-growth companies (read: tech?); and the building of a quantum computing hub just for good order.
Data chatbot for DTC brands - your own always-on, instant analyst (Sponsor)
Use Zenlytic's AI analyst to ask what's driving your store's acquisition, conversion, and retention.
It's like having a data analyst who never sleeps. Who can instantly drill into your brand's data. Who never forgets to update your dashboards.
š ļø Cool Tools
Product launches, updates and demos
Chatfuel AI - Take customer communication to the next level. (link)
Ask your PDF - Upload a PDF to chat with it. (link)
Paper pages on the Hugging Face Hub - Discover models, datasets and spaces for a specific paper. (link)
Collato - AI search for product teams. (link)
Klarity - Chat with your document and discover detailed insights. (link)
Auto transcribing is now faster and more accurate with Banner Bear + Whisper. (link)
Smart Chat - Transform your Obsidian notes into interactive AI-powered conversations. (link)
Darby Dashboards - Connect your APIs/data sources and create dynamic dashboards in seconds. (link)
DocLime - Say goodbye to manual document search and get answers within seconds. (link)
Contract Reader launches smart contract reviews powered by GPT-4. (link)
Chatbot UI Pro - Itās an open source ChatGPT clone that you can run locally. (link)
What does this code do - ChatGPT under the hood to explain any piece of code you don't understand. (link)
Nabla Copilot - Superpowers for health providers. (link)
Replit + GPT4 - Ready to go templates for Node.js and Python. Just create a Repl, place your API key, and run. (link)
Chatgpt Asteriods - A game coded by ChatGPT. (link)
I gave GPT-4 a budget of $100 and told it to make as much money as possible. (link)
Cursor - An IDE built for programming alongside GPT-4. (link)
ChatFriday - A minimalistic UI for ChatGPT. (link)
Cal, the meeting scheduler, announces AI powered search to its docs. (link)
Double - Start using GPT to automatically research your leads on the internet and answer questions. (link)
š¤ Miscellaneous
News, podcasts, videos, blogs etc
GPT-4ās successes, and GPT-4ās failures. (link)
Interview with OpenAIās Greg Brockman: GPT-4 isnāt perfect, but neither are you. (link)
GPT-4, Google adds AI to productivity apps, local language models. (link)
Stripe and OpenAI collaborate to monetize OpenAIās flagship products and enhance Stripe with GPT-4. (link)
The Vesuvius Challenge - A machine learning and computer vision competition to read the Herculaneum Papyri with $250k in prizes. (link)
Reid Hoffman wrote a book called Impromptu with Open AIās GPT-4. (link)
The multi-modal, multi-model, multi-everything future of AGI. (link)
Apple engineers testing ChatGPT-style tech as Siri faces āclunky codeā and other hurdles. (link)
LinkedIn is adding AI tools for generating profile copy and job descriptions. (link)
Unscripted AI NPCs in a first-of-its-kind Unreal Engine demo. (link)
Foundation Model Ops - Powering the next wave of generative AI apps. (link)
OpenAI checked to see whether GPT-4 could take over the world. (link)
Microsoft rations access to AI hardware for internal teams. (link)
These new projects show just how much more powerful GPT-4 is. (link, without paywall here)
Chat GPT4 - Is the world prepared for the coming AI storm? (link)
Should we automate the CEO? (link)
š Learn
How-toās and resources
š Too many links?! I created a database for all links mentioned in these emails. Refer 1 friend using this link and I'll send over the link database.
š¬ Research
Published research papers
X-GPT from Microsoft - Connecting generalist X-Decoder with GPT-3. (link)
Highly personalised text embedding for image manipulation by Stable Diffusion. (link)
UPRISE from Microsoft - Universal prompt retrieval for improving zero-shot evaluation. (link)
DeepMIM - Deep supervision for masked image modelling. (link)
Re-ReND from Meta - Real-time rendering of NeRFs across devices. (link)
Mesh Strikes Back - Fast and efficient human reconstruction from RGB videos. (link)
š° Unclassifieds
Short, sponsored links
Have a product, service, job, event, newsletter, app, book, movie, tool, or anything you'd like to share with over 45k subscribers?
Advertise | Job board | Community | Investor + Founders connect
š¼ AI Images of the day
Funny memes and pics from around the web
Send this to 1 AI-curious friend and receive my AI project tracker database! Use this link.
How was today's email? |
Reply