Welcome Renaissance Creator,
Google is having a big week.
From announcing Gemini, its newest model that was believed to blow GPT-4 out of the water, to announcing a series of free and open source models, to Gemini causing a series of gaffes prompting many people to suggest firing Google’s CEO—there has been a lot going on.
In an effort to divert attention away from disturbing imagery coming from Gemini’s image generation, Google is back today with a new announcement. This one, if it actually works, looks mind-blowing.
Today, Google announced Genie: Generative Interactive Environments.
Hot Off The Press
Google announces Genie: a foundation model for generative world-building
Eleven labs and Perplexity partner to create an automated daily podcast
Phind-70B Closing the code quality gap with GPT-4 Turbo while running 4x faster
Tyler Perry just stopped his $800 million film studio expansion due to OpenAI's Sora
Google’s Gemini is still having a lot of problems and causing AI ethics and safety debates
Democratic operative admits to commissioning fake Biden robocall that used AI
Magic 6 Pro smartphone showcases an eye-tracking AI function that enables users to open and move their car by looking at their phone screen
Figure AI raising $675 million from Intel, Samsung, Amazon, Bezos, NVIDIA, and OpenAI to build humanoid
AI is going to change the adult entertainment business forever
The One Big Thing
Google wants to be the Main Character of the AI Game
Google is arguably the company with the most to lose in the AI revolution.
While most participants are clamoring to grab a piece of a seemingly infinite pie, the incumbents, especially in the internet search space, have a large business to defend from disruption.
New entrants like ChatGPT and Perplexity have begun to eat into Google’s Search business, and despite however slight that business impact may be yet, the pace of adoption is a real threat. ChatGPT was the fastest growing technology product of all time upon launch and presents a direct threat to the massive moat Google has built around internet search queries. With the barrier to content creation and distribution going to zero, it’s a fair question to ask how relevant direct search will be in the future of the internet.
But Google has a trillion reasons to fight back. And another trillion dollars to fight with.
It turns out that a trillion dollars can create some cool new things, as we have seen in the past week. Google has picked up the pace of their releases, with Gemini, Gemma, and now Genie all coming within a week of each other.
Genie is taking Google’s AI ambitions to the next level.
Google describes it as “a foundation world model trained from Internet videos that can generate an endless variety of playable (action-controllable) worlds from synthetic images, photographs, and even sketches.”
Playable worlds from synthetic images.
According to Google, Genie was fed the following image as input to create a generative world environment:
This was the output (excuse the blurry image):
In a previous issue, we talked about how, with OpenAI’s Sora, everyone in the world can now be a filmmaker with the tools in their pocket and the power of their imagination.
Genie will let everyone become a video game designer, architect, world-builder, and 3D animator, among many other crafts we have not even conceived of yet.
Beyond this, it also offers a new environment for AI agent training in a closed environment before being released into the real world. The demos for this look particularly interesting. Imagine a generative world that is constantly updating and reacting to an agent that lives within the world. Something like Boston Dynamic’s robots, before being thrust into the open world, could interact in a generative world created by Genie where the AIs could interact in a virtual space, expand the limits of testing without real world implications.
The barriers to creation are getting closer and closer to 0. The tools are getting stronger and stronger. There is only one place this leads, and that is an explosion of creativity coming from every corner of the earth.
The Renaissance.
Artists’ Gallery
From 2D image to 3D model using Stable Diffusion:
#ComfyUI#Comfy3D
Comfy3D Update:
- Support install directly into ComfyUI Windows Portable Environment (python 3.11, cuda 12.1)
- A bunch of small adjustment & fixFollowing results are generated inside ComfyUI python embeded environment:
— Mr. For Example (@MrForExample)
8:17 PM • Feb 25, 2024
AI video reel made entirely from Stable Diffusion:
Fresh out of the oven! A reel of my brainchild from the past 6 months! Thank you for 1000 followers!
Sound on!
Created with @StabilityAI SVD, @runwayml , @pika_labs , @midjourney and #AnimateDiff
Music: Chris Avantgarde - Inside (ft. Red Rosamond)
— The Butcher's Brain (@ButchersBrain)
6:21 AM • Feb 19, 2024
AI Generated Art using Voice Prompting:
The next stage of AI image gen is going to be all about control. Human creativity is a beautiful thing, and the images we all have in our heads is much better expressed through motion, brush-strokes, song, tone of voice than through a simple text prompt.
"Make a tree with a… twitter.com/i/web/status/1…
— Martin Nebelong (@MartinNebelong)
9:24 PM • Feb 25, 2024
Hybrid Animal / Human Portraits (Gallery, click in to see more. Weird af, I won’t lie):
Tools
Must have tools for every Renaissance creator to add to their toolkit:
Open Code Interpreter: Integrating Code Generation with Execution and Refinement
You: Personalized AI Assistant
How to use Gemma to interpret large documents
Get Insulted chat bot (exactly how it sounds)
TTS Arena: One place to test, rate and find the champion of current open models
Perplexity Globe Search: “Kinda like a custom wikipedia page on anything you want.”
Use AI to learn about anything in a visual way
Deep Tech
The newest and coolest in the research world that you need to know about:
AlphaFold Meets Flow Matching for Generating Protein Ensembles
Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance
Open Source AI Cookbook: open-source guides/ colabs to build practical (scalable) AI applications
Meta presents MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Long context windows and performance deep dive
AI x Robotics deep dive from last week
Deep dive on RAG
Middleware for LLMs
Someone fine tuned Mistral to create better performance than Gemini Pro
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Genie Technical Paper: Generative Interactive Environments
Closing Thought
❝
Genie & AlphaFold: Simulate protein folding in a generative world environment
Me pontificating about things people smarter than me are building
Work With Us!
The AI Renaissance is coming, and we are building the best community of people making it happen.
Contact us to sponsor your product or brand and reach the exact audience for your needs across our newsletter and podcast network.