ChatGPT for Images Can Feel Like Magic - AIxC: 79
+ Virtual Try-On Workflows for Paid Subscribers
Welcome back! Constant progress is normal in Ai world, but this week has been out of control. Just to name a few of the big drops:
New version of Gemini
ChatGPT 4o with conversational text-to-image capability
A few foundational model, Reve
Let’s jump in,
Hector & AIxC team
01 / Latest
ChatGPT for Image Generation
I never liked using Dall-e much. Not a lot of control, slow and just not up to par with other text-to-image tools.
This new ChatGPT image tool is another story. I gave it a list of directions, including, colors, shapes, composition and text. It picked up on most everything I noted. It was a little confused with the text when I asked it for “new logos”, but I’d file that under unclear direction from my part.
First pass, count me impressed. The potential use cases are ripe for exploration.
You can read all about its capabilities here.
02 / Latest
Does It Live Up to the Hype?
I input this external image of a production timberland boot to ChatGPT’s image function. I then asked it to give me a “tech pack” type drawing for it, front and side views. A minute later I had this. Side view pretty good, front a bit wonky. But very promising with bit more effort.
I then gave it an input that was a bit more out there. A stretch from the traditional.
The results, ok for a single try. Missing some details, proportions off, wanting to go towards the stock boot.
I tried one last attempt with another out there design.
You can see how it took my intent and placed it over a traditional boot profile.
Take all of this with the grain of salt that comes with only having spent a short amount of time on this. But the quick results tell me that like most things Ai, it will be easy to product “magic” when the subject matters have large datasets. And without that, it will still struggle and require a lot of creativity from our end.
03 / FOR PAID SUBSCRIBERS
Virtual Try-On with Gemini 2.0 Flash
Manu showed up at our weekly Ai R&D meeting today with some firepower. He spent time testing out what was possible with Gemini’s virtual try-on capability.
Above you can see how you can easily blend two assets (left and middle) to create a new one (right). Not 100% accurate, but close.
That effort was followed up with another to see how you can take a base image and visualize it in wide range of views and shots:
As is the case with nearly all Ai functions, you’re getting loss of intent, but pretty darn good for storytelling and vibe.
This is a good one today and well worth the price of admission in my biased opinion :)
Sign up below for access to the AIxC R&D Lab board and this workflow documentation and the detailed breakdown. Link and password behind the paywall. Please note the new password for the month!