GPT-image-1 has just gained 17% usage, challenging the leaders in AI image generation. We are not surprised; we have been experimenting and building AI Image generators into our automation workflows, and GPT-image-1 is your best choice at the moment.
A key reason for this surge in use can be seen from the side-by-side image comparisons generated from identical prompts in our AI-enabled workflows.
GPT-image-1 outperforms the other generators by a country mile when accurately rendering text within images.

In a B2B context, integrating text and images is particularly important. Unlike more stylised image generation outcomes, B2B often requires text as a vital supporting element to effectively convey a message, which can be challenging with images alone.
Three reasons why text in images matters
As part of integrating image generation into our workflows, text in images are crucial in several use cases. One important application is for the Open Graph image displayed when an article or blog post is shared, more on that below. This image is often the most visually appealing part of the shared content, and we have observed a significant increase in click-through rates when a concise text overlay is present, informing viewers about the content.
1 ) Impact on Click-Through Rates
Studies show that implementing strategic text overlay techniques can increase click-through rates by up to 300%. This is a dramatic improvement, and we need an image generation model that is very good at text rendering as a critical feature for B2B content.
For social media specifically, different text overlay styles yield varying engagement rates. On LinkedIn, image ads with text overlays like memes achieve CTRs of 1.5 %+, while fake conversations reach 1%+-both significantly outperforming the platform's average 0.6% CTR for image ads.

2) Open Graph and Social Sharing Performance
Open Graph meta tags control how URLs display when shared on social platforms, transforming standard blue links into rich objects with images, headlines, and descriptions. These enhanced previews, call it open graph seo, significantly increase CTR compared to standard links, as users receive immediate context about the content they might click on.
Email marketing data further validates this approach, with campaigns featuring images showing a 42% higher click-through rate than those without images. When combined with appropriate text overlays, images become even more powerful engagement drivers.
3) Text-Image Interplay in B2B Communications
The text-image relationship has always been especially valuable in conveying complex information in a B2B context. When text and images work symbiotically, they create balanced and professional content that appeals to business audiences.
Examples showing the differences between AI Image Models
All the images in the comparisons were created using exactly the same prompts; the only differences were how we needed to convey aspect ratios and output formats. All were connected to our make.com workflow for three specific use cases: a cover image, a thumbnail, and an open graph image. Finally, in these examples, we have used a trained assistant with a set graphic style, which can be tuned to any brand identity.

Google's Imagen 3.0 attempts to integrate text. It comes close, but small mistakes creep in, immediately creating more work.
Midjourney is probably a non-starter other than for the cover image as it does not handle text at all, and its lack of API access is a significant hindrance. Its typical uses are within a more stylistic, artistic environment. As a result, if you were to use it (and we have used it in the past), it would not be straightforward to embed text in the images. You would have to add extra steps to the workflow to ensure that the images from Midjourney do not contain any text, and then add that text using some form of post-production automation.

A similar post-production automation can be achieved if you want to use Google Imagen 3.0.
However, we feel that the optimal model should elegantly and robustly add text to images, making GPT-Image-1 the clear winner.

We do need to perform post-production on these images because of the aspect ratios. This post-production is not a significant issue; those automated images already allow us to improve click-through rates significantly.
More importantly, they allow us to generate the right images at a reasonable speed.
Furthermore, they will allow us to rapidly generate up to three image versions to A/B test and optimise, which is a significant advantage.
Below is a comprehensive feature comparison table for GPT-image-1 (OpenAI), Google’s Imagen 3.0-generate-002, and Midjourney, focusing on API access, aspect ratio options, file output control, and other features we feel are relevant when adding AI Image generation into your workflows.
AI Image Generator Feature Comparison
Feature / Model | GPT-image-1 (OpenAI) | Imagen 3.0-generate-002 (Google) | Midjourney |
API Access | Yes. Official API with robust docs and key-based authentication | Yes. Official API via Gemini/Vertex AI, API key required | No official public API from Midjourney, but third-party APIs (e.g., GoAPI) and Discord bot integration exist |
Aspect Ratio Controls | Yes. Set via size parameter (e.g., "1024x1024", "1024x1536", "1536x1024") | Yes. Explicit aspect_ratio parameter. Supported: "1:1", "3:4", "4:3", "9:16", "16:9" | Yes. Use --ar parameter in prompts, supports a wide range (e.g., 1:1, 16:9, 9:16, 2:1, etc.) |
File Output Options | Yes. Specify output format: PNG (default), JPEG, WebP; can set compression for JPEG/WebP; supports transparency | No direct file format parameter; outputs as binary image data (can save as PNG/JPEG via code) | Output format not directly settable via prompt; images are downloadable as PNG/JPEG from Discord/web interface. |
Image Resolution | Customisable: "1024x1024", "1024x1536", "1536x1024", "auto" | Fixed per aspect ratio; typical outputs are 1024px on the shortest side | Customisable via prompt parameters, but with model-dependent limits |
Image Editing (Inpainting) | Yes. Supports image-to-image, inpainting, and editing via API | No direct inpainting/editing, but can generate new images from prompts | Yes. Can blend, remix, and vary images; advanced editing via prompt and Discord features |
Text Rendering in Images | Improved, can reliably render text in images | Improved over previous models; can render text in images, but still has significant hallucinations. | Supports text rendering in images (prompt-dependent), but less reliable than GPT-image-1/Imagen 3 |
Safety & Content Filters | Integrated safety filters; configurable via API parameters. | Configurable safety filter levels: BLOCK_LOW_AND_ABOVE, etc. | Discord and platform-level moderation; less granular control for developers |
People Generation Control | Not explicitly documented. | Yes. Can allow/block generation of people (adults only or none). | No explicit control; subject to Discord/community guidelines. |
Watermarking | Not explicitly documented. | Yes. All images include an invisible SynthID watermark. | No watermark by default, but public images are visible unless using stealth mode. |
Why API first image generation matters
At The Media CTO, we have tried many image generation platforms, many on free trials, most use the models in this article under the hood to create images. You end up paying for all the wrapper features, which do not always add value, particularly if you are a small to medium media business looking to automate and optimise. You will end up paying for a promise.
Activating image generation inside automation platforms like Make, n8n or even inside Google Sheets, if that's your jam, will save you a lot of time and money and significantly impact engagement.
Accessing these models directly using an API is the only smart move.
Further Reading
More information on the transformative impact of GPTs and AI image generation in event and marketing strategies can be explored in the following content. These articles cover customizable GPTs for content and engagement, GPT integration with spreadsheets to streamline AI image workflows, and the broader impact of AI-powered image generation on event engagement and marketing.
-
How Custom GPTs and GPT-image-1 Are Reshaping Event and Marketing Strategies
Explores how customizable GPTs complement GPT-image-1 by enabling AI-driven content, marketing, and engagement tools across event management and B2B workflows.
-
Streamlining Event Marketing: Using GPTs in Spreadsheets to Enhance AI Image Workflows
Shows how GPT integration in Google Sheets can complement tools like GPT-image-1 by automating text prompts and campaign content for image generation.
-
How AI-Powered Image Generation and GPTs Are Transforming the Events Industry
Explores how AI tools like GPT-image-1 enhance event engagement and marketing through personalized visuals, content automation, and real-time interaction.