Why We Stopped Embedding YouTube Videos. And Built Something Better.
If you run a content site and you publish video alongside your written articles, the default move is to drop in a YouTube embed. Job done. But the more we thought about it, the more that started to feel like the lazy answer rather than the right one. It also felt at odds with the experiment that Digitising Events represents, the building of an AI-enabled publishing business.
Here is the problem with YouTube embeds.
The first issue is performance. A YouTube iframe does not wait for your reader to show any interest in the video before it starts loading a significant chunk of third-party JavaScript, tracking scripts, and fonts. All of that lands on your page the moment it loads, regardless of whether anyone scrolls near the player. For a site that takes page speed seriously, that is a meaningful cost.
The second issue is control. You get YouTube's player, YouTube's end-card recommendations, and YouTube's branding. Some of that can be suppressed with URL parameters, but not all of it, and the recommendations in particular are actively working against you, pulling your reader off your page and onto whatever YouTube decides to serve next.
The third issue is the conversion logic. If your video lives on YouTube, you actually want the view to happen on YouTube. For the algorithm, for subscriber growth, for watch time. An embed that keeps someone on your page is working against that goal. You’re not converting them to a YouTube viewer; you are just giving them a slightly worse watching experience in a box on your page.
It's about maximising engagement
All the commercial and programmatic solutions to the speed issue are effectively lightweight facsimiles of what the iframe embed does. Not geared towards engagement.
I had a kernel of an idea of a better solution, so I started a design chat with Claude about what a better answer might look like.The chat took 81 minutes, and we got to the prototype you see below; that, in itself, is a good story.
The idea we landed on was this: what if instead of embedding the video, you cracked it open? We already have chunked transcript segments stored for every video we produce.
Feed those segments to an AI, extract the four key themes the video actually covers, and generate a hook and a call to action for each one. Then create a background image per theme, also using AI, and assemble the whole thing into a carousel or something like that. Where every card links directly to the exact timestamp in the video where that theme begins, to better join up the reader's journey.
Instead of asking a reader to commit to a whole video, you are showing them the specific moments most relevant to them and letting them jump straight there.
We ran through some prompts quickly to think through the image generation side. For a component like this you need consistency across all four cards, they have to feel like a set. You also need compositional control, because text sits over these images, so you cannot have faces or busy centred elements eating into your overlay space.
We settled on Flux 2 Pro with a locked style preamble applied to every generation, and tested at 1024 x 576. The right weight for a web asset without being wasteful on file size. As a personal test I also wanted to see how quickly we could get to a proof of concept. So I was not too worried about whether this is the right model for now.
The segment data came straight out of BigQuery. The theme extraction, hooks, and CTAs were generated from the raw transcript. The images went through fal.ai. And the carousel was built as a self-contained HTML file that opens locally in a browser with no dependencies.
81 minutes to a working prototype
That in itself is pretty cool. The whole pipeline from transcript data to something you can actually look at and interact with took a single session. The automation chain that would make this run on every new video to be embedded in an article is a separate conversation.
But the proof of concept answers the most important question first: does the output actually look good and does the idea hold up? We think it does.
And the other test is: can AI execute its assigned roles well enough in a first pass? That gives me a good indication of how much effort would need to go into enabling an autonomous execution. And what happens where.
The plan now is to iterate on the visual design and decide on the template, then execute it on a number of different articles and monitor the click-through and interactions on these widgets before doing any heavy automation.
It may well be that this sits well within Claude Desktop, and all we need are a few MCP calls to get images generated and stored. As ever, this is all part of us chipping away at building a human-centred but AI-enabled media business. You can follow along on this site.




