Work in content? You should be using AI for alt text

1 month ago 8

The easiest way to show your newsroom is actually thinking about AI isn’t the 50th summarisation tool, chatbot article, or headline generator. It’s in automating the jobs people aren’t good at and don’t really want to do in the first place.

Alt text is a perfect example.

Most images on your site need it, hardly anyone writes it consistently, and most editors would rather not spend their day doing it. But with AI and LLMs, you can fix that in minutes.

Try the demo below. It’ll generate alt text for any image you upload. Then we’ll dive into why it works, what makes good alt text for journalism, and what we learned building this at the Financial Times.

Try it out below

Drop your image here

or click to browse

Don't have an image? Try these examples:

What and why

Alt text is the short line of text a screen reader uses to describe an image to someone who can’t see it. It’s one of those things that every website should consider, but few newsrooms really give consistent attention to. Most alt text is either empty, badly duplicated, or overly descriptive, and that inconsistency means people relying on screen readers often have a worse experience on your site.

Take a look at this example from the BBC and the New York Times. Each of their homepages marked up with this accessibility tool shows how alt text appears for images.

In this example The New York Times has videos for both of its leading stories, which have no text representation at all. However, the small headlines image for the daily podcast has a detailed description of the supreme court. Columnist headshots have their names associated with them. And images of iPhone homescreens have an alt text describing just that. Towards the bottom of the page two images are marked as ‘decorative’, which we’ll get on to later.

By contrast, the BBC’s page is scattered with images, all of which have text. We have a variety of lengths, from “Dame Jilly Cooper laughing” or “Lecornu and Macron” to a detailed description of a composite image around the US-hosted World Cup. Charlotte Church’s image describes her outfit, the event and the location, whereas the image to the left of her simply reads “The Strictly contestants”.

Imagine you couldn’t see those pages and text was all you had to go on. Would the information the images are adding be sufficient? Would it actually be too much? Think about the hierarchy: would hearing the alt text first before the headline be a help or a hindrance to getting to where you need to go?

Alt text is really hard to get right. There are a lot of guides online, but they generally end up focused on the same few principles, and they typically focus on ‘functional’ images, or images that are there to convey information. But in reality, most images on a news site are there to add context, to break up text, or to add some visual interest. They aren’t always strictly necessary to understand the content of the article, nor are they necessarily ‘adding’ anything to the story that is not in the text already.

A homepage is a great example of this contrast: the images are accompanied by headlines and text that convey the information that you need to take the next action: clicking through to the story, where you will more than likely see the image again. In this context the image is typically decorative. And by definition decorative images don’t need alt text.

Some images however are crucial to comprehension. In articles, where the image often holds a lot of weight: photos of historic moments, unbelievable damage after a natural disaster, or the emotion captured in a person’s face, are all adding something important to the story. The term photojournalism exists for a reason, and in many cases striking photography is as much a part of the experience as the writing itself.

How we built it at the FT

When we built this at the Financial Times, we made it part of the CMS’s native image component. When an editor uploads an image, they can click a button to generate alt text using the AI model. The generated text is then populated into the alt text field, where the editor can review and edit it as needed before saving.

A screenshot of the Financial Times content management system showing an image upload dialog with AI-generated alt text.

We went further by encouraging the editor to check a box confirming they had reviewed the text before saving. If they didn’t, the text would stay highlighted in orange, drawing attention from other editors until it was approved.

Our prompt took some work to get right. It took a series of tests with different models and prompt iterations to reach a point where the text was consistently good enough to publish with minimal edits. We also limited the tool to images that came with photo agency captions and metadata, as we found that passing this information through aided in identification significantly.

Example prompt used above You are an AI alt text generator. For each image, write a concise, objective description that conveys the essential visual information to someone who cannot see it. Focus on key elements, context, and purpose. Avoid subjective judgments, assumptions, or detail not critical to comprehension. Be clear, accurate, and specific. Only name people in the photo if you are certain of their name and they are a famous individual. Otherwise gently describe people making judgements about protected characteristics. If the image is a chart, then discuss the axes, important values, and the general trend. Do not discuss visual information that would not make sense to someone who cannot see the chart, such as the color of the trend line or the color of the bars. Always return your response in <alt> tags only, without any additional commentary or punctuation. Unless, in exceptional circumstances where it is necessary, such as a chart, you MUST keep your response to two sentences, or around 150 characters. For charts, you may respond with up to four sentences, and your character count is unlimited.

The output can be a little generic, and that’s fine. The goal isn’t to replace editors but to help them start from something consistent. Adding a human in the loop ensures the final alt text fits your newsroom’s tone and context.

A few things we learned

Decide whether the image is decorative. If it doesn’t add essential meaning, use an empty alt="".
Describe what’s essential, not everything. Focus on who or what’s in the image and why it matters to the story. If the person is notable, name them. Describe what they are wearing or doing only if it’s relevant.
Keep it short. Usually under 150 characters, though for complex images and charts you can afford to flex this a bit.
Don’t duplicate captions. Screen readers will read both. You can use aria-hidden on captions in this case. ¹
Punctuate like a sentence. Skip “Image of…” intros unless it is a ‘Screenshot of’, or ‘Cartoon of’ where the medium is important.

It’s a simple workflow, but it adds up fast: accessibility audit scores improved, editors saved time, and, most importantly, we served readers who use assistive tech much better.

And it’s the kind of thing any media organisation can do this quarter. You don’t need a data science team or a complex pipeline, just a text-generation API call, some sensible defaults, and someone to review the results. Let me know if you give it a go.

Aside

I’m certain that in the future this will be something that is provided by local AI models anyway. Google’s Gemini Nano is getting good enough at this kind of thing, and small enough, that integration with Chrome could enable this for images on any website you visit. However! This model may not have all the information your organisation has available to it when adding this image, so if you want to provide the best experience possible, keeping control over what you are serving is still valuable.

Read Entire Article