AI AgentsAccessibilityWeb Design

Image-Heavy Sites and the Information Gap for AI Agents

Agent Checker19 January 20264 min read

A luxury furniture retailer we audited had beautiful product pages. Gorgeous photography, lifestyle images showing pieces in styled rooms, material swatches as high-resolution photos. The pages looked stunning. But when we sent an AI agent to find "a walnut dining table for six people that costs under £2,000," the agent came back empty-handed. Every product matched the criteria, but the agent couldn't tell.

Where the Information Hides

The product dimensions were embedded in an infographic image. The wood type was shown as a colour swatch with no alt text. The seating capacity was mentioned only in a lifestyle photo caption rendered as part of the image. The price appeared in a stylised banner graphic. None of this information existed as text in the HTML.

This isn't an edge case. We analysed 100 product pages across fashion, furniture, and electronics retailers. On average, 31% of product-relevant information existed only in images. The breakdown:

Size/dimension info in images: 28% of sites
Price or pricing tiers in banner graphics: 15% of sites
Material/ingredient details in infographics: 22% of sites
Promotional offers only in hero banners: 44% of sites
Colour names only as visual swatches: 37% of sites

Vision Models Help, But Not Enough

Modern AI agents increasingly use vision models to interpret images. GPT-4o, Claude, Gemini, they can all describe what they see in a photo. So doesn't this solve the problem?

Partially. Even as multi-modal agents see websites more like humans, vision models can identify that a table is made of dark wood and has six chairs around it but they struggle with precision. "Dark wood" might mean walnut, mahogany, or stained oak. The model might estimate dimensions, but "roughly 180cm long" is different from the actual specification of 185cm. And reading text from stylised banner images is unreliable; decorative fonts, unusual colours, and busy backgrounds all reduce accuracy.

There's also the cost and speed issue. Sending every image on a product page through a vision model takes time and API credits. A shopping agent comparing 50 products across five sites would need to process hundreds of images. That's slow and expensive compared to reading structured text data.

Alt Text: Doing the Bare Minimum (Badly)

Alt text should bridge this gap. In practice, it rarely does. We checked alt text quality across 500 product images:

Missing alt text entirely: 23% of images
Generic alt text ("product image," "banner," "photo"): 34%
Filename as alt text ("DSC_0042.jpg," "hero-v3-final.png"): 11%
Decent descriptive alt text: 22%
Detailed, information-rich alt text: 10%

Only that last 10% actually helps an AI agent understand what the image contains. And even "decent" alt text often misses the key information. An alt text of "walnut dining table" is good for accessibility but doesn't include the dimensions, price, or seating capacity that a shopping agent needs.

The Promotional Content Black Hole

Sales and promotional information is the worst offender. "20% off all sofas this weekend" rendered as a hero banner image. "Free delivery on orders over £50" in a graphic footer. "Buy 2 get 1 free" as an overlay on product images. Agents miss all of this.

During a January sales audit, we found that 67% of promotional offers on the sites we tested were communicated exclusively through images. An AI agent helping a user find the best deals had no idea these offers existed. The agent recommended full-price items on sites running massive sales, simply because the sale information wasn't in the text.

Fixing the Gap

Structured data markup. Schema.org markup helps agents understand your products by specifying price, availability, dimensions, materials, and more in machine-readable format. Agents can read this data without looking at the page at all. It takes an hour to implement and has SEO benefits too.

Meaningful alt text. Write alt text that contains the information the image conveys, not just a description of what the image looks like. Instead of "dining table in modern kitchen," try "Walnut dining table, 185cm x 90cm, seats 6, shown in modern kitchen setting."

Text equivalents for promotions. If a banner says "20% off sofas," put that text in the HTML as well, even if it's visually hidden with sr-only CSS class. Screen readers need it too. So do agents.

Product data in HTML, not just images. Every specification that matters for purchasing decisions should exist as text in the page HTML or in structured data. Images should illustrate and enhance; they should not be the only source of factual product information.

The accessibility overlap here is significant. Everything that helps AI agents read your product data also helps screen reader users, search engines, and anyone on a slow connection where images don't load. Building for machines and building for accessibility are, in this case, exactly the same work.