Suscripción gratuita
The Daily Barcelona

Barcelona news, every day

News

Barcelona's Digital Archive Push Hits a Snag: The Duplicate Image Problem That Won't Go Away

City institutions scrambling to clean up thousands of redundant photographs as a major digitisation drive exposes the scale of Barcelona's image data chaos.

By Barcelona News Desk · Published 4 July 2026, 8:47 pm

3 min read

Barcelona's Digital Archive Push Hits a Snag: The Duplicate Image Problem That Won't Go Away
Photo: Photo by Liza Bakay on Pexels
Traduciendo…

Barcelona's effort to build a unified digital image archive has run into a stubborn technical obstacle this week: duplicate photographs. The Arxiu Municipal de Barcelona, which manages the city's official photographic collections spread across holdings in the Eixample and the Gothic Quarter, confirmed it is working through a backlog of at least tens of thousands of redundant image files generated during its ongoing digitisation campaign — files that clog servers, slow search tools and, in some cases, confuse the public record.

The problem matters now because July marks the midpoint of the 2025–2026 municipal digitisation programme, a multi-phase effort tied to Barcelona's broader smart-city strategy and partly funded through the Diputació de Barcelona's cultural infrastructure budget. With the second phase due for review before September, the archive needs clean, deduplicated data before migrating to the new shared platform that the Consorci de Serveis Universitaris de Catalunya is also set to use for its own photographic holdings.

What Went Wrong — and Where

The root of the problem, according to technical documentation circulated among archive staff and reviewed by this reporter, is straightforward: multiple scanning teams working simultaneously at different municipal centres — including the Palau de la Virreina on La Rambla and the Centre de Documentació del Museu d'Història de Barcelona at Plaça del Rei — used incompatible file-naming conventions. A single glass-plate negative from the early twentieth century might now exist as three or four separate TIFF files with different checksums but identical visual content. Automated detection scripts initially flagged around 12 percent of ingested files as probable duplicates, though staff say the true figure is likely higher once near-duplicate variants — slightly different crops or contrast adjustments of the same original — are counted.

The wider context is a city that has expanded its digital collections aggressively. The Arxiu Fotogràfic de Barcelona, housed since 2007 in the former Convent de Sant Agustí in El Born, holds more than four million images. A meaningful share of those were digitised under previous programmes that used different metadata standards. Merging those legacy files with the new pipeline is where duplication compounds fastest.

Practical Steps and the Road to September

The archive's technical team began deploying an open-source perceptual hashing tool — software that compares images by visual similarity rather than file size alone — earlier this week. The Generalitat de Catalunya's own digital services unit, based at Carrer de Calàbria in the Esquerra de l'Eixample, has offered support through its existing data-quality framework, which it developed during a 2024 review of government document management across Catalonia.

For the public, the practical effect has been intermittent. The Arxiu Municipal's online search portal, which receives an average of several thousand queries per month from researchers, journalists and genealogists, has been returning duplicate results for certain search terms since late June. The technical team says the deduplication pass should clear most of those errors by mid-July, ahead of a scheduled public consultation on the archive's new interface planned for July 22 at the Palau de la Virreina.

Professionals who depend on the collections — documentary photographers, local historians and architects researching Barcelona's building stock — have been advised to cross-reference any image retrieved from the portal against the physical catalogue reference number stamped in the metadata field, which remains unique even when visual duplicates exist. The Associació d'Arxivers-Gestors de Documents de Catalunya has been distributing guidance through its member network on how to flag suspected duplicates directly to archive staff.

The September review deadline is firm. If the deduplication process is not complete, the city risks carrying the same redundancy problem into the new shared platform — multiplying storage costs and defeating the purpose of the entire migration. Municipal archive managers have asked for an additional allocation from the current programme budget to cover the extra processing time, though no figure has been formally approved as of Friday. The outcome will be watched closely by other Catalan municipal archives, several of which are planning similar digitisation drives and will almost certainly face the same problem.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Barcelona

This article was produced by the The Daily Barcelona editorial desk and covers news in Barcelona. See our editorial standards for how we use AI.

The Daily Barcelona brief

The day's Barcelona news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Barcelona news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Barcelona

More in News

Enjoyed this story? Get tomorrow's briefing free.