Suscripción gratuita
The Daily Barcelona

Barcelona news, every day

News

Barcelona's Municipal Archive Tackles Duplicate Image Crisis With New Scanning Protocol

The Arxiu Municipal de Barcelona launched a week-long audit this week to identify and remove thousands of duplicate digital images clogging its public collections database.

By Barcelona News Desk · Published 4 July 2026, 9:16 pm

3 min read

Barcelona's Municipal Archive Tackles Duplicate Image Crisis With New Scanning Protocol
Photo: Photo by Pavlo Luchkovski on Pexels
Traduciendo…

The Arxiu Municipal de Barcelona confirmed this week that its ongoing digital cataloguing project had uncovered more than 14,000 duplicate image files across its publicly accessible photographic collections — a backlog that has slowed search response times and confused researchers relying on the archive for historical documentation of the city.

The issue surfaced during a broader digitisation push that the archive, based on Carrer de Santa Llúcia in the Gothic Quarter, began accelerating in early 2025. Staff running integrity checks on the database flagged the scale of the duplication problem in late June, triggering an emergency audit that ran through this week.

Why It Matters Now

The timing is not accidental. Barcelona's cultural institutions have been under mounting pressure to improve digital access since the city's 2024–2027 Digital Culture Plan, approved by the Ajuntament de Barcelona, set a target of making 80 percent of municipal archival holdings fully searchable online by the end of 2026. With six months left on that deadline, duplicate-image bloat in the database is a direct threat to hitting the benchmark.

The problem has practical consequences beyond bureaucratic targets. Researchers at the Institut de Cultura de Barcelona, which oversees the archive network, have reported delayed responses to image licensing requests — a revenue stream the institute depends on to fund further digitisation work. Short turnaround times on licensing requests matter particularly to publishers and production companies working on Barcelona-themed projects, who typically need cleared images within 48 to 72 hours.

Duplicate records also distort the archive's metadata, meaning that searches for photographs of specific neighbourhoods — say, images from the Poblenou industrial waterfront or the Eixample grid taken before the 1992 Olympic construction boom — return cluttered results that bury unique images under near-identical copies scanned at different resolutions.

The Cleanup Operation

This week's audit involved a team working across two sites: the main archive building near the Catedral de Barcelona and a secondary digitisation facility on Carrer dels Almogàvers in Poblenou, which the Ajuntament repurposed for heritage scanning work in 2023. Staff used a combination of perceptual hashing software and manual verification to flag pairs or clusters of images representing the same original photograph.

The duplication arose from several overlapping causes. Multiple digitisation contractors working on different phases of the project between 2019 and 2024 each uploaded their batches independently, with no cross-checking protocol in place at the time. Some photographs were also scanned by the Biblioteca de Catalunya under a separate cooperation agreement and then ingested into the municipal system without deduplication, effectively creating two or three versions of the same file at different resolutions.

The archive has not yet announced a final count of confirmed duplicates removed, but preliminary figures circulating internally as of Thursday put the number of resolved cases above 6,200, with work continuing through next week. The target is to clear the backlog before the Festa Major de Gràcia in mid-August, when the archive typically sees a spike in image requests from media outlets and festival organisers hunting for historical photographs of the neighbourhood.

For researchers and journalists who use the archive regularly, the practical advice from archive staff this week is to hold off on new bulk image requests until at least July 14, when the cleaned database is expected to go live on the Arxiu en Línia platform. Single-image or small-batch requests are still being processed normally through the archive's standard contact form. Researchers with pending requests submitted before June 30 have been told their files are being prioritised and will be resolved before the new database goes public. The Ajuntament has not indicated whether the deduplication delay will affect the broader 2026 digital access target, but the institute is expected to publish a progress update later this month.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Barcelona

This article was produced by the The Daily Barcelona editorial desk and covers news in Barcelona. See our editorial standards for how we use AI.

The Daily Barcelona brief

The day's Barcelona news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Barcelona news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Barcelona

More in News

Enjoyed this story? Get tomorrow's briefing free.