Suscripción gratuita
The Daily Barcelona

Barcelona news, every day

News

Barcelona's Urban Archive Tackles Duplicate Image Crisis Ahead of Major Digital Overhaul

The city's digitisation drive has surfaced thousands of redundant photographs clogging public databases, forcing a coordinated response this week from the Ajuntament and cultural institutions.

By Barcelona News Desk · Published 4 July 2026, 8:58 pm

3 min read

Barcelona's Urban Archive Tackles Duplicate Image Crisis Ahead of Major Digital Overhaul
Photo: Photo by Sergio Fdez on Pexels
Traduciendo…

Barcelona's municipal archive has launched an emergency deduplication review after an internal audit found that more than 40,000 duplicate image files had accumulated across the Arxiu Municipal de Barcelona's shared digital repositories — a problem that has quietly ballooned since the institution accelerated its scanning programme in 2024. The announcement came Tuesday, three days before the city's broader Smart City data governance framework enters its next review phase on 7 July.

The timing matters. Mayor Jaume Collboni's administration has staked significant political capital on a digitally transparent city hall, and the archive sits at the centre of that promise. Duplicate records don't merely waste server space; under the European Data Governance Act, which came into force progressively from 2023, public bodies are required to maintain verifiable, non-duplicated data sets when opening collections to researchers and commercial licensees. Getting caught with bloated, inconsistent archives before the July review would be an embarrassment the Ajuntament can ill afford.

Where the Problem Surfaced

The duplicates cluster around two specific digitisation pipelines. The first involves legacy photograph collections originally held at the Arxiu Fotogràfic de Barcelona, housed in the Palau de la Virreina on La Rambla, where analogue batches were scanned by two separate contractors between 2021 and 2023 without a unified metadata protocol. The second originates in the district-level upload portals used by the ten Barcelona districts, including Gràcia and Sant Martí, which fed neighbourhood documentation images into the central system with inconsistent file-naming conventions.

The Institut de Cultura de Barcelona — the municipal body that oversees the Palau de la Virreina and several other cultural venues — confirmed this week that it has engaged a Barcelona-based data management firm to run automated hash-comparison tools across the affected collections. Hash comparison checks each image file's unique digital fingerprint, flagging identical or near-identical copies regardless of what they are named. The process is expected to take three weeks and is scheduled to complete before the end of July.

The Consorci de Biblioteques de Barcelona, which runs the city's 40 public libraries and maintains its own digital image catalogue, said Thursday that it had carried out a similar internal review in May and identified roughly 6,200 redundant files in its Biblioteca de Catalunya-linked shared portal — a smaller but still operationally significant figure. Those files have already been quarantined pending deletion authorisation from archivists.

What the Data Reveals About Barcelona's Digitisation Pace

The scale of the duplication is partly a function of ambition. Barcelona's municipal digitisation budget rose to €4.2 million in 2025, up from €2.8 million in 2022, as the city pushed to make historical urban photography openly accessible through the opendata.barcelona.cat portal. Scanning volumes roughly doubled over that three-year period. When throughput scales that quickly without standardised ingest protocols, duplication is a near-inevitable byproduct — a pattern that archivists at the Arxiu Nacional de Catalunya in Sant Cugat del Vallès have flagged in professional forums since at least 2023.

The duplication also has a cost dimension. Cloud storage contracts for the municipal archive are priced per terabyte, and preliminary estimates suggest the redundant files occupy approximately 2.3 terabytes of paid storage — a modest but recoverable sum at current municipal contract rates.

Researchers who regularly use the archive's public portal, including those working through the Universitat Pompeu Fabra's history department in the Ciutadella campus area, have noted that search results for historical Barcelona street photography — particularly images of the Eixample grid and the old Raval — have been returning duplicated results for months, undermining confidence in search accuracy.

The practical next step for anyone who relies on the archive is to treat any image downloaded before 1 August as potentially one of several identical versions in circulation, and to re-verify file provenance once the deduplication process completes. The Arxiu Municipal has indicated it will publish a clean-slate index on opendata.barcelona.cat in early August, with each retained image assigned a stable persistent identifier. Cultural institutions and journalists working with the archive's photographic collections should hold off on major licensing requests until that index goes live.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Barcelona

This article was produced by the The Daily Barcelona editorial desk and covers news in Barcelona. See our editorial standards for how we use AI.

The Daily Barcelona brief

The day's Barcelona news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Barcelona news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Barcelona

More in News

Enjoyed this story? Get tomorrow's briefing free.