Suscripción gratuita
The Daily Barcelona

Barcelona news, every day

News

Barcelona's Digital Archives Push Forward With Duplicate Image Purge This Week

City institutions and cultural bodies accelerated a long-delayed cleanup of redundant visual records, clearing thousands of duplicate images from public-facing databases.

By Barcelona News Desk · Published 4 July 2026, 9:51 pm

3 min read

Barcelona's Digital Archives Push Forward With Duplicate Image Purge This Week
Photo: Photo by Maria Clara Diab on Pexels
Traduciendo…

Barcelona's main municipal digital archive, the Arxiu Municipal de Barcelona, completed the first phase of a system-wide duplicate image removal project on Friday, July 3, with technicians reporting the deletion of more than 14,000 redundant files from its publicly accessible photographic catalogue. The work, part of a broader digitisation push that began in late 2024, had stalled for months over data integrity concerns before resuming in earnest this spring.

The timing matters. The city's cultural institutions have been under pressure to modernise their digital infrastructure ahead of a wider metropolitan open-data initiative tied to the Smart City Expo World Congress, which returns to Fira de Barcelona in November 2026. Bloated databases filled with duplicate images slow down public search tools and inflate cloud storage costs — a problem that has grown as catalogues expanded rapidly during the pandemic years, when physical access to archives was restricted and digitisation accelerated without strict quality controls.

Which institutions are affected — and where

Beyond the Arxiu Municipal, two other organisations took concrete steps this week. The Museu Nacional d'Art de Catalunya, situated on Montjuïc, confirmed it had flagged approximately 3,200 duplicate entries in its online collection portal, with automated deduplication software now running against its high-resolution image repository. Staff at the museum indicated the process would carry into August. Meanwhile, the Institut de Cultura de Barcelona — which oversees venues ranging from the Palau de la Virreina on La Rambla to the Centre de Cultura Contemporània de Barcelona in the Raval neighbourhood — launched an internal audit of its press-image libraries, which had accumulated duplicates across multiple content management systems since a platform migration in 2022.

The practical trigger for this week's activity was a procurement deadline. The Ajuntament de Barcelona's contractual review cycle for its cloud storage providers closes on July 31, and internal assessments earlier this year found that duplicate and redundant files accounted for a disproportionate share of storage consumption across municipal cultural platforms. Reducing that footprint before the contract renewal was cited in planning documents as a direct cost-saving measure.

The numbers behind the cleanup

Storage costs for public institutions in Catalonia have risen alongside broader European cloud pricing trends. The Arxiu Municipal's digital holdings now exceed 2.3 million individual image files, a figure that has more than doubled since 2019. Industry benchmarks suggest duplicate content in rapidly digitised archives can represent between 8 and 15 percent of total file volume, meaning the potential reduction at the Arxiu alone could free up storage equivalent to tens of thousands of euros in annual fees, depending on the final tier pricing negotiated this summer.

The deduplication process itself uses perceptual hashing — a technique that identifies visually identical or near-identical images even when file names or metadata differ — rather than simple file-size matching, which had produced unreliable results in earlier attempts. That earlier approach, tried in 2023, incorrectly flagged distinct photographs of similar subjects, including multiple archival shots of the same streets in the Eixample district taken years apart. Staff had to manually review and restore dozens of mislabelled files before the project was paused.

For researchers, journalists and members of the public who regularly pull images from these open catalogues, the changes should eventually improve search accuracy and reduce the frustration of encountering the same photograph listed multiple times under different catalogue numbers. The Arxiu Municipal's public-facing search tool, accessible via the Ajuntament's digital services portal, is expected to reflect the cleaned data by mid-September at the earliest, once a final human-review stage clears flagged edge cases. Institutions advise that anyone currently working on projects that cite specific catalogue reference numbers double-check those references before September, as some IDs assigned to duplicate entries will be retired and consolidated under canonical records.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Barcelona

This article was produced by the The Daily Barcelona editorial desk and covers news in Barcelona. See our editorial standards for how we use AI.

The Daily Barcelona brief

The day's Barcelona news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Barcelona news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Barcelona and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Barcelona

More in News

Enjoyed this story? Get tomorrow's briefing free.