Barcelona's municipal digital archive holds hundreds of thousands of photographs, maps and visual records — and a significant share of them are duplicates. City technology officers and digital preservation specialists are now pushing hard for a coordinated strategy to clean up the redundancy before expanded tourist tax revenues fund yet another round of digitisation that compounds the problem.
The issue surfaced publicly this spring when the Arxiu Municipal de Barcelona, headquartered on Carrer de Sant Honorat in the Gothic Quarter, began a quality audit of its visual collections as part of a broader open-data initiative tied to Mayor Jaume Collboni's smart-city programme. Auditors found that duplicate image entries were inflating storage loads and creating conflicting metadata records — photographs of the same Barceloneta seafront scene catalogued under different file names, dates and authors, with no automatic reconciliation tool in place.
Why does this matter now? The city's digitisation pipeline has accelerated. The Institut de Cultura de Barcelona (ICUB) approved a three-year digital heritage plan in March 2025 with a budget allocation for scanning physical collections across six municipal museums and libraries. More images are entering the system every week. Without a deduplication standard baked into the ingest workflow, specialists warn, the archive risks locking in today's disorder at much larger scale.
What Officials and Experts Are Saying
Digital archivist professionals working with the Ajuntament — speaking in their institutional capacity rather than on personal record — have described the core obstacle as a governance gap, not a technology gap. Tools capable of detecting near-duplicate images using perceptual hashing and metadata cross-referencing already exist, and several are open-source. The problem is that no single directorate owns the deduplication mandate. The archive, ICUB, and the city's smart-city office Barcelona Digital City, based in the 22@ innovation district in Poblenou, each manage image assets under separate protocols.
The startup ecosystem in 22@ has not been silent. At least two Barcelona-based computer vision companies — operating within the cluster around Carrer de Pallars — have pitched solutions to municipal procurement officers this year, proposing automated pipelines that flag duplicates at upload rather than retrospectively. Neither contract has been awarded publicly as of this week. Procurement records reviewed for this article show both proposals remain under technical evaluation.
Academic voices are part of the conversation too. Researchers at the Universitat Pompeu Fabra, whose communication and information science faculty is based in the El Born area near Ciutadella park, have published work arguing that image deduplication is not merely a storage optimisation exercise but a questions of archival integrity. Duplicate records distort search results, misrepresent the frequency of certain historical subjects, and can skew algorithmic tools trained on civic image datasets.
Scale of the Problem and What Comes Next
Storage costs are a concrete pressure. Municipal cloud storage pricing for large public institutions in the EU has risen alongside broader infrastructure costs — industry benchmarks put enterprise-tier object storage at roughly €20 to €30 per terabyte per month depending on redundancy requirements. Barcelona's archive does not publish its storage expenditure as a standalone line item, but the audit process is expected to produce a figure this autumn as part of the open-data transparency push.
The immediate practical step being discussed is a working group that would bring together the Arxiu Municipal, Barcelona Digital City, and ICUB under a shared image data standard. A preliminary meeting was scheduled for late June at the Museu del Disseny de Barcelona on Plaça de les Glòries Catalanes, though no formal agreement has been announced yet.
For anyone depositing images with the city — neighbourhood associations, cultural organisations, research institutions — the advice from archivists right now is straightforward: use standardised file-naming conventions, embed complete EXIF metadata at capture, and avoid submitting unprocessed duplicates in bulk. The city is unlikely to have an automated fix in place before 2027, and submissions made today will sit in the queue until then. Getting the originals right matters more than waiting for a system to sort it out later.