Thousands of duplicate photographs are jamming the pipelines of Barcelona's two most ambitious public digitisation efforts, slowing access to historical collections that the city has spent millions of euros to make available online. The problem surfaced publicly this week when archivists at the Arxiu Nacional de Catalunya, based in Sant Cugat del Vallès, flagged an internal backlog of roughly 40,000 image files that appear more than once across its shared repository — a number the institution confirmed to staff in a memo circulated on 1 July 2026, details of which were reported by the specialist publication Gestió Documental.
The timing is awkward. Barcelona City Council has been promoting its Barcelona Memòria Digital programme — an initiative backed by the Collboni administration's culture budget — as a model for municipal transparency and public access. The programme has funnelled resources into scanning everything from nineteenth-century urban planning maps stored at the Arxiu Municipal Contemporani de Barcelona on Carrer de l'Almirall Cervera in Barceloneta, to press photographs from the Franco era held by the Col·legi de Periodistes de Catalunya on Rambla de Catalunya. Duplicates corrode the credibility of that entire exercise: a researcher who downloads the same image twice under different catalogue numbers cannot trust the collection.
Why This Week's Developments Matter
The duplicate-image problem is not new to digital archivists anywhere, but Barcelona's particular situation has made it acute right now for two reasons. First, the city is mid-way through a three-year contract, worth approximately €2.3 million, with a consortium of Catalan technology firms to build a unified metadata layer across municipal collections — a contract awarded in March 2024. The consortium's progress report, due to the council's culture committee this coming September, is expected to address deduplication as a key milestone. Missing that milestone would trigger a contractual review.
Second, the short-term political climate inside the Generalitat de Catalunya has put an unusual spotlight on cultural institutions. Tensions between the Catalan government and Madrid over competency over archive funding — a long-running dispute in the context of the broader independence debate — mean that any embarrassment for institutions like the Arxiu Nacional lands in a politically charged space. Observers in the sector note that questions about institutional competence are never purely technical when Catalan sovereignty over cultural heritage is itself a live argument.
At street level in the Eixample, the practical stakes are more mundane but real. Photographers, academics and documentary filmmakers who use the Fototeca de Catalunya on Plaça de Pau Vila in the Born district routinely report hitting duplicate listings when running keyword searches. The fototeca's online portal, relaunched in October 2024, was supposed to eliminate legacy duplication inherited from an older database system. This week, technicians discovered that a batch import carried out in late May 2026 — covering approximately 6,200 images from the Frederic Marès Museum photographic collection on Plaça de Sant Iu, in the Gothic Quarter — reintroduced a subset of files that had already been cleaned and catalogued.
What Archivists Are Doing About It
Deduplication at this scale requires more than manual review. The standard tool used across European national archives is perceptual hashing — an algorithm that compares image content rather than file names or metadata strings. The Arxiu Nacional confirmed it will deploy an updated hashing protocol sourced from an open-source framework maintained by the Europeana Foundation, the EU cultural heritage aggregator, with implementation scheduled to begin the week of 13 July 2026.
For institutions in Barcelona, the immediate advice from the sector is to pause any new batch imports until the hashing tools are certified against existing collections. The Fototeca de Catalunya has reportedly suspended its scheduled August upload of 3,800 additional civil-war-era images from the Generalitat's Documentation Centre in Les Corts pending the technical review.
Researchers who depend on these collections in the meantime are being directed to cross-reference catalogue numbers against the CCUC — Catàleg Col·lectiu de les Universitats de Catalunya, which maintains its own independent index and has not been affected by this week's batch-import error. It is an imperfect workaround, but for now it is the most reliable one available.