Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)
Validator & data extractor Tool

Download and evaluate XML metadata from OAI-PMH enabled digital libraries.

Example OAI-PMH URL:

Download all the records from a digital library. Please insert an OAI-PMH URL, select the metadata prefix and check out the results.

Optional parameters:

REST API is completely FREE for personal or academic use.

Large scale use of the service via its REST API as well as technical and/or scientific support is available on a fee.

Please contact the author, Vangelis Banos, to learn more.


The OAI-PMH validator and data extractor tool is a free service created by Vangelis Banos. The aim of this project it to support digital repository operators and developers by automating the harvesting and validation of OAI-PMH services. The author has accumulated 10+ years of relevant experience working on projects like:


OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) is a protocol developed by the Open Archives Initiative. It is used to harvest (or collect) the metadata descriptions of the records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.

The protocol is usually just referred to as the OAI Protocol. Check full article in wikipedia »

What is my OAI-PMH URL?

The OAI-PMH URL address of a digital library varies depending on the software it is based on. Please check out some widely used software and their Sample URLs.

Software Sample URL Sample OAI-PMH URL
Dspace version 1.5 and newer
Open Journal System (OJS)

These URLs are just the defaults and can be changed by the library's administrator.

How does works? is a web application. It works in the background on our servers and presents the final output to the users. There is no need to install software in your computer, everything is done using a web browser

What kind of validation tests are performed?

Generic Checks
  1. Check HTTP Response code.
  2. Check HTTP Response document content type.
  3. Check XML document file size.
  4. Check HTTP Request & Response Time.
  5. Check XML document compliance against the OAI-PMH XML Schema
Content-specific Checks
  1. Check OAI-PMH Protocol version.
  2. Check Administrator email address.
  3. Check ListSets command for sets.
  4. Check available Metadata formats.
  5. Check Total Records number.
  6. Check Dublin Core metadata in Records.
  7. Check ESE metadata in Records.
  8. Check XML document compliance against the ESE XML Schema
  9. Also check various ESE XML elements for common mistakes such as
    • Check for invalid europeana:isShownAt URL
    • Check for invalid europeana:isShownBy URL
  10. Check various metadata elements in Records:
    • Empty dc:title
    • Empty or invalid dc:identifier
    • Empty setSpec

I want an extra feature of some kind, what can I do?

Please contact the author, Vangelis Banos. Any suggestions for improvements and new features are always welcome.

References in scientific publications

  1. Miranda, Jesús María Aramberri. "ONDAREA, DE ALMACÉN A FÁBRICA." Nuevos Extractos de la RSBAP 26-G (2022).
  2. Sosa, Jared David Tadeo Guerrero, et al. "Un generador de metadatos openaire conforme con el repositorio nacional de México." Journal of Science and Research 5.CININGEC (2020): 808-830.
  3. Sebastián Blanco-Olea, Fernando. "Comparative Quality Evaluation of Universities' Institutional Repositories of Peru." JLIS. it, Italian Journal of Library, Archives & Information Science 12.2 (2021).
  4. Mesquita, Giovana Cavalcanti de. "Análise do repositório do Ministério da Economia no contexto do acesso aberto." (2021).
  5. Gottardo, Taís Virgínia, and Ivanildo Barbosa. "INDE METADATA CONFORMITY INDICATOR." Boletim de Ciências Geodésicas 25 (2019): e2019S002.
  6. Costa, Michelli Pereira da, and Fernando César Lima Leite. "Open access institutional repositories in Latin America." (2019).
  7. Pereira da Costa, Michelli, and Fernando César Lima Leite. "Repositorios institucionales de acceso abierto en América Latina." Biblios 74 (2019): 1-14.
  8. Myriam Bastin, François Renaville. ULiège experience with aggregators and discovery tools providers. Be proactive and apply best practices (if you can...), 2018
  9. Deinzer, Gernot. "6b Repositoriensoftware." Praxishandbuch Open Access (2017): 290.
  10. COSTA, Michelli, and Fernando LEITE. "Repositórios institucionais de acesso aberto à informação científica da América Latina." (2017).
  11. 林信成, and 周庭郁. "圖書資訊學開放取用期刊聯合目錄系統之設計與實作." Journal of Educational Media & Library Sciences 54.1 (2017).
  12. Bettoni, E. M., M. B. de Carvalho, and Patricia Marchiori. Geração de indicadores para periódicos científicos: um estudo na AtoZ. Federal University of Paraná, 2016.
  13. Smith, Ina. Open access infrastructure. Vol. 2. UNESCO Publishing, 2015.
  14. Rousidis, Dimitris, et al. "Evaluation of Metadata in Research Data Repositories: The Case of the DC. Subject Element." Metadata and Semantics Research. Springer International Publishing, 2015. 203-213.
  15. Koulouris, Alexandros, Vangelis Banos, and Emmanouel Garoufallou. "Aggregating metadata for Europeana: the Greek paradigm." Proceedings of the International Conference on Integrated Information (IC-ININFO). 2011.
  16. Garoufallou, Emmanouel, Vangelis Banos, and Alexandros Koulouris. "Solving aggregation problems of Greek cultural and educational repositories in the framework of Europeana." International Journal of Metadata, Semantics and Ontologies 8.2 (2013): 134-144.
  17. Rousidis, Dimitris, et al. "Metadata for Big Data: A preliminary investigation of metadata quality issues in research data repositories." Information Services and Use 34.3 (2014): 279-286.
  19. Georgiadis, Haris, et al. "Ensuring the quality and interoperability of open cultural digital content: System architecture and scalability." Information, Intelligence, Systems and Applications, IISA 2014, The 5th International Conference on. IEEE, 2014.
  20. Rousidis, Dimitris, et al. "Data Quality Issues and Content Analysis for Research Data Repositories: The Case of Dryad." Let’s Put Data to Use: Digital Scholarship for the Next Generation, 18th International Conference on Electronic Publishing, Thessaloniki, Greece. 2014.
  21. Stathopoulou, Ioanna-Ourania, et al. "An Open Cultural Digital Content Infrastructure." Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on. IEEE, 2014.
  22. Hirschmann, Barbara. "DOI Registration Manual." (2014).
  23. Calzolari, Nicoletta, Monica Monachini, and Valeria Quochi. "Interoperability framework: The FLaReNet action plan proposal." Language Resources, Technology and Services in the Sharing Paradigm (2011): 41.
  24. Birello, Giancarlo, et al. "Step by step installation guide of a digital preservation infrastructure." (2012).
  25. Antonius Rachmat, C. "Analisis Rancang Bangun Sistem Repositori Institusi Berbasis Metadata Dublin Core di UKDW Yogyakarta."
  26. Houssos, Nikos, et al. "Enhanced OAI-PMH services for metadata sharing in heterogeneous environments." Library Review 63.6/7 (2014): 465-489.
  27. Kapidakis, Sarantos. "Comparing metadata quality in the Europeana context." Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments. ACM, 2012.
  28. Calzolari, Nicoletta, et al. "Final FLaReNet Deliverable Language Resources for the Future–The Future of Language Resources." The Strategic Language Resource Agenda. FLaReNet project (2011).

Features include:

  • View, print or download the output of all OAI-PMH supported commands.
  • Detect problems with metadata records (e.g. invalid URLs, empty titles, invalid date formats etc.)
  • Download all records from one or more digital libraries in parallel.
  • Check compliance with OAI-PMH, Dublin Core (DC), Europeana Semantic Elements (ESE) and other standards.

57179 digital libraries, repositories and e-journals have been already tested with

Created by Vangelis Banos, © 2011 - 2024