What Can I Do With This Messy Spreadsheet? Converting from Excel Sheets to Fully Compliant EAD-XML files
- Authors
Many Galleries, Libraries, Archives, and Museums (GLAM) face difficulties sharing their collections meta-data in standardised and sustainable ways due to the absence of in-house Information Technology (IT) support or capabilities. This situation means that staff rely on more familiar general purpose office programs like text processors, spreadsheets, or low-code databases. However, while these tools offer an easy approach to data registration and digitisation, they don’t allow for more advanced uses. This blog explains a procedure for producing EAD (Encoded Archival Description) files from an Excel spreadsheet using OpenRefine.
Spreadsheets offer a structured way of registering data which makes them more easily processable by machines than normal text documents. However, spreadsheets can pose some problems when registering multiple instances (redundancy, empty cells, arbitrary number of columns for the same attribute, etc.) and they do not offer integrity checks, so data can be messy and invalid without users noticing it. These issues are the reason why spreadsheets are not advised as a comprehensive and sustainable method of registering data within an institution, even though they remain a popular and familiar tool used in many institutions for informal (and sometimes even formal) data registration. This post is thus dedicated to this commonly used method in order to show how data in spreadsheets can be migrated to more structured and standard formats.
Learning outcome
After viewing this training resource, users will be able to:
- Understand the challenges of interoperability when using spreadsheets
- Produce EAD files from an Excel spreadsheet using OpenRefine
Check out What can I do with this messy spreadsheet? Converting from Excel sheets to fully compliant EAD-XML files
Go to this resource