Accessing Context
Duration: | December 2024 – December 2025 |
Subsidy provider: | European Research Infrastructure for Heritage Science, E-RIHS |
Subsidy size: | 60.000 euro |
Remarkable: | We werken aan een standaard voor de duurzame ontsluiting van culturele erfgoed datasets, waarbij we ingaan op de specificiteit van dit soort data. Bias, positionaliteit, provenance en historische context spelen daarbij een belangrijke rol. |
Valorisation: | We ontwikkelen en publiceren open access een concreet format voor de meta-data en omschrijving die de complexiteit en heterogeniteit van culturele erfgoed data weerspiegelt. |
Data Envelopes for Digital Cultural Heritage in Practice
How do we deal with the specificity of cultural heritage datasets in descriptions and metadata? For example, in what ways can we be more transparent about the bias often embedded in data and encourage responsible use of historical datasets? Can we clarify the influence of positionality on the creation and editing of a resource?
In the humanities and heritage sector, many datasets are becoming available for research and interdisciplinary use, partly due to data-centric innovations and new digitization technologies. At the same time, the digital processing of data on cultural assets is not documented in a comparable way in most projects, and current metadata models and various dataset registries do not provide enough space for adequate description. Addressing this problem requires consensus and support for an overarching descriptive model for dataset disclosure that facilitates data discovery, sharing, and reuse. In this project, we continue to work on a metadata format that reflects the complexity and heterogeneity of cultural heritage data and that helps unlock datasets through FAIR and Open Science principles.
Data envelopes
In today’s data field and in the machine learning context, so-called datasheets are used to map the characteristics of a data object, dataset or software system to add context and summarize technical and commercial information in particular. Analysis and experiments at the Huygens Institute in conversation with other partners in the field have shown that these versions of datasheets are not sufficient to properly express the complexity and heterogeneity of cultural heritage data. That is why the Huygens Institute has introduced the new concept of data-envelopes that provides space for information that is of specific importance to this field; for researchers, the wider public and for machines.
This project, in collaboration with the Amsterdam City Archives and The Netherlands Institute for Sound and Vision, has two objectives. On the one hand, we are testing the applicability of data-envelopes for other types of data (e.g. audiovisual data, photo archives and administrative datasets); on the other hand, we are identifying technical requirements for storing and sharing data-envelopes (within the organizations and) through external catalogs. These activities will result in a data-envelopes proof-of-concept and implementation plan by the end of 2025 for further application and use in the field.
FAIR and Open Science
At the Huygens Institute, we are committed to applying FAIR principles and working toward a strong Open Science practice. At the same time, because of our many projects, we are keenly aware of the complexity of historical data. The proposed data-envelopes provide the opportunity to achieve a balance in dataset descriptions where important cultural information can be described in detail across different types of facets by researchers and other dataset creators.
For example, the data-envelopes also incorporate information about provenance (in what context the sources originated), context around FAIR principles (e.g., where can the data be found and in what format; what does someone need to know before they (re)use the dataset), and positionality (who worked on it and what does that potentially say about bias and blind spots). In addition, the data-envelopes add facets that are actually machine-readable and therefore enhance interoperability with other systems and discoverability.