Submission 67
CRMdig-based Provenance Ontology for Korean Cultural Heritage 3D Data
SP02-01
Presented by: Ara JO
This study aims to design a 'Korean-style ontology' for the provenance (entire lifecycle) of Korean cultural heritage 3D data based on CRMdig, and through this, to construct semantic data. Its goal is to integrally manage 3D data that has been underutilized by domestic institutions, to mitigate the phenomenon of information distortion (hallucination) that arises in the AI era, and furthermore, to prepare a foundation for international data linkage and trust-based services.
As we enter the age of Artificial Intelligence (AI), the value of digital data is increasing as never before. The field of Korean cultural heritage, in particular, has established high-quality 3D original record data generated by 3D scanning and photogrammetry as a core asset in accordance with the 'Cultural Heritage Digital Transformation 2030' plan, a process that is still actively ongoing.[1]
The cultural heritage 3D data produced through the investment of vast resources is mostly limited to low-capacity models and their basic metadata, which are provided through online portals. This low-capacity data has clear limitations in terms of quality for use in general content creation or specialized academic research. This necessitates high-quality 3D data or unprocessed raw data. Therefore, in preparation for the future phase of disclosing and utilizing all data, the design of a standardized 3D data ontology that can effectively integrate data, transparently manage provenance, and guarantee international interoperability has become a pressing task.
Accordingly, this study adopts CIDOC-CRM, the standard ontology of the International Council of Museums (ICOM), and its digital extension model, CRMdig, as its core methodology to design a 'Lifecycle Ontology for Korean-style Cultural Heritage 3D Data.'[2]
This research will be conducted in a total of five stages. First, it will analyze the entire lifecycle of domestic cultural heritage 3D data. Centering on the Korea Heritage Service and the National Museum of Korea, which represent national heritage, it will closely investigate the current status of raw data acquisition, data post-processing, final deliverables, and data archiving.[3] Although both institutions record precise metadata based on detailed data construction guidelines, they are founded on traditional database structures where users search for information by keywords, thus showing the fundamental limitations of their archive systems.
Second, it will search for a standard model through an analysis of foreign 3D archive cases. Focusing on four leading institutions—Europeana, the Smithsonian, the British Museum, and CyArk—it will investigate the management level of data provenance in major overseas 3D archives and conduct a comparative analysis with the domestic situation.[4] Through these cases, it will identify how data schemas are designed differently according to the core objectives of each archive. By synthesizing this domestic and international status, the direction of this study will be determined.
Third, based on the preceding analysis results, a 'Korean-style Cultural Heritage 3D Data Ontology' based on the international standard semantic model CRMdig will be designed. At this stage, it will particularly focus on the representation of provenance, which allows for tracing the entire process from the creation of 3D data to its processing, transformation, and transfer. Although CRMdig is an extension model specialized to precisely capture the dynamic lifecycle of a digital object, it does not predefine all the classes that encompass the detailed work stages of a specific domain. For example, while CRMdig's D3 Formal Derivation class is central to modeling the 3D data post-processing workflow, it has the limitation of not being able to distinguish specific work types such as 'point cloud registration,' 'noise removal,' or 'texturing' of a mesh model. To clearly express this technical detailed information, the ontology will be extended by proposing a user-defined classification system composed of sub-classes of D3 Formal Derivation or properties that modify it. To verify the effectiveness of the designed ontology, a specific cultural heritage asset will be selected as a case to build a pilot dataset. In this process, a unique identifier (URI) will be assigned to every digital object and transformation event—from the acquired raw dataset through intermediate processing stages to the final deliverables—to clearly define the connection relationship of each stage.
Fourth, based on the designed ontology schema and the pilot dataset, RDF (Resource Description Framework) triples, which are a set of interconnected data, will be generated. These generated RDF triples will finally be completed as a Knowledge Graph containing the detailed provenance information of the corresponding cultural heritage.
Fifth, the performance and utility of the constructed Knowledge Graph will be evaluated by executing SPARQL queries. This stage is a process of verifying whether the designed ontology has practically secured traceability and interoperability through complex and semantic queries that were impossible with simple metadata searches. For instance, by successfully executing a query like, "Find all the raw data acquired with scanner B and the 3D data in OBJ format among the Daeungjeon data of temples in region A," it will be proven that the proposed ontology model has the scalability applicable to cultural heritage data of various sizes, from small artifacts to large-scale historic sites.
Ultimately, this study will establish a long-term preservable 3D data system based on FAIR principles to prepare the foundation for the international data linkage of Korean cultural heritage. In the future, it can lead to diverse research expansions, such as automating the structuring process of unstructured real-world data (messy real-world data) by utilizing AI, and developing a next-generation intelligent cultural heritage search system that incorporates Retrieval-Augmented Generation (RAG) technology.
Keywords
Cultural Heritage, 3D Data, Ontology, CIDOC-CRM, CRMdig, provenance
Reference
[1] the Korea Heritage Service. (June 16, 2021). the Korea Heritage Service, The announcement of the 'Digital Transformation for Cultural Heritage 2030' plan. the Korea Heritage Service.
http://www.khs.go.kr/newsBbz/selectNewsBbzView.do?newsItemId=155702775§ionId=b_sec_1&pageIndex=1&strWhere=&strValue=&mn=NS_01_02
[2] International Committee for Documentation (CIDOC). (n.d.). The CIDOC conceptual reference model. Retrieved September 29, 2025, from https://cidoc-crm.org/ ; CIDOC CRM Special Interest Group. (n.d.). CRMdig: An extension of CIDOC-CRM to support 3D modelling. Retrieved September 29, 2025, from https://cidoc-crm.org/crmdig
[3] Korea Heritage Service. (n.d.). Korea Heritage Digital Service. Retrieved September 29, 2025, from https://digital.khs.go.kr/ ; National Museum of Korea (n.d.). 3D Data Search. Retrieved September 29, 2025, from https://www.museum.go.kr/MUSEUM/contents/M0505000000.do
[4] Europeana. (n.d.). Europeana. Retrieved September 29, 2025, from https://www.europeana.eu ; Smithsonian Institution. (n.d.). 3D digitization. Retrieved September 29, 2025, from https://3d.si.edu/ ; The British Museum. (n.d.). The British Museum (@britishmuseum). Sketchfab. Retrieved September 29, 2025, from https://sketchfab.com/britishmuseum/models ; OpenHeritage3D(CyArk). (n.d.). OpenHeritage3D. Retrieved September 29, 2025, from https://openheritage3d.org/