CRMsci : the Scientific Observation Model
The “Scientific Observation Model” is a formal ontology intended to be used as a global schema for integrating metadata about scientific observation, measurements and processed data in descriptive and empirical sciences such as biodiversity, geology, geography, archaeology, cultural heritage conservation and others in research IT environments and research data libraries. Its primary purpose is facilitating the management, integration, mediation, interchange and access to research data by description of semantic relationships, in particular causal ones. It is not primarily a model to process the data themselves in order to produce new research results, even though its representations offer themselves to be used for some kind of processing.
It uses and extends the CIDOC CRM (ISO21127) as a general ontology of human activity, things and events happening in spacetime. It uses the same encoding-neutral formalism of knowledge representation (“data model” in the sense of computer science) as the CIDOC CRM, which can be implemented in RDFS, OWL, on RDBMS and in other forms of encoding. Since the model reuses, wherever appropriate, parts of CIDOC Conceptual Reference Model, we provide in this document also a comprehensive list of all constructs used from ISO21127, together with their definitions following the version 5.1.2 maintained by CIDOC.
The Scientific Observation Model has been developed bottom up from specific metadata examples from biodiversity, geology, archeology, cultural heritage conservation and clinical studies, such as water sampling in aquifer systems, earthquake shock recordings, landslides, excavation processes, species occurrence and detection of new species, tissue sampling in cancer research, 3D digitization, based on communication with the domain experts and the implementation and validation in concrete applications. It takes into account relevant standards, such as INSPIRE, OBOE, national archeological standards for excavation, Digital Provenance models and others. For each application, another set of extensions is needed in order to describe those data at an adequate level of specificity, such as semantics of excavation layers or specimen capture in biology. However, the model presented here describes, together with the CIDOC CRM, a discipline neutral level of genericity, which can be used to implement effective management functions and powerful queries for related data. It aims at providing superclasses and superproperties for any application-specific extension, such that any entity referred to by a compatible extension can be reached with a more general query based on this model.
Besides application-specific extensions, this model is intended to be complemented by CRMgeo, a more detailed model and extension of the CIDOC CRM of generic spatiotemporal topology and geometric description, also currently available in a first stable version [CRMgeo, version 1.0 - Doerr, M. and Hiebel, G. 2013]. Details of spatial properties of observable entities should be modelled in CRMgeo. As CRMgeo links CIDOC CRM to the OGC standard of GeoSPARQL it makes available all constructs of GML of specific spatial and temporal relationships. Still to be developed are models of the structures for describing quantities, such as IHS colors, volumes, velocities etc.
This is an attempt to maintain a modular structure of multiple ontologies related and layered in a specialization – generalization relationship, and into relatively self-contained units with few cross-correlations into other modules, such as describing quantities. This model aims at staying harmonized with the CIDOC CRM, i.e., its maintainers submit proposals for modifying the CIDOC CRM wherever adequate to guarantee the overall consistency, disciplinary adequacy and modularity of CRM-based ontology modules.
Contact info: Maria Theodoridou