This document explains the graph-based data model for the Data Collections Explorer.
The structure is as follows:
- Host
- A hosting institution can provide multiple, possibly different, services
- Service
- Each instance represents a service available to the public (i.e. Zenodo)
- SubjectArea
- Instances: should be provided by a controlled vocabulary in the future
- ServiceType
- Three subclasses: Collection, Discrete and Terminology. "Collection" encompasses everything that comprises more than one dataset, "Discrete" contains only the subclass "Dataset", while "Terminology" contains everything related to nomenclature. The classes "Collection", "Discrete", and "Terminology" are mutually disjoint.
- Collection:
- Archive
- Bibliography
- Catalogue
- Chemistry
- Database
- Digital_Library
- Encyclopedia
- Repository
- Community Repository
- Institutional Repository
- Discrete
- Dataset
- Terminology
- Ontology
- Terminology_Service
The classes "Service" and "ServiceType" are disjoint with the classes "Host" and "SubjectArea". All other classes are mutually disjoint.
There are individuals of different types in the graph:
- SubjectArea: In the future, these could be imported from a controlled vocabulary.
- Host
- Service: These individuals have more than one type, the other coming from the class ServiceType.
All individuals of each type are mutually different to other individuals of the same type.
All data properties are mutually disjoint.
-
hasAPI
- Domain: Service class
- Range:
xsd:string
- This might change in the future to not only indicate whether a service provides API access, but also the type(s) and URLs.
-
hasDatasetSizeLimit
- Domain: Service class
- Range:
xsd:decimal[>=0]
-
hasHostURL
- Domain: Host class
- Range:
xsd:anyURI
- Characteristic: Functional
-
hasPublicationCost
- Domain: Service class
- Range:
xsd:string
(for now; there are multiple non-numeric entries) - Characteristic: Functional
-
hasServiceURL
- Domain: Service class
- Range:
xsd:anyURI
- Characteristic: Functional
-
isOpenAccess
- Domain: Service class
- Range:
xsd:string
These are the currently available object properties:
- hasSubjectArea
- Domain: Service class
- Range: SubjectArea class
- Asymmetric
- Irreflexive
- hostsService
- Domain: Host class
- Range: Service class
- Asymmetric
- Irreflexive
- Inverse of isHostedBy
- isHostedBy
- Domain: Service class
- Range: Host class
- Asymmetric
- Irreflexive
- Inverse of hostsService
All object properties are mutually disjoint.
The current version of the Data Collections Explorer has a comment field. This is replicated as an annotation: the owl:annotatedSource
is the host, with the owl:annotatedTarget
being the service this comment is valid for; the owl:annotatedProperty
is hostsService
and the comment itself is an rdfs:comment
.
Assuming you have a local instance of Apache Fuseki running, load DCE.ttl
. The ontology IRI is https://data-collections.nfdi4ing.de/graph
.
Some points that are still open:
- separate namespaces for hosts, services, properties, etc.