Skip to content

Dataset Management

Datasets Overview

To view and browse all loaded datasets, go to the Datasets view by clicking on the list icon in the left sidebar. The Datasets view provides a list of all datasets loaded at the moment, including some metadata like the status, a tag, the time of creation or of the last update as well as the number of entities and statements. The Search field in the upper right corner of the Datasets overview allows you to search for a particular dataset within all loaded datasets.

For each dataset, you can continue with several actions such as viewing the Dataset details by clicking , or directly browsing the dataset by clicking the Table view icon or Tree view icon . More information on browsing datasets can be found in the Search & Browse section.

Dataset Properties

The Dataset details view gives you an overview on the following dataset properties.

Property Description
Name A descriptive title that is displayed in the Datasets view.
Tag A short and unique mnemonic abbreviation or code for the dataset. The tag is used as a shortcut throughout Accurids, e.g., in the search result display or in search filters.
Description An informative description of the dataset.
Color The color of the badge that indicates the dataset throughout different Accurids screens.
Load status A dataset needs to be loaded before you can manage it.
loading completed successfully
loading in progress
loading not yet started
loading failed
Index status A dataset needs to be successfully indexed before you can work with its content.
indexing completed successfully
indexing in progress
indexing not started yet
indexing failed
Analysis status A dataset is analyzed regarding structure and quality.
analysis completed successfully
analysis in progress
analysis not started yet
analysis failed
Created on The date and time when the dataset was initialized.
Created by The user who created the dataset.
Last updated on The date and time when the content of the dataset was last changed.
Last updated by The user who last changed the dataset content.
Storage Size The storage size of the dataset.
Number of entities The number of indexed entities.
Number of statements The number of triples.
Number of unique predicates The number of unique predicates.
Number of mappings The total number of mapping triples contained in the dataset.
Hierarchy properties The list of detected hierarchical properties such as rdfs:subClassOf or skos:broader. If no hierarchical property has been found in the dataset, the value is empty.
Cycles The number of cycles formed by some hierarchical property.
ID Generator ID Generator associated with the dataset. Only available with the PID Generator Module.

Working with Datasets

Users in Accurids can be given a specific role, based on which specific actions in Accurids are restricted or allowed. Regarding dataset management, standard users can not upload or edit any dataset. Contributors can upload datasets and edit or delete those. Admins can upload, edit or delete any dataset that has been loaded, independent on who uploaded the dataset in the first place. More information on user roles can be found in the Platform Administration section.

Creating a New Dataset

When on the Datasets view, click in the upper right corner to create a new dataset. In the upcoming dialog you have to specify the name, tag and optionally a description.

Click ADD FILES to select all files that should be loaded into the dataset. You can continue to add data from files by repeatedly clicking ADD FILES or by selecting graphs from a SPARQL Endpoint.

Finally, click SAVE DATASET to start the upload. You can monitor the loading and indexing progress in the Datasets view. After successful ingest and indexing you get notified and can start searching in the dataset.

Updating the Content of a Dataset

Click in the Dataset details view to update the content of a dataset.

The update will clear the existing content of the dataset and upload the new data into it.

If you just want to update the name, tag, description or colour of the dataset, you can do that directly in the Dataset details view by simply clicking the pencil icon </svg.

Configuration of a SPARQL Endpoint

To enter the configuration for a SPARQL Endpoint click and fill required parameters. The endpoint configuration is remembered per user and only visible to you.

Add a Dataset From a SPARQL Endpoint

Within the dataset creation and dataset update dialogs you can choose to provide data from configured SPARQL Endpoints.

  1. Select a configured SPARQL Endpoint and choose all graphs you want to include into the dataset.
  2. Finally, click SAVE DATASET.

You can monitor the loading and indexing progress in the Datasets view.

Download a Dataset

In the Dataset details view, click to download the dataset. Depending on the file size this may take some minutes.

Remove a Dataset

To remove a dataset, go to the Dataset details view and click the trashcan icon , then confirm the deletion. This process cannot be undone.

Search for a Dataset

The Search field in the upper right corner of the Datasets overview allows you to search for a particular dataset within all loaded datasets.

Dataset Requirements

To be successfully loaded, indexed and displayed a dataset has to fulfill the following requirements:

  • RDF Syntax: The dataset must be valid RDF syntax according to the different serializations such as Turtle, N3 or RDF/XML.
  • RDF Type: All entities which should be indexed need to have a specified rdf:type property.