A Step-by-Step Guide
Deployment Instructions
Installation
Debian distribution
Semagrow is available as a .deb distribution. To install on Debian-based systems execute the following in a terminal:
echo ‘deb http://semagrow.semantic-web.at/deb/ lucid free’ > /etc/apt/sources.list.d/semagrow.list
wget -q http://semagrow.semantic-web.at/deb/packages.semagrow.key -O- | apt-key add -
Semagrow depends on Java 8 or later. To start the SemaGrow endpoint, issue:
service semagrow start
At this point, the Semagrow endpoint is up and running:
service semagrow status
[ ok ] SemaGrow Stack is running with pid **.
Semagrow is a SPARQL endpoint meant to be used by a client application, but a human-usable Web app is also provided for testing and monitoring. The Semagrow Web app can be accessed at http://localhost:8080/SemaGrow
Docker image
Semagrow is available as a Docker image. To install the image execute the following in a terminal:
docker run semagrow/semagrow:latest
Build from sources
Semagrow is an open source project developed on Github. The source repository can be cloned from https://github.com/semagrow/semagrow and the most recent stable version can be downloaded from the master branch. This is always the version from which the Debian and Docker distributions are produced. The current stable version is version 1.4.0.
Maven is required in order to build the sources.
Configuration
As a bare minimum, one must declare the remote endpoints that Semagrow federates. These are specified in RDF using the Turtle format. By default, Semagrow looks at /etc/default/semagrow/metadata.ttl to find information about its federation. A metadata.ttl can be as minimal as:
More complex configuration files provide Semagrow with important metadata and statistics about the contents of the federated endpoints. Please consult the configuration page about generating such files.
Each time the metadata.ttl file is modified, the Semagrow service must be restarted in order to read in the modified configuration:
service semagrow restart
Usage
Semagrow is a SPARQL endpoint meant to be used by a client application, but a human-usable Web app is also provided for testing and monitoring. The Semagrow Web app can be accessed at http://localhost:8080/SemaGrow. Clicking on the “Sparql” tab presents the SPARQL query environment.
For our simple example, we will use the metadata.ttl provided at the Semagrow repository. This metadata.ttl describes the AGRIS endpoint that serves agricultural science bibliography.
You can see a simple usage of the SemaGrow stack by submitting a SPARQL query that retrieves the number of images published between the years 2006 and 2008:
Press the Execute
button to execute the query. The results are presented in JSON format.
To demonstrate federated querying, we will now add a second dataset to the federation by replacing metadata.ttl with a new configuration file, also available at the Semagrow repository. This new configuration federates AGRIS with a dataset of that annotates AGRIS resources with a “clean publication year” property that disambiguates and normalizes publication years. The new dataset does not add publications, but the publication year is guaranteed to be an integer so that the same query yields more results.