Exploring an RDF Graph Database

mbarbieri
Level Up Coding
Published in
9 min readFeb 16, 2021

--

This demonstration contains some basic queries and tools to help you exploring an RDF Graph Database. It includes features of Stardog* and GraphDB* to:

  • Search and browse classes and properties.
  • Explore connections in the data and visualise query results graphically as nodes and edges.
  • Visualize schemas and relationships.

* “These are core semantic technologies for any Enterprise Knowledge Graph (EKG)”

You may choose to skip the environment setup mentioned below if you do not want to execute the queries yourself, but only browse the results in the screenshots.

To be able to run the queries in this demonstration, you will need to set up the Northwind RDF graph database on your local machine. Please follow the instructions in the “Setting up the Northwind database on Stardog -or- GraphDB” sections in the end of the Northwind SQL vs SPARQL article.

The queries used in this demonstration can be downloaded here.

Exploring the Graph

Counting triples

Counting all triples in the default graph of the RDF Graph Database.

SELECT (COUNT(?s) as ?numTriples)
WHERE {
?s ?p ?o .
}
GraphDB

You will find more details on default and named graphs on Stardog and GraphDB RDF Graph Databases further down in this article.

Selecting triples

Selecting all triples in the default graph of the RDF Graph Database.

SELECT * {?s ?p ?o}
GraphDB

Selecting properties of a class

In the following example, the query returns all properties of the class “Order” in the default graph of an RDF Graph Database on Stardog.

SELECT DISTINCT
?domain ?prop ?range
WHERE {
?subject ?prop ?object .
?subject a ?domain .
OPTIONAL {
?object a ?oClass .
}
BIND(IF(BOUND(?oClass), ?oClass, DATATYPE(?object)) AS ?range)
FILTER (?prop != rdf:type && ?prop != rdfs:domain && ?prop != rdfs:range
&& ?domain = :Order
)
}
Stardog Studio

Counting predicates

Counting all predicates in the default graph of the RDF Graph Database.

SELECT
?predicate (COUNT(?predicate) as ?predicateCount)
WHERE {
?subject ?predicate ?object .
}
GROUP BY
?predicate
ORDER BY
DESC(?predicateCount)
Stardog Studio

Counting class instances

Counting all class instances in the default graph of the RDF Graph Database.

SELECT
?class (COUNT(?subject) as ?classCount)
WHERE {
?subject rdf:type ?class .
FILTER (?class != rdfs:Class && ?class != rdf:Property)
}
GROUP BY
?class
ORDER BY
DESC(?classCount)
Stardog Studio

The query above can be used for a basic first step data migration reconciliation, for example, by comparing the instance counts against the table record counts on the source relational database. Here is the equivalent in SQL:

SELECT 'OrderDetail' AS TableName, COUNT(*) AS RercordCount FROM [dbo].[OrderDetail] UNION 
SELECT 'Order', COUNT(*) FROM [dbo].[Order] UNION
SELECT 'Customer', COUNT(*) FROM [dbo].[Customer] UNION
SELECT 'Product', COUNT(*) FROM [dbo].[Product] UNION
SELECT 'Territory', COUNT(*) FROM [dbo].[Territory] UNION
SELECT 'Supplier', COUNT(*) FROM [dbo].[Supplier] UNION
SELECT 'Employee', COUNT(*) FROM [dbo].[Employee] UNION
SELECT 'Category', COUNT(*) FROM [dbo].[Category] UNION
SELECT 'Region', COUNT(*) FROM [dbo].[Region] UNION
SELECT 'Shipper', COUNT(*) FROM [dbo].[Shipper]
ORDER BY RercordCount DESC
Azure Data Studio

Counting triples in graphs

Counting triples in the default and each named graph in the RDF Graph Database.

Let us first populate some example graphs before we continue.

Populate graph1

PREFIX ns: <http://mysparql.ai/ns#>
INSERT DATA {
GRAPH ns:graph1 {
ns:book1 ns:price 10 .
}
}

Populate graph2

PREFIX ns: <http://mysparql.ai/ns#>
INSERT DATA {
GRAPH ns:graph2 {
ns:book1 ns:price 10 .
ns:book2 ns:price 20 .
}
}

Populate graph3

PREFIX ns: <http://mysparql.ai/ns#>
INSERT DATA {
GRAPH ns:graph3 {
ns:book1 ns:price 10 .
ns:book2 ns:price 20 .
ns:book3 ns:price 30 .
}
}

The following example will return a triple count for each of the named graphs populated above plus the default graph that was created when the Northwind sample database was imported.

SELECT
?g (COUNT(*) AS ?Count)
WHERE {
{
GRAPH ?g {?s ?p ?o}
} UNION {
?s ?p ?o
BIND("default" AS ?g)
}
}
GROUP BY ?g
ORDER BY DESC(?Count)
Stardog Studio

RDF Named Graphs is an extensive topic. For more information, please refer to the following article:

Constructing graphs

The CONSTRUCT query form returns a single RDF graph specified by a graph template. The result is an RDF graph formed by taking each query solution in the solution sequence, substituting for the variables in the graph template, and combining the triples into a single RDF graph by set union.

Using CONSTRUCT, it is possible to extract parts or the whole graphs from the target RDF dataset.

The following query constructs a graph containing the entire Northwind dataset.

CONSTRUCT {
?domain ?prop ?range
}
WHERE {
?subject ?prop ?object .
?subject a ?domain .
OPTIONAL {
?object a ?oClass .
}
BIND(IF(BOUND(?oClass), ?oClass, DATATYPE(?object)) as ?range)
FILTER (?prop != rdf:type && ?prop != rdfs:domain && ?prop != rdfs:range)
}

In GraphDB (screenshots below), select “Run” (1) to produce a tabular result or “Visual” (2) to get the visual graph representation.

Note that the “Inferred Data” and “Same As” options must be turned off, as illustrated.

GraphDB
GraphDB

Constructing a graph with nodes and edges for a given class

Sometimes visual graphs can become cluttered, especially for a large and complex RDF Graph Database. More focused graphs can be created by filtering the classes to be displayed.

The following example generates a visual graph around the Order class.

CONSTRUCT {
?domain ?prop ?range
}
WHERE {
?subject ?prop ?object .
?subject a ?domain .
OPTIONAL {
?object a ?oClass .
}
BIND(IF(BOUND(?oClass), ?oClass, DATATYPE(?object)) as ?range)
FILTER (?prop != rdf:type && ?prop != rdfs:domain && ?prop != rdfs:range
&& ?domain = :Order
)
}
Stardog Studio
Stardog Studio

Note that other classes may still appear (e.g. Customer, Employee) on the graph, as they are nodes referenced by the outgoing edges of the Order class.

Constructing a graph with nodes and edges for a given list of classes

Here is another example with a list of classes.

CONSTRUCT {
?domain ?prop ?range
}
WHERE {
?subject ?prop ?object .
?subject a ?domain .
OPTIONAL {
?object a ?oClass .
}
BIND(IF(BOUND(?oClass), ?oClass, DATATYPE(?object)) as ?range)
FILTER (?prop != rdf:type && ?prop != rdfs:domain && ?prop != rdfs:range
&& ?domain IN ( :Order, :Customer, :OrderDetail, :Product )
)
}
Stardog Studio

The following is an example with Order and Customer classes.

Run the query and then click on “Visual” to generate the graph in GraphDB.

PREFIX : <http://www.mysparql.com/resource/northwind/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
CONSTRUCT {
?domain ?prop ?range
}
WHERE {
?subject ?prop ?object .
?subject a ?domain .
OPTIONAL {
?object a ?oClass .
}
BIND(IF(BOUND(?oClass), ?oClass, DATATYPE(?object)) as ?range)
FILTER (?prop != rdf:type && ?prop != rdfs:domain && ?prop != rdfs:range
&& ?domain IN ( :Order, :Customer )
)
}
GraphDB
GraphDB

Describing resources

The DESCRIBE form returns a single result RDF graph containing RDF data about resources. This data is not prescribed by a SPARQL query, where the query client would need to know the structure of the RDF in the data source, but, instead, is determined by the SPARQL query processor.

The following 3 examples show how to return all outgoing edges of a node using GraphDB.

Describing an OrderDetail

Describing an instance of the OrderDetail class.

PREFIX   : <http://www.mysparql.com/resource/northwind/>
DESCRIBE :orderDetail-10248-72
GraphDB
GraphDB

Describing an Order

Describing an instance of the Order class.

PREFIX   : <http://www.mysparql.com/resource/northwind/>
DESCRIBE :order-10248
GraphDB
GraphDB

Describing a Product

Describing an instance of the Product class.

PREFIX   : <http://www.mysparql.com/resource/northwind/>
DESCRIBE :product-1
GraphDB

In Stardog, by running a query with the “Text” option selected produces a single result graph containing RDF data about resources, as per below.

Stardog Studio

When selecting the “Visual” option, a graphical representation of nodes and edges is produced.

Stardog Studio

Searching triples

Searching triples across all graphs in an RDF Graph Database.

The following query searches for category-3 in the subject, predicate, and object positions of triples across all graphs (named and default) of the RDF Graph Database. It also returns the name of the graph where each triple belongs to.

SELECT
?g ?s ?p ?o
WHERE {
{
GRAPH ?g { ?s ?p ?o }
} UNION {
?s ?p ?o
BIND("default" AS ?g)
}
FILTER (
(CONTAINS (STR(?s), ?searchString)) ||
(CONTAINS (STR(?p), ?searchString)) ||
(CONTAINS (STR(?o), ?searchString))
)
BIND("category-3" AS ?searchString)
}
ORDER BY
?g ?s
GraphDB

Behaviour and configuration varies from RDF Graph Database vendor to vendor when it comes to default and named graphs.
The query above aims to catch all triples, regardless of the database vendor, query environment or protocol configuration. This can be useful when investigating data issues, cherry-picking RDF raw data provided by an ETL process, performing data reconciliations, exporting data for analytics, etc.

Visualisation and Query Tools

Visualisation and natural language query tools (and query builders) can offer great options to explore an RDF Graph Database.

Note that the tools below require all named graphs to be available in the default graph for them to work properly. This is the default behaviour in GraphDB and can be accomplished in Stardog by setting the query.all.graphs=true database property.

The following visualisation was created using metaphacts.

Sparklis is a structured language query builder and offers a very intuitive way of exploring an RDF Graph Database. The tool generates the SPARQL query for you automatically.

Level Up Coding

Thanks for being a part of our community! Before you go:

🚀👉 Join the Level Up talent collective and find an amazing job

--

--