What is the problem with Semantic Graph Database Benchmarks?

Published in

Level Up Coding

4 min readMay 31, 2022

Benchmarks for Semantic Graph Databases usually offer the tools and guidelines for vendors to set up and run the benchmarks themselves and publish the results on public websites. However, it’s very common that vendors only publish partial non-audited results publicly, for example, data imports on W3C.org, ignoring other aspects of the benchmark. Also, only some benchmarks support reasoning — one of the main selling points of RDF Graph databases — or can simulate enterprise environments where concurrent users execute realistic workloads of use case motivated queries over a certain period of time, which is particularly important to determine the scalability of the system under test and expose data concurrency bottlenecks. The Benchmarks that do include all the above do not seem to be widely used by RDF Graph Database vendors.

Most semantic graph database benchmark websites haven’t been maintained or
released audited results for years. Apart from publication of partial results from some vendors, there is not much going on.
We see this as a severe issue for the semantic graph database market to overcome since graph database selection based on hard facts is extremely hard to do by prospective buyers who do not yet have any experience with using RDF Graph Databases for mission-critical use cases.

Benchmark Facts

Very few official results (audited by Benchmark providers) have been published by RDF and Property Graph database vendors
Most RDF Graph database vendors publish unofficial partial results, e.g. data import only, without query timings
Not all benchmarks can provide realistic transactional and analytics workloads
The ones that can, are not very popular among RDF Graph databases, with very few tests executed and results published
Not all benchmarks support Reasoning
There is no reliable and complete source of benchmark results that can be used by prospective buyer to select RDF Graph database as at the date of this article

Benchmarks available

The following are the main graph database benchmarks available today.

The W3C.org RDF Store Benchmarking page collects references to RDF Graph benchmarks, benchmarking results and papers about Graph benchmarking. You will notice that on the W3C.org Large Triple Stores page, most vendors published results for the Lehigh University Benchmark (LUBM). Despite being the most popular RDF Graph Benchmark in the market, it does not mean it is the most complete. LUBM has a series of shortcomings that we have been working on to overcome.

In an Enterprise Knowledge Graph (EKG) context, the problem of not having proper benchmarks becomes even more apparent since in an EKG you are running hundreds of wildly different use cases across many of your lines of business that utilise multiple different types of graph databases. Per use case we need to be able to determine which backend graph database is best positioned to serve that use case. It could even be run across multiple graph databases (for instance a combination of Amazon Neptune and RDFox or a combination of Stardog or Ontotext and Neo4j or TigerGraph).
We basically have to integrate benchmarking with monitoring in the full DevOps (or DataOps / SemOps) process and go towards “Continuous Benchmarking” based on benchmark-test-scenarios defined as part of all model-driven use-cases in the EKG.

For more info refer to Enterprise Knowledge Graph Foundation (EKGF) at ekgf.org

Solving the problem

At agnos.ai we think that Graph database benchmarks are very important for the success of the market, the same way tpc.org has been for the relational database market for decades. Graph Databases are implemented by a growing number of storage systems and are used within enterprises and the Web. For RDF Graph Databases, as SPARQL is taken up by the community, there is a growing need for benchmarks to compare the performance of storage systems that expose SPARQL endpoints via the SPARQL protocol.

We have been working on enhancing the capabilities of some of the most popular benchmarks towards more comprehensive and mature sets of tests. We currently offer the following extended versions:

We are also planning to run benchmarks using the existing (and very comprehensive) LDBC Semantic Publishing Benchmark (LDBC-SPB) and the LDBC Financial Benchmark (LDBC FINBENCH).

Please check our web site for more details on the benchmarks above and other benchmarking services we provide that can help your organization.

Co-author: Jacobus Geluk — agnos.ai CEO.

What is the problem with Semantic Graph Database Benchmarks?

Benchmark Facts

Benchmarks available

Solving the problem

Written by mbarbieri