Dgraph vs. JanusGraph

LGalatin · July 7, 2020, 8:52pm

Highlights

JanusGraph is not a native graph database.
JanusGraph is not self contained and relies on third-party solutions such as different (mostly NoSQL) storage backends.
If JanusGraph is used with Cassandra and HBase, then it is a distributed database but It won’t have ACID transactions.
If JanusGraph is used with BerkleyDB, then it has ACID transactions but it won’t be distributed.

Native GraphQL support

Dgraph : Yes - Only DB to natively support GraphQL resulting in capacity to process GraphQL queries in parallel with high performance
JanusGraph : No - JanusGraph query language is Gremlin. Reference

Distributed Graph database

Dgraph : Distributed with the ability to use the same query everywhere as if querying a single database
JanusGraph : JanusGraph is distributed only with Apache Cassandra and Apache HBase. Note that BerkeleyDB JE is a non-distributed database. HBase gives preference to consistency at the expense of yield, and Cassandra gives preference to availability at the expense of harvest Reference

Distributed ACID Transactions

Dgraph :
- Supported and Jepsen-tested
- Synchronous replication with immediate consistency meaning any client can read the latest write.
- Open Source
- Reference
JanusGraph: JanusGraph transactions are not necessarily ACID. They can be so configured on BerkeleyDB, but they are not generally so on Cassandra or HBase, where the underlying storage system does not provide serializable isolation or multi-row atomic writes and the cost of simulating those properties would be substantial. Reference

Sharding

Dgraph :
- Predicate-based sharding . Avoids N+1 problem and network broadcasts when running a query in high fanout scenarios. This ensures low-latency query execution, irrespective of the size of the cluster or the number of intermediate results. Reference
- Consistent production level latencies and consistent queries Reference
- Automatic sharding
- Sharding a single predicate on the roadmap
JanusGraph
- When JanusGraph is deployed on a cluster of multiple storage backend instances, the graph is partitioned across those machines. By default, JanusGraph uses a random partitioning strategy that randomly assigns vertices to machines.
- When the graph is small or accommodated by a few storage instances, it is best to use random partitioning for its simplicity. As a rule of thumb, one should strongly consider enabling explicit graph partitioning and configure a suitable partitioning heuristic when the graph grows into the 10s of billions of edges.
- Reference

Consistent Replication

Dgraph : Synchronous replication across all replicas
JanusGraph :
- Only Hbase has native support for strong consistency at row level (Reference). Even so JanusGraph documentation explains use of locks for data consistency on HBase here.
- Cansandra has specific configurations for replication. In general, higher levels are more consistent and robust but have higher latency. Reference

Linearizable Reads

Dgraph : Strong (sequential) consistency across clients Reference
JanusGraph :
- Apache Cassandra or Apache HBase are both eventual consistency storage backends that means JanusGraph must obtain locks in order to ensure consistency. Because of the additional steps required to acquire a lock when committing a modifying transaction, locking is a fairly expensive way to ensure consistency and can lead to deadlock when very many concurrent transactions try to modify the same elements in the graph. Reference
- JanusGraph first persists all graph mutations to the storage backend. If the primary persistence into the storage backend succeeds but secondary persistence into the indexing backends or the logging system fail, the transaction is still considered to be successful because the storage backend is the authoritative source of the graph. This can create inconsistencies with the indexes and logs. To automatically repair such inconsistencies, JanusGraph can maintain a transaction write-ahead log which is enabled through the configuration. Reference

Correctness and durability testing

Dgraph: Jepsen-tested
JanusGraph: It is not Jepsen-tested.

High availability

Dgraph:
- Yes, HA Cluster Setup is explained here
- HA Cluster setup is available in Community Edition.
JanusGraph :
- High availability depends on the backend configuration. Both Hbase and Cassandra can be highly available.
- If an instance fails, i.e. is not properly shut down, JanusGraph considers it to be active and expects its participation in cluster-wide operations which subsequently fail because this instances did not participate in or did not acknowledge the operation. In this case, the user must manually remove the failed instance record from the cluster and then retry the operation. Reference

Transparent data encryption

Dgraph: Yes, database files are encrypted at rest with a user-specified key
JanusGraph: This depends on the backend storage system. Hbase and Oracle Berkley DB have encryption at rest options. Although it is not documented how they can be used with JanusGraph. Reference for HBase, and for Berkeley DB.

Query languages

Dgraph:
- GraphQL
- GraphQL± (Variation of GraphQL supporting advanced features.)
JanusGraph: Gremlin Query Language

Management of runaway queries

Dgraph :
- Context cancellation which works across clients and servers. So, a context cancellation at the client level would automatically cancel the query at all involved servers
- OpenCensus integration , which allows distributed tracing all the way from app to Dgraph cluster and back.
- Open standards for query context cancellation and tracking
JanusGraph :
- There is nothing in Gremlin Server that will list running queries. As for cancellation, according to standard TinkerPop semantics a Traversal should respect a request for interruption on a thread. These semantics are enforced by the TinkerPop process test suite. That said, it is still up to the graph provider to properly allow for that behavior. Reference

Backups

Dgraph :
- Binary format
- Both full and incremental backups to files, S3 and Google storage via Minio
- Live backups with no downtime
- Reference
JanusGraph :
- JanusGraph acts as an abstraction layer on top of the storage backends and defers to the storage backends for administrative best practices. As a result, there is a lack of centralized documentation on backend administrative tasks. Reference
- Cassandra offers Snapshot, incremental, and commit-log backups. Reference
- Hbase backup offerings are summarized here

Pricing and Free trial

Dgraph :
- Open source version is under Apache 2.0, so free to use and modify.
- Enterprise version pricing is based on the number of instances of Dgraph, not the number of cores / RAM / Disk, etc…
JanusGraph :
- Open Source under the Apache 2 license

Appropriate as primary database to build apps/data platform on

Dgraph : Dgraph is a general-purpose database with a graph backend.
JanusGraph : The use case is determined by the storage backend, JanusGraph is a graph engine not a graph database.

Open source

Dgraph :
- Yes, Apache 2.0. GitHub
- Enterprise features are NOT Apache 2.0. But, users can still read the source
- Dgraph open source version and enterprise version provide the same performance . They’re only different in that enterprise version has more features
- Dgraph supports many open standards, like Grpc, Protocol Buffers, Go contexts, Open Census integration for distributed tracing.
JanusGraph : Open Source under the Apache 2 license

Protocols

Dgraph:
- HTTP/HTTPS
- gRPC
- Protocol Buffers
JanusGraph :
- HTTP/HTTPS
- WebSockets
- Reference

Point in time recovery

Dgraph: On the roadmap
JanusGraph JanusGraph does not provide point in time recovery. It can be configured to keep a write-ahead log. Reference

Multi-region deployments

Dgraph : Yes
JanusGraph: Depends on the storage system used. Cassandra has multi-region deployment. Reference

SQL migration tool

Dgraph : Yes
JanusGraph: No. There are some suggestions on the resources to do this with Cassandra here.

Authentication and authorization

Dgraph :
- JSON web tokens
- ACLs for enterprise
- Integration with LDAP on the roadmap
JanusGraph: HTTP Basic authentication and authentication over websocket. Reference

Drivers

Dgraph :
- Dgraph’s drivers use gRPC not REST
- Any GraphQL compatible client can be used
- Dgraph’s supported drivers are the same as Neo4J’s supported drivers: Java, JavaScript , Go, Python, .Net)
- Dgraph’s unofficial drivers are: Rust, Dart, Elixir
- Reference
JanusGraph:
- A list of TinkerPop drivers is available on TinkerPop’s homepage.
- In addition to drivers, there exist query languages for TinkerPop that make it easier to use Gremlin in different programming languages like Java, Python, or C#.

Multi-database features

Dgraph: Multi-Tenancy on the roadmap
JanusGraph: Edge Label Multiplicity

Graph Database As A Service (DBaaS)

Dgraph: Hosted solution launching in mid-year 2020
JanusGraph: No

Query execution plans

Dgraph: Query planning on the roadmap
JanusGraph: With JanusGraphManager, you can define a property in your configuration that defines how to access a graph.

Support for graph algorithms

Dgraph:
- Shortest k-paths
- Edge traversal limit to determine cycles in graphs
- Others requested from community listed here
JanusGraph: JanusGraph doesn’t talk about graph algorithms but one could follow this Gremlin recipe for shortest-path for example.

Apache Spark integration

Dgraph: No
JanusGraph: Users can leverage Apache Hadoop and Apache Spark to configure JanusGraph for distributed graph processing. Reference

Kafka integration

Dgraph: On the roadmap
JanusGraph: There are no official plugins but there are some integrations done by the community. Here is an example with Hbase.

Import/export

Dgraph:
- Using BulkLoader or LiveLoader, Dgraph can read the data as is with no modification needed
- Supported data formats are JSON and RDF
- Exporting database is explained here
JanusGraph:
- Export: GraphML or GraphSon, Gremlin I/O library
- Import: Bulkloading, Gremlin I/O library

Topic		Replies	Views
Suggestion: fix outdated comparisons Users	2	461	April 3, 2018
Dgraph Database Overview - Dgraph documentation Documentation	2	600	February 17, 2021
Is DGraph the Cassandra of Graph DB? Dgraph	3	912	May 30, 2018
Dgraph GraphQL hits GA with v20.03.1 - Dgraph Blog Blog	0	581	May 4, 2020
Dgraph GraphQL hits GA with v20.03.1 - Dgraph Blog Blog	2	572	July 26, 2022

Dgraph vs. JanusGraph

Highlights

Native GraphQL support

Distributed Graph database

Distributed ACID Transactions

Sharding

Consistent Replication

Linearizable Reads

Correctness and durability testing

High availability

Transparent data encryption

Query languages

Management of runaway queries

Backups

Pricing and Free trial

Appropriate as primary database to build apps/data platform on

Open source

Protocols

Point in time recovery

Multi-region deployments

SQL migration tool

Authentication and authorization

Drivers

Multi-database features

Graph Database As A Service (DBaaS)

Query execution plans

Support for graph algorithms

Apache Spark integration

Kafka integration

Import/export

Related topics