I’m trying to find the shortest and longest path between two node types across the entire graph in Dgraph, similar to how it’s done using APOC in Neo4j.
In Neo4j, I can use a single query like this:
MATCH (start: `ntype1`)
CALL apoc.path.expandConfig(start, {
labelFilter: '>ntype2',
minLevel: 1,
uniqueness: 'NODE_GLOBAL'
})
YIELD path
RETURN min(length(path)) as min, max(length(path)) as max
This efficiently returns the shortest and longest path lengths between any nodes of type ntype1 to ntype2 in one go.
What I’m doing in Dgraph:
Since Dgraph doesn’t support such a direct query, I’m first fetching all UIDs of source and target node types, and then looping over combinations to run multiple shortest queries:
combined_query = f"""
{{
sources(func: type({ntype1})) {{ uid RELATED_TO{{uid}} ~RELATED_TO{{uid}} }}
targets(func: type({ntype2})) {{ uid RELATED_TO{{uid}} ~RELATED_TO{{uid}} }}
}}
"""
result = dgraphManager.query(combined_query)
source_uids = [x['uid'] for x in result['sources'] if 'RELATED_TO' in x or '~RELATED_TO' in x]
target_uids = [x['uid'] for x in result['targets'] if 'RELATED_TO' in x or '~RELATED_TO' in x]
uid_list = [{"from": s, "to": t} for s in source_uids for t in target_uids]
query_parts = []
result_parts = []
for i, item in enumerate(uid_list, 1):
query_parts.append(f"""
var(func: uid({item['from']})) {{
start{i} as uid
}}
var(func: uid({item['to']})) {{
end{i} as uid
}}
path{i} as shortest(from: uid(start{i}), to: uid(end{i})) {{ RELATED_TO ~RELATED_TO }}
""")
result_parts.append(f"result{i}(func: uid(path{i})) {{ uid }}")
final_query = "{\n" + "\n".join(query_parts) + "\n" + "\n".join(result_parts) + "\n}"
data = dgraphManager.query(final_query)
path_lengths = [len(val) for key, val in data.items() if val]
Problem:
This approach works, but it’s very inefficient when handling numerous nodes. For example, if ntype1 has 100 nodes and ntype2 has 50 nodes, it results in executing 5,000 queries — just to determine path lengths between two types. Sometimes, it causes errors related to query timeout.
Question:
Is there a more efficient way in Dgraph to compute:
- The shortest path between any node of type A to any node of type B
- The longest such path
Ideally similar to how APOC works in Neo4j, with just one query.
Any insights or optimizations would be highly appreciated!