Sorry for the delay. I extended a notebook in our dgraph-experimental repo that explores this in detail. Here’s the link to the notebook. If you have Docker and Jupyter Notebooks installed, you can run this locally. If not, I’ve included the output of the notebook in the commit, so you can see the output just by loading the file in your browser.
As I stated earlier, the sole native path function in Dgraph (shortest) relies on knowing the UIDs from two separate nodes. So natively running queries across paths in the entire graph is not possible. However, using pydgraph and networkx you can perform all sorts of graph analysis, not just path calculations. But since the topic is about shortest and longest shortest, here’s a brief overview. I definitely would recommend looking at the notebook linked above tho, as this is just a quick summary… Note that this notebook uses our stock Friend of a Friend graph, in which there are Person nodes, connected to other Person nodes via a Person.friends predicate.
- Load nodes and edges from Dgraph, convert and import to networkx
import json, pydgraph, networkx as nx
# Query Dgraph for Person nodes and friendships
client = pydgraph.DgraphClient(pydgraph.DgraphClientStub('localhost:9080'))
query = """
{
person(func: type(Person)) @recurse {
id: uid
name: Person.name
friends: Person.friends
}
}
"""
txn = client.txn(read_only=True)
res = txn.query(query=query)
nodes = {}
edges = []
# Here we use the pydgraph `convert function to extract nodes and edges from the DQL result
convert.extract_dict(nodes=nodes, edges=edges, data=json.loads(res.json))
nodes_df = pd.DataFrame.from_dict(nodes, orient = 'index')
edges_df = pd.DataFrame(edges)
G = nx.from_pandas_edgelist(
edges_df,
source="src",
target="dst",
create_using=nx.MultiDiGraph()
)
- Find longest shortest path
Since friendship graphs are cyclic by nature, we can’t use nx.dag_longest_path(), which is only for acyclic graphs. Instead, we use the graph diameter - the longest shortest path between any two nodes, representing maximum degrees of separation:
# Find longest shortest path (diameter cmp approach)
longest_path = []
max_length = 0
for source, targets in nx.shortest_path(G).items():
for target, path in targets.items():
if len(path) > max_length:
max_length = len(path)
longest_path = path
# Display with readable names
path_names = [node_id_to_name.get(node_id, node_id) for node_id in longest_path]
print(f"Diameter: {max_length-1} edges")
print(f"Path: {' -> '.join(path_names)}")