Help with designing a schema

Hi,
I’m new to dgraph. How could I design a schema for money flow between users?
Every user has an ID and could send any amount of currency to any other user.
I have transactions like this: User 1 sends 12 amount of currency to User 2 etc.

I want to build a graph like this:

How should schema look like for this? (edges with values represent the transactions between users)

Hi @Kuririn,
Without knowing any other requirements you might have, I would suggest putting a node in the middle between users, signifying a transaction type

<name> <string> @index(hash) .
<user_id> <string> @index(hash) .
<txn_id> <string> @index(hash) .
<txn_time> <datetime> .
<amount> <float> .
<recipient> uid @reverse .
<sender> uid @reverse.

type User {
  user_id
  name
}
type Transaction {
  txn_id
  txn_time
  amount
  sender
  recipient  
}

Where queries could be like

{
  txnsByUserName(func: eq(name,"John Smith")) @filter( type(User) ) {
    user_id
    ~sender {
      id
      amount
      txn_time
    }
  }
  usersByTxn(func: eq(txn_id,'42342abec')) @filter(type(Transaction)) {
    sender @filter(type(User)) {
      user_id
      name
    }
    recipient @filter(type(User)) {
      user_id
      name
    }
  }
}

(the extra type(X) filters are just for extra protection, not really required there)

Good luck!

1 Like

Thanks! I will consider it. What I want to achieve is to see all the inflows (incoming transactions) to specific users to the N degree (N hops on the graph).

For example graph that I showed would be displayed for User 4.

yea to do things like ‘show me people who have given money to anyone who has given money to me’ you could query as something like:

{
  q(func: eq(name,"John Smith")) @recurse {
    ~recipient
    sender
    txn_id
    user_id
  }
}

which would follow all reverse recipient edges, all sender edges, and collect txn_id and user_id along the way.

Another cool thing you can do here is ask “has person A ever been financially connected to person B though any possible path?” using a shortest_path query.

2 Likes

Awesome! It’s what I need. I will dig deeper into it. Thanks!

@iluminae you was so helpful. Do you know if I can add own uid? Because when I’m adding new transactions I must first check uid of user and it’s additional query (I must specify uid for sender, recipient fields).

im not sure what you question is asking exactly, so correct me if my assumptions are wrong here:

  • If you want to update based on user’s ids:
    • you can use upserts to look up ids to link to. (see here). Here is a complex version with alot of sanity checking that may not be required in your case:
upsert {
  query {
    userA as var(func: eq(user_id,"7tsuynaebvj"))
    userB as var(func: eq(user_id,"eoineovvhe"))
    transaction as var(func: eq(txn_id,"teohdht")) @filter(uid_in(sender,userA) AND uid_in(recipient,userB)) # for sanity check, find a already existing trasaction that exactly matches what you are about to insert
  }
  mutation @if(eq(len(transaction),0)) { #only perform this mutation if no existing transaction exists (extra, not really required for this use case)
    set {
      # note that if any of these variables are empty (above query did not find anything) it will assign a new uid
      uid(transaction) <txn_id> "teohdht" .
      uid(transaction) <sender> uid(userA) .
      uid(transaction) <recipient> uid(userB) .
      uid(transaction) <amount> "3456.345" . 
    }
  }
  # set user information. If you know absolutely these must exist already, both of these mutations are probably unnecessary.
  mutation @if(eq(len(userA),0)) { #fill in userA's id if this node is new
    set {
      uid(userB) <user_id> "7tsuynaebvj" .
      #maybe name and info too
    }
  }
  mutation @if(eq(len(userB),0)) { #fill in userB's id if this node is new
    set {
      uid(userB) <user_id> "eoineovvhe" .
      #maybe name and info too
    }
  }
}

(note: i free form wrote this and did not run it)

  • If you want to literally put in your own uids (literally the uid int field) and control them on your side, external to the database:
    • I highly discourage this, as you will also have to control leasing these uids from the Zero pods, and there have been bugs about leasing all available UIDs at once. Use external ids (which is any identifying field you want that is 1:1 with a node), and upserts to look up real uids from them.
1 Like

Thanks!!! You saved me a lot of time. :slight_smile:

I have a big problem with this query right now:

It worked on the small scale, but when I uploaded 2 million nodes it’s not working. I’m getting this error:

context deadline exceeded

I thought that the graph database is designed for queries like this. I’m using dgraph shared in the cloud. Do I need a dedicated server for this number of nodes?
I’m afraid that with more nodes even on a dedicated server it will work very slow.

Nevermind! I forgot about depth parameter!

iluminae you are a genius and maybe you could help me.

Why this query doesn’t show any connections (it only return main node)? I know that this user ids are connected with transactions. (it shows when I put these filters on sender and recipient fields, but I want to filter this on transaction level)

{
   u1 as var(func: eq(user_id, "1"))
    u2 as var(func: eq(user_id, "2"))
    u3 as var(func: eq(user_id, "3"))
  
    q(func: eq(user_id, "1")) @recurse (depth: 5) {
    	~recipient @filter(uid(u1) OR uid(u2) OR uid(u3))
    	~sender @filter(uid(u1) OR uid(u2) OR uid(u3))
    	recipient
    	sender
    	txn_id
    	user_id
    	username
    	amount
  }
}

Sorry, not enough information as is - can you include minimal example RDF to go along with that query to explain what you have?

It is almost the same schema that you presented in this topic.

Something like this:

<username> <string> @index(hash) .
<user_id> <string> @index(hash) .
<txn_id> <string> @index(hash) .
<txn_time> <datetime> .
<amount> <float> .
<recipient> uid @reverse .
<sender> uid @reverse.

type User {
  user_id
  username
}
type Transaction {
  txn_id
  txn_time
  amount
  sender
  recipient  
}

I mean the rdf data lines - what is connected to what, etc - so I can load it into a dgraph and reproduce your query and it’s result