Why are queries running so slowly？？

youyin123 · September 16, 2020, 3:06pm

I have a query requirement, I write a query and test the result on small data is correct.
However, when running on big data, the prompt exceeds 20s.
I wonder why this query has been running for so long? Is it because my code is not well written?

youyin123 · September 16, 2020, 3:06pm

I need to know:

Given a Message, retrieve the (1-hop) Comments that reply to it.
In addition, return a boolean flag knows indicating if the author of the reply knows the author of the original message. If author is same as original author, return false for knows flag.

the relationship like:

youyin123 · September 16, 2020, 3:07pm

my code:

{
  # 找出给定月份的Tag是哪些
        var(func: has(creationDate))@filter((type(Comment) or type(Post)) and le(creationDate,"2011-09-30") and ge(creationDate, "2011-09-01")){
    hasTag{
      tag_uid as  uid
    }
  }
  # 首先是第一个月的次数
  var(func: uid(tag_uid)) @filter(type(Tag)){
    count_thismonth as count(~hasTag) @filter((type(Comment) or type(Post)) and le(creationDate,"2011-09-30") and ge(creationDate, "2011-09-01"))
  }
  # 其次是第二个月的次数
  var(func: uid(tag_uid)) @filter(type(Tag)){
    count_nextmonth as count(~hasTag) @filter((type(Comment) or type(Post)) and le(creationDate,"2011-10-31") and ge(creationDate, "2011-10-01"))
  }
  # 计算diff
  var(func: uid(count_thismonth)){
    diff as math(max(count_thismonth-count_nextmonth,0)+max(count_nextmonth-count_thismonth,0))
    
  }

        # 统一计算
  q(func: uid(diff),orderdesc:val(diff),orderasc:name){
    name
    count_thismonth1: val(count_thismonth)
    count_nextmonth1: val(count_nextmonth)
    diff1: val(diff)
    
  }

}

youyin123 · September 16, 2020, 3:11pm

I want to know how to improve the efficiency of this query？
Who can give me some advice

MichelDiz · September 16, 2020, 10:48pm

Avoid using has func at root, especially when you have tons of data. You can use on filters tho.
The best approach here to gain perf is using indexation. Any kind.

Also, I personally would recommend that you segment your types. Doing a pattern like “namespacing”.

For example. The predicate “name”. You can have this very same predicate shared with several entities. This isn’t too good. So, I recommend that you do like:

user.name: string .
product.name: string .
animal.name: string .
object.name: string .
...

So on and so forth.

You should do this block like this

A0 as var(func: type(Post)) @filter(le(creationDate,"2011-09-30") AND ge(creationDate, "2011-09-01"))
A1 as var(func: type(Comment)) @filter(le(creationDate,"2011-09-30") AND ge(creationDate, "2011-09-01")) 
  
var(func: uid(A0,A1)){
    hasTag {
    tag_uid as  uid
  }

That way you can have a better performance.

youyin123 · September 17, 2020, 1:11pm

@MichelDiz
thanks!

In fact, I used “has” because I didn’t know I could use “type” directly, now I removed “has”.
But it still took a long time. I suddenly thought that it might be an index problem, so I added an index to the “creationDate”, and the result came out, which took 6 seconds (this time is still too long, but let’s do it first)

MichelDiz · September 17, 2020, 3:01pm

BTW, I did a small “upgrade” on the query. You don’t need the has() at all. As you are using on the filters the same predicate, there’s no need to check its existence.

Topic		Replies	Views
Two equivalent queries, one is slow and the other one is fast Dgraph	3	828	January 6, 2020
So this recursive query, it takes 2 to 8 seconds, did I write it wrong?Or what can I do to make the query faster, how can I index it？ Dgraph discussion , kind:question , dgraph	7	652	January 6, 2021
V1.0.12 slower for some queries Users	4	455	April 6, 2019
DQL: Why Dgraph groupby query slow? Dgraph dgraph	0	499	March 14, 2022
Query is very slow while adding le function for float predicate in filter Dgraph area:performance	6	1186	November 15, 2022

Why are queries running so slowly？？

Related topics