Inconsistent query result

Here’s the revised message with the data structure included for clarity:


Subject: Issue with Single-Field Queries After Upgrading to Dgraph 24.1.1

Hi,

I recently upgraded to Dgraph 24.1.1 and encountered a strange issue when querying a single field.

Data Structure:

Assume we have the following schema:

type User {
  uid: String
  Name: String
  Address: [Address]
}

type Address {
  uid: String
  City: String
  PostCode: String
}

Steps to Reproduce:

  1. Create a new User with an Address:

    {
      "uid": "_:user1",
      "Name": "John Doe",
      "Address": [
        {
          "uid": "_:addr1",
          "City": "New York",
          "PostCode": "10001"
        }
      ]
    }
    
  2. Update the User, but omit the uid for Address:

    {
      "uid": "0x123",  
      "Name": "John Doe",
      "Address": [
        {
          "City": "San Francisco",
          "PostCode": "94105"
        }
      ]
    }
    
  3. Query for User, requesting only the City from Address:

    {
      getUser(func: eq(Name, "John Doe")) {
        uid
        Name
        Address {
          City
        }
      }
    }
    

Expected Behavior:

  • The query should return the user with a single city value from the Address object.

Actual Behavior:

  • Most of the time, the response contains an array of cities, instead of a single value.
  • Occasionally, it returns only one city, as expected.
  • If I modify the query to include a second field (e.g., uid or PostCode), the response is always correct, returning a single City and PostCode.

Additional Observations:

  • I know that during the update, I am missing the uid for Address, which may cause unintended behavior.
  • However, this worked fine in Dgraph 24.0.
  • Data created using Dgraph 24.0 does not exhibit this issue.
  • Even after backing up (24.0 data) and restoring data into a fresh Dgraph 24.1.1 instance, this data works fine, only newly created data have this.
  • the array is always of size two, independent of how many updates were made

This also affects cases where you query a User and only request the Name.

Any insights on why this is happening and how to resolve it would be appreciated.

Thanks!

I made another strange observation.
it seems that the property/field that is sometimes returned as single, is the one from the last edit.

I know made two query just for this two fields, as both have separate uids.
and both querys returned me the value of this field. so before and after the update.

now I fixed the code, to always add the uids on update.

I made 1 update. and the problem is gone.
now i only receive a single value all the time.

but interesting is that the 2 query for the fields itself now both return the same value.
even the uids are different, and even this value was not changed.

I’ll try to take a deeper look at this later, but know that in Dgraph, uid is a reserved keyword. You could definitely use it like you have above, but it will definitely conflict with the uid mechanism in mutations and when using DQL.

Might I suggest

type User {
  id: ID
  Name: String
  Address: [Address]
}

I use the uid as reference for updates.
means,
i create new objects without the uid.
but when I want to update, the uid is present in the object.

but it is not part of the dgraph schema.
just the go structs and the corresponding json have it

I have the same problem in DQL.
Schema:

connection: uid .
name: string .

type node {
    name
    connection
}

The connection predicate of a single node is updated several times to different other nodes.

When querying

{
    q(func: uid(0x123)){
        uid
        connection {
            uid
        }
    }
}

I get this result, where 0x1 and 0x2 where previously set connections, which should not happen. The predicate is not defined as a uid list in the schema.

 "q": [
      {
        "uid": "0x123",
        "connection": {
          "uid": [
            "0x1",
            "0x2"
          ]
        }
      }
    ]

But when querying

{
    q(func: uid(0x123)){
        uid
        connection {
            uid
            name
        }
    }
}

I get the correct result only containing the last set node.

 "q": [
      {
        "uid": "0x123",
        "connection": {
          "uid": "0x2",
          "name": "node2"
        }
      }
    ]

So it seems dgraph somehow does not delete the old edge when updating a connection. This also only happened when using dgraph version 24.1.0 or 24.1.1 instead of 24.0.0.

There’s a bug that we recently fixed in v24.1.1 related to scalar predicates like the one you are using. Maybe there are more issues remaining. Can you give me exact instructions that I can use to reproduce the issue that you have?

Schema

connection: uid .
name: string .

type Node {
    name
    connection
}

Set data

{
  set {
    _:node1 <name> "node_1" .
    _:node1 <dgraph.type> "Node" .
    _:node2 <name> "node_2" .
    _:node2 <dgraph.type> "Node" .
    _:node3 <name> "node_3" .
    _:node3 <dgraph.type> "Node" .
  }
}

Set initial connection

{
  set {
    <0x1> <connection> <0x2> .
  }
}

Replace connection

{
  set {
    <0x1> <connection> <0x3> .
  }
}

Replace connection a second time

{
  set {
    <0x1> <connection> <0x2> .
  }
}

Using this query

{
  q(func: type(Node)) {
    uid
    name
    connection {
      uid     
    }
  }
}

I get an array for the connection of node with uid 0x1

  "data": {
    "q": [
      {
        "uid": "0x1",
        "name": "node_1",
        "connection": {
          "uid": [
            "0x2",
            "0x3"
          ]
        }
      },
      {
        "uid": "0x2",
        "name": "node_2"
      },
      {
        "uid": "0x3",
        "name": "node_3"
      }
    ]
  },

But with this query

{
  q(func: type(Node)) {
    uid
    name
    connection {
      name
      uid     
    }
  }
}

I get the correct result

  "data": {
    "q": [
      {
        "uid": "0x1",
        "name": "node_1",
        "connection": {
          "name": "node_3",
          "uid": "0x3"
        }
      },
      {
        "uid": "0x2",
        "name": "node_2"
      },
      {
        "uid": "0x3",
        "name": "node_3"
      }
    ]
  },

I can reproduce it this way most of the time. If the connection array does not appear, try the mutations which change the connection again.
I tested with dgraph v24.1.2, v24.1.0 (which both have this problem) and dgraph v24.0.5 (which does not have this problem).

Thanks a lot @vnium for the issue report and @xqqp for a potential fix. The bug has been fixed and merged in main. I have backported it and we will release a new version soon. There’s one more bug I am working on (oom in shortest path), the moment that’s done, we will make a new patch release. If you could you try out the release/v24.1 branch now, it would give us more confidence on the fix.

1 Like

Thanks a lot for the quick fix! I tried the release branch and can confirm that the bug is fixed for me.

1 Like

thanks a lot,
i will wait until there is a release :slight_smile:

Thanks a lot to you too @mko0815 for reporting the issue.

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.