Thanks @matthewmcneely , I watched the video. As always with your productions, it’s good ![]()
I don’t know if you noticed the compute embedding script that I published in pydgraph repository as an example:
The script is using DQL level mutation. It relies on a small configuration file where you define the types for which you need an embedding and the a dql query to capture the fields you need to create a text embedding. It also uses a template string, so you can create the embedding string as you want based on multiple fields. ('m using pybar, which is following mustache notation for templates).
The script loops by batches on all nodes without embeddings and updates them.
You can select OpenAI, Mistral, or a huggingface model to generate the embedding,
It is pretty generic, as no size limitation, and helped me a lot in my projects.
PR to improve the script are welcome !