AUGUST 25 2023
Dgraph with Native Vector Support - The Best of Both Worlds
Discover how Dgraph’s new vector support enables intelligent and adaptive search by combining graph and vector capabilities in a single platform.

With Dgraph's native vector support, you can not only find the needle in the haystack but also understand its context, its value, and its connections—all within a unified platform!
Introduction
Dgraph integrates vector support directly into its graph database, enabling a seamless combination of structured, interconnected data with the semantic power of vector embeddings. This innovation allows you to build intelligent and adaptive search solutions without relying on external vector databases.
The Goal
Fuzzy matching and semantic search are essential for modern applications, but traditional databases often struggle to combine flexibility with structured relationships. Dgraph’s unique architecture makes this challenge easy to address.
Dgraph models your data as a network of interconnected information — a graph. With native vector support, it enables embedding-based operations directly on the graph. Vectors, which represent the semantic meaning of text or data, allow you to:
- Perform similarity searches.
- Create intelligent associations.
- Enhance search and discovery.
This new capability transforms your knowledge graph into an Association Graph, combining structured knowledge with inferred, context-rich relationships.
Sample Use Case
Previously, we explored how to use word embeddings in Dgraph to implement
automatic classification
(see blog post). In this use
case, projects
are associated to categories
using word embeddings.
We will leverage the vector embeddings associated with projects
to expose a
natural language search API. Dgraph’s native vector indexing enables you to
store embeddings, compute distances between them, and retrieve context-rich
associations — all natively within Dgraph.
Native Vector Support in Dgraph
Dgraph’s vector functionality eliminates the need for external vector databases. Here’s how it works:
- Storing Vectors: Nodes in Dgraph can directly store vectors, which are numerical representations of text or other data.
- Similarity Search: Perform semantic similarity searches using vector distance computations (e.g., cosine similarity or dot product).
- Integrated Metadata: Dgraph can store additional metadata, such as similarity scores or the models used for embedding, ensuring full data lineage.
- Unified API: Query structured and semantic data together via DQL or GraphQL, simplifying development and integration.
This unified approach lets you mix reliable, curated information (your knowledge graph) with AI-inferred relationships (associations) without relying on external services.
Hands-On with Dgraph
Data Model
Here’s an example of a GraphQL schema that incorporates Dgraph’s vector embedding capabilities:
type Project {
id: ID!
title: String! @search(by: [term])
grade: String @search(by: [hash])
category: Category
score: Float
embedding: [Float!] @embedding @search(by: ["hnsw"])
}
type Category {
id: ID!
name: String!
embedding: [Float!] @embedding @search(by: ["hnsw"])
}
The embedding
field stores vector embeddings for projects and categories.
Embedding Generation
You can use any model to compute embeddings. The sample code is using
Hugging Face sentence-transformers/all-MiniLM-L6-v2
. You can easily use
OpenAI text-embedding-3-small
or any other embedding model.
- Compute the embedding for a project title or category name.
- Save the embedding as a vector field in the respective node.
Semantic Search Logic
Dgraph’s native similarity search greatly simply the semantic search logic:
- Compute an embedding for the search text.
- Use GraphQL or Dgraph Query Language to find similar vectors in the graph.
MODUS
The easiest way to add custom logic for embedding-based operations is to front Dgraph with Modus. Modus is an open-source, serverless framework designed for building intelligent APIs and functions. Here is an example of API created in Modus:
import { JSON } from "json-as"
import { embedText } from "./embeddings";
import { dgraph } from "@hypermode/modus-sdk-as";
import {searchBySimilarity} from "./dgraph-utils"
const DGRAPH_CONNECTION = "dgraph";
@json
class Project {
@alias("dgraph.type")
type: string | null = "Project";
@alias("uid") @omitnull()
id: string | null = null;
@alias("Project.title")
title!: string;
@alias("Project.category") @omitnull()
category: Category | null = null
@alias("Project.embedding") @omitnull()
embedding: string | null = null
}
@json
class Category {
@alias("dgraph.type")
type: string | null = "Category";
@alias("uid") @omitnull()
id: string | null = null;
@alias("Category.name")
name!: string;
@alias("Category.embedding") @omitnull()
embedding: f32[] | null = null
}
export function addProject( input: Project[]): Map<string, string>|null {
const uids = new Map<string, string>();
// add dgraph.type and embedding to each project
for (let i=0; i < input.length; i++) {
const project = input[i];
project.type = 'Project';
project.embedding = JSON.stringify(embedText([project.title])[0]);
}
const payload = JSON.stringify(input);
const mutations: dgraph.Mutation[] = [new dgraph.Mutation(payload)];
const res = dgraph.execute(DGRAPH_CONNECTION, new dgraph.Request(null, mutations));
return res.Uids;
}
The addProject
function
- Computes an embedding for each Product to create
- Performs a mutation in Dgraph to store the Product data with the embedding.
Adding Data
Use the following mutation to add projects:
mutation AddProject($input: [ProjectInput!]!) {
addProject(input: $input) {
key
value
}
}
with the variables
{
"input": [
{ "title": "Multi-Use Chairs for Music Classes" },
{ "title": "Photography and Memories....Yearbook in the Works" },
{ "title": "Current Events in Second Grade" },
{ "title": "Great Green Garden Gables" },
{ "title": "Albert.io Prepares South LA students for AP Success!" },
{ "title": "Learning and Growing Through Collaborative Play in TK!" },
{ "title": "Sit Together, Learn Together, Grow Together!" },
{ "title": "Help Special Children Succeed with Social Skills!" },
{ "title": "iCreate with a Mini iPad" },
{ "title": "Photography and Memories....Yearbook in the Works" },
{ "title": "The Truth About Junk Food" },
{ "title": "I Can Listen" },
{ "title": "Making Math A Group Learning Experience" },
{ "title": "The Center Of Learning: Kindergarten Fun!" }
]
}
Semantic Search
Semantic search is added by a simple function :
/**
* Search projects by similarity to a given text
*/
export function searchProjects(search: string): Project[]{
const embedding = embedText([search])[0];
const topK = 3;
const body = `
uid
Project.title
Project.category {
Category.name
}
`
return searchBySimilarity<Project>(DGRAPH_CONNECTION,embedding,"Project.embedding",body, topK, 0.5);
}
The searchProjects
function
- Compute an embedding of the input string
- Performs a similarity search in Dgraph using the Project.embedding vector predicate.
The search function is simply using a Dgraph query with the similar_to
function:
export function searchBySimilarity<T>(connection:string, embedding: f32[],predicate:string, body:string, topK: i32 = 10, threshold:f32 = 0.75): T[]{
const query = `
query search($vector: float32vector) {
var(func: similar_to(${predicate},${topK},$vector)) {
vemb as ${predicate}
dist as math((vemb - $vector) dot (vemb - $vector))
score as math(1 - (dist / 2.0))
}
list(func:uid(score),orderdesc:val(score)) @filter(gt(val(score),${threshold})){
${body}
}
}`
const vars = new dgraph.Variables();
vars.set("$vector", JSON.stringify(embedding));
const dgraph_query = new dgraph.Query(query,vars);
const response = dgraph.execute(connection, new dgraph.Request(dgraph_query));
console.log(response.Json)
return JSON.parse<ListOf<T>>(response.Json).list
}
Perform semantic searches with the semSearchProjects
query:
query SearchProjects {
searchProjects(search: "Photography and Memories....Yearbook in the Works") {
id
title
}
}
The result will include the most relevant projects based on semantic similarity.
Conclusion
Dgraph’s native vector support brings the power of semantic search and associations directly into your graph database. By combining structured data with embedding-based intelligence, you can build sophisticated applications without relying on external tools.
This unified approach simplifies architecture, reduces latency, and enhances the capabilities of your knowledge graph.
Start exploring Modus and Dgraph’s vector capabilities today by signing up for Dgraph Cloud!