JUNE 5 2024
Using Vector similarity search in GraphQL
Dgraph 24.0.0 supports GraphQL similarity search based on vector indexes for efficient retrieval of similar contents.

Similarity search in GraphQL
This post shows a simple example of a GraphQL schema with vector embeddings and corresponding mutation and query.
Deploy the following GraphQL schema:
type Project {
id: ID!
title: String! @id
title_v: [Float!]
@embedding
@search(by: ["hnsw(metric: euclidean, exponent: 4)"])
}
In this schema, the field title_v
is an embedding on which the HNSW algorithm
is used to create a vector search index. The metric used to compute the distance
between vectors (in this example) is Euclidean distance. A new directive,
@embedding
, has been introduced to designate one or more fields as vector
embeddings. The @search directive has been extended to define the HNSW index
based on Euclidean distance. The exponent
value is used to set reasonable
defaults for HNSW internal tuning parameters. It is an integer representing an
approximate number for the vectors expected in the index, in terms of power
of 10. Default is “4” (10^4 vectors).
Once deployed successfully:
Let’s add some data via the auto-generated addProject
mutation type.
mutation {
addProject(
input: [
{
title: "iCreate with a Mini iPad"
title_v: [0.12, 0.53, 0.9, 0.11, 0.32]
}
{
title: "Resistive Touchscreen"
title_v: [0.72, 0.89, 0.54, 0.15, 0.26]
}
{ title: "Fitness Band", title_v: [0.56, 0.91, 0.93, 0.71, 0.24] }
{ title: "Smart Ring", title_v: [0.38, 0.62, 0.99, 0.44, 0.25] }
]
) {
project {
id
title
title_v
}
}
}
The auto-generated querySimilarProjectByEmbedding
query allows us to run
semantic (aka similarity) search using the vector index specified in our schema.
Execute the query:
query {
querySimilarProjectByEmbedding(
by: title_v
topK: 3
vector: [0.1, 0.2, 0.3, 0.4, 0.5]
) {
id
title
vector_distance
}
}
The results obtained for the querySimilarProjectByEmbedding
function includes
the 3 closest Projects ordered by vector_distance. The vector_distance is the
Euclidean distance between the title_v
embedding vector and the input vector
used in our query.
Note: you can omit vector_distance
predicate in the query, the result will
still be ordered by vector_distance.
The distance metric used is specified in the index creation. In this example we have used:
title_v: [Float!] @embedding @search(by: ["hnsw(metric: euclidian, exponent: 4)"])
We can also query for similar objects to an existing object, given it’s Id
,
using the getSimilar<Object>ById
function.
query {
querySimilarProjectById(by: title_v, topK: 3, id: "0xef7") {
id
title
vector_distance
}
}
In the example below, we use title
to identify a project for which we want to
find similar projects. In this case, the title
field is an external ID and
annotated using the @id
directive in the schema. You can have multiple fields
designated as external IDs, using the @id directive.
query {
querySimilarProjectById(by: title_v, topK: 3, title: "Smart Ring") {
title
vector_distance
}
}