MARCH 6 2025

Dgraph v24.1: Making knowledge graphs faster with performance enhancements

Dgraph v24.1 enhances knowledge graph performance with a new query planner, faster writes, improved caching, and expanded data capabilities

Software Engineer
Harshil Goel
Software Engineer, Hypermode

We're excited to announce the release of Dgraph v24.1, an in-place upgrade that delivers significant performance improvements, new features, and important bug fixes. Key updates include an introduction of a query planner in Dgraph, faster writes on indexed predicates, and increased cache efficiency for faster reads.

New features

  • Query planner introduction for efficient eq() query evaluation for performance gains.
  • BigFloat datatype for floats with precision up to 200, expanding Dgraph's capabilities for handling high-precision numerical data.

Performance updates

  • Faster count index insertions for dramatically faster insertions (up to 99.5%).
  • Optimized UID retrieval from posting lists for faster query execution.
  • Reduced vector index rebuild time for faster large-scale vector similarity search and knowledge graph RAG workflow.
  • Up to 50% faster mutations with indexes for improved overall write performance.
  • Scalar postings for predicates without a list to further optimize storage and query execution.
  • Ristretto Posting List Cache with greater control over cache behavior.

In addition, there are ongoing bug fixes and improvements (see here for full list).

How to upgrade to Dgraph v24.1

To upgrade from version 23.0 or higher, simply:

  1. Replace the binary.
  2. Rebuild count indexes if you've previously encountered "Transaction Is Too Old" issues.

What's new, in detail

Optimized query planner for more effective database queries

The new query planner optimizes query execution, starting with one of the most used queries: the eq() query. By estimating the number of items for a query, the planner determines the fastest way to retrieve the data. This currently works for any eq query not in the root.

Benchmarks show a 1000x improvement in query time for certain eq() queries. This example query below used to take more than one second, but now in Dgraph v24.1, it takes under 0.17 milliseconds.

Query: { q(func: eq(name, "name1")) @filter(eq(dgraph.type, “person”)) { uid } }

QueryAverage time in v24.0Average time in v24.1Improvement
O(name1) = 20
O(person) = 10
1.239 seconds
0.177312 milliseconds
99.5%

In previous versions of Dgraph, executing this query would retrieve all the UIDs that are type person, and intersect that with the result of eq(name, name1). Now Dgraph tracks O(person) and the query planner knows that the max result can only be O(name1). Therefore it only checks UIDs for the type person.

We will extend the query planner to various different types of functions, starting with range functions like gt and lt. We are also going to change the order in which a list of filters are applied. Currently they are all applied all at once. However, some filters would be better off if they are applied later.

Faster reads & improved consistency with the Posting List Cache

Posting Lists are how every node and predicate combination is stored inside of Dgraph.

To reduce disk reads and improve latency during high-throughput periods, we've redesigned the cache to only contain the latest timestamp for that key. This redesign helps ensure that additional reads from the disk are not completed if the data is available in the cache.

Dgraph v24.1 also introduces a new configuration for the cache. This changes the default to update values in the cache after each mutation. Previously, Dgraph removed data from the cache on update to prevent adding too many mutations and causing cache bloat.

For use cases where num(mutations) >> num(queries), this configuration can be changed to achieve a higher hit-miss ratio from the cache. With this release, we have introduced new metrics in Dgraph to better track cache hits and misses for debugging purposes.

Benchmarks show an average performance improvement of 38.66% with the Posting List cache enabled compared to v24.0. To test this change, we used a 21 million entry RDF dataset with standardized queries in a Go benchmark suite, which runs all queries multiple times and figures out an average how much time each query took.

Up to 3x faster count queries with mutable layer upgrades

Dgraph's Posting List is divided into two sections: mutable and immutable.

The immutable layer consists of previous commits and is stored as a sorted list. The mutable layer consists of a list of recent mutations sorted chronologically. In v24.1, we've upgraded the mutable layer into its own class from a map.

With this upgrade, the mutable layer is now sharable across various different mutations without copying memory. The new layer allows us to create data structures that are optimized for querying, eliminating the need to sort data for every query, as was necessary with the previous data storage method.

Having the mutable layer in shared memory gives Dgraph a clear structure that allows it to compute and store any value, so Dgraph can now store Count inside memory easily. For instance, with Count Index, Dgraph previously counted the number of items each time. Now, Dgraph determines the length as the data is read, saving a significant amount of time. Benchmarks show up to 3x faster count queries.

QueryAverage time in v24.0Average time in v24.1
Query-16
2.8770
0.918
Query-44
3.149
0.979

Reducing index performance overhead with mutation upgrades

Dgraph now reads less data from disk, with benchmarks showing 50% improvement for inserting list predicates. When inserting predicates with indexes, Dgraph previously needed to read the old value to remove associated indexes, which was time-consuming.

There are two types of indexes:

  1. List Indexes: Reading the old value is unnecessary, as it doesn't make sense in the context of a list.
  2. Scalar Indexes: Only the latest value needs to be read. Dgraph now uses Badger.Get (which is faster and reads less disk data) instead of Badger.Iterate (which reads the entire history) to read data.

The only exception where it's necessary to read the entire list to determine its length is when dealing with count indexes.

Thank you to the Dgraph community

This release would not be possible without the work of the wonderful Dgraph community.

Thank you to the 20 community members who contributed 120 commits, making this release possible. Your dedication and effort are instrumental to our success! We are actively addressing submitted issues and feedback, and encourage your continued participation in the community.

Dgraph v24.1 delivers substantial performance improvements, new features, and bug fixes. Upgrading is easy, and the benefits are significant. We encourage all users to upgrade and experience the enhanced performance and capabilities of Dgraph v24.1.

Get started today! If you are interested in early access to Dgraph v25, let us know here.