Feature: Temporal Aggregating Index #15553

Max-Meldrum · 2024-05-16T14:05:15Z

I recently wrote a blog post about speeding up temporal aggregation queries significantly in DataFusion by using µWheel.

µWheel could potentially be used by Databend also to implement a Temporal version of Aggregating Index that pre-materializes aggregates across time.

I'd be happy to help if there is interest.

sundy-li · 2024-05-22T00:41:15Z

I wondered if it works with the distributed warehouse?

Max-Meldrum · 2024-05-22T08:42:28Z

I wondered if it works with the distributed warehouse?

I would say that µWheel can be used in two different modes:

Stream Mode:

This mode assumes that the wheel will be incrementally be updated by a streaming system.
µWheel is designed around low watermarking, meaning it is up to the user/system to advance the internal time
to cause aggregates to roll up over time.

A low watermark w indicates that all records with timestamps t where t <= w have been ingested. This means
a wheel will start rejecting data with timestamps below the watermark. This assumption may not be fully compatible
with non-streaming systems.

Index Mode:

However, if you are working with static read-only datasets that are time partitioned, then µWheel is ideal as an
index on top of this data.

So, to answer the question. If the distributed warehouse does not adopt low watermarking, it is still possible to use µWheel
in Index mode. The result from different µWheel instances can be merged together if the data is sharded.

Max-Meldrum added the C-feature Category: feature label May 16, 2024

Max-Meldrum mentioned this issue May 29, 2024

µWheel for OLAP Indexing - Tracking Issue uwheel/uwheel#126

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Temporal Aggregating Index #15553

Feature: Temporal Aggregating Index #15553

Max-Meldrum commented May 16, 2024

sundy-li commented May 22, 2024

Max-Meldrum commented May 22, 2024

Feature: Temporal Aggregating Index #15553

Feature: Temporal Aggregating Index #15553

Comments

Max-Meldrum commented May 16, 2024

sundy-li commented May 22, 2024

Max-Meldrum commented May 22, 2024