Just browsing the Quickwit documentation it seems like the general architecture ...

fulmicoton · on July 12, 2024

Quickwit (like Elasticsearch/Opensearch) stores you data compressed with ZSTD in a row store, builds a full text search index, and stores some of your fields in a columnar. The "compressed size" includes all of this.

The high compression rate is VERY specific to logs.

- What happens when you alter an index configuration? Or add or remove an index?

Changing an index mapping was not available in 0.8. It is available in main and will be added in 0.9. The change only impacts new data.

- Or add or remove an index?

This is handled since the beginning.

- What about cold storage?

What makes Quickwit special is that we are reading everything is on S3. We adapted our inverted index to make it possible to read straight from S3. You might think this is crazy slow, but we typically search into TBs of data in less than a second. We have some in RAM cache too, but they are entirely optional.

> 2. Sampled data, generally for debugging. I would generally try to keep this at 10TB or less;

Sometimes, sampling is not possible. For instance, some of Quickwit users (including Binance) use their logs for user support too. A user might come asking details about something fishy that happened 2 months ago.

JackSlateur · on July 11, 2024

You have very good questions, I can only guess one answer: s3 network transfer is free for AWS services

Your link[1] said:

  You pay for all bandwidth into and out of Amazon S3, except for the following:
  [...]
  - Data transferred from an Amazon S3 bucket to any AWS service(s) within the same AWS Region as the S3 bucket (including to a different account in the same AWS Region).