None of this. It's in the blog post in a lot of detail =)
The 5ms write latency is because the backend distributed block storage layer is doing synchronous replication to multiple servers for high availability and durability before ack'ing a write. (And this path has not yet been super-performance-optimized for latency, to be honest.)
I'm not aware of any published source for this time limit, nor ways to reduce it.
The docs do say, however, "If the volume has been impaired for more than 20 minutes, you can contact the AWS Support Center." [0] which suggests its some expected cleanup/remount interval.
That is, it is something that we regularly encounter when EC2 instances fail, so we were sharing from personal experience.
Tiger Cloud certainly continues to run on AWS. We have built it to rely on fairly low-level AWS primitives like EC2, EBS, and S3 (as opposed to some of the higher-level service offerings).
Our existing Postgres fleet, which uses EBS for storage, still serves thousands of customers today; nothing has changed there.
What’s new is Fluid Storage, our disaggregated storage layer that currently powers the new free tier (while in beta). In this architecture, the compute nodes running Postgres still access block storage over the network. But instead of that being AWS EBS, it’s our own distributed storage system.
From a hardware standpoint, the servers that make up the Fluid Storage layer are standard EC2 instances with fast local disks.
- EBS typically operates in the millisecond range. AWS' own documentation suggests "several milliseconds"; our own experience with EBS is 1-2 ms. Reads/writes to local disk alone are certainly faster, but it's more meaningful to compare this against other forms of network-attached storage.
- If durability matters, async replication isn't really the right baseline for local disk setups. Most production deployments of Postgres/databases rely on synchronous replication -- or "semi-sync," which still waits for at least one or a subset of acknowledgments before committing -- which in the cloud lands you in the single-digit millisecond range for writes again.
Our experience is that Clickhouse and Timescale are designed for different workloads, and that Timescale is optimized for many of the time-series workloads people use in production:
TimescaleDB primarily serves operational use cases: Developers building products on top of live data, where you are regularly streaming in fresh data, and you often know what many queries look like a priori, because those are powering your live APIs, dashboards, and product experience.
That's different from a data warehouse or many traditional "OLAP" use cases, where you might dump a big dataset statically, and then people will occasionally do ad-hoc queries against it. This is the big weather dataset file sitting on your desktop that you occasionally query while on holidays.
So it's less about "can you store weather data", but what does that use case look like? How are the queries shaped? Are you saving a single dataset for ad-hoc queries across the entire dataset, or continuously streaming in new data, and aging out or de-prioritizing old data?
In most of the products we serve, customers are often interested in recent data in a very granular format ("shallow and wide"), or longer historical queries along a well defined axis ("deep and narrow").
For example, this is where the benefits of TimescaleDB's segmented columnar compression emerges. It optimizes for those queries which are very common in your application, e.g., an IoT application that groups by or selected by deviceID, crypto/fintech analysis based on the ticker symbol, product analytics based on tenantID, etc.
If you look at Clickbench, what most of the queries say are: Scan ALL the data in your database, and GROUP BY one of the 100 columns in the web analytics logs.
There are almost no time-predicates in the benchmark that Clickhouse created, but perhaps that is not surprising given it was designed for ad-hoc weblog analytics at Yandex.
So yes, Timescale serves many products today that use weather data, but has made different choices than Clickhouse (or things like DuckDB, pg_analytics, etc) to serve those more operational use cases.
However if you really want to optimize data currently residing in Postgres for analytical workloads, as the original comment suggests - consider moving to a dedicated OLAP DB like ClickHouse.
What we ended up doing is maintain meta-data in Postgres but time series data is stored in ClickHouse. Thanks for making / working on ClickHouse. I appreciate it very much.
Indeed, ClickHouse results were run on an older instance type of the same family and size (c5.4xlarge for ClickHouse and c6a.4xlarge for Timescale), so if anything ClickHouse results are at a slight disadvantage.
Eh, c6a is also an AMD Rome which has worse memory bandwidth at the tails and weaker per thread performance than Cascadelake (c5). I don't understand anything about this particular benchmark, but I wouldn't compare them simply as "older vs newer".
TimeScale was certainly the first choice as we were already using Postgres. However, we could not get it to perform well as times are simulated / non monotonic. We also ultimately need to be able to manage low trillions of points in the long run. InfluxDB was also evaluated but faced a number of issues as well (though I am certain both it and TimeScale would work fine for some use cases).
I think perhaps because ClickHouse is a little more general purpose, it was easier to map our use case to it. Also, one thing I appreciate about ClickHouse is it doesn't feel like a black box - once you understand the data model it is very easy to reason about what will work and what will not.
Can you say more about "dynamic labels"? Do you just mean that as you evolve, you want to add a new type of "key-value" pair?
The most common approach here is just to store the step of "dynamic" labels in JSON, which can be evolved arbitrarily.
And we've found that this type of data actually compresses quite well in practice.
Also regarding compression, Timescale supports transparent mutability on compressed data, so you can directly INSERT/UPDATE/UPSERT/DELETE into compressed data. Under the covers, it's doing smart optimizations to manage how it asynchronously maps individual mutations into segment level operations to decompress/recompress.
The 5ms write latency is because the backend distributed block storage layer is doing synchronous replication to multiple servers for high availability and durability before ack'ing a write. (And this path has not yet been super-performance-optimized for latency, to be honest.)