More

databrecht · on April 21, 2021

There is value in both, at Fauna we provide GraphQL out the box. Using Fauna directly would eliminate an indirection and is probably slightly more efficient. However, if there is a GraphQL layer like Prisma in between you could essentially change to any database with less impact on your application. This is tremendously interesting for people who develop frameworks, using prisma gives them the advantage of supporting multiple databases immediately. Or for application developers it could allow you to move from a non-scalable database to a scalable database once it becomes necessary or simply just switch databases if the database maintenance is causing you grief. I'm for one looking forward to Prisma supporting Fauna since if the interface is the same, there are even less reasons not to choose a scalable managed database instead of managing your own db :). And I would say that Prismas interface is quite great!

Note: the performance impact does depend heavily on whether your database maps well on ORMs. Traditional databases have an impedance missmatch when it comes to translating tables to an objet format. Graph databases or the way Fauna works (documents with relations and map/reduce-like joins) map well on ORMs so the performance impact would be small.

databrecht · on April 14, 2021

Which should be the case, depending on what you do you will probably experience 10-50ms read latencies. Look for Evan's answers why the measured values here are higher.

databrecht · on April 14, 2021

No it's not, that price is far far smaller and should not impact pure reads. This is probably an artifact of an anti-pattern where the same documents are constantly updated which creates significant history. At this point, that can have an impact on the index. We are working on optimizing that in which case history will no longer have an impact on these latencies while retaining the possibility to go back in time or get changesets.

databrecht · on April 14, 2021

Region selection is coming up if that interests you. We are actively working on it :)

thdxr · on April 14, 2021

I need to do some benchmarking myself but it seems that even in a local region writes are in the 100s of ms. I'm aiming for my Lambda functions to be < 100ms so using Fauna seems difficult to work in.

Unless you meant I can limit "how global" my data is and that would improve write speeds?

ec109685 · on April 15, 2021

The spanner approach could dynamically shift the leader location for parts of the key space, so writes that tend to be done from one location could avoid needing to communicate outside the local region.

databrecht · on Feb 19, 2021

Exactly! That's how I've built until now, a mix of databases. But it's also harder to manage. Database vendors notice this and that results in databases that start offering alternative ways of modeling.

databrecht · on Feb 19, 2021

Freeform flexibility is one aspect, but a document-style could also simply be a preference for how you want to structure your data or typically has an impact on how joins happen (if the document-database offers joins). Those joins will work in a graph-like fashion instead of how flat sets are typically joined. Or a nested document could be an optimization to provide data in the exact format that your client wants to see it. Although some document databases have popularized the idea that you should join in the client because they didn't provide joins initially, it doesn't have to be that way.

Mixing paradigms in one database is probably going to be happening more. Just like Postgres is offering a 'document' style, some document databases are offering documents with relations. It wouldn't surprise me to see document databases offer optional schemas. I think that the future is a mix of options and tools in one database (which JSONB columns are a first step for). Depending on the situation we'll just model differently. The best database might become the one that makes us use these different tools most elegantly together. The difference between a document and a table is only a 'flatten' and a 'schema' away.

databrecht · on Feb 19, 2021

'NoSQL' can be transactional and relational. The question should always be: "this is my problem, what's the best database?". NoSQL is such a huge bucket that the original question doesn't make sense imo. So is SQL, some traditional databases have quite some nifty features to support specific patterns.

SQL will (maybe sadly?.. maybe not?) not go away. Many so-called 'NoSQL' are looking into providing SQL or already provided SQL (with or without limitations) to their users because they just want to use what they know. I would be stoked for an SQLV2 standard!

databrecht · on Feb 19, 2021

I would go a step further, you can't even talk about NoSQL vs SQL. It's about database features, the join patterns, how scaling happens, both are overlapping more and more and will continue to overlap more. Products built on SQL are aiming to scale and 'NoSQL' is aiming to provide the features that SQL provides in a scalable manner. Her original statement was already quite confusing. A relational store doesn't necessarily mean SQL, many 'NoSQL' offer relations and are a perfect fit for social media or were even built to support this kind of applications :)

databrecht · on Feb 19, 2021

Exactly, but it goes further than that. The mentality never made sense since the term NoSQL never made sense to start. It's amazing how many people use a term that just originated from a meeting to talk about alternative databases. How we keep using it, although it's practically impossible to say what NoSQL is. Depending on whom you ask that term means different things. This is a very good introduction to the term: https://www.youtube.com/watch?v=qI_g07C_Q5I

Graph databases are considered 'NoSQL' yet they have relations and transactions. Schemaless is often also one of the properties give to NoSQL, but it's also a bit strange to consider that a NoSQL attribute. Some traditional databases offer schemaless options and databases like Cassandra has schema yet is considered NoSQL. I work at Fauna which has relations, stronger consistency than many traditional databases. It is schemaless at this point but that might change in the future. Since it doesn't offer SQL it's thrown into the NoSQL bucket with the assumptions that come along with it.

None of these one-liners in computer science make sense IMHO and we listen way too often to colleagues who use them. Similarly "Use SQL for enforced schema" might be accurate in many cases but in essence it depends on your situation, and we need to do research about what we use instead of following one-liners ;)

databrecht · on Feb 19, 2021

Social media are typically quite heavy on tree traversals. That kind of pattern is very similar to trying to resolve a deep ORM query or a deep GraphQL query which also doesn't map very well on 'traditional' relational databases https://en.wikipedia.org/wiki/Object%E2%80%93relational_impe.... I believe this 'issue' depends on: A) the type of join B) whether your relational databases flattens between consecutive joins. C) is there easy/efficient pagination on multiple levels

The type of join shouldn't be a problem, SQL engines should in most cases be able to determine the best join. In the cases it can't you can go start tweaking (although tricky to get right, especially if your data evolves, it's possible, you probably want to fix your query plan). B is however tricky and a performance loss since it's really a bit silly that data is flattened into a set each time to be then (probably) put into a nested (Object-Oriented or JSON) format to provide the data to the client. This is closely related to C, in a social graph you might have nodes (popular people or tweets) who have a much higher amount of links than others. That means if you do a regular join on tweets and comments and sort it, on the tweet you might not get beyond the first person. Instead, you probably only want the first x comments. That query might result in an amount of nested groups. So it might look more like the following SQL (wrote it by heart, probably not correct):

SELECT tweet.*, jsonb_agg(to_jsonb(comment)) ->> 0 as comments, FROM tweet JOIN comment ON tweet.id = comment.tweet_id

GROUP BY tweet.id HAVING COUNT(comment.tweet_id) < 64 LIMIT 64

That obviously becomes increasingly complex if you want a feed with comments, likes, retweets, people, etc.. all in one. There are reasons why two engineers that helped to scale twitter create a new database (https://fauna.com/) where I work. Although relational, the relations are done very differently. Instead of flattening sets, you would essentially walk the tree and on each level join. I did an attempt to explain that here for the GraphQL case: https://www.infoworld.com/article/3575530/understanding-grap...

TLDR, in my opinion you can definitely use a traditional relational database. But it might not be the most efficient choice due to the impedance mismatch. Relational applies to more than traditional SQL databases though, graph database or something like fauna is also relational and would be a better match (Fauna is similar in the sense that joins are very similar to how a graph database does these). Obviously I'm biased though since I work for Fauna.