More

jrullmann · on Feb 13, 2015

Take a look at the Overview slide of the presentation, where this information is listed for each database the author discussed: http://image.slidesharecdn.com/newsql-2015-150213024325-conv...

jrullmann · on Feb 12, 2015

Seeing how each visualization adjusts as I change the original dataset is so useful. The technique reminds me of Bret Victor's amazing work.

Ladder of Abstraction Essay: http://worrydream.com/#!2/LadderOfAbstraction

Stop Drawing Dead Fish Video: https://vimeo.com/64895205

This is awesome, thanks for sharing!

jrullmann · on Feb 11, 2015

Along those lines, this article about Malcolm McLean (often called the father of containerization) is a great read.

http://hbswk.hbs.edu/item/5026.html

jrullmann · on Jan 29, 2015

Looks cool. Would love to know what Conspire uses as their technology stack.

jrullmann · on Dec 17, 2014

Nice utility. I submitted a pull request for FoundationDB SQL Layer https://github.com/emirozer/fake2db/pull/4

jrullmann · on Nov 25, 2014

Very cool - would love to see tests of FoundationDB!

FYI I'm an engineer at FDB, happy to help.

martinkl · on Nov 25, 2014

Awesome, I'd love a pull request! I've been looking at FoundationDB, but haven't had time to test it. Porting the tests to another database is (hopefully) a mostly mechanical exercise.

wwilson · on Nov 25, 2014

Here's one way you could run this test vs. FoundationDB:

https://gist.github.com/MMcM/f00108c5943e919f73d1

jrullmann · on Nov 25, 2014

Just submitted a pull request for FoundationDB: https://github.com/ept/hermitage/pull/1

jrullmann · on Aug 11, 2014

Great article. A lot of engineers don't have personal experience with these kinds of network failures, so sharing stories of their consequences means more engineers can make informed (and conscious) decisions of how much risk can be tolerated for their applications.

One thing that you could gleam for this article-and I think that this is incorrect-is that the application or operations engineer is responsible for understanding the nuances of distributed systems. In my experience the number of people who are relying on distributed systems is much larger than the number of people who understand these issues.

So what we really need are systems we can build on whose developers understand how to build (and test!) the nuances of data convergence, consensus algorithms, split-blain avoidance, etc. We need systems to gracefully-and automatically-deal with and recover from network failures.

Full disclosure: I'm an engineer at FoundationDB

jrullmann · on July 16, 2014

What kind of consistency do you expect to provide with this future syncing feature? I assume it will be eventually consistent. Is that right? How will conflicts be resolved?

astigsen · on July 16, 2014

Yes, on mobile sync really only make sense if also works when the device has spotty or no connectivity, so that naturally entails eventual consistency in some form.

We are not ready to into details about how our sync solution will work yet, but watch out for some announcements soon.

jrullmann · on July 10, 2014

Out of curiosity, what database are you using to store the data?

uniontownlabs · on July 10, 2014

By default it writes metadata about the stream (title, description, etc) using a file based db called nedb, and it appends the actual logged data to CSV files that are split into 500k chunks. When the user requests their logged data, all of the files are stitched back together, converted into the requested format (JSON, CSV, etc), and streamed to the user’s web client.

For the production server, we are currently using MongoDb for metadata storage and the same CSV module for logged data storage.

jrullmann · on July 10, 2014

That's a pretty unusual setup :)

I'd be interested in a blog post about how you choose this architecture.

uniontownlabs · on July 10, 2014

Sounds good. I'll work on one once the traffic stabilizes.

ejr · on July 10, 2014

Is this the same Nedb? https://github.com/louischatriot/nedb/

Looks like a pretty nice project.

jrullmann · on July 9, 2014

The author says that different data stores are good at different things, so we need to use multiple data stores in our application. He proposes that a data service layer can abstract these implementations, making it easier to swap out data stores as needed. I think that's a good idea. Separating our applications from data store specifics is a big reason why we use ORMs and ODMs today.

However, I think there are challenges with this polygot data store architecture that he doesn't address. Each addition requires due diligence to understand its CAP trade-offs (which the author mentions briefly), scalability and performance characteristics, how to configure, etc. These are non-trivial concerns even for a single database. It's important to consider these challenges when building out a data store or data service.

I'd propose an architecture where the data services layer itself exposes different data models to the application, all of which are persisted in a single data store. Given that many data stores use a key-value store under the covers anyway, translating the specific data model down to a single, persistent data store would simplify operations while exposing the desired data model to the application. (As a caveat, this multi-model approach requires ACID transactions to ensure strict consistency when translating between data models.) This approach provides operational simplicity with just one data store and application efficiency by exposing the "right" data model API.

Full disclosure: I'm an engineer at FoundationDB, a database that provides ACID-compliant polygot data models on a single key-value data store (http://www.foundationdb.com).