One of the biggest issue with distributed database is to find the right model to store your data. On a recent project, I decided to use a registry model.

The registry idea

The idea behind writing a registry is to have an easy way to both store and view data.

For a given device that has a {UUID} id:

Classical column-families to index data

The problem comes with the data we need to index. We can store everything in a registry manner like having a path “/device/by-owner/{UUID}”:[“{UUID1}”,“{UUID2}”]. But it’s just easier to use cassandra secondary indexes have each property of each entity written to the indexed columns of the column family.

Sample use case: file storage

So you get the basic “Registry” model. Storing file on top of that is quite easy. Then what I did is I just said files are chunks of data. So if I want to store a picture for a user, I could store like this:

Hector object mapper

I have to say I didn’t know this project existed not that long ago.

I think HOM is a much better option in pretty much all the cases. Still having a simple tree view of your data can be a very interesting feature to analyze what you are working on.