from Hacker News

PostgreSQL columnar store benchmarks on SSDs

by il on 6/4/14, 5:11 PM with 36 comments

  • by danmaz74 on 6/4/14, 6:09 PM

    This looks so good, that a question arises: Where's the catch? In other words, in which situations is a columnar DB a bad solution?
  • by ddorian43 on 6/4/14, 6:49 PM

  • by Alex3917 on 6/4/14, 6:27 PM

    Any comparisons against Vertica or other DBs that were designed to be columnar from the ground up?
  • by techscruggs on 6/4/14, 6:08 PM

    If you are not familiar with Foreign Data Wrappers, they allow you to connect to other datastores and represent that data as tables in your database. http://wiki.postgresql.org/wiki/Foreign_data_wrappers
  • by fletchowns on 6/4/14, 6:55 PM

    Does this support JOINs? Or do you use a giant WHERE IN () clause?

    My use case is essentially a cross-database JOIN that I've been using MySQL & temp tables to accomplish. For example, give me the sum of column x if column y is any one of these 50,000 values from a separate system. So load the 50,000 values into a temp table and then do a JOIN. Performance isn't that great and it uses a ton of disk space so I wanted to try using a columnar store.

  • by dharbin on 6/4/14, 6:13 PM

    I'm very excited about this! Add a mechanism to distribute data and queries across a cluster, and this could be the makings of an open-source Amazon Redshift.
  • by rustyconover on 6/5/14, 3:51 AM

    It would be interesting to compare these benchmarks against the performance of Amazon's Redshift.

    If the benchmark can be run without changes on Redshift would be my first question. There are some interesting differences that Redshift has rather than just being a columnar PostgreSQL protocol-speaking database. But if its possible, I'd be very interested to see the results.

  • by klreierson on 6/5/14, 12:11 AM

    Do the benchmarks for postgres utilize in memory columar store (IMCS)? What is the difference between postgres imcs and citus cstore_fdw? http://www.postgresql.org/message-id/52C59858.9090500@garret...
  • by mixologic on 6/4/14, 7:17 PM

    Isn't the assumed tradeoff SSD storage for CPU usage? How much more cpu time is utilized in compressing/decompressing? And whats the unit cost of that extra CPU in comparison to the cost for disk space savings of 'expensive' SSD's?
  • by dougmccune on 6/4/14, 9:34 PM

    I couldn't find documentation about what subset of SQL you can use. I saw mention of "all supported Postgres data types", but not anything about what features work. Any links?