from Hacker News

PartiQL: One query language for all your data

by portmanteaufu on 8/1/19, 7:29 PM with 84 comments

  • by lwansbrough on 8/1/19, 8:38 PM

        PartiQL> SELECT * FROM [1,2,3]
           | 
        ===' 
        <<
          {
            '_1': 1
          },
          {
            '_1': 2
          },
          {
            '_1': 3
          }
        >>
        --- 
        OK! (86 ms)
    
    Jeez. 86ms for this query on this data set? Hope that's not representative of the general performance!
  • by xpe on 8/2/19, 5:27 AM

    A common query language, while appealing, is unlikely to fully abstract over different types of databases with different features and performance trade offs. It will be a leaky abstraction.

    Now, in practice, perhaps with sufficient adoption and integration, PartiQL might be good enough for 80% of use cases.

  • by jnordwick on 8/1/19, 8:35 PM

    'SQL’s ORDER BY orders the output data. Similarly, the PartiQL ORDER BY is responsible for turning its input bag into an array.'

    That is the most important thing for my uses. I deal mostly in time series data, SQL windowing queries are too slow. Turning the set into an array to allow indexing and support easy time series queries is enough for me the use it.

  • by manojlds on 8/2/19, 4:54 AM

    What does this offer over Hive SQL and Spark also supports it?

    Below are the reasons given in the blog post and I am trying to compare them with Hive SQL + Spark

    SQL compatibility - I need to check this as I am not a SQL expert, but Hive SQL seems compatible

    First-class nested data - supported

    Optional schema and query stability - supported

    Minimal extensions - feels same goals in Hive SQL

    Format independence - yes

    Data store independence - yes.

  • by rdsubhas on 8/2/19, 10:37 AM

    There is one word that every vendor hates: "vendor agnostic". Minor differences in SQL dialects are not a bug, they are features for most vendors.

    Most customers running on Amazon (or any cloud) want to move from having to maintain their own databases (which takes a lot of effort) to paying someone else do it. Amazon knows this.

    This move looks like Amazon has everything to win and every other vendor has everything to lose. Even if they say the opposite (you can switch from Amazon to your own) - they know that extremely few customers have the will to operationalize their own databases. So they know that only the opposite will happen - customers will switch from self hosted to Amazon services. They have also been openly predatorial towards other open source databases (e.g. aws elasticsearch and mongo). No wonder all Amazon services already support this.

    In that context, who is the target audience and what is the deployment model here? Are vendors going to integrate this directly into their databases? Or users have to run their own proxy instances? Or is it compiled into the application as a library?

  • by kodablah on 8/1/19, 9:15 PM

    Is there a specification in anything besides PDF easily available to link to?
  • by AtlasBarfed on 8/2/19, 6:48 PM

    AWS is all-in on data lock-in.

    This may be powerful and useful, but it is proprietary, nontransparent, unstandardized, and nonportable.

    I get that every database has some platform lock-in, but its getting ridiculous. At least amazon's relational offerings need to adhere to binary driver protocols.

  • by zellyn on 8/1/19, 8:49 PM

    Anyone know how this compares to Presto and zetasql?
  • by ahl on 8/2/19, 4:31 PM

    @dlurton since you seem to be speaking for the PartiQL team on this (congrats on the launch!): The reference implementation is open source; what's the plan for the language spec? Is that something that AWS is going to own and control? The website references the PartiQL Steering Committee -- is that just AWS folks or is the intention to make it more broadly composed of members of the community you build?

    I'm interested in adopting PartiQL for our product, but would we get to participate in the evolution of the language or would we purely be downstream of the decisions made to benefit AWS products and services?

  • by pawelduda on 8/2/19, 1:09 AM

    Love the codebase, I never wrote any Kotlin (and very little Java) and was able to (hopefully) complete a good first issue very quickly.
  • by manigandham on 8/1/19, 9:18 PM

    This is pretty nice. If only because using a SQL dotted syntax seamlessly with JSON data.
  • by pushingice on 8/1/19, 10:12 PM

    Will this be integrated into AWS Athena? The blog post doesn't mention it.
  • by whoevercares on 8/1/19, 9:23 PM

    Awesome! I’d be very interested to see when DynamoDB support this language and a MongoDB like query builder. Then I might sell all my MDB shares...
  • by ohnoesjmr on 8/1/19, 8:53 PM

    I wonder how this deals with nested parquet data, and whether it's able to optimise on the things parquet provides.
  • by k__ on 8/2/19, 2:05 AM

    Is this a GraphQL alternative or more for accessing DBs in the backend?
  • by agentultra on 8/1/19, 8:56 PM

    Interesting that they opted for a relational rather than a categorical one; the latter is proving to be more flexible [0].

    [0] https://www.categoricaldata.net/

  • by unnouinceput on 8/2/19, 9:48 AM

    Quote: "PartiQL requires the Java Runtime (JVM) to be installed on your machine."

    And that right there is where they lost me. Nooo thank you.

  • by benburleson on 8/1/19, 10:06 PM

  • by mehh on 8/1/19, 10:15 PM

    So I assume this is a rebranding of some other open source project with the amazon brand stuck on it, or is it actually something distinct?
  • by breck on 8/1/19, 9:01 PM

    This is neat. Anyone want to add support for TreeBase/Tree Notation? http://treenotation.org/treeBase/. It's currently on the backburner to query TreeBases in SQL without having first to convert the TreeBase to sql. Seems like it would be relatively straightforward to use this to do that.