from Hacker News

The design philosophy of Great Tables

by randyzwitch on 4/4/24, 6:00 PM with 92 comments

  • by paddy_m on 4/4/24, 6:55 PM

    Great tables has done some really nice work on python/jupyter tables. It looks like they are almost building a "grammar of tables" similar to a grammar of graphics. More projects should write about their philosophy and aims like this.

    I have built a different table library for jupyter called buckaroo. My approach has been different. Buckaroo aims to allow you to interactively cycle through different formats and post-processing functions to quickly glean important insights from a table while working interactively. I took the view that I type the same commands over and over to perform rudimentary exploratory data analysis, those commands and insights should be built into a table.

    Great tables seems built so that you can manually format a table for presentation.

    https://github.com/paddymul/buckaroo

    https://youtu.be/GPl6_9n31NE

  • by jszymborski on 4/4/24, 8:27 PM

    The example they show of a Great Table is, to my taste, way too busy. Here is my unsolicited opinion:

    The top and bottom horizontal rules on the Title appear to be superfluous, and I dislike how it is aligned with the first column (row labels) rather than the second. I feel like a little space to breath at the bottom, along with a bold font would add visual hierarchy w/o the clutter.

    The row label backgrounds are far too dark and the font weight makes it hard to read. I'd prefer a very light blue here instead. I don't like the row group label ("Name") being italicized.

    The spanner labels floating in the centre make the table hard to scan. Would be much nicer aligned left.

    Finally, I really dislike the font (maybe this is just my browser, though).

    I mocked-up some of the changes here, I think this is a much easier to read table:

    https://i.imgur.com/iMMf5vo.png

  • by wglb on 4/5/24, 3:07 AM

    This a good article with some fascinating history.

    More recent history involves the production of CALS tables https://en.wikipedia.org/wiki/CALS_Table_Model. The company Datalogics https://en.wikipedia.org/wiki/Datalogics was heavily involved in the CALS table initiative. Datalogics staff was part of the ISO committee forming SGML, and trained many people on SGML, including DoD staff and their contractors involved with documentation.

    I was involved with the team that produced an editor for SGML-based documents. It had as one of its features the ability to specify the formatting of an element based on the SGML context of that element. This was before XSLT and its kin.

    Alumni of Datalogics helped Microsoft learn about XML ("No, you can't arbitrarily switch case on XML element tags").

    Also TeX practitioners have pretty well-formed opinions about how tables should be formatted.

    Odd side-note: I learned that the documentation for a fighter airplane of the time, if printed out, would weigh more than the aircraft and would fill a football-field sized collection of filing cabinets.

    And as much as many today don't like XML, coming from the SGML world it is a boon.

  • by jiggawatts on 4/4/24, 10:07 PM

    Something that always annoyed me about numeric data like dollar amounts in tables is that visually the comparison between quantities is logarithmic instead of linear.

    E.g.:

        Cost
        $1500
         $130
         $110
         $210
    
    The text in the last three rows look 4/5ths the size of the text in the first row. However, even if summed, the last three costs add up to only 1/3rd of the top row! People visually see the number digits, which is roughly the same as Log 10.

    I’ve so often had this issue that I started putting in-cell bar charts into every finance-related spreadsheet.

    Otherwise meetings will get derailed debating the cost of something trivial that is totally irrelevant compared to the biggest absolute costs.

    As a real example, I had many meetings spent debating a $15 monthly cost for server log collection in the cloud for a VM running a database engine that costs $15K monthly for the license alone.

  • by closed on 4/4/24, 8:33 PM

    Hey one of the co-maintainers of Great Tables, along with Rich Iannone, here!

    I just wanted to say that Rich is the only software developer I know, who when asked to lay out the philosophy of his package, would give you 5,000 years of history on the display of tables. :)

  • by ttymck on 4/4/24, 6:41 PM

    Wow. This looks incredible, thanks for sharing.

    It makes me wonder how we've gone this long with increasingly poor data table presentations (the mid-century modern tables are astutely pointed to as shining examples).

    This makes me excited to get back into data analysis with python. Moreover, I see some possible API improvements and extensions I'd like to make.

  • by countrymile on 4/4/24, 8:09 PM

    I love this package and have been using it for a few years in R. It's great [for making] tables in html but the pdf and docx output is a little less polished. I do worry that the recent shift to bringing the python version up to speed with the R version has slowed down the R development. Though it's well worth checking out whatever your language.
  • by AvAn12 on 4/5/24, 4:34 PM

    Wonderful! In the 90s a colleague and I wrote a book (EBRI Datebook on Employee Benefits) which was mostly tables. In addition to SAS, our other primary tool was an ancient language called Table Producing Language ("TPL"). Despite dating back to the 1970s, TPL was incredibly flexible, expressive, and efficient - once you figured out the syntax.

    The designers of Great Tables might want to check out TPL. It covers everything Great Tables aims to do, and I think may have a few more tricks up its sleeves:

    https://www.ojp.gov/pdffiles1/Digitization/68013NCJRS.pdf

    Regardless, thanks for making Great Tables! This goes a long way towards making table producing in python much better.

  • by benterix on 4/5/24, 8:28 AM

    This guy deserves some prize for 1) great work, 2) attention to detail, 3) in-depth research, 4) excellent presentation of his work (sparing the usual questions like "but what is it, really?", "how do I start?", "can you provide some examples?").

    People from Show HN - watch and learn.

  • by antidnan on 4/4/24, 7:47 PM

    There's also a book on the subject: https://en.wikipedia.org/wiki/The_History_of_Mathematical_Ta...

    Interesting aside: AI models trained on spreadsheets need "good tables" such as column names, headers, etc. to understand context. Like Fortap: https://arxiv.org/abs/2109.07323

  • by jimhefferon on 4/4/24, 7:55 PM

    I'm interested in the midcentury modern ones because they have lots of vertical rules. I'm active on the subreddit for LaTeX and there is a religion common there that even one vertical rule is an unforgivable abomination.
  • by flobosg on 4/4/24, 8:41 PM

    Regarding “nanoplots”: they are essentially sparklines, aren’t they?
  • by xnx on 4/4/24, 7:05 PM

    Tables are underutilized for how concise and descriptive they can be when making comparisons. It's a shame most text editors start with a blank table instead of inserting one pre-configured with some good design choices.
  • by simonbarker87 on 4/4/24, 8:28 PM

    This looks great. I so wish that the HTML table element would get some progress - it’s so limited.

    I don’t want to have to use some JS library component just to show tabular data especially given how badly they perform one big - but a server side rendered HTML table can be enormous and render fine. But again, so limited.

  • by eviks on 4/5/24, 5:43 AM

    Would be great if the example tables were great for such a post. Like their remote correspondent's table: they've actually made it partially worse in the "great tables" style:

    - now the dates are not vertically aligned, and

    - the weights have repetition of units (lbs) (although losing the decimal is an improvement)

    - the name column is way too visually "heavy", that's a style you'd reserve for a header, a simple bold would suffice

    They've also retained other issues from the original like city not being a city, but a combo of city and region, or similarly no separation of the first name of a person, which is especially important for a diverse group of people

  • by mcswell on 4/4/24, 9:15 PM

    Not mentioned yet are DocBook tables, of which there are several types. The kind we used starts here: https://tdg.docbook.org/tdg/5.1/cals.table. You have to drill down to get inside the tables. They have some--but I think not all-- the structure of GT.

    There's also of course LaTeX (mentioned in a couple other comments here), which has "ordinary" tables and long tables (tables that span more than one page).

  • by sebastiansm on 4/4/24, 6:45 PM

    It's great that the RStudio team is working on Python libraries.

    Hope to see dplyr and ggplot someday on Python.

  • by crispyambulance on 4/5/24, 10:47 AM

    I really like great tables and its cousin in R, gt. You know they're taking a long view when there are photos of CLAY TABLETS and VISI-CALC in the article. Bravo!

    Tables, when you really care, are so very difficult to get right. Sometimes you really want to densely compact the data to communicate all details to others who are already deeply invested in the dataset, sometimes you need to remove all but the most essential information to get across a single idea with clarity, and then there are a continuum of variations between those extremes. The problem expands into more dimensions when the media becomes a consideration-- you simply can't (and should not) use the same approach when dealing with pdf's vs html vs a slide-deck. On top of all that, you often have a personal style that you want to get across, or have a style that you need to comply with in some inflexible way.

    I like how gt just NAMES the parts of table in their docs (and in the schematic in the article). This is a problem where agreeing on what things are called makes a ton of difference in usability.

  • by lqet on 4/5/24, 8:43 AM

    Here is a very short guide on how to make nice tables (which, IMHO, look a lot better than the visually cluttered "Great Table" example from the article) for scientific papers:

    https://people.inf.ethz.ch/markusp/teaching/guides/guide-tab...

  • by bigger_cheese on 4/5/24, 3:29 AM

    I Use SAS reports in my job pretty heavily I'm keen to find alternatives - this looks pretty promising.

    One thing in particular I'm interested in but could not see an example for is if this will let you insert "break lines" i.e. for displaying sub totals and similar.

    For example based on the demo, which shows names and addresses from census data it might be nice to be able to break at each change in postcode and display some summary data like a count of people found at that postcode or an average age (based on the DOB) living at that postcode or similar.

    Otherwise conditional formatting is another pain point either using rules i.e. if value in column B is greater than a specified threshold make the entire row bold. Or automatically creating a color gradient to highlight the cells ala Excel.

    For bonus points management types like things like red and green traffic lights (or down/up arrows) you can display next to kpi data in a table It's a gimmick but wins you points.

  • by tqwhite on 4/5/24, 12:58 PM

    I want to praise the Great Tables people for having created an excellent marketing piece. It is a lesson in how to engage people in a product announcement. I started reading the article and by the time they started explaining the product, I was sincerely interested in knowing about it.
  • by acrophiliac on 4/4/24, 6:46 PM

    While I'm waiting for the packages to download, can you explain how I get tabular output when I run a python application using your package at the command line? Does it produce HTML output? PDF? Your "getting started" docs doesn't explain.
  • by throwaway81523 on 4/4/24, 6:52 PM

    This article is mostly blither, whether or not it is AI generated. It is about a Python library for generating nicely formatted HTML tables, though they don't tell you much about it til the near the end. The library seems to use an OOP approach. An alternative approach might be more declarative. The product name "Great Tables" appears in boldface over and over (no idea if the font helps SEO) and the name itself is awfully pretentious imho. Overall, the library itself sounds ok,, but the blog post is the annoying market-speak that frequently makes me cringe here on HN.

    It would be nice to add some interactivity features to the tables, like ActiveAdmin in Rails.

  • by ewuhic on 4/5/24, 8:07 AM

    There was some design blog post several years ago, maybe not even surfaced on HN:

    Creating beautiful tables both UI and UX wise, with some features being e.g dropping separator between columns (row?), doing some visual accents, etc.

    And yes, the most distinguished feature was that the tables weren't looking like your busy PowerPoint non-tech organisation stuff, they were very modern yet simple.

    I don't remember the specifics, but I was really impressed and regret not bookmarking the article.

    Does anyone know of the article in question, and maybe could share the link?

  • by hipjiveguy on 4/7/24, 9:29 PM

    The best thing anyone can do with tables is make it so people use tables Not "divs" - tables, can be parsed, and exported to other data formats easily, and are easy to recognize, without visualization.

    Many websites build tables out of div's, and they may look like tables, but they are hard to manipulate/export data from.

    If a table is on a website, I feel it should be easy to export it as csv, at the least, to "free the data" :D

  • by sandbach on 4/6/24, 12:49 AM

    The documentation of the LaTeX package booktabs[0] is a great resource for anyone interested in beautiful table design.

    [0] http://mirrors.ctan.org/macros/latex/contrib/booktabs/bookta...

  • by seanwilson on 4/4/24, 9:30 PM

    How does this compare to https://github.com/jieter/django-tables2? That one makes it really easy to display database models as HTML tables with column sorting and pagination, and search/filtering can be added on top with django-filter.
  • by RyanHamilton on 4/5/24, 12:23 PM

    If anyone is looking for a great open source table, I recommend slick grid: https://github.com/6pac/SlickGrid it has all the features shown in the article and has proven itself capable of everything I've needed for the last 2+ years.
  • by Narann on 4/5/24, 7:40 AM

    Nothing in the product page explains how you generate the HTML code and/or the image.
  • by thih9 on 4/5/24, 3:43 PM

    I am missing a section on UX best practices and in particular: intuitive and touch friendly solution for multi column sort.

    All approaches I’ve seen here have some issues; worst of all being “press shift to sort by multiple columns” (not touch friendly).

  • by two_handfuls on 4/4/24, 7:55 PM

    Summary:

    This article is about a Python library called “Great Tables” that is focused on the display of tables for publication and presentation (not for interactive browsing).

    The article does not specify which output format it supports.

    Also you get some bonus historical context on tables.

  • by pphysch on 4/4/24, 11:35 PM

    The generated HTML for the tables looks pretty good. How easy is it to attach extra classes to the elements? Is cell content HTML-escaped by default?
  • by shymaple on 4/5/24, 6:36 AM

    Awesome, recently we started working on table functionality for one of our feature and your post is really helpful. Thanks!
  • by hughess on 4/5/24, 3:26 PM

    Great article and big fan of this approach.

    I'm one of the maintainers of Evidence (open source tool based on markdown + SQL) and working on a similar approach to creating presentation tables configurable in code.

    Some examples here for any SQL + table enthusiasts: https://docs.evidence.dev/components/data-table

  • by golergka on 4/5/24, 4:50 AM

    Before clicking the link I was wondering if it's about relational schemas or woodworking.
  • by WuxiFingerHold on 4/5/24, 2:56 AM

    I overall like the approach for complex scenarios but their example is not the best one. The original version is much more readable and their final version adds mostly noise.

    Also, in documents all images and tables should have descriptive captions. So their header with title and subtitle would be redundant.

  • by semireg on 4/4/24, 7:46 PM

    Does anyone know any similar projects that can render to an HTML canvas?
  • by throw_m239339 on 4/5/24, 2:14 AM

    This was surprisingly informative and entertaining.
  • by narush on 4/4/24, 7:23 PM

    This is an excellent blog post - I'd never heard of Great Tables before, and I'm a newly minted fan!

    > confronted with an all-too-familiar dilemma: copy your data into a tool like Excel to make the table, or, display an otherwise unpolished table.

    One add-on (coming from the past 4 years of working on a tabular-data from Pythons startup [1]) is that users aren't just copying data into Excel because if it's good formatting capability: very often, there are organizational constraints that mean that Excel _needs_ to be where this data ends up.

    The most common reasons I've seen for data ending up in Excel: 1. Other parts of the report rely on Excel features - you want to build pivot tables or graphs in Excel (often, these are much easier to build in Excel than in Python for anyone who isn't a real Pythonista) 2. The report you're sending out for display is _expected_ in an Excel format. The two main reasons for this are just organizational momentum, or that you want to let the receiver conduct additional ad-hoc analysis (Excel is best for this in almost every org).

    The way we've sliced this problem space is by improving the interfaces that users can use to export formatting to Excel. You can see some of our (open-core) code here [2]. TL;DR: Mito gives you an interface in Jupyter that looks like a spreadsheet, where you can apply formatting like Excel (number formatting, conditional formatting, color formatting) - and then Mito automatically generates code that exports this formatting to an Excel. This is one of our more compelling enterprise features, for decision makers that work with non-expert Python programmers - getting formatting into Excel is a big hassle.

    Of course, for folks who can ditch Excel entirely, this is entirely unnecessary. Great Tables seems excellent in this case (and anyone writing blog posts this good is probably writing good code too... :) )

    [1] https://trymito.io

    [2] https://github.com/mito-ds/mito/blob/dev/mitosheet/mitosheet...

  • by boringg on 4/4/24, 8:39 PM

    I was really looking forward to a discussion about beautiful wood tables. I should have known better
  • by snappr021 on 4/5/24, 5:57 AM

    Contenteditable?
  • by jamesdutc on 4/5/24, 1:03 AM

    The historical background about tabular displays of quantitative information is very interesting. I imagine it must have been fun think deeply about this problem.

    Unfortunately, the API design in the example is just not very good:

        (
           GT(simple_table, rowname_col='Name')
          .tab_header(title='Names, Addresses, and Characteristics of Remote Correspondents')
          .tab_stubhead(label=md('*Name*'))
          ...
        )
    
    I'm uncertain if it's trying to mimic something in another language like R (or some grammar of graphics thing or D3.js.) Hopefully, it's not trying to mimic the look of long, chained `pandas.DataFrame` operations (because it misses the point of why those look the way it does.)

    Of course, for ad hoc, in-a-notebook, cut-and-paste/written-from-scratch use, the API design doesn't really matter that match. Usually, users will readily memorise the required incantations then fiddle with the result until they get what they want or they give up.

    It's probably the case that for most tools that produce visual outputs, a majority of users are creating things in this style. (There are, e.g., millions of casual Matplotlib users out there.) But programmatic use is not too far off. Tools that produce visual outputs (even those as formally rigidly at display tables,) are often subject to consistency requirements, which directly implies programmatic use.

    So, when I discover that my colleagues and I have six tables across three notebooks that need a consistent look, and I decide to interact with this tool programmatically, am I expected to write…?

        def standard_table(source, /, rowname_col, header_title, stubhead_label, weight_columns):
          return (
            GT(source, rowname_col=rowname_col)
            .tab_header(title=header_title)
            .tab_stubhead(label=md(f"*{stubhead_label}*"))
            .fmt_integer(columns=weight_columns, pattern="{x} lbs")
            ...
          )
    
        standard_table(simple_table, rowname_col='Name', header_title='Names, Addresses, and Characteristics of Remote Correspondents', stubhead_label='Name', weight_columns='Weight')
    
    Or maybe…?

        def format_table(weight_columns):
          return (
            tbl
            .tab_stubhead(label=md(f"*{tbl.stubhead.label}*")) # what if not present?
            .fmt_integer(columns=weight_columns, pattern="{x} lbs")
            ...
          )
    
        format_table(
          GT(simple_table, rowname_col='Name')
          .tab_header(title='Names, Addresses, and Characteristics of Remote Correspondents')
          .tab_stubhead(label='Name')
          ...
        )
    
    Or maybe…?

         class StandardTable(GT):
           def tab_stubhead(self, *a, **kw):
             # inspect.signature.bind(...) # ...
             return super().tab_stubhead(*a, **kw)
    
        StandardTable(...)
    
    These aren't great options. The API design is just not very good.
  • by tonymet on 4/4/24, 8:47 PM

    Imagine the web if every site was exclusively tabular. No UIs just a table of figures and a CRUD for modifying it. Something like hypercard meets excel
  • by magnio on 4/5/24, 5:50 AM

    TLDR: you can create booktabs-style tables in Python.

    Strange that they did not know (or credit) booktabs, the LaTex package that popularizes this table design since 2003.

  • by tomcam on 4/4/24, 7:23 PM

    Fantastic article, duly bookmarked. However.

    “The democratization of computational tables arguably began with VisiCalc in 1979… I mean, try it out and you’ll see that this is quite limited in more than a few ways.”

    Them’s fightin’ words. IMHO VisiCalc’s ability to generate models quickly changed civilization. It freed people to try out ideas at no cost and to view or manipulate data in ways no one could hope to do before.