by randyzwitch on 4/4/24, 6:00 PM with 92 comments
by paddy_m on 4/4/24, 6:55 PM
I have built a different table library for jupyter called buckaroo. My approach has been different. Buckaroo aims to allow you to interactively cycle through different formats and post-processing functions to quickly glean important insights from a table while working interactively. I took the view that I type the same commands over and over to perform rudimentary exploratory data analysis, those commands and insights should be built into a table.
Great tables seems built so that you can manually format a table for presentation.
by jszymborski on 4/4/24, 8:27 PM
The top and bottom horizontal rules on the Title appear to be superfluous, and I dislike how it is aligned with the first column (row labels) rather than the second. I feel like a little space to breath at the bottom, along with a bold font would add visual hierarchy w/o the clutter.
The row label backgrounds are far too dark and the font weight makes it hard to read. I'd prefer a very light blue here instead. I don't like the row group label ("Name") being italicized.
The spanner labels floating in the centre make the table hard to scan. Would be much nicer aligned left.
Finally, I really dislike the font (maybe this is just my browser, though).
I mocked-up some of the changes here, I think this is a much easier to read table:
by wglb on 4/5/24, 3:07 AM
More recent history involves the production of CALS tables https://en.wikipedia.org/wiki/CALS_Table_Model. The company Datalogics https://en.wikipedia.org/wiki/Datalogics was heavily involved in the CALS table initiative. Datalogics staff was part of the ISO committee forming SGML, and trained many people on SGML, including DoD staff and their contractors involved with documentation.
I was involved with the team that produced an editor for SGML-based documents. It had as one of its features the ability to specify the formatting of an element based on the SGML context of that element. This was before XSLT and its kin.
Alumni of Datalogics helped Microsoft learn about XML ("No, you can't arbitrarily switch case on XML element tags").
Also TeX practitioners have pretty well-formed opinions about how tables should be formatted.
Odd side-note: I learned that the documentation for a fighter airplane of the time, if printed out, would weigh more than the aircraft and would fill a football-field sized collection of filing cabinets.
And as much as many today don't like XML, coming from the SGML world it is a boon.
by jiggawatts on 4/4/24, 10:07 PM
E.g.:
Cost
$1500
$130
$110
$210
The text in the last three rows look 4/5ths the size of the text in the first row. However, even if summed, the last three costs add up to only 1/3rd of the top row! People visually see the number digits, which is roughly the same as Log 10.I’ve so often had this issue that I started putting in-cell bar charts into every finance-related spreadsheet.
Otherwise meetings will get derailed debating the cost of something trivial that is totally irrelevant compared to the biggest absolute costs.
As a real example, I had many meetings spent debating a $15 monthly cost for server log collection in the cloud for a VM running a database engine that costs $15K monthly for the license alone.
by closed on 4/4/24, 8:33 PM
I just wanted to say that Rich is the only software developer I know, who when asked to lay out the philosophy of his package, would give you 5,000 years of history on the display of tables. :)
by ttymck on 4/4/24, 6:41 PM
It makes me wonder how we've gone this long with increasingly poor data table presentations (the mid-century modern tables are astutely pointed to as shining examples).
This makes me excited to get back into data analysis with python. Moreover, I see some possible API improvements and extensions I'd like to make.
by countrymile on 4/4/24, 8:09 PM
by AvAn12 on 4/5/24, 4:34 PM
The designers of Great Tables might want to check out TPL. It covers everything Great Tables aims to do, and I think may have a few more tricks up its sleeves:
https://www.ojp.gov/pdffiles1/Digitization/68013NCJRS.pdf
Regardless, thanks for making Great Tables! This goes a long way towards making table producing in python much better.
by benterix on 4/5/24, 8:28 AM
People from Show HN - watch and learn.
by antidnan on 4/4/24, 7:47 PM
Interesting aside: AI models trained on spreadsheets need "good tables" such as column names, headers, etc. to understand context. Like Fortap: https://arxiv.org/abs/2109.07323
by jimhefferon on 4/4/24, 7:55 PM
by flobosg on 4/4/24, 8:41 PM
by xnx on 4/4/24, 7:05 PM
by simonbarker87 on 4/4/24, 8:28 PM
I don’t want to have to use some JS library component just to show tabular data especially given how badly they perform one big - but a server side rendered HTML table can be enormous and render fine. But again, so limited.
by eviks on 4/5/24, 5:43 AM
- now the dates are not vertically aligned, and
- the weights have repetition of units (lbs) (although losing the decimal is an improvement)
- the name column is way too visually "heavy", that's a style you'd reserve for a header, a simple bold would suffice
They've also retained other issues from the original like city not being a city, but a combo of city and region, or similarly no separation of the first name of a person, which is especially important for a diverse group of people
by mcswell on 4/4/24, 9:15 PM
There's also of course LaTeX (mentioned in a couple other comments here), which has "ordinary" tables and long tables (tables that span more than one page).
by sebastiansm on 4/4/24, 6:45 PM
Hope to see dplyr and ggplot someday on Python.
by crispyambulance on 4/5/24, 10:47 AM
Tables, when you really care, are so very difficult to get right. Sometimes you really want to densely compact the data to communicate all details to others who are already deeply invested in the dataset, sometimes you need to remove all but the most essential information to get across a single idea with clarity, and then there are a continuum of variations between those extremes. The problem expands into more dimensions when the media becomes a consideration-- you simply can't (and should not) use the same approach when dealing with pdf's vs html vs a slide-deck. On top of all that, you often have a personal style that you want to get across, or have a style that you need to comply with in some inflexible way.
I like how gt just NAMES the parts of table in their docs (and in the schematic in the article). This is a problem where agreeing on what things are called makes a ton of difference in usability.
by lqet on 4/5/24, 8:43 AM
https://people.inf.ethz.ch/markusp/teaching/guides/guide-tab...
by bigger_cheese on 4/5/24, 3:29 AM
One thing in particular I'm interested in but could not see an example for is if this will let you insert "break lines" i.e. for displaying sub totals and similar.
For example based on the demo, which shows names and addresses from census data it might be nice to be able to break at each change in postcode and display some summary data like a count of people found at that postcode or an average age (based on the DOB) living at that postcode or similar.
Otherwise conditional formatting is another pain point either using rules i.e. if value in column B is greater than a specified threshold make the entire row bold. Or automatically creating a color gradient to highlight the cells ala Excel.
For bonus points management types like things like red and green traffic lights (or down/up arrows) you can display next to kpi data in a table It's a gimmick but wins you points.
by tqwhite on 4/5/24, 12:58 PM
by acrophiliac on 4/4/24, 6:46 PM
by throwaway81523 on 4/4/24, 6:52 PM
It would be nice to add some interactivity features to the tables, like ActiveAdmin in Rails.
by ewuhic on 4/5/24, 8:07 AM
Creating beautiful tables both UI and UX wise, with some features being e.g dropping separator between columns (row?), doing some visual accents, etc.
And yes, the most distinguished feature was that the tables weren't looking like your busy PowerPoint non-tech organisation stuff, they were very modern yet simple.
I don't remember the specifics, but I was really impressed and regret not bookmarking the article.
Does anyone know of the article in question, and maybe could share the link?
by hipjiveguy on 4/7/24, 9:29 PM
Many websites build tables out of div's, and they may look like tables, but they are hard to manipulate/export data from.
If a table is on a website, I feel it should be easy to export it as csv, at the least, to "free the data" :D
by sandbach on 4/6/24, 12:49 AM
[0] http://mirrors.ctan.org/macros/latex/contrib/booktabs/bookta...
by seanwilson on 4/4/24, 9:30 PM
by RyanHamilton on 4/5/24, 12:23 PM
by Narann on 4/5/24, 7:40 AM
by thih9 on 4/5/24, 3:43 PM
All approaches I’ve seen here have some issues; worst of all being “press shift to sort by multiple columns” (not touch friendly).
by two_handfuls on 4/4/24, 7:55 PM
This article is about a Python library called “Great Tables” that is focused on the display of tables for publication and presentation (not for interactive browsing).
The article does not specify which output format it supports.
Also you get some bonus historical context on tables.
by pphysch on 4/4/24, 11:35 PM
by shymaple on 4/5/24, 6:36 AM
by hughess on 4/5/24, 3:26 PM
I'm one of the maintainers of Evidence (open source tool based on markdown + SQL) and working on a similar approach to creating presentation tables configurable in code.
Some examples here for any SQL + table enthusiasts: https://docs.evidence.dev/components/data-table
by golergka on 4/5/24, 4:50 AM
by WuxiFingerHold on 4/5/24, 2:56 AM
Also, in documents all images and tables should have descriptive captions. So their header with title and subtitle would be redundant.
by semireg on 4/4/24, 7:46 PM
by throw_m239339 on 4/5/24, 2:14 AM
by narush on 4/4/24, 7:23 PM
> confronted with an all-too-familiar dilemma: copy your data into a tool like Excel to make the table, or, display an otherwise unpolished table.
One add-on (coming from the past 4 years of working on a tabular-data from Pythons startup [1]) is that users aren't just copying data into Excel because if it's good formatting capability: very often, there are organizational constraints that mean that Excel _needs_ to be where this data ends up.
The most common reasons I've seen for data ending up in Excel: 1. Other parts of the report rely on Excel features - you want to build pivot tables or graphs in Excel (often, these are much easier to build in Excel than in Python for anyone who isn't a real Pythonista) 2. The report you're sending out for display is _expected_ in an Excel format. The two main reasons for this are just organizational momentum, or that you want to let the receiver conduct additional ad-hoc analysis (Excel is best for this in almost every org).
The way we've sliced this problem space is by improving the interfaces that users can use to export formatting to Excel. You can see some of our (open-core) code here [2]. TL;DR: Mito gives you an interface in Jupyter that looks like a spreadsheet, where you can apply formatting like Excel (number formatting, conditional formatting, color formatting) - and then Mito automatically generates code that exports this formatting to an Excel. This is one of our more compelling enterprise features, for decision makers that work with non-expert Python programmers - getting formatting into Excel is a big hassle.
Of course, for folks who can ditch Excel entirely, this is entirely unnecessary. Great Tables seems excellent in this case (and anyone writing blog posts this good is probably writing good code too... :) )
[2] https://github.com/mito-ds/mito/blob/dev/mitosheet/mitosheet...
by boringg on 4/4/24, 8:39 PM
by snappr021 on 4/5/24, 5:57 AM
by jamesdutc on 4/5/24, 1:03 AM
Unfortunately, the API design in the example is just not very good:
(
GT(simple_table, rowname_col='Name')
.tab_header(title='Names, Addresses, and Characteristics of Remote Correspondents')
.tab_stubhead(label=md('*Name*'))
...
)
I'm uncertain if it's trying to mimic something in another language like R (or some grammar of graphics thing or D3.js.) Hopefully, it's not trying to mimic the look of long, chained `pandas.DataFrame` operations (because it misses the point of why those look the way it does.)Of course, for ad hoc, in-a-notebook, cut-and-paste/written-from-scratch use, the API design doesn't really matter that match. Usually, users will readily memorise the required incantations then fiddle with the result until they get what they want or they give up.
It's probably the case that for most tools that produce visual outputs, a majority of users are creating things in this style. (There are, e.g., millions of casual Matplotlib users out there.) But programmatic use is not too far off. Tools that produce visual outputs (even those as formally rigidly at display tables,) are often subject to consistency requirements, which directly implies programmatic use.
So, when I discover that my colleagues and I have six tables across three notebooks that need a consistent look, and I decide to interact with this tool programmatically, am I expected to write…?
def standard_table(source, /, rowname_col, header_title, stubhead_label, weight_columns):
return (
GT(source, rowname_col=rowname_col)
.tab_header(title=header_title)
.tab_stubhead(label=md(f"*{stubhead_label}*"))
.fmt_integer(columns=weight_columns, pattern="{x} lbs")
...
)
standard_table(simple_table, rowname_col='Name', header_title='Names, Addresses, and Characteristics of Remote Correspondents', stubhead_label='Name', weight_columns='Weight')
Or maybe…? def format_table(weight_columns):
return (
tbl
.tab_stubhead(label=md(f"*{tbl.stubhead.label}*")) # what if not present?
.fmt_integer(columns=weight_columns, pattern="{x} lbs")
...
)
format_table(
GT(simple_table, rowname_col='Name')
.tab_header(title='Names, Addresses, and Characteristics of Remote Correspondents')
.tab_stubhead(label='Name')
...
)
Or maybe…? class StandardTable(GT):
def tab_stubhead(self, *a, **kw):
# inspect.signature.bind(...) # ...
return super().tab_stubhead(*a, **kw)
StandardTable(...)
These aren't great options. The API design is just not very good.by tonymet on 4/4/24, 8:47 PM
by magnio on 4/5/24, 5:50 AM
Strange that they did not know (or credit) booktabs, the LaTex package that popularizes this table design since 2003.
by tomcam on 4/4/24, 7:23 PM
“The democratization of computational tables arguably began with VisiCalc in 1979… I mean, try it out and you’ll see that this is quite limited in more than a few ways.”
Them’s fightin’ words. IMHO VisiCalc’s ability to generate models quickly changed civilization. It freed people to try out ideas at no cost and to view or manipulate data in ways no one could hope to do before.