by olestr on 12/30/23, 1:28 AM with 26 comments
by lawrjone on 12/30/23, 8:17 AM
Worth some observations that:
- We're still using Fivetran for the EL stages. Costs are much more significant than they were before and we're looking (for the high volume sources) into options like DataStream as cost savers, but it's not unmanageable.
- dbt is still working great, even if we've done a lot of investment having now built a 5 person data team (BI, DA, DE) around it.
- Still use Metabase but have some frustrations and are considering other options.
- We no longer use Stitch :tada:
There's a post that followed this on improvements we made to our setup that may be interesting: https://incident.io/blog/updated-data-stack
The OP is still full of relevant, useful information, though (imo, of course).
by davedx on 12/30/23, 9:54 AM
I've not worked at any startups that did data warehousing, the one place I did work at where we were /starting/ to get it setup was like 300+ employees and $100M+/year revenue.
by 1letterunixname on 12/30/23, 11:59 AM
ETL should be minimized (except for external data, which is a bad sign of data owned or managed by a third-party) and replaced with the equivalent of dynamic or materialized "views". Prefer to create hygienic "views" of data against original data rather than mutating and destroying such original data with destructive transformations.
Finally, have a deeply-integrated, robust, enterprise-wide, fine-grained ACL system and privacy policy to keep everyone (and system users) from accessing anything without a specific business purpose need and an approval audit record stored via some sort of blockchain-like tech.
by evtothedev on 12/30/23, 1:48 PM
by alberth on 12/30/23, 7:53 AM
Plan A: $16 (month/user)
Plan B: $10,000+ Call Us
Plan C: Call Us
Those are some of the steepest price cliffs I’ve ever come across.
by rollulus on 12/30/23, 7:36 AM