from Hacker News

Ask HN: Thoughts on a Configuration-Driven Data Pipeline Platform?

by keyurishah on 1/15/25, 8:40 PM with 1 comments

Hi HN,

I'm working on a startup idea for a configuration-driven data pipeline platform designed to simplify and streamline the process of building and managing data pipelines.

The Problem Managing modern data pipelines is often complex, requiring developers to integrate multiple tools manually, deal with vendor lock-in, and handle a steep learning curve for diverse technologies.

The Solution Our platform acts as a wrapper for open-source tools, offering a plug-and-play approach to:

Data ingestion, transformation, and quality control CI/CD and test-driven pipeline creation Easy switching between technologies (e.g., Snowflake to Delta Lake) On-premise deployment for organizations avoiding the cloud Key Features Low-code configuration: Build pipelines through simple YAML/JSON configurations, no custom code required. Modular and Flexible: Mix-and-match open-source tools seamlessly. Scalable Architecture: Designed for a variety of workloads, from small startups to enterprise-scale data needs. Questions for HN Does this idea resonate with the challenges you've faced in managing data pipelines? Are there any features or capabilities you’d consider must-haves? How important is vendor lock-in avoidance to your organization? Would you prefer an open-source tool with premium add-ons or a fully hosted solution? I’d love to hear your feedback, suggestions, or critiques. If this idea interests you, I’d be happy to share more details!

Thanks in advance for your input!