from Hacker News

How we run Spark and Sqoop in production

by natekupp on 6/6/16, 4:56 PM with 18 comments

  • by ktamura on 6/6/16, 5:42 PM

    Any good alternatives for Sqoop? I feel that an ETL tool just for HDFS is too limiting and leads to further fragmentation on the data pipeline.
  • by sciurus on 6/6/16, 9:51 PM

    Their earlier post about rebuilding their data infrastructure is more interesting imho: https://news.ycombinator.com/item?id=11474284
  • by natekupp on 6/6/16, 6:37 PM

    hey all, feel free to reach out to me either on this thread, or directly at nate[at]thumbtack.com if I can answer any questions!
  • by lazywizard on 6/12/16, 2:30 AM

    Thanks for sharing with all scripts. Great help.
  • by lobster_johnson on 6/6/16, 10:56 PM

    Speaking of Spark, has anyone used it with Go? Is such a thing even possible?
  • by mcrad on 6/6/16, 6:25 PM

    Thumbtack is great for lazy consumers. But word on the street is they contribute to too much price pressure on the market, therefore drive the overall quality of services down. Good work Thumbtack! You have figured out a convenient way to sacrifice long term value in favor of short term profit.