by elt on 11/1/21, 10:43 PM with 2 comments
collect data from (n*k) sources-> derive new data -> generate a unified/merged collection of data (n) data.
The current solution is all hand crafted code.
I know this is a 10,000 foot view of the problem, but are there any guides or books on how to better design and implement this type of solution?
by hedgehog on 11/5/21, 4:57 PM
You can get pretty far with R or Pandas + Scipy on a fast machine, after that then you start taking on more hassle of Spark or whatever fits your situation.
Oh, and 0) pain that's motivating the rebuild. Feel free to e-mail me even just to rubber duck your thinking.
by markus_zhang on 11/2/21, 1:31 AM