from Hacker News

Ask HN: Simplest Way to Distribute Custom C++ Computation in the Cloud?

by Ken_At_EM on 9/20/22, 12:43 PM with 0 comments

Hi,

We have a commercial app that we call a "Decoder." The Decoder is a wireless signal processor that consumes an analog time-series waveform and extracts digital information from it using different digital signal processing / demodulation techniques.

Improving our signal processing involves reprocessing a large database of analog signal examples and grading the performance after changes. Tweaking filters, settings, techniques, etc can all make improvements, or regressions, on a large or small scale.

We have a Windows App (C++/Qt) that can reprocess and "re-decode" the data. You pass it a CSV file containing the analog waveform and it produces many output files of different types.

My current architecture is that I developed a custom work/job runner that can be installed on any windows machine. It calls into AWS and watches a SQS queue for jobs. It downloads a job, fires up the worker utility, waits for it to process the data, parses the output files, and then stores the result into a SQL database.

We need to expand the scale at which we do this 10X, maybe 100X to experiment with some machine learning approaches we're considering, basically just running 10X to 100X more simulations or reprocessing jobs.

The utility takes 1-10 minutes to process the data depending on the CPU and length of the analog waveform.

My existing architecture seems kind of flimsy and I am hesitant to rely on it to do 100X more jobs than it can. Also, it doesn't scale in parallel well. I'd love to be able to process any number, 1000 jobs, all within 10 minutes. With the existing architecture I have to spin up a new worker and the jobs are divided among all workers. The workers are a mix between EC2 Windows Server VMs and physical machines.

We are considering porting the utility to linux so that we can it in on AWS Lambda or Batch.

My team and I are a bit new when it comes to cloud processing hence the post.

Primary Question: What would HN suggest as the simplest way to scale this up?

Sub-Questions:

Are there any straightforward ways to achieve our goals with the existing Windows CLI processor utility? Or should we port to Linux?

We'd prefer not to have to manage VMs/Physical machines/interact with the OS if we don't have to. What's a good cloud tool for distributing this computing given that?

Thanks for your time.