by leahcim on 11/13/20, 11:34 PM with 11 comments
Our clients give us an Excel/file and I need to 1) match the columns 2) clean up the data 3) return errors (if any) 4) import each row into my database
It's very difficult given that so many things could go wrong. I have already invested a lot of time in building a file parser with column matcher but the results aren't there yet.
Do you know of any API / widget that would allow me to offer this on my website with minimal code?
by gus_massa on 11/14/20, 12:46 AM
by shyn3 on 11/14/20, 2:39 PM
You will always get false positives, so what I do is dump the PS results into another database, then run a comparison, and then import. It slows you down a bit but I have noticed, every few months, I'll get 1 or 2 new errors due to data sanitization.
by fuzzfactor on 11/14/20, 6:10 PM
This could be high-touch but it could also build more client trust and pay off more than other non-automatable efforts.
Then first actually open each file in Excel.
I would imagine many clients would expect you to be more advanced at Excel than they are anyway, so do not fall behind.
That way you and the client always start on the same page, and you have the full power of Excel at your disposal at least like they do.
Matching the columns is no longer on your plate, fait accompli.
And a bad XLSx can be caught by Excel without being subject to further incorrect processing.
Maybe the data will be cleaner or more well-confirmed to begin with, and there might be fewer errors to return already.
The spreadsheet itself is the database for that one file, in your client's familiar format, an opened XLSx.
Rather than parsing, you could then consider reading the data directly from the cells using a straightforward macro which creates a binary or text file directly converting to your desired database format, or also an intermediate CSV if desired.
Excel's built-in macro language should still be capable of accomplishing this, or I would expect you could still call any proper x86 code in other languages as an Excel macro or subroutine if desired too.
It should then be possible for a powershell script to accomplish this structured file conversion with a single click from a menu, on any target XLSx or folder of XLSx's that conform to the clients' layout, which you have encoded into the custom macro that converts to your proprietary file structure.
If this is considered efficient while accommodating the overhead of Windows and Excel together only you can decide.
I used a common office PC for this process so I could encode modern data for use by antique scientific instruments, then sent the files to the equipment in their own obsolete format using the COM port.
Only decoding the incoming proprietary data stream from the antique gear required a parser.
by sangli on 11/14/20, 4:54 AM
by st1x7 on 11/14/20, 10:42 AM
by sloaken on 11/14/20, 10:04 AM
by poletopole on 11/14/20, 8:02 PM
by YaBa on 11/14/20, 6:45 PM
Python + Pandas would be my choice to develop something.
by thomasthomas on 11/17/20, 2:39 PM
by op03 on 11/14/20, 11:28 AM
by su1000 on 11/14/20, 12:40 AM