from Hacker News

Show HN: Google Sheets add-on to compare text, fuzzy-match, highlight duplicates

by chiscript on 1/18/20, 10:06 AM with 27 comments

I created an add-on for Google Sheets called Flookup, and it comes both as a free version and a VERY AFFORDABLE paid version.

At its core, Flookup is a fuzzy matching add-on that helps you manage text that is less than a 100% match. Beyond that it can be used to:

1. Search for and match data regardless of whether it contains typos.

2. Highlight and delete duplicates duplicates even if the data has mismatched text.

3. Calculate the percentage similarity between strings.

4. Extract unique values from any column based on percentage similarity.

5. Sum and find the average of numbers based on corresponding partial matches.

Because of its versatility, Flookup can be used to return the best match, the next best match, etc. until the minimum percentage similarity is reached. This feature avoids weaknesses other fuzzy matching algorithms have because it safely hands power to the user, and I believe the user is the best judge of which data is a match or not.

Another great feature Flookup has is that it can be used to combine lookup values. This is particularly helpful when your data has many similar strings and you want to add extra information to your lookup value in order to increase the specificity of your query.

Finally, Flookup is good for more than just fuzzy matching; it is the improved replacement for VLOOKUP and INDEX/MATCH that you have been looking for.

Find out more by heading to https://www.getflookup.com, Subscription information is available at https://www.getflookup.com/pricing

by throw_14JAS on 1/20/20, 6:57 AM
I've had a similar idea in the back of my mind for a few years now. Congrats on launching!
My use case is a bit different -- I was doing a lot of database cleanups, particularly CRMs. I rewrote/reused code to build a duplicate detector a number of times; always wish there were a service that I could send data to, and it would flag my dupes. Even was using human labelers to train domain specific models.
by tehabe on 1/20/20, 8:54 AM
Why is so hard to introduce the people who made this tool on the website? If I spent money on something I want to know to whom I sent this money. It feels really weird to use an anonymous tool for something which might be important.
by superbrane on 1/20/20, 8:59 AM
Congrats for launching a useful tool and already gathering a nice install base. Wonder how much of it is paid :)). Base plan looks a bit too restrictive - people can process 50 rows manually in xls -I suggest you offer more rows for the free plan, so you encourage adoption.
by dandare on 1/20/20, 4:27 PM
This reminded me of https://openrefine.org/.
Good luck with your project!
by samdung on 1/20/20, 11:36 AM
Congrats on the launch and i hope you make money quickly enough before Google launches this as a built-in feature.
by jacklewis on 1/20/20, 6:23 AM
> 2. Highlight and delete duplicates duplicates even if the data has mismatched text.
I see what you did there
by marapuru on 1/20/20, 9:40 AM
Why did you choose a subscription model over a fixed price one?