by welanes on 11/1/19, 12:19 PM with 74 comments
by phsource on 11/1/19, 3:03 PM
We built WrapAPI (https://wrapapi.com) back in the day, before we ended up starting Wanderlog (https://wanderlog.com), our current travel planning Y Combinator startup. This definitely is still an unsolved problem.
However, from a business point of view, we found that it was rather difficult to make a business out of an unspecialized scraping tool. The Kimono founders expressed a similar sentiment: ultimately, scraping is a solution looking for a problem.
Developers can often roll their own solution too, which limits your customer base and how much you can charge. Instead, vertical-specific tools that target particular industries seem to be the way to go (see Plaid as an example!)
Alternatively, you have to be good at Enterprise and B2B sales. This is a product that you need to get the word out, get a champion, and do customer success on since it has a substantial learning curve. We were not, so that was why we chose to focus on other projects to start out
Best of luck, and feel free to get in touch if you'd like to chat more
by welanes on 11/1/19, 1:16 PM
The idea is to be able to choose a website, select the data you want, and make it available (as JSON, CSV or an API) with as little friction as possible.
Kimono was the gold standard for a while so did yoink some of their ideas, while doing some other things differently.
Still needs some work but as an MVP would appreciate any feedback. Cheers.
by beagle3 on 11/1/19, 9:58 PM
"Turn website into an API", for me, evokes the image that I can automate (say) placing an order in Amazon as an API, or paying my bills automatically. It includes scraping, of course, but requires a lot more (mechanize/twill/selenium/phantom/etc power).
There was a company called Orsus that did exactly that. Last I heard about them it was the year 2000.
by uberswe on 11/1/19, 1:51 PM
I have a similar idea that I'm working on, your site is definitely bookmarked and will try the extension later.
by save_ferris on 11/1/19, 1:04 PM
I think one or both were acquired and immediately shut down, but I’m not 100% sure about that.
by ainiriand on 11/1/19, 1:48 PM
by mikikian on 11/1/19, 8:23 PM
by MildlySerious on 11/2/19, 10:16 AM
This seems hard to solve entirely programmatically, maybe having a way to be more specific by providing a selector yourself or selecting multiple entries and having the plugin figure it out could add a lot of utility in such cases.
by nopcode on 11/1/19, 5:25 PM
Also, scraping a website to use/copy it’s data is illegal in my country (Belgium). I’m not sure this tool itself would be.
by flingo on 11/4/19, 12:52 AM
This just seems to add another dependency to whatever I'm developing. Plus, it sends data through a server I don't control. (I assume)
by maroonblazer on 11/2/19, 1:35 AM
Please consider adding the ability to script clicks on elements, e.g. buttons.
I manage a site where we load a subset of articles on initial page load and then have a "Load more" button that executes Javascript to load another batch of articles. Getting a list of articles from our CMS is a bit of a hassle so being able to scrape it easily instead would be ideal.
by holeyness on 11/1/19, 8:45 PM
by mrskitch on 11/1/19, 8:42 PM
Anyways give me an email at joel at browserless dot io if you ever want to chat
by joelvalleroy on 11/2/19, 1:21 AM
by ntaylor on 11/1/19, 2:46 PM
by matz1 on 11/1/19, 5:36 PM
by monkeydust on 11/1/19, 8:29 PM
by cfan01 on 11/5/19, 8:36 PM
by earth2mars on 11/1/19, 3:26 PM
by nightnight on 11/1/19, 4:53 PM