from Hacker News

Show HN: Turn any website into an API (for those who miss Kimono)

by welanes on 11/1/19, 12:19 PM with 74 comments

  • by phsource on 11/1/19, 3:03 PM

    This is very cool! I love how you brought back the original Kimono UI with the checkmark and Xs for adding and removing data tags.

    We built WrapAPI (https://wrapapi.com) back in the day, before we ended up starting Wanderlog (https://wanderlog.com), our current travel planning Y Combinator startup. This definitely is still an unsolved problem.

    However, from a business point of view, we found that it was rather difficult to make a business out of an unspecialized scraping tool. The Kimono founders expressed a similar sentiment: ultimately, scraping is a solution looking for a problem.

    Developers can often roll their own solution too, which limits your customer base and how much you can charge. Instead, vertical-specific tools that target particular industries seem to be the way to go (see Plaid as an example!)

    Alternatively, you have to be good at Enterprise and B2B sales. This is a product that you need to get the word out, get a champion, and do customer success on since it has a substantial learning curve. We were not, so that was why we chose to focus on other projects to start out

    Best of luck, and feel free to get in touch if you'd like to chat more

  • by welanes on 11/1/19, 1:16 PM

    Hey HN, I posted this in a comment thread the other day and (to my surprise) it got a positive reception so added a few more updates and decided to post it proper.

    The idea is to be able to choose a website, select the data you want, and make it available (as JSON, CSV or an API) with as little friction as possible.

    Kimono was the gold standard for a while so did yoink some of their ideas, while doing some other things differently.

    Still needs some work but as an MVP would appreciate any feedback. Cheers.

  • by beagle3 on 11/1/19, 9:58 PM

    I don't feel it is right to describe it as "turns a website into an API", rather "gives scraped data through an API".

    "Turn website into an API", for me, evokes the image that I can automate (say) placing an order in Amazon as an API, or paying my bills automatically. It includes scraping, of course, but requires a lot more (mechanize/twill/selenium/phantom/etc power).

    There was a company called Orsus that did exactly that. Last I heard about them it was the year 2000.

  • by uberswe on 11/1/19, 1:51 PM

    I like the idea but I was skeptical as to how well it works and noticed the video on the main page of your website which scans coinmarketcap seems to be wrong. It gets 200 cryptocurrency names but only 100 prices which means only the first result is correct.

    I have a similar idea that I'm working on, your site is definitely bookmarked and will try the extension later.

  • by save_ferris on 11/1/19, 1:04 PM

    What is it about this service as a business model that prevents it from taking off? I’ve known at least two YC startups that tried to build businesses around this idea.

    I think one or both were acquired and immediately shut down, but I’m not 100% sure about that.

  • by ainiriand on 11/1/19, 1:48 PM

    Hi, is it possible to make it compatible with firefox?
  • by mikikian on 11/1/19, 8:23 PM

    Maybe a better business model is to offer this as a service to site owners who are not tech savvy. Site owners then have the ability to offer an API to new customers making it a win / win. Site owners can now offer an API (free or paid), and API consumer can rely on getting data in the future.
  • by MildlySerious on 11/2/19, 10:16 AM

    I just gave this a shot on the ISO website to get a list of country codes[1], but it seems the selection algorithm breaks down when there's no specific classes applied to elements, as every td.v-grid-cell is selected, which is all of them, instead of the values of the alpha2 column for example.

    This seems hard to solve entirely programmatically, maybe having a way to be more specific by providing a selector yourself or selecting multiple entries and having the plugin figure it out could add a lot of utility in such cases.

    [1] - https://www.iso.org/obp/ui/#search/code/

  • by nopcode on 11/1/19, 5:25 PM

    I believe this could be a good solution to turn legacy software into an API. The “generated code” should be a reverse proxy, not a scraping lib.

    Also, scraping a website to use/copy it’s data is illegal in my country (Belgium). I’m not sure this tool itself would be.

  • by flingo on 11/4/19, 12:52 AM

    Is there a reason this doesn't spit out some python or JavaScript code to scrape the same info out?

    This just seems to add another dependency to whatever I'm developing. Plus, it sends data through a server I don't control. (I assume)

  • by maroonblazer on 11/2/19, 1:35 AM

    I like this.

    Please consider adding the ability to script clicks on elements, e.g. buttons.

    I manage a site where we load a subset of articles on initial page load and then have a "Load more" button that executes Javascript to load another batch of articles. Getting a list of articles from our CMS is a bit of a hassle so being able to scrape it easily instead would be ideal.

  • by holeyness on 11/1/19, 8:45 PM

    Does this work with authenticated pages?
  • by mrskitch on 11/1/19, 8:42 PM

    This is super cool. I really enjoyed and missed the kimono workflow. Automating something like this with browserless.io would be really fun (I run that project). Extensions is one of the things we’re looking to support.

    Anyways give me an email at joel at browserless dot io if you ever want to chat

  • by joelvalleroy on 11/2/19, 1:21 AM

    Awesome! One question I have after reading the page is - what is the pricing plans concerning credits? (for automated scraping)
  • by ntaylor on 11/1/19, 2:46 PM

    Kimono was cool, nice to see another option. I still have a Kimono t-shirt in a drawer somewhere.
  • by matz1 on 11/1/19, 5:36 PM

    How to use the 'pagination' feature ? The help guide doesn't even mentioned it.
  • by monkeydust on 11/1/19, 8:29 PM

    Looks good, could this be integrated into n8n.io to be used to drive a workflow?
  • by cfan01 on 11/5/19, 8:36 PM

    Firefox add in please.
  • by earth2mars on 11/1/19, 3:26 PM

    if you can add RSS feed response that would be great
  • by nightnight on 11/1/19, 4:53 PM

    OT: or just use puppeteer, not really hard, for free and you can rule the world