from Hacker News

Cloud Video Intelligence API

by hurrycane on 3/8/17, 6:33 PM with 87 comments

by tyre on 3/8/17, 9:44 PM
I think their model should take a second pass on the words and probabilities, independent of the video.
Look at their example:
```
  Animal: 97.76%
  Tiger: 90.11%
  Terrestrial animal: 68.17%
```
So we are 90% sure it is a tiger but only 68% sure it is a land animal? I don't think that makes sense.
It could be that this is a weakness of seeding AI data with human inputs. I can believe that 90% of people who saw the video would agree that it is a tiger, while fewer would agree it is a terrestrial animal, because they don't know what terrestrial means.
by sna1l on 3/8/17, 7:10 PM
I wonder if Snapchat is/will become a large user of this service? Depending on the average response time of this API, Snapchat could get much better ad targeting analyzing their Stories content.
I imagine that they have something similar in house that they run since it is pretty vital to their core business, but you never know.
by wyc on 3/8/17, 7:41 PM
I think the most commercially successful application of computer vision has been quality-control devices (citation needed). Agriculture is very interested in CV for a return-optimization technique known as precision farming. Manufacturers pay for inspection of production throughout the pipeline. To predict where a mass-market CV could be successful, I think we should look for industries with similar problems but cannot currently afford a bespoke custom modeling solution.
by tambourine_man on 3/8/17, 9:36 PM
It amazes me how smart these guys at google are, and yet, they can't design a mobile site if their lives depended on it:
http://imgur.com/bXGuNfL
by skewart on 3/8/17, 7:24 PM
I'm curious about how much use these general-purpose computer vision APIs are actually getting. How many companies out there really want to sift through a lot of photos to find ones that contain "sailboat"? I'm inclined to think a lot more companies would want to find "one of these five different specific kinds of sailboats performing this action", which is definitely not among the tens of thousands of predefined labels that Google, and Amazon, offer with their general purpose models.
High-quality custom model training as a service seems much more compelling.
by timc3 on 3/8/17, 7:27 PM
I have been on the beta program for this and generally the results in our testing have been very good. I particularly like how granular the data can get.
by bitmapbrother on 3/8/17, 7:40 PM
It was really entertaining listening to Fei-Fei Lee talk about AI and ML at Google Cloud. If you get the chance check it out on YouTube. I especially liked how she referred to video as once being the "dark matter" of vision AI.
by imh on 3/8/17, 9:07 PM
The demo picture they chose is interesting. It's obviously a tiger, and is identified as such with only 90% probability. I appreciate the difficulty of the problem and how big of a success it is to achieve even that level of confidence, but that low level of confidence really shows how far we are from being able to simply trust computer vision. Still useful from an information retrieval perspective, I expect.
by aub3bhat on 3/8/17, 7:35 PM
I think there is a need for a comprehensive system for image and video data analytics. Much like how we today have relational databases (postgres, MYSQL) and full text search engines (lucene/Solr). The approach Google or Amazon have been taking which involves providing a "tagging" API is frankly unimaginative.
I am working on Deep Video Analytics an Open Source Visual Search and Analytics platform for images and videos. The goal of Deep Video analytics is to become a quickly customizable platform for developing visual & video analytics applications, while benefiting from seamless integration with state or the art models released by the vision research community. Its currently in very active development but still well tested and usable without having to write any code.
https://github.com/AKSHAYUBHAT/DeepVideoAnalytics
https://deepvideoanalytics.com
by ar15saveslives on 3/8/17, 8:29 PM
Correct me if I'm wrong, but this is just a frame-by-frame labeling. You can download whatever pre-trained CNN, pass individual frames through it and get the same result.
by vaiski on 3/9/17, 12:50 AM
There's alternative out there from a company called Valossa. More comprehensive than what Google is now offering. Https://val.ai
by frakkingcylons on 3/8/17, 9:16 PM
As a Cloud Prediction API user, it makes me a bit uneasy to see it left out of the image of their product suite. Is it effectively in maintenance mode now? I feel like TensorFlow is overkill for what I need and my use case doesn't fit into image/speech/video detection.
by soared on 3/8/17, 8:13 PM
Sounds similar to a company I worked with that took security camera footage from restaurants and identified employee theft and process inefficiencies.
by jimmcslim on 3/9/17, 5:37 AM
I wonder if you could use this to upload recordings from your DVR and have it determine the likely timecode of commercial breaks...
by zitterbewegung on 3/8/17, 8:46 PM
Not the first https://clarifai.com has a similar service .
by CRUDmeariver on 3/8/17, 11:17 PM
Is there any storage-related cost (i.e. retreival or egress cost) when you call this on a file stored on Google Cloud Storage?
by hartator on 3/8/17, 8:39 PM
It's awesome, but I can't really see any application beside content filtering and supericial content classification.
by joaoaccarvalho on 3/8/17, 9:59 PM
When you use these Google APIs, can Google keep/ use your data in any way?
by chimtim on 3/8/17, 8:54 PM
what is the "video" bit here? This is just running image recognition on a bunch of frames.
by kneel on 3/8/17, 9:10 PM
Cronenberg inception porn is coming