by alexwg on 1/1/16, 4:35 PM with 2 comments
by Houshalter on 1/1/16, 10:55 PM
Data is basically infinite. The internet has endless amounts of data. Hardware is basically infinite too. A large corporation like Google can afford massive clusters of top of the line GPUs, if it would help their algorithms.
The main bottleneck is algorithms which can take advantage of those things. We don't have very good algorithms for using "unlabeled" data. Data that hasn't been painstakingly classified by a human, who tells the AI what it's supposed to make of it.
We don't have very good algorithms for utilizing multiple GPUs. There are bandwidth limitations in how fast data can be transferred between them, so we need to make efficient use of it. No one has really worked out how to do that really well. But it's probably possible.
And then there are just limitations on what the existing algorithms can do. Google translate's algorithm needs lots of data, because that is a very brute force approach. We now have neural network language models that are far more efficient and need less data. GoogLeNet was not the first NN to be trained on imagenet at all. But it did so well because it had such a big improvement in the algorithm.
by steinsgate on 1/1/16, 6:19 PM