from Hacker News

AI Data Laundering

by marceloabsousa on 10/17/22, 9:26 PM with 113 comments

  • by moyix on 10/17/22, 10:04 PM

    The Authors Guild v Google decision about Google Books seems relevant:

    > In late 2013, after the class action status was challenged, the District Court granted summary judgement in favor of Google, dismissing the lawsuit and affirming the Google Books project met all legal requirements for fair use. The Second Circuit Court of Appeal upheld the District Court's summary judgement in October 2015, ruling Google's "project provides a public service without violating intellectual property law." The U.S. Supreme Court subsequently denied a petition to hear the case.

    [...]

    > The court's summary of its opinion is:

    [...]

    > Google’s unauthorized digitizing of copyright-protected works, creation of a search functionality, and display of snippets from those works are non-infringing fair uses. The purpose of the copying is highly transformative, the public display of text is limited, and the revelations do not provide a significant market substitute for the protected aspects of the originals. Google’s commercial nature and profit motivation do not justify denial of fair use.

    https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,....

    This doesn't touch on the ethics of course – at minimum I think allowing people to exclude themselves or their work from a dataset is necessary.

  • by dkural on 10/17/22, 10:04 PM

    This reminds me of the Jedi Mind trick of Uber of waving a smartphone to argue that labor & other laws all of a sudden don't apply to them, to the detriment of the public that'll now shoulder the costs.
  • by nojvek on 10/18/22, 9:56 AM

    Big Tech has really big datasets esp Google. With YouTube, Photos, Music, Gmail, Docs, Maps, Books, Waymo, Search … they have giant multimodal datasets that capture essence of all human knowledge. They have 10+ products with more a billion users creating data for them.

    If Google Brain/DeepMind were to crack AGI, it would make Google/Alphabet crazy rich at the detriment of millions of YouTubers, Book authors, musicians, drivers.

    AI will concentrate power and wealth to fewer individuals.

  • by noduerme on 10/18/22, 8:38 AM

    I've got a couple examples of Stable Diffusion replicating watermarks along with similar swatches of imagery into scenes from the same prompt [1]. A single case of this should be enough to file a massive lawsuit if the art were recognizable to the creator.

    [1] https://news.ycombinator.com/item?id=33061707

  • by killjoywashere on 10/17/22, 10:52 PM

    > It’s currently unclear if training deep learning models on copyrighted material is a form of infringement

    What? It's clearly a derived work.

  • by gfd on 10/18/22, 12:18 AM

    Was this term coined on HN? I remember first seeing it (used in an AI context) from this 2019 comment under "Cool stuff that's still completely unregulated": https://news.ycombinator.com/item?id=21167689

    Most of the predictions in that first comment came true.

  • by learndeeply on 10/17/22, 11:06 PM

    > But then Meta is using those academic non-commercial datasets to train a model, presumably for future commercial use in their products. Weird, right?

    This is a very strong and likely inaccurate presumption.

  • by RosanaAnaDana on 10/17/22, 11:04 PM

    The horse seems well out of the gate.
  • by Havoc on 10/17/22, 11:33 PM

    The whole thing is a mess but frankly i doubt this genie can be put back in the bottle
  • by bo1024 on 10/18/22, 2:01 AM

    The Flickr example is wild. How was nobody sued for that!?
  • by krab on 10/18/22, 11:39 AM

    Are we heading towards voiding most of current copyrights or is there a way out of this mess with another patch to the laws?
  • by theGnuMe on 10/19/22, 1:15 AM

    It’s definitely fair use. One question I have though is Mickey Mouse protected by copyright or trademark or ? I assume someone other than Disney can’t sell mickeys likeness or is that wrong in art? And if the AI makes a movie?
  • by patcon on 10/18/22, 11:35 AM

    Not sure laundering it the right term.

    Laundering private things through the commons feels not as shady as laundering in private networks. The commons benefits too.

    It's more like open source that money laundering