by maskil on 7/14/23, 2:45 PM with 159 comments
by foob on 7/14/23, 4:14 PM
The complaint lays out in steps why the plaintiffs believe the datasets have illicit origins — in a Meta paper detailing LLaMA, the company points to sources for its training datasets, one of which is called ThePile, which was assembled by a company called EleutherAI. ThePile, the complaint points out, was described in an EleutherAI paper as being put together from “a copy of the contents of the Bibliotik private tracker.” Bibliotik and the other “shadow libraries” listed, says the lawsuit, are “flagrantly illegal.”
IANAL, but this basically sounds like LLaMa was trained on illegally obtained books by Meta's own admission. It's an exciting development that Meta is releasing a commercial-use version of the model, but I wonder if this is going to cause issues down the road. It's not like Meta can remove these books from the training set without retraining from scratch (or at least the last checkpoint before they were used).
by zargon on 7/14/23, 3:28 PM
by greatpostman on 7/14/23, 3:32 PM
by ekojs on 7/14/23, 3:23 PM
From the FT article: '“The goal is to diminish the current dominance of OpenAI,” said one person with knowledge of high-level strategy at Meta.'
by forgingahead on 7/14/23, 4:36 PM
This is not charity, this is a shrewd business move.
by whimsicalism on 7/14/23, 5:03 PM
My guess is still the latter because that's what I've heard the rumors about, but this article is pretty unclear on this fact.
by pmarreck on 7/14/23, 4:38 PM
How can I play with open source LLM's locally?
by loufe on 7/14/23, 6:11 PM
by stale2002 on 7/14/23, 4:37 PM
Well now there is a commerical release. I guess it wasn't some corporate plot after all!
Some people just can't admit when a corporation does a good thing.
(In this case, the good thing is being done to obsolete their competitors, but it is good none the less, that a commerical LLM is available for people to use for free)
by obblekk on 7/14/23, 3:48 PM
by 0cf8612b2e1e on 7/14/23, 4:17 PM
by rvz on 7/14/23, 4:58 PM
Still waiting for the 'Meta is dying' and 'Fire Mark Zuckerberg' calls from last year. A year later, where are they now?
by TheBengaluruGuy on 7/15/23, 2:11 AM
Does it mean that any blogs that I wrote from my own insights, will automatically be trained on the model… without my permission?
As an author, it feels like it’s stealing the knowledge and insight without appropriate attribution.
by Jeff_Brown on 7/14/23, 3:13 PM
by sagebird on 7/14/23, 6:16 PM
hardware is the only moat
If you want to live the good life before you are exquisitely extinguished, spend every other day figuring out how to buy more NVDA, the other days exercising outside, being human.
by bilsbie on 7/14/23, 6:49 PM
by 40yearoldman on 7/14/23, 3:28 PM
Open-source commercial?