from Hacker News

Another Hit Piece on Open-Source AI [video]

by vignesh_warar on 12/23/23, 7:19 PM with 11 comments

by gnabgib on 12/23/23, 9:13 PM
The report (Identifying and Eliminating CSAM in Generative ML Training Data and Models)[0] that this guy is very slowly sumarizing (and seems to largely agree with despite the title) was discussed 3 days ago (38 points, 30 comments)[1]
[0]: https://purl.stanford.edu/kh752sm9123 [1]: https://news.ycombinator.com/item?id=38711135
by Palmik on 12/23/23, 8:57 PM
Tangential, but why didn't the OpenAssistant team (lead by the author of the video) release the OpenAssistant dataset? As far as I know, the project was shut down, and only some initial highly filtered version of the data got released. This dataset could be very valuable for the community that created it.
by artninja1988 on 12/23/23, 8:05 PM
It's honestly pretty sad that at no time the authors of this paper bothered contacting laion to remove the links and work together to develop better filters. Also pretty interesting, that one of the authors calls, David Thiel himself the "Ai censorship death star". Yannic is probably right that they aren't particularly interested in bettering open source diffusion models and are more in the walled garden camp.
by mistrial9 on 12/23/23, 9:34 PM
then why does IBM spend money producing this one?
https://www.youtube.com/watch?v=y9k-U9AuDeM
by terminous on 12/23/23, 9:44 PM
Open source advocates: "With enough eyes, all bugs are shallow."
These researchers: "I see your project includes a non-zero amount of CSAM."
Open source advocates: "How dare you point out an issue? This is a hit piece!"