by mofosyne on 5/24/24, 6:47 AM with 1 comments
by mofosyne on 5/24/24, 6:47 AM
There was an attempt to filter out stupid comments in social media using a vector filter as a plugin for wordpress initially. Project somewhat fell though since it turns out it's quite hard to filter out stupidity in the internet.
While the main source code may be outdated, the corpus may still be of some use in training modern LLMs based content moderation or at least a historical curio of humanities attempt to stem the tied of stupidity.
I've converted the original MySQL dump into an sqlite database and csv file so it should be easier for anyone interested to give this corpus a shot with new machine learning advances these days.