from Hacker News

Ask HN: Is the Clueweb Dataset Useful?

by agencies on 7/5/21, 2:02 AM with 0 comments

A handful of places are working to make web search better (for any value of better). The clueweb dataset (eg [1]) has been around quite a while but is it actually useful for any real world purposes?

Common crawl is bigger but I've also read still is not good enough to use as input for good web search.

Did the trec web track produce anything useful?

[1] http://www-personal.umich.edu/~kevynct/trec-web-2014/