by agencies on 9/19/21, 4:28 AM with 0 comments
For starters, we need a communal dataset of 50-100 million urls/data/metadata that's "good" (for any value of good) to help people experiment with web search tech.
We need more Kaggle competitions to explore better summarization, reader mode text extraction, boilerplate removal, etc. What other competitions would help web search?
How can we foster a sustainable community and projects to create a bazaar of web search engines?