In an age of LLMs, is it time to reconsider human-edited web directories?

Back in the early-to-mid '90s, one of the main ways of finding anything on the web was to browse through a web directory.

Lycos, Excite, and of course Yahoo all were originally web directories of this sort.

These directories generally had a list of categories on their front page. News/Sport/Entertainment/Arts/Technology/etc.

Each of those categories had subcategories, and sub-subcategories that you clicked through until you got to a list of websites. These lists were maintained by actual humans.

Typically, these websites also had a limited web search that would crawl through the pages of websites listed in the directory.

By the late '90s, the standard narrative goes, the web got too big to index websites manually.

Google promised the world its algorithms would weed out the spam automatically.

And for a time, it worked.

But then SEO and SEM became a multi-billion-dollar industry. The spambots proliferated. Google itself began promoting its own content and advertisers above search results.

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

My question is, if a lot of the web is turning to crap, do we even want to search the entire web anymore?

At some point, does it become more desirable to go back to search engines that only crawl pages on human-curated lists of websites?

And is it time to begin considering what a modern version of those early web directories might look like?

@degoogle #tech #google #web #internet

  • Albert Cardona@mathstodon.xyz
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    @ajsadauskas @degoogle

    And just now, as seen at the bottom of a blog post:

    “Post a Comment
    Unfortunately because of spam with embedded links (which then flag up warnings about the whole site on some browsers), I have to personally moderate all comments. As a result, your comment may not appear for some time. In addition, I cannot publish comments with links to websites because it takes too much time to check whether these sites are legitimate.”