I think much of Geocities remained accessible until 2013/2014 before going completely (apart from Japan 2019 or so).
This isn’t helped by most websites reinventing themselves every couple of years so the old links 404 even though the content still exists.
Poorly thought-out Facebook posts are forever; coverage of city council malfeasance from two years ago, not so much.
If the information was important wouldn’t it already be passed around and expanded upon? The Internet is probably 99% junk, at least the posts I’ve made. Only the good stuff like goatse survives.
Problem is, people rarely realize the importance until they’re lost. Plenty of posts from 90s and 2000s containing valuable insights are probably lost forever. Remember that not everything online is in English, either.
And now with Google regurgitating a summary of the content they’ve crawled there will be no incentive to publish because no one will click through to get ad payments.
We need to revive the days people write blog posts to help others instead of pushing ads to make money. The content was far better.
This is like the disinvention of the printing press, at least from an archeological perspective.
54% of Wikipedia pages contain at least one link in their “References” section that points to a page that no longer exists.
My impression was/is that over the last years/decade Wikipedia made efforts to/switch to not linking directly but extending direct links with (dated) Web Archive links or using Web Archive links directly (dated as "sourced from this in this state; which protects against upstream edits too).
This isn’t inherently bad.
Some web pages are extraneous, fedundant, or only relevant for a limited period of time. A sign up page for a concert doesn’t need to exist permanently. Consolidating a large website down to fewer pages that are accessible for everyone is a good thing.
Archiving services that retain web pages that deserve saving are how we should retain that history of the web, but the actual creators don’t necessarily need to indefinitely maintain a web page that becomes obsolete.
Yes, a lot is lost that could have just continued to exist and archiving is good, but getting rid of clutter is not a bad thing.