Formerly u/CanadaPlus101 on Reddit.

  • 52 Posts
  • 5.28K Comments
Joined 1 year ago
cake
Cake day: June 12th, 2023

help-circle







  • Yeah, that was actually an awkward wording, sorry. What I meant is that given a non-continuous map from the natural numbers to the reals (or any other two sets with infinite but non-matching cardinality), there’s a way to prove it’s not bijective - often the diagonal argument.

    For anyone reading and curious, you take advantage of the fact you can choose an independent modification to the output value of the mapping for each input value. In this case, a common choice is the nth decimal digit of the real number corresponding to the input natural number n. By choosing the unused value for each digit - that is, making a new number that’s different from all the used numbers in that one place, at least - you construct a value that must be unused in the set of possible outputs, which is a contradiction (bijective means it’s a one-to-one pairing between the two ends).

    Actually, you can go even stronger, and do this for surjective functions. All bijective maps are surjective functions, but surjective functions are allowed to map two or more inputs to the same output as long as every input and output is still used. At that point, you literally just define “A is a smaller set than B” as meaning that you can’t surject A into B. It’s a definition that works for all finite quantities, so why not?











  • Ah, looks like you beat my edit by a few seconds.

    Good to know about the Netscape thing. It looks like Firefox (still, being a successor to NS) does it that way, and Chrome can do it that way. If you’re using a true third option you probably don’t need my help.

    For the sake of completeness, on Tor Browser you have to copy the SQLite database from the browser directory, since it’s too locked down to just export the normal way. Then I’d try just subbing it in on an offline Firefox instance and proceeding the normal way. And obviously, use wget over torsocks as well.


  • I find that the things most likely to disappear (like a tinkerer’s web 1.0 homepage) tend to have limited recursion depth anyway.

    A Tumblr blog takes an awfully long time to crawl politely, IIRC, but the end result wasn’t too big on disk. Now I’m wondering how you would pass a cookie to wget, and how you might set a data cap so you can stop and wait for the month to be up before you call it again. I kind of feel like I’ve done a cookie before to get around a captcha or something…

    Edit: There’s a couple of ideas for limiting size on StackOverflow. The wget specific one is -Q for quota, which you’d want to set conservatively in case there’s one huge file somewhere, since it only checks between individual downloads.

    Looks like there’s a --load-cookies option that will read a browser export of cookies from a file, as well as load POST data and save cookie options if you want to do something interactive that way.

    Edit edit: What I’m remembering is actually adding headers, like this.