• Masimatutu@lemm.eeOP
    link
    fedilink
    arrow-up
    38
    ·
    edit-2
    1 year ago

    I’d say roughly 1,000 to 100,000, depending on format.

    Edit: Raw ASCII (7-bit) could give you up to ~half a million.

    Edit 2: According to Randall Munroe (to lazy to find the source), you could theoretically store one word letter per bit. That would give us up to ten two million books.

    • Sotuanduso@lemm.ee
      link
      fedilink
      English
      arrow-up
      16
      ·
      1 year ago

      One letter per bit? You’d need some crazy effective compression algorithm for that, because a bit is 1 or 0. Did you mean byte?

      • AdrianTheFrog@lemmy.world
        link
        fedilink
        English
        arrow-up
        15
        ·
        edit-2
        1 year ago

        UTF-8 and ASCII are normally already 1 character per byte. With great file compression, you could probably reach 2 characters per byte, or one every 4 bits. One character every bit is probably impossible. Maybe with some sort of AI file compression, using an AI’s knowledge of the English language to predict the message.

        Edit: Wow, apparently that already exists, and it can achieve even higher of a compression ratio, almost 10:1! (with 1gb of UTF-8 (8 bit) text from Wikipedia) bellard.org/nncp/

        If an average book has 70k 5 character words, this could compress it to around 303 kb, meaning you could fit 1.6 million books in 64 gb.

        You can get a 2tb ssd for around $70. With this compression scheme you could fit 52 million books on it.

        I’m not sure if I’ve interpreted the speed data right, but It looks like it would take around a minute to decode each book on a 3090. It would take about a year to encode all of the books on the 2tb ssd if you used 50 a100s (~$9000 each). You could also use 100 3090s to achieve around the same speed (~$1000 each)

        52 million books is around the number of books written in the past 20 years, worldwide. All stored for $70 (+$100k of graphics cards)

        • Sotuanduso@lemm.ee
          link
          fedilink
          English
          arrow-up
          11
          ·
          1 year ago

          There’s something comical about the low low price of $70 (+$100k of graphics cards) still leaving out the year of time it will take.

          • Cicraft@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Well I guess you could sacrifice a portion for an index system and just decode the one you’re trying to read