• whotookkarl@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    1
    ·
    edit-2
    2 hours ago

    Here’s the cycle we’ve gone through multiple times and are currently in:

    AI winter (low research funding) -> incremental scientific advancement -> breakthrough for new capabilities from multiple incremental advancements to the scientific models over time building on each other (expert systems, LLMs, neutral networks, etc) -> engineering creates new tech products/frameworks/services based on new science -> hype for new tech creates sales and economic activity, research funding, subsidies etc -> (for LLMs we’re here) people become familiar with new tech capabilities and limitations through use -> hype spending bubble bursts when overspend doesn’t keep up with infinite money line goes up or new research breakthroughs -> AI winter -> etc…

    • Semperverus@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      10
      ·
      2 hours ago

      I still believe they have the ability to reason to a very limited capacity. Everyone says that they’re just very sophisticated parrots, but there is something emergent going on. These AIs need to have a world-model inside of themselves to be able to parrot things as correctly as they currently do (yes, including the hallucinations and the incorrect answers). Sure they are using tokens instead of real dictionary words, which comes with things like the strawberry problem, but just because they are not nearly as sophisticated as us doesnt mean there is no reasoning happening.

      We are not special.

      • galanthus@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 minutes ago

        If the only thing you feed an AI is words, then how would it possibly understand what these words mean if it does not have access to the things the words are referring to?

        If it does not know the meaning of words, then what can it do but find patterns in the ways they are used?

        This is a shitpost.

        We are special, I am in any case.

        • xthexder@l.sw0.com
          link
          fedilink
          English
          arrow-up
          7
          ·
          edit-2
          2 hours ago

          I think the strawberry problem is to ask it how many R’s are in strawberry. Current AI gets it wrong almost every time.

  • N0body@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    40
    ·
    6 hours ago

    The tested LLMs fared much worse, though, when the Apple researchers modified the GSM-Symbolic benchmark by adding “seemingly relevant but ultimately inconsequential statements” to the questions

    Good thing they’re being trained on random posts and comments on the internet, which are known for being succinct and accurate.

    • blind3rdeye@lemm.ee
      link
      fedilink
      English
      arrow-up
      13
      ·
      3 hours ago

      Yeah, especially given that so many popular vegetables are members of the brassica genus

      • MoogleMaestro@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 hours ago

        Absolutely. It would be a shame if AI didn’t know that the common maple tree is actually placed in the family cannabaceae.

  • emerald@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    35
    arrow-down
    2
    ·
    8 hours ago

    statistical engine suggesting words that sound like they’d probably be correct is bad at reasoning

    How can this be??

    • Siegfried@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      arrow-down
      2
      ·
      7 hours ago

      I would say that if anything, LLMs are showing cracks in our way of reasoning.

      • MoogleMaestro@lemmy.zip
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 hours ago

        Or the problem with tech billionaires selling “magic solutions” to problems that don’t actually exist. Or how people are too gullible in the modern internet to understand when they’re being sold snake oil in the form of “technological advancement” when it’s actually just repackaged plagiarized material.

  • kingthrillgore@lemmy.ml
    link
    fedilink
    English
    arrow-up
    22
    ·
    edit-2
    9 hours ago

    I feel like a draft landed on Tim’s desk a few weeks ago, explains why they suddenly pulled back on OpenAI funding.

    People on the removed superfund birdsite are already saying Apple is missing out on the next revolution.

  • RaoulDook@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    1
    ·
    8 hours ago

    I hope this gets circulated enough to reduce the ridiculous amount of investment and energy waste that the ramping-up of “AI” services has brought. All the companies have just gone way too far off the deep end with this shit that most people don’t even want.

    • thanks_shakey_snake@lemmy.ca
      link
      fedilink
      English
      arrow-up
      12
      ·
      7 hours ago

      People working with these technologies have known this for quite awhile. It’s nice of Apple’s researchers to formalize it, but nobody is really surprised-- Least of all the companies funnelling traincars of money into the LLM furnace.

      • jabathekek@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        6
        ·
        3 hours ago

        *starts sweating

        Look at that subtle pixel count, the tasteful colouring… oh my god, it’s even transparent…

    • WhatAmLemmy@lemmy.world
      link
      fedilink
      English
      arrow-up
      64
      arrow-down
      5
      ·
      15 hours ago

      The results of this new GSM-Symbolic paper aren’t completely new in the world of AI researchOther recent papers have similarly suggested that LLMs don’t actually perform formal reasoning and instead mimic it with probabilistic pattern-matching of the closest similar data seen in their vast training sets.

      WTF kind of reporting is this, though? None of this is recent or new at all, like in the slightest. I am shit at math, but have a high level understanding of statistical modeling concepts mostly as of a decade ago, and even I knew this. I recall a stats PHD describing models as “stochastic parrots”; nothing more than probabilistic mimicry. It was obviously no different the instant LLM’s came on the scene. If only tech journalists bothered to do a superficial amount of research, instead of being spoon fed spin from tech bros with a profit motive…

      • aesthelete@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        3 hours ago

        If only tech journalists bothered to do a superficial amount of research, instead of being spoon fed spin from tech bros with a profit motive…

        This is outrageous! I mean the pure gall of suggesting journalists should be something other than part of a human centipede!

      • jabathekek@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        2
        ·
        8 hours ago

        describing models as “stochastic parrots”

        That is SUCH a good description.

      • no banana@lemmy.world
        link
        fedilink
        English
        arrow-up
        32
        arrow-down
        2
        ·
        15 hours ago

        It’s written as if they literally expected AI to be self reasoning and not just a mirror of the bullshit that is put into it.

        • Sterile_Technique@lemmy.world
          link
          fedilink
          English
          arrow-up
          26
          arrow-down
          3
          ·
          13 hours ago

          Probably because that’s the common expectation due to calling it “AI”. We’re well past the point of putting the lid back on that can of worms, but we really should have saved that label for… y’know… intelligence, that’s artificial. People think we’ve made an early version of Halo’s Cortana or Star Trek’s Data, and not just a spellchecker on steroids.

          The day we make actual AI is going to be a really confusing one for humanity.

      • fluxion@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        4
        ·
        14 hours ago

        Clearly this sort of reporting is not prevalent enough given how many people think we have actually come up with something new these last few years and aren’t just throwing shitloads of graphics cards and data at statistical models

  • The Snark Urge@lemmy.world
    link
    fedilink
    English
    arrow-up
    79
    arrow-down
    2
    ·
    edit-2
    18 hours ago

    One time I exposed deep cracks in my calculator’s ability to write words with upside down numbers. I only ever managed to write BOOBS and hELLhOLE.

    LLMs aren’t reasoning. They can do some stuff okay, but they aren’t thinking. Maybe if you had hundreds of them with unique training data all voting on proposals you could get something along the lines of a kind of recognition, but at that point you might as well just simulate cortical columns and try to do Jeff Hawkins’ idea.

      • WldFyre@lemm.ee
        link
        fedilink
        English
        arrow-up
        5
        ·
        5 hours ago

        Did I misremember something, or is my memory easily influenced by external stimuli? No, the Mandela Effect must be real!

        /s

  • CosmoNova@lemmy.world
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    3
    ·
    17 hours ago

    Are you telling me Apple hasn’t seen through the grift and is approaching this with an open mind just to learn how full off bullshit most of the claims from the likes of Altman are? And now they’re sharing their gruesome discoveries with everyone while they’re unveiling them?

    • WhatAmLemmy@lemmy.world
      link
      fedilink
      English
      arrow-up
      47
      arrow-down
      3
      ·
      16 hours ago

      I would argue that Apple Intelligence™️ is evidence they never bought the grift. It’s very focused on tailored models scoped to the specific tasks that AI does well; creative and non-critical tasks like assisting with text processing/transforming, image generation, photo manipulation.

      The Siri integrations seem more like they’re using the LLM to stitch together the API’s that were already exposed between apps (used by shortcuts, etc); each having internal logic and validation that’s entirely programmed (and documented) by humans. They market it as a whole lot more, but they market every new product as some significant milestone for mankind … even when it’s a feature that other phones have had for years, but in an iPhone!

      • sinceasdf@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        7 hours ago

        The entirety of “open” ai is complete bullshit. They’re no longer even pretending to be nonprofit at all and there is nothing “open” about them since like 2018.

  • Lvxferre@mander.xyz
    link
    fedilink
    English
    arrow-up
    30
    arrow-down
    1
    ·
    17 hours ago

    The fun part isn’t even what Apple said - that the emperor is naked - but why it’s doing it. It’s nice bullet against all four of its GAFAM competitors.

    • jherazob@fedia.io
      link
      fedilink
      arrow-up
      24
      arrow-down
      2
      ·
      17 hours ago

      This right here, this isn’t conscientious analysis of tech and intellectual honesty or whatever, it’s a calculated shot at it’s competitors who are desperately trying to prevent the generative AI market house of cards from falling

    • conciselyverbose@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      1
      ·
      16 hours ago

      They’re a publicly traded company.

      Their executives need something to point to to be able to push back against pressure to jump on the trend.

    • misk@sopuli.xyzOP
      link
      fedilink
      English
      arrow-up
      9
      ·
      13 hours ago

      Given the use cases they were benchmarking I would be very surprised if they were any better.