Timely_Jellyfish_2077@programming.dev to Technology@lemmy.worldEnglish · 2 months agoReasoning failures highlighted by Apple research on LLMsappleinsider.comexternal-linkmessage-square59fedilinkarrow-up1230arrow-down111 cross-posted to: technology@lemmit.online
arrow-up1219arrow-down1external-linkReasoning failures highlighted by Apple research on LLMsappleinsider.comTimely_Jellyfish_2077@programming.dev to Technology@lemmy.worldEnglish · 2 months agomessage-square59fedilink cross-posted to: technology@lemmit.online
minus-squareRimu@piefed.sociallinkfedilinkEnglisharrow-up8·edit-22 months agoI tried it myself (changing the name and changing the values) but lost interest after 3 attempts and always getting the right answer: https://chatgpt.com/share/670af65d-da08-800f-8ad4-c67782ee5477 https://chatgpt.com/share/670af672-45dc-800f-ac91-cc2811fa89c7 https://chatgpt.com/share/6709e80b-e5a8-800f-90d0-1af3418675ef
minus-squareA_A@lemmy.worldlinkfedilinkEnglisharrow-up3·2 months agoErrors from your links like this : Unable to load conversation 670a…6ed2c
minus-squareA_A@lemmy.worldlinkfedilinkEnglisharrow-up3·2 months ago“… So, Mary has 190 kiwifruit.” nice 😋🥝
minus-squaretinsukE@lemmy.worldlinkfedilinkEnglisharrow-up4arrow-down1·2 months agoI wouldn’t doubt that LLMs got some special input to deal with the specific examples of this paper, or similar enough.
minus-squarealienanimals@lemmy.worldlinkfedilinkEnglisharrow-up1·2 months agoThis is just improving LLMs, but with more steps.
I tried it myself (changing the name and changing the values) but lost interest after 3 attempts and always getting the right answer:
https://chatgpt.com/share/670af65d-da08-800f-8ad4-c67782ee5477
https://chatgpt.com/share/670af672-45dc-800f-ac91-cc2811fa89c7
https://chatgpt.com/share/6709e80b-e5a8-800f-90d0-1af3418675ef
Errors from your links like this :
Unable to load conversation 670a…6ed2c
Sorry! I’ve updated my links now.
“… So, Mary has 190 kiwifruit.”
nice 😋🥝
I wouldn’t doubt that LLMs got some special input to deal with the specific examples of this paper, or similar enough.
This is just improving LLMs, but with more steps.