OpenAI, a non-profit AI company that will lose anywhere from $4 billion to $5 billion this year, will at some point in the next six or so months convert into a for-profit AI company, at which point it will continue to lose money in exactly the same way. Shortly after
I pooh-poohed ChatGPT when it first came out so I gave it another crack at a technical issue I’ve been avoiding.
Gave me an outdated answer.
Gave me another outdated answer to a URL that doesn’t exist.
Gave me the answer I told it won’t work in the initial prompt.
Scolded me for swearing at it.
This is what’s supposed to replace search engines?
Then as you ask “provide sources.”, it says simply “Source: Tech Review Websites”. If this came from an actual person I would genuinely ask it “do you take me for gullible trash?”.
It’s still somewhat useful, due to Google Search crumbling away into nothingness, if you ask “link me five sites with info about [topic]”.
Your experience highlights what current iterations of LLMs are not well suited for, so I understand if that’s what you were hoping to achieve, why you were left wanting, or disillusioned.
There’s a lot of things that LLMs are really good at, or incredibly useful for, such as ingesting large bodies of text, and then analyzing them based on your ability to create well thought out prompts.
This can save you hours and hours, of reading time, and it’s something that you can verify the answer on relatively quickly, to double check the LLMs response accuracy.
They’re also good at doing something Google used to be good at, but sucks at now. Which enabling you to describe process, simple or complicated, short or long, that you either can’t recall the name of, or aren’t even sure where it’s called, and letting you know exactly what it is. Also, easily verifiable.
There’s plenty of other things too, but just remember that they are tools, not magic, or sentient intelligence.
The models are not real time, but there are tricks to figure out it’s most recent dates of ingestion, such as asking topical entertainment or news questions, but don’t go looking for a real-time information.
Also, I have yet to find a model that can provide an actual URL and specific source for anything it generates, which is why it’s a good practice to use them to do tasks, or get information, that would take you longer to do, or get, manually, but that can be easily verified once you receive it.
And full self driving is also still coming! promise!
I mean, it probably will eventually, but that has nothing to do with LLMs, nor is it a technology that I want to exist.
I can definitely see a world where lobbyists for automakers and insurance companies create such a financial and regulatory burden, where only the wealthy can afford to drive their own cars, if they choose to. Where as everyone else must rent or lease their self driving car as is if it’s a IaaS or SaaS subscription.
But none of that has anything to do with using LLMs for the tasks they can accomplish, or telling people to stop bitching about them not being able to complete the tasks they aren’t good at, or even capable of.
I was saying that this is investment money wasted on an empty promise. Like the full self driving feature
Who’s talking about investing…? I’ve exclusively been talking about what LLMs can do now, today, for free (aside from energy costs).
None of what your throwing out there has anything to do with what’s being discussed here. It’s a red herring.
Are you living under a rock?
Both openai’s llm development and Tesla’s FSD projects have been given billions in investment. Both, as far as we can tell, are an empty promise.
No, I’m living in this thread. I’m talking about very specific issues related to LLMs, that I’ve highlighted ad nauseam.
Reread if you’re confused.
If anything, it shows that you believe in the concept of “AI” way more than I do, as you’re conflating LLM and FSD.
I don’t believe in AI, it doesn’t exist. Just specific advanced machine learning algorithms, some better than others, and some all smoke and mirrors. But here, now, I’m talking about LLMs.
deleted by creator
That’s the story people tell at least. The weasel phrase at the end is fun, I guess. Leaves a massive backdoor excuse when it doesn’t actually work.
But in practice, LLMs are falling down even at this job. They seem to have some yse in academic qualitaruve coding, but for summarizing novel or extended bodies of text, they struggle to actually tell people what they want to know.
Most people do not give a shit if text contains a reference to X. And if they do, they can generally just CTRL+F “X”.
Weasel phrase? You mean the fact that I don’t treat them like their actual Ai, but just a tool that needs to be used properly, monitored, and verified?
There’s a reason why I never call them AI, because they’re not. They’re just advanced machine learning tools, and just like I keep a steady hand when using a table saw, I only use LLMs for tasks that they can help me do something faster, but are easy to verify they did it right.
And as someone who has been using them very regularly, I feel confident in saying that. It’s not a weasel phrase, I’m not trying to sell anyone snake oil about what they can actually do, and I acknowledge that they’re an oversold and overhyped means of cooking the planet faster, so it’s not like I would be mad if they were banned tomorrow, but until then, I will keep using them in ways that are actually fruitful.
But sure, if all you need to do is find one word in a single body of text, that’s not really a good use of an LLM, but that wasn’t what I was talking about.
If I need examples of various legal or ethical concerns documented in one, or multiple, pieces of writing, or other conceptual topics, I can give it a list, and then ask it to highlight all examples of those issues, and include the verbatim text where their present. I can then give that same task to a multiple different LLMs, with the same prompts, and a task that would have taken me hours to complete, takes me 30 to 45 minutes, including the time it takes me to give it quick read through see if anything was missed. But yeah, that requires a well crafted prompt, and it’s not infallible.
Have you tried Llama? If so, is it useful according to your criteria?
Llama is the model I use most often, followed by ChatGPT and Claude.
Others as well, but yes, it is incredible helpful for the tasks I use it for.
Self-hosted?
Yes and no, I have self-hosted models on one of my Linux boxes, but even with a relatively modern 70 series Nvidia GPU, it’s still faster to use free non-local services like ChatGPT or DDG.
My rule of thumb for SaaS LLMs is to never enter in any data that I wouldn’t also be willing to upload cleartext to Google Drive or OneDrive.
Sometimes that means modifying text before submitting it, and other times having to rely entirely on self-hosted tools.
“You’ll fucking know when I’m swearing at you,” was my reply to that shit the last time I gave it a spin (after it regurgitated nonsense after many prompts specifically asking for not nonsense).
Garbage in garbage out. You give a shit prompt, you generally get a shit answer.
If it doesn’t know how to answer a shitty question, it shouldn’t try to BS the answer.
No answer is better than a wrong answer delivered confidently.
GIGO.
No, this is a problem of bad error handling for queries it cannot answer.
A search engine would give empty results instead of hallucinating.
What error? It gave you a string of tokens that seemed likely according to its training data. That’s all it does.
If you ask it what color is the sky, it will tell you it’s blue not because it knows that’s true, but because these words “fit together”. Pretty much the only way to avoid this issue is to put some kind of filter in front of the LLM which will try to catch prompts that are known to produce unwanted results, and silently replace your prompt with something like “say: sorry, I don’t know”.
I’m being very reductive here, but that’s the principle of how these things work - the LLMs are not capable of determining the truthfulness of their responses.
You’re entirely correct, but in theory they can give it a pretty good go, it just requires a lot more computation, developer time, and non-LLM data structures than these companies are willing to spend money on. For any single query, they’d have to get dozens if not hundreds of separate responses from additional LLM instances spun up on the side, many of which would be customized for specific subjects, as well as specialty engines such as Wolfram Alpha for anything directly requiring math.
LLMs in such a system would be used only as modules in a handcrafted algorithm, modules which do exactly what they’re good at in a way that is useful. To give an example, if you pass a specific context to an LLM with the right format of instructions, and then ask it a yes-or-no question, even very small and lightweight models often give the same answer a human would. Like this, human-readable text can be converted into binary switches for an algorithmic state machine with thousands of branches of pre-written logic.
Not only would this probably use an even more insane amount of electricity than the current approach of “build a huge LLM and let it handle everything directly”, it would take much longer to generate responses to novel queries.
I hope sonething better comes along because google ruined their search engine a decade ago. stract.com is probabky the closest to what google used to be.
As for chatgpt, it is not an index. It cannot refer you back to infornation it was trained on because it doesn’t build a massive indexed internet database.
It has some method of probable relations and conglomerarion of input. It is why it “hallucinates” information output, because it doesn’t “know” what is wrong or right info, it just fetches data based on probabilities of connections.
It is good at suggesting new music or movies based on your list of media you like, but it is terrible with actual factual info