The words you are reading have not been produced by Generative AI. They’re entirely my own.
The role of Generative AI
The only parts of what you’re reading that Generative AI has played a role in are the punctuation and the paragraphs, as well as the headings.
Challenges for an academic
I have to write a lot for my job; I’m an academic, and I’ve been trying to find a way to make ChatGPT be useful for my work. Unfortunately, it’s not really been useful at all. It’s useless as a way to find references, except for the most common things, which I could just Google anyway. It’s really bad within my field and just generates hallucinations about every topic I ask it about.
The limited utility in writing
The generative features are useful for creative applications, like playing Dungeons and Dragons, where accuracy isn’t important. But when I’m writing a formal email to my boss or a student, the last thing I want is ChatGPT’s pretty awful style, leading to all sorts of social awkwardness. So, I had more or less consigned ChatGPT to a dusty shelf of my digital life.
A glimmer of potential
However, it’s a new technology, and I figured there must be something useful about it. Certainly, people have found it useful for summarising articles, and it isn’t too bad for it. But for writing, that’s not very useful. Summarising what you’ve already written after you’ve written it, while marginally helpful, doesn’t actually help with the writing part.
The discovery of WhisperAI
However, I was messing around with the mobile application and noticed that it has a speech-to-text feature. It’s not well signposted, and this feature isn’t available on the web application at all, but it’s not actually using your phone’s built-in speech-to-text. Instead, it uses OpenAI’s own speech-to-text called WhisperAI.
Harnessing the power of WhisperAI
WhisperAI can be broadly thought of as ChatGPT for speech-to-text. It’s pretty good and can cope with people speaking quickly, as well as handling large pauses and awkwardness. I’ve used it to write this article, and this article isn’t exactly short, and it only took me a few minutes.
The technique and its limitations
Now, the way you use this technique is pretty straightforward. You say to ChatGPT, “Hey, I’d like you to split the following text into paragraphs and don’t change the content.” It’s really important you say that second part because otherwise, ChatGPT starts hallucinating about what you said, and it can become a bit of a problem. This is also an issue if you try putting in too much at once. I found I can get to about 10 minutes before ChatGPT either cuts off my content or starts hallucinating about what I actually said.
The efficiency of the method
But that’s fine. Speaking for about 10 minutes straight about a topic is still around 1,200 words if you speak at 120 words per minute, as is relatively common. And this is much faster than writing by hand is. Typing, the average typing speed is about 40 words per minute. Usually, up to around 100 words per minute is not the strict upper limit but where you start getting diminishing returns with practice.
The reality of writing speed
However, I think we all know that writing, it’s just not possible to write at 100 words per minute. It’s much more common for us to write at speeds more like 20 words per minute. For myself, it’s generally 14, or even less if it’s a piece of serious technical work.
Unrivaled first draft generation
Admittedly, using ChatGPT as fancy dictation isn’t really going to solve the problem of composing very exact sentences. However, as a way to generate a first draft, I think it’s completely unrivaled. You can talk through what you want to write, outline the details, say some phrases that can act as placeholders for figures or equations, and there you go.
Revolutionizing the writing process
You have your first draft ready, and it makes it viable to actually do a draft of a really long report in under an hour, and then spend the rest of your time tightening up each of the sections with the bulk of the words already written for you and the structure already there. Admittedly, your mileage may vary.
A personal advantage
I do a lot of teaching and a lot of talking in my job, and I find that a lot easier. I’m also neurodivergent, so having a really short format helps, and being able to speak really helps me with my writing.
Seeking feedback
I’m really curious to see what people think of this article. I’ve endeavored not to edit it at all, so this is just the first draft of how it came out of my mouth. I really want to know how readable you think this is. Obviously, there might be some inaccuracies; please feel free to point them out where there are strange words. I’d love to hear if anyone is interested in trying this out for their work. I’ve only been messing around with this for a week, but honestly, it’s been a game changer. I’ve suddenly looked to my colleagues like I’m some kind of super prolific writer, which isn’t quite the case. Thanks for reading, and I’ll look forward to hearing your thoughts.
(Edit after dictation/processing: the above is 898 words and took about 8min 30s to dictate ~105WPM.)
my main problem with ChatGPT output is that I now know what ChatGPT style reads like, and my eyes slide off it as I immediately assume that nuance at sentence construction level is noise rather than any attempt to communicate meaning.
I should probably ask it to write like me to see if I was fed into its training data.
That’s the point of just using to organise dictation like this instead of asking it to generate output. The headings are ChatGPT but the rest is just the words I said aloud transcribed by Whisper AI.
I suppose my big annoying soapbox opinion is one I’ve had with every other accidentally useful application of ChagGPT - why the hell are we using LLMs to do this? Splitting text into paragraphs can surely be done with a much simpler NLP model (i.e. one that doesn’t require a GPU per user), and it’s not like speech-to-text is new.
I imagine you could use a simple model with ‘good enough’ accuracy, and then have some basic keybinds to very quickly fix up the rest of it. Goto-next-paragraph, goto-previous-paragraph, move-sentence-forward, and move-sentence-backwards would do the job.
N.B. The general process is cool though! I’m neurodivergent like you and I’ve spent a lot of time lately thinking about how to make the actual process of note taking as seamless as possible. I’ve found that reducing the barriers between me and the metaphorical paper has really increased how likely I am to write my thoughts down.
I’ve noticed this pattern all over in the AI hype cycle too — well-known and efficient techniques are either ignored in favor of something extremely wasteful or are rebranded to appear new. I’m actually starting to wonder how many AI startups use a data pipeline of existing techniques that incorporates an LLM step that’s effectively a no-op, or very close to one
probably a lot
one very big part of this is the executive-tier push for “AI”. it’s not because AI, it’s purely social. “everyone else is doing it” and a lot of execs will literally fall over themselves to get something delivered in that manner. because they don’t actually understand the thing in comparison to other shit.
I cite the way “the metaverse” bubbled and fizzed as my supporting datapoint here, along with n-many other bullshit hype cycles in the past
and “blockchain”. Every now and then there’s a writeup in the finance press of some company that is totally using blockchain to move real money around, and in 100% of cases i’ve looked into the blockchain bit gets relegated to logging or removed entirely while MSSQL, Oracle or Postgres do the work. Of course, it’s still advertised as blockchain.
The weird thing about this is how the sales pitch for enterprise blockchain has not changed at all. Pick a presentation on YouTube from the last 6 months and I guarantee it will be identical to one from 5 years ago
I dug into my draft scraps to find this bit I wrote about it. It was never published so when it continues to be true you have to trust my word that I wrote it on the 19th of February 2023.
I’m old enough to remember this same line of argument about internet company hype. That everyone wanted a company as successful as Microsoft or yahoo and was throwing money into anything that had a .com. Of course, one of those was Amazon.com …
It’s possible for one field to be 100% a scam (blockchain, NFTs), while another field is 99% a scam (AI startups), and yet the 1% ends up creating a massive new sector of the economy that is richer than anything prior.
I’m old enough to remember this same line of argument about internet company hype
you’re extremely close to doing your dash on this fine server, cut this shit right now
It’s possible for one field to be 100% a scam (blockchain, NFTs), while another field is 99% a scam (AI startups), and yet the 1% ends up creating a massive new sector of the economy that is richer than anything prior.
again, it’s really weird that you seem to also be anti-cryptocurrency while not knowing that this is the exact reasoning used by some of the most self-harming problem gamblers in the cryptocurrency space — that all it’ll take is one unicorn to make everyone who bought in rich
walking example of the lottery gambler’s paradox - statistically you know it is vanishingly unlikely that you’ll be winner, but maybe it could be you!! because our irrational squishy brainmeats are so goddamned easy to logic-hack, so the cycle continues
also, honest question: have you previously lost a whole lot of money gambling on cryptocurrency?
Agreed, ChatGPT is nearly useless here compared to Whisper AI. Speech to text isn’t new, but my experience is that Whisper AI is much better than any other speech to text I’ve used.
One benefit of this approach is that ChatGPT can also produce summaries which can help with early draft iteration or organising unstructured thoughts.
better speech to text is a good thing
I have done interviews using YouTube’s auto-transcribe on reasonably clear material and it always needs a great deal of cleanup
Yeah, if anything cleaning up speech to text (and probably character recognition too) is the natural use of (these kind of) LLMs as they pretty much just guess what words should be there based on the others. They still struggle with recognising words when the surrounding words don’t give enough context clues, but we can’t have everything!
(Well until the machine gods get here /s 🙄)
They’re also (annecdotally) pretty good at returning the wording of “common” famous quotes if you can describe the content of the quote in other words and I can’t think of other tools that do that quite so well. I just wish people would stop using them to write content for them: recently I was recruiting for a new staff member for my team and someone used ChatGPT to write their application. In what world they thought statisticians wouldn’t see right through that I don’t know 😆
It’s useless as a way to find references
serious question: did you expect otherwise, and if so, why? I’ve seen a number of people attempt this tooling for this reason and it seems absurd to me (but I’m already aware of the background of how these things work)
to these:
which I could just Google anyway
this is actively worsening from both sides - on goog’s side with doing all the weird card/summation/etc crap, on the other side where people are (likely already with LLMs) generating filler content for clickthrough sites. an awful state of affairs
It’s really bad within my field yep. any time you go beyond the bounds of something it’s seen sufficiently heavy training on (“popular things”), it readily falls off an accuracy cliff. because of course it does, it can’t do anything differently.
just generates hallucinations
nit: this is correct but possibly not in the way that you meant
this is all it does. everything is a synthesis/hallucination. the fact that some are “correct” is a derived trick of statistics (many people clicking y/n on things in the training phases to heavily weight towards some $x, engineering effort to make certain chosen-$x more likely than some other, etc)
as to the rest of the post: I do see a possible future where llm-like (or whatever branch of it follows) could be useful, but there’s a number of notable things that would have to happen differently. open > closed is one thing, actually having global support (instead of some anglofranco dipshittery as it tends to center atm), etc etc etc. “if these clowns keep being the drivers” is not how I think we’ll get there. “all of this on chatgpt” is definitely not how we’ll get there.
that the post itself was characterised by a number of short-header-short-paragraph entries is notable (and probably somewhat obvious as to why?). what I can’t see is how that can necessarily gain you time in the case of something where you’d be working in much longer/more complex paragraphs, or more haltingly in between areas as you pause on structure and such
in the end precision is precision, and it takes a certain amount of work, time, and focus to achieve. technological advances can help on certain dimensions of this, but ime even that usually comes at a tradeoff somewhere
this is all it does. everything is a synthesis/hallucination. the fact that some are “correct” is a derived trick of statistics (many people clicking y/n on things in the training phases to heavily weight towards some $x, engineering effort to make certain chosen-$x more likely than some other, etc)
Yeah, a common misconception that I keep seeing is “ChatGPT makes mistakes and says things that aren’t true because it was trained on the entire internet, which contains a lot of falsehoods.” It’s important to understand that this is not why ChatGPT says things that aren’t true! It says things that aren’t true because it’s a statistical sentence constructor that puts together sentences one word at a time without any reference to the actual meaning of what it’s saying. Just training an LLM on ‘factual’ info like scientific journal articles or something isn’t going to fix the issue. (in fact they tried that already… it didn’t work)
Note that the new line of thinking is “if you didn’t use at least 10,000 GPUs you didn’t try anything”. All the models that show even a spark of intelligence had very absurd amounts of compute put into their training. It is possible that galactica would have worked had facebook put more resources into it.
I mean I’ll believe it when I see it but until then I’m gonna assume making a bigger version of the thing is still gonna have the same problems as literally every other time they’ve tried it
serious question: did you expect otherwise, and if so, why? I’ve seen a number of people attempt this tooling for this reason and it seems absurd to me (but I’m already aware of the background of how these things work)
In answer to your first question, no, I didn’t expect it to be good for finding references.
For some context on myself, I’m a statistician, essentially. I have some background in AI research, and while I’ve not worked with large language models directly, I have some experience with neural networks and natural language processing.
However, my colleagues, particularly in the teaching realm, are less familiar with what ChatGPT can be used for, and do try to use it for all the things I’ve mentioned.
this is actively worsening from both sides - on goog’s side with doing all the weird card/summation/etc crap, on the other side where people are (likely already with LLMs) generating filler content for clickthrough sites. an awful state of affairs
You are right that the quality of Google search results are worse, but I’ll admit to using the term Google somewhat pejoratively to mean the usual process I would use to seek out information, which would involve Google, but also involve Google Scholar, my university’s library services, and searching the relevant journals for my field. Apologies for the imprecision there.
nit: this is correct but possibly not in the way that you meant
With regards to the hallucinations, I am using the word in a colloquial sense to mean it’s generating, “facts that aren’t true”. So, I’m using the word in a colloquial sense to mean it’s generating, quote, facts that aren’t true, end quote.
that the post itself was characterised by a number of short-header-short-paragraph entries is notable (and probably somewhat obvious as to why?). what I can’t see is how that can necessarily gain you time in the case of something where you’d be working in much longer/more complex paragraphs, or more haltingly in between areas as you pause on structure and such
The structure being short paragraphs is partly to down to the way I was speaking, I was speaking off the top of my head and so my content wouldn’t form coherently long paragraphs anwyay. Having used this approach in a few different contexts, it does break things into longer paragraphs. I couldn’t predict exactly when it would break things into longer or shorter paragraphs, but it does a good enough job for being able to edit the text as a first draft.
Chat GPT is certainly aggressive with generating the headers, and honestly, I don’t tend to use it with the header version all that much. I just thought it was an interesting demonstration.
Also, with this example, in contrast to the ones in my work, I had the idea for this post come into my head, recorded it, and posted it here in under ten minutes. Well, that’s not strictly true. There was a bug when I tried to post it that I had to get mod support for, but otherwise, it was under ten minutes.
At work, the content is not stuff that’s off the top of my head. I talk about my subject and I teach my subject all the time so I’m already able to speak with precision about it, as such dictation is helpful for capturing what I can convey verbally.
in the end precision is precision, and it takes a certain amount of work, time, and focus to achieve. technological advances can help on certain dimensions of this, but ime even that usually comes at a tradeoff somewhere
You’re right that precision does take time, and as the stuff comes out, it’s not suitable for the final draft of a research paper. However, you can get 80% of the way there, and often, in the early stages of writing a research paper or similar, the key thing is to communicate what you’re working on with colleagues. And being able to draft several thousand words rapidly in under an hour so I can give someone a good idea of what I’m aiming for is very useful.
Anyway, thanks for your feedback. I really appreciate it.
(Full disclosure: I also wrote this comment using ChatGPT/Whisper AI and copying your quotes in.)
(Well, I say using ChatGPT. This isn’t really about using ChatGPT to do anything more than put paragraphs in, and headings of you so desire. I just thought this was worth posting because the technique is useful to me and I thought others might find it handy.)
With regards to the hallucinations, I am using the word in a colloquial sense to mean it’s generating, “facts that aren’t true”. So, I’m using the word in a colloquial sense to mean it’s generating, quote, facts that aren’t true, end quote.
as I understand it, “hallucination” is also the jargon word used in ML for when it generates wrong facts (even though the processes for facts and non-facts are the same)
Yeah, matches with my experience among the other stats and data science folks I interact with, but most of my sphere are statisticans or empirical researchers from various subjects using stats so I can’t claim inner knowledge of the LLM crowd’s stuff.
I think it’s a pretty alright metaphor. My very oversimplified layman’s understanding of dreams and other hallucinations is a nervous system attempting to pattern match nonsense stimulus into something it can recognize, semantics be damned. There are some parallels to draw to a statistical engine choosing the next token based on syntactic probability and forming confidently wrong sentences.
Overly long aside: Even accounting for all the nonsense contemporary LLMs produce, it is quite impressive how much they do get right. I am not opposed to the idea that semantic models such as those of humans and other conscious beings occur as an emergent phenomenon from sufficiently complex syntactic manipulation of symbolic tokens. To me Searle’s Chinese Room thought experiment seems to describe a sentient Choose Your Own Adventure book rather than an unthinking entity, though I’m not sure I even understand the argument properly. I don’t think LLMs have anything I’d describe as a sense of truth, but I’d actually expect the statements of a syntax maximizer to correlate even less with semantically correct ideas and that’s interesting.
Yes, I write like a dweeb but at least I know I’m out of my depth.
The closest thing LLMs have to a sense of truth is the corpus of text they’re trained on. If a syntactic pattern occurs there, then it may end up considering it as truth, providing the pattern occurs frequently enough.
In some ways this is made much worse by ChatGPT’s frankly insane training method where people can rate responses as correct or incorrect. What that effectively does is create a machine that’s very good at providing you responses that you’re happy with. And most of the time those responses are going to be ones that “sound right” and are not easy to identify as obviously wrong.
Which is why it gets worse and worse when you ask about things that you have no way of validating the truth of. Because it’ll give you a response that sounds incredibly convincing. I often joke when I’m presenting on the uses of this kind of software to my colleagues that the thing ChatGPT has automated away isn’t the writing industry as people have so claimed. It’s politicians.
In the major way it’s used, ChatGPT is a machine for lying. I think that’s kind of fascinating to be honest. Worrying too.
(Also more writing like a dweeb please, the less taking things too seriously on the Internet the better 😊)
I’m really curious to see what people think of this article. I’ve endeavored not to edit it at all, so this is just the first draft of how it came out of my mouth. I really want to know how readable you think this is.
Honestly, this is a tough one to respond to. I want to respond as if this were meant to be an article about your experiment rather than an example of the kind of writing you are expected to output in your job. In that context it feels more like a part of a bigger idea/narrative rather than the idea/narrative itself.
It does read like a dictated voice note and my eyes skipped over the paragraph titles. Looking back over them they seem more like handy pointers for editing your text.
What are your views on how these positive takeaways can be tempered by different writing contexts?
I can see this being a lead-in to an article about the pressure on academics to write, what it means to be prolific, and if the content of their writing is appreciated.
Oh, it is supposed to be an article about the experiment, or rather an experiment itself; the kind of writing I output for my job is very different. It seems like my intentions were pretty roundly misinterpreted here in general, still it took 10mins to write from inception of the idea for the article so I’m not too upset by that.
Agreed re paragraph titles, pretty much for me this is all about making dictation a more streamlined process. This is the first time I’ve found it accurate enough to be useful and had a way (via ChatGPT splitting things into paragraphs) to make it accessible to edit.
Wildly I wouldn’t actually say I’m overworked writing wise as an academic, but I am certainly the exception there.
It’s definitely clear that better voice to text is a legit value area for ML/LLMs from what you say.
Ah, you know, I should have just written that 😂 I swear my next post will be higher effort!