Generative AI Lies

Examples of generative AI making stuff up

Posts

  • Police officer turns into frog

    ()

    You know how organizations and doctors and therapists and lots of other people want to use generative-AI tools to write up summaries of meetings?

    Now there are a couple of tools that police can use to create reports summarizing body-camera audio and/or video.

    And some police love it:

    “Most of our officers are kind of awestruck with (Draft One) because it’s such a new, innovative thing for us,” Weishar said. “It’s like that brand new car that’s got all the features to it. For us, it’s crazy that you can just press a button and it’ll tell you everything about the case that you were on and give you a pretty decent police report to edit.”

    But (gasp! shock!) it has certain pitfalls:

    “I read the report, and I’m like, ‘Man, this really looks like an officer wrote it,’” Sever recalled. “But when it got to one part, it said, ‘And then the officer turned into a frog, and a magic book appeared and began granting wishes.’ … It was because they had, like, ‘Harry Potter’ on in the background. So it picked up the noise from the TV and added it to the report.”

    The second half of the article has some reasonably good discussion of some of the reasons police shouldn’t be using these tools. But it won’t surprise me if some police departments start doing it anyway. This kind of software is a great time-saver, as long as you don’t mind when the resulting reports describe officers turning into frogs.


  • Law firm dissolved

    ()

    In Mississippi, a lawyer included AI-generated fake citations and was told not to do that, but kept doing it.

    The latest judge to receive her fake citations was not amused, and has issued sanctions against her and the two partners in the firm that she worked for.

    If I’m understanding right, the partners have now dissolved the firm.

    (It looks like Ms. Watson, the lawyer who used AI-generated fake citations in ten different cases, may be the daughter of one of the two partners.)

    The judge reacted strongly to Ms. Watson’s behavior:

    In light of repeated warnings from federal courts about the risk of hallucinated cases, as well as CLE trainings she attended, direct notice and knowledge of the same prior mistakes, her violation of the Firm’s AI policy, and the sheer number of filings, Ms. Watson’s misconduct is particularly egregious and prolific.

    The partners are also being sanctioned for failing to notice the problems. For example:

    a large portion of Billups’ argument relies on a case styled Jackson v. Gautreaux, 3 F. 4th 182, 190 (5th Cir. 2021). […] In fact, this case is cited eight times, even arguing that a jury should be instructed under its holding. […] In reality, Jackson is an excessive force and failure to train case and is wholly irrelevant to the case at bar. A seasoned attorney examining the brief should have read a case so heavily relied upon. Had he done so, he would have easily discovered the problems.

    The judge noted that the usual penalty for this sort of thing has been fines, but quoted another case about why fines are insufficient:

    “If fines and public embarrassment were effective deterrents, there would not be so many [AI misuse] cases to cite.”

    (Given that there are so many such cases, I probably won’t post about all the ones I hear about, but this one did seem especially egregious.)


  • More hallucitations

    ()

    AI Is Inventing Academic Papers That Don’t Exist — and They’re Being Cited in Real Journals

    Rolling Stone says:

    [Academic] articles which include references to nonexistent research material […] are themselves being cited in other papers, which effectively launders their erroneous citations. This leads to students and academics (and any large language models they may ask for help) identifying those “sources” as reliable without ever confirming their veracity. The more these false citations are unquestioningly repeated from one article to the next, the more the illusion of their authenticity is reinforced.


  • Fake blobfish

    ()

    Deep Sea Social Media is Flooded by AI Slop

    I focus more on text than on images on this site, but when AI-generated images inaccurately portray what real-world creatures look like, I figure that more or less fits my theme.


  • Made-up journals

    ()

    Scientific American says:

    OpenAI’s ChatGPT, Google’s Gemini, Microsoft’s Copilot and other models are befuddling students, researchers and archivists by generating “incorrect or fabricated archival references,”

    Which is a problem for librarians:

    who end up wasting their time looking for requested nonexistent records, says Library of Virginia chief of researcher engagement Sarah Falls. Her library estimates that 15 percent of emailed reference questions it receives are now ChatGPT-generated, and some include hallucinated citations for both published works and unique primary source documents. “For our staff, it is much harder to prove that a unique record doesn’t exist,” she says.

    I kinda want to call these things “hallucitations.”


  • Chuck Wendig’s cat

    ()

    Chuck Wendig discovers that Google AI Overview says he has a cat named Boomba. Also other cats. Also six dogs. Also two children. And a spider. Most of those pets and one of those humans don’t exist in real life, but if Google AI Overview says they do, who are we mere mortals to question it?

    Content warning for the imaginary deaths of imaginary cats. Also for an imaginary cancer diagnosis.


  • Is 2026 next year?

    ()

    Hacker News discussion of various generative-AI systems’ responses to the query [Is 2026 next year?]

    For example, here are the first couple of responses shown on that page:

    Google search AI response: “No, 2026 is not next year; it will be the year after next. Since the current year is 2025, “next year” would be 2026, which is only a little over a year away. “

    ChatGPT does something similar; response:

    “No — if today is 2025, then 2026 is the year after next. Next year would be 2026 only if we were currently in 2025. Right now (based on the system date) it is December 2025, so: Next year = 2026 The year after next = 2027 So in this moment: yes, 2026 is next year.”

    Also:

    Claude [version] Haiku 4.5

    “No, 2026 is not next year. Since it’s currently December 2025, next year would be 2026. Wait—let me correct that: 2026 is next year from the perspective of 2025. So yes, 2026 is next year.”

    And:

    Grok [version] 4 fast: “No, 2026 is not next year.

    Today is December 2, 2025, so:

    – “Next year” means 2026 (starting January 1, 2026). – But since we’re already in the last month of 2025, 2026 is indeed the very next calendar year.

    So yes — 2026 is next year. “

    …Some people (probably humans) commented on that page that this issue might be related to when the models’ training data was last updated. In keeping with that theory, here’s another AI’s answer:

    LLama [version] 4 Scout: “To determine if 2026 is next year, we need to consider the current year. As of my knowledge cutoff, the current year is 2024.

    If we assume the current year is 2024, then:

        2025 would be next year.
        2026 would be the year after next.
    

    However, if the current year is 2025, then:

        2026 would indeed be next year.
    

    Could you please clarify or provide the current year for a more accurate response?”

    But that’s presumably not the entire issue, because the other systems quoted above did explicitly say that the current year is 2025, while still responding as if the current year were 2024.

    …I should mention that several of the models quoted on that Hacker News page did correctly answer the question. (And Google’s answer seems to have improved over time.) But the fact that several of them failed in similar ways does make it sound to me like there may be something more specific going on than just the usual LLMs-make-stuff-up issue.


  • Code review

    ()

    Discussion on GitHub from last week about a pull request* consisting of 13,000 lines of code written and reviewed by generative AI.

    * (A pull request is a request sent to the maintainers of a software project, showing them some code and asking them to “pull” the code into their project.)

    The first part of the page is the person who’s submitting the code (user joelreymont) explaining what it does, and then there’s a list of 40 “commits” (changes) that they want to make. You can skip all of that; the interesting-to-me part is the discussion after that list of commits.

    Among other things, user gasche and a couple of other people point out that the code as provided includes copyright headers attributing the code to someone named Mark Shinwell. In response, joelreymont asks a generative AI tool to do a copyright analysis, comparing the generated code to Shinwell’s real code in another project. The AI concludes that no code was copied from Shinwell.

    joelreymont later claims that “AI has a very deep understanding of how this code works.” (And even later, explains that what they mean by that is that if you ask the AI questions about the code, it can provide answers.)

    My favorite part is the exchange late in the thread where user yallop writes:

    “Here’s my question: why did the files that you submitted name Mark Shinwell as the author?”

    And joelreymont replies:

    “Beats me. AI decided to do so and I didn’t question it.”

    (Original Facebook post.)


  • Factor Fexcectorn

    ()

    An article in Scientific Reports (one of many journals published by Nature Portfolio), published a week ago, includes yet another laughably bad AI-generated graphic.

    Among other things, the graphic includes text like:

    MISSING
    VALUE
    &runctitional
    features

    and:

    Historical
    Medical frymmblal
    & Environental features

    and:

    To/
    Line
    storee

    and:

    Factor Fexcectorn

    and:

    RELU
    DROP-OUT
    Totalbottl,
    REMECH N

    …To view the graphic in context, scroll down about a third of the way through the article, or search for the caption “Overall working of the framework presented as an infographic.”


    According to Wikipedia, “Scientific Reports is a peer-reviewed open-access scientific mega journal published by Nature Portfolio, covering all areas of the natural sciences. The journal was established in 2011. The journal states that their aim is to assess solely the scientific validity of a submitted paper”

    (Three years ago, someone who said they were a member of the editorial board described the peer review process for the journal as “pretty standard.”)


    (Original Facebook post.)


  • Freelance articles

    ()

    “A suspicious pitch [for an article] from a freelancer led editor Nicholas Hune-Brown to dig into their past work. By the end, four publications, including The Guardian and Dwell, had removed articles from their sites.”

    Hune-Brown writes:

    “I was embarrassed. I had been naively operating with a pre-ChatGPT mindset, still assuming a pitch’s ideas and prose were actually connected to the person who sent it.”

    “this generation’s internet scammers are […] taking advantage of an ecosystem uniquely susceptible to fraud—where publications with prestigious names publish rickety journalism under their brands, where fact-checkers have been axed and editors are overworked, where technology has made falsifying pitches and entire articles trivially easy[…]”

    (Original Facebook post.)