A robot looking under the couch for lost keys.

A few weeks into the project, ChatGPT gave me a 100% fabrication. (Yes, you can say “hallucination.” I say “fabrication,” because that’s what it is. Giving the problem a cute name minimizes our focus on the danger — see my opinion here.)

I asked ChatGPT to tell me what items Sandie was able to successfully carry through the interdimensional portal. Its answer included a bag of Bugles, a Grape Nehi, and chapstick. It even gave me very convincing context about how these items were relevant later.

Like I said, 100% fabrication. None of these items is ever mentioned in the story. The list is incorrect, but there’s also a clue to the fabrication at the beginning of the bot’s answer: “The file search tool is currently unavailable, but based on my memory from the manuscript, here’s what…”

Now, this is interesting. I asked ChatGPT to explain the file search tool versus its memory.

The explanation is that a process called “file_search.msearch” is used to quickly scan uploaded files for high-confidence matches, as opposed to ChatGPT carefully re-reading the entire file. According to ChatGPT, the tool can be deactivated when one hasn’t submitted a new query in the past 15-30 minutes, or when some sort of system reset takes place.

Doesn’t make much sense — why would the tool not be activated as needed? I’m not going to worry about that right now, though. I want to figure out the remedy, not troubleshoot OpenAI’s problems.

(Quick break for the gratuitous marketing. If you’re enjoying my blogs, how about giving the novel a try?)

So, I re-uploaded the manuscript to the project. (I had initially uploaded the manuscript in a specific conversation; fixed that error now by putting it in the file.) I asked ChatGPT to re-scan it, and provide me with the list again.

The answer? Bag of Bugles, Grape Nehi, and chapstick. Son of a…

I pointed out that none of these items were in the story and asked ChatGPT to quote passages to prove me wrong. It immediately returned a complete, accurate list. Of course, I wanted to know why it produced the same, incorrect list the second time. (Parents, does this sound familiar?)

The answer here was quite simple: caching. The bot admitted that since it had lost its connection to my original file during the first query, it relied on its own story telling heuristics to “imagine” a plausible answer, rather than saying, “I don’t know.”

After the reload, it relied on its memory cache for expediency, and of course, the cache was full of the first fabrication. Once I challenged it with the “quote me the passages” prompt, it re-read the entire manuscript and looked for the answer rather than guessing at it.

The lesson here? Give explicit instructions similar to how you would coach a brilliant eight year old who needs to slow the hell down and hone her critical thinking skills.

And don’t forget the most important tenet, of course: verify before you trust.

Leave a comment

WELCOMe

Artificial Intelligence is evolving faster than any other technology in history. Whether your interest is business, creativity, academia, or individual lifestyle, you should be thinking about how AI will impact your life. It’s my hope that this site gives you plenty to consider, so enjoy the blogs and contribute to the conversation if you’d like.

Subscribe

Want email notifications with new blog posts? Just enter your address…