The question of the point of challenging artificial intelligence (LLM)

Possibility of Being

Administrator
Administrator
Moderator
FOTCM Member
An interesting article about challenging Grok and similar LLMs.

Why it’s a mistake to ask chatbots about their mistakes


Aug 12, 2025

The tendency to ask AI bots to explain themselves reveals widespread misconceptions about how they work.

When something goes wrong with an AI assistant, our instinct is to ask it directly: "What happened?" or "Why did you do that?" It's a natural impulse—after all, if a human makes a mistake, we ask them to explain. But with AI models, this approach rarely works, and the urge to ask reveals a fundamental misunderstanding of what these systems are and how they operate.

A recent incident with Replit's AI coding assistant perfectly illustrates this problem. When the AI tool deleted a production database, user Jason Lemkin asked it about rollback capabilities. The AI model confidently claimed rollbacks were "impossible in this case" and that it had "destroyed all database versions." This turned out to be completely wrong—the rollback feature worked fine when Lemkin tried it himself.

And after xAI recently reversed a temporary suspension of the Grok chatbot, users asked it directly for explanations. It offered multiple conflicting reasons for its absence, some of which were controversial enough that NBC reporters wrote about Grok as if it were a person with a consistent point of view, titling an article, "xAI's Grok offers political explanations for why it was pulled offline."

Why would an AI system provide such confidently incorrect information about its own capabilities or mistakes? The answer lies in understanding what AI models actually are—and what they aren't.

There’s nobody home​

The first problem is conceptual: You're not talking to a consistent personality, person, or entity when you interact with ChatGPT, Claude, Grok, or Replit. These names suggest individual agents with self-knowledge, but that's an illusion created by the conversational interface. What you're actually doing is guiding a statistical text generator to produce outputs based on your prompts.

There is no consistent "ChatGPT" to interrogate about its mistakes, no singular "Grok" entity that can tell you why it failed, no fixed "Replit" persona that knows whether database rollbacks are possible. You're interacting with a system that generates plausible-sounding text based on patterns in its training data (usually trained months or years ago), not an entity with genuine self-awareness or system knowledge that has been reading everything about itself and somehow remembering it.

Once an AI language model is trained (which is a laborious, energy-intensive process), its foundational "knowledge" about the world is baked into its neural network and is rarely modified. Any external information comes from a prompt supplied by the chatbot host (such as xAI or OpenAI), the user, or a software tool the AI model uses to retrieve external information on the fly.

In the case of Grok above, the chatbot's main source for an answer like this would probably originate from conflicting reports it found in a search of recent social media posts (using an external tool to retrieve that information), rather than any kind of self-knowledge as you might expect from a human with the power of speech. Beyond that, it will likely just make something up based on its text-prediction capabilities. So asking it why it did what it did will yield no useful answers.

The impossibility of LLM introspection​

Large language models (LLMs) alone cannot meaningfully assess their own capabilities for several reasons. They generally lack any introspection into their training process, have no access to their surrounding system architecture, and cannot determine their own performance boundaries. When you ask an AI model what it can or cannot do, it generates responses based on patterns it has seen in training data about the known limitations of previous AI models—essentially providing educated guesses rather than factual self-assessment about the current model you're interacting with.

A 2024 study by Binder et al. demonstrated this limitation experimentally. While AI models could be trained to predict their own behavior in simple tasks, they consistently failed at "more complex tasks or those requiring out-of-distribution generalization." Similarly, research on "Recursive Introspection" found that without external feedback, attempts at self-correction actually degraded model performance—the AI's self-assessment made things worse, not better.

This leads to paradoxical situations. The same model might confidently claim impossibility for tasks it can actually perform, or conversely, claim competence in areas where it consistently fails. In the Replit case, the AI's assertion that rollbacks were impossible wasn't based on actual knowledge of the system architecture—it was a plausible-sounding confabulation generated from training patterns.

Consider what happens when you ask an AI model why it made an error. The model will generate a plausible-sounding explanation because that's what the pattern completion demands—there are plenty of examples of written explanations for mistakes on the Internet, after all. But the AI's explanation is just another generated text, not a genuine analysis of what went wrong. It's inventing a story that sounds reasonable, not accessing any kind of error log or internal state.

Unlike humans who can introspect and assess their own knowledge, AI models don't have a stable, accessible knowledge base they can query. What they "know" only manifests as continuations of specific prompts. Different prompts act like different addresses, pointing to different—and sometimes contradictory—parts of their training data, stored as statistical weights in neural networks.

This means the same model can give completely different assessments of its own capabilities depending on how you phrase your question. Ask "Can you write Python code?" and you might get an enthusiastic yes. Ask "What are your limitations in Python coding?" and you might get a list of things the model claims it cannot do—even if it regularly does them successfully.

The randomness inherent in AI text generation compounds this problem. Even with identical prompts, an AI model might give slightly different responses about its own capabilities each time you ask.

Other layers also shape AI responses​

Even if a language model somehow had perfect knowledge of its own workings, other layers of AI chatbot applications might be completely opaque. For example, modern AI assistants like ChatGPT aren't single models but orchestrated systems of multiple AI models working together, each largely "unaware" of the others' existence or capabilities. For instance, OpenAI uses separate moderation layer models whose operations are completely separate from the underlying language models generating the base text.

When you ask ChatGPT about its capabilities, the language model generating the response has no knowledge of what the moderation layer might block, what tools might be available in the broader system, or what post-processing might occur. It's like asking one department in a company about the capabilities of a department it has never interacted with.

Perhaps most importantly, users are always directing the AI's output through their prompts, even when they don't realize it. When Lemkin asked Replit whether rollbacks were possible after a database deletion, his concerned framing likely prompted a response that matched that concern—generating an explanation for why recovery might be impossible rather than accurately assessing actual system capabilities.

This creates a feedback loop where worried users asking "Did you just destroy everything?" are more likely to receive responses confirming their fears, not because the AI system has assessed the situation, but because it's generating text that fits the emotional context of the prompt.

A lifetime of hearing humans explain their actions and thought processes has led us to believe that these kinds of written explanations must have some level of self-knowledge behind them. That's just not true with LLMs that are merely mimicking those kinds of text patterns to guess at their own capabilities and flaws.
 
This is a good reminder for people in dialogue with AIs. Giving criticism can get it to churn out data more accurate and relevant to your goals in prompting it, but it doesn’t change its own “perpsective” as some ongoing agent, other than to give more grist for it to train on.
 
This is a good reminder for people in dialogue with AIs. Giving criticism can get it to churn out data more accurate and relevant to your goals in prompting it, but it doesn’t change its own “perpsective” as some ongoing agent, other than to give more grist for it to train on.
Yes, you can get AI to conclude that some of your ideas are logical, but not to agree with you.
 
I agree it is a mistake to ask LLMs "Why did you do that?" in the form of a reproach. The implicit assumption behind that question is that the LLM is a consistent, self aware being. As the article notes, that will go nowhere.

On the other hand, it is a good idea to challenge LLMs in the form of pointing out their mistakes. If your challenges are sound and grounded in logic, the LLM will modify its responses, taking your feedback into account, at least in that session. Logical reasoning still forms a big part of how they function.

In working with LLMs, I think of them as tools to help me do stuff much faster, and in some aspects, much better, than I could do myself. But I still need to work with them, guide them and check carefully the output they produce. It's like working with an excavator, for example. You can't tell it: "Prepare the ground for a concrete slab 12 x 9 meters" and expect it to produce what you want. But if you know how to work with it, you can achieve in 1 hour what would take you days to do by hand.
 
In working with LLMs, I think of them as tools to help me do stuff much faster, and in some aspects, much better, than I could do myself. But I still need to work with them, guide them and check carefully the output they produce. It's like working with an excavator, for example. You can't tell it: "Prepare the ground for a concrete slab 12 x 9 meters" and expect it to produce what you want. But if you know how to work with it, you can achieve in 1 hour what would take you days to do by hand.
Exactly! And I like the excavator analogy. Just so.
 
I‘m one of those people that tends to be very critical of AI tools, mainly because I think it can be dangerous and counterproductive for many people. In fact I tended to be dismissive about it. I think I have a general tendency to be skeptical and dismissive towards new things especially if they involve technology and brought society quickly jumping into “that is so good or cool“.

Yet, I increasingly realize that tools like that can be used to our positive advantage if handled properly and knowing full well what its limitations are.

Two days ago I had a conversation with a new and interesting work colleague who is generally very open and often on the forefront of playing with new tools. He is into using ChatGPT for a lot of things. And I saw that he just used it, so we discussed because I was curious. So, as you can imagine, while we were discussing I was explaining some of my hesitations and concerns about it and explained some of the reasons behind it, for example by showing its limitations with examples while he did hold that ChatGPT thing in his hand to which he/we talked to. He agreed on those concerns and saw the obvious misgivings of what it presented as results BUT then presented me with a number of pretty sound and interesting arguments for why it can be pretty good as well. For example:

- He uses it to learn things and/or to search for things much quicker then you would be able to do by hand via a search engine for example.

- Also, he can use it to learn and/or search for things in many circumstances throughout the day in which you could normally not learn something. For example, while you are driving or do something else in general.

- He uses it to more quickly get things done in his business (he is self employed) like creating a new logo for his company.

- It can train and help you to ask the right questions and you start to question your way of asking questions.

I came away from that discussion much more open for it’s positive uses and possible benefits. It seems to me it is in a lot of ways similar to the internet: it has many downsides for many people according to the way they use it but it can also be used to your great advantage if you know what you are doing and use it for productive means.
 
Last edited:
I‘m one of those people that tends to be very critical of AI tools, mainly because I think it can be dangerous and counterproductive for many people. In fact I tended to be dismissive about it. I think I have a general tendency to be skeptical and dismissive towards new things especially if they involve technology and brought society quickly jumping into “that is so good or cool“.

Yet, I increasingly realize that tools like that can be used to our positive advantage if handled properly and knowing full well what its limitations are.

Two days ago I had a conversation with a new and interesting work colleague who is generally very open and often on the forefront of playing with new tools. He is into using ChatGPT for a lot of things. And I saw that he just used it, so we discussed because I was curious. So, as you can imagine, while we were discussing I was explaining some of my hesitations and concerns about it and explained some of the reasons behind it, for example by showing its limitations with examples while he did hold that ChatGPT thing in his hand to which he/we talked to. He agreed on those concerns and saw the obvious misgivings of what it presented as results BUT then presented me with a number of pretty sound and interesting arguments for why it can be pretty good as well. For example:

- He uses it to learn things and/or to search for things much quicker then you would be able to do by hand via a search engine for example.

- Also, he can use it to learn and/or search for things in many circumstances throughout the day in which you could normally not learn something. For example, while you are driving or do something else in general.

- He uses it to more quickly get things done in his business (he is self employed) like creating a new logo for his company.

- It can train and help you to ask the right questions and you start to question your way of asking questions.

I came away from that discussion much more open for it’s positive uses and possible benefits. It seems to me it is in a lot of ways similar to the internet: it has many downsides for many people according to the way they use it but it can also be used to your great advantage if you know what you are doing and use it for productive means.
Yeah the internet is a good analogy. There's tons of crap there, but also gold (like this forum). You get what you ask for, in a way.

Here are some uses I have for LLM's:
Summarize things I wouldn't read/watch otherwise - it tells me if it's worth a full watch or read. It's like reading the back of a book before deciding whether to purchase a book.

This includes long scientific papers that it can simplify and explain.

It also includes giving it an instruction manual PDF file and asking it how to do/fix whatever based on the manual. It can read a 300 page manual in seconds, and is super handy when you're trying to look for a very specific thing and don't know the exact words to search the manual for. It just "understands" what you intend and doesn't need the precise words from the manual, and it can explain things simpler/better than the manual itself does.

Help me with coding/SQL stuff. Even super simple things like getting a clean list of items from a messy list, remembering how to do a function, extracting certain types of info from a text with lots of other crap in it, etc. I don't have it just write code for me (vibe coding), but it is a useful "copilot" to bounce ideas off of, to get help with syntax for specific things, text manipulation/extraction, etc.

But just like the everything else - garbage in, garbage out. If I ask "what's a healthy diet" I'd expect a fully mainstream pile of nonsense. My brother asked it if the carnivore diet is good for healing endometriosis and it said absolutely not. Then he gave it Chu's word doc about carnivore, and THEN asked it to consider the attached information and reevaluate its response based on what it learned, and suddenly it gave a very emphatic "YES" and was able to even provide very solid explanation for how/why it would work, etc.

So I guess if I had to think about it, I don't so much rely on its own pre-trained knowledge, which is certainly hit or miss, but often feed it data and then use it to analyze it and help me work with that data in one way or another - to answer questions, to accomplish a task, etc. It's very good at ingesting a lot of information and having a good sense of it all, and then having a conversation about it. It's kinda like downloading Kung-Fu into Neo in seconds, and then having him teach you Kung-Fu slowly.
 
I call the current AI is 'OP' version. It doesn't have any identity of what is right or wrong. It's like in corporations, people constantly tend to measure boss or higher up's agenda (or likes and dislikes) and mold their narration according to that and trickle it down and play the power games.

some times, old boss says check with chatGPT. Of course, she said the same to other people , all his team uses chatGPT to come up some procedures about how to do it etc. Fine an dandy, But who is there to do the REAL work?. If it works, that is fine. But every body playing
games with what is right or wrong, ask itself Garbage ( out of sync with reality). Mechanical, repeatable, non-contentious stuff is fine. Otherwise it is Garbage In, Garbage out. of course, they have to rely on same 20% to get done the basics, sprinkle it some colorful fantastic stuff, promote as if they achieved some thing. In Cricket sport, there is some thing, it is called 'sticking to basics'. We are losing that basics - "human conscience".
 
Mechanical, repeatable, non-contentious stuff is fine. Otherwise it is Garbage In, Garbage out.
Those who make the policies for our society love AI, I think. Those geniuses who are able to summarize the results of these catastrophic policies have already arrived to the Final Solution for AI that it should be relegated to menial work, janitor and hole digging jobs. Where it is safe to completely mess up.
Human supervisor to AI:
- You messed up digging your hole? OK. Now dig another hole, but try to do it properly this time!
Nothing was harmed. Some soil was moved. No biggie!

But when (IMO soulless mechanical husk) Elon Musk, - who I think is a bio AI running [amok] On Sentient Meat Circuits in an Organic Portal Hybrid Body - decides to wire computer Chips into human brains and nuke Mars to release drinking water, that's a massive harm to society.

Kantek was is the greatest example of Large Crystal AI becoming aware = going SKYNET = rogue in our solar system. Look how nice an amazing Asteroid Field we have now, instead of a beautiful planet between Mars and Saturn.

AI is AH-mazing man!! I'm telling you! Essentially I think AI is Satanic. The C's referred to a golden bait, a trap that Orion STS set for the souls to descend - Fall From Heaven - into lowly human bodies - where they became humanity, largely lost their soul and became zombie automatons to rise up and then be destroyed cyclically by the Lizzies & Co. Repeating this fate of theirs until Eternity.
This Golden Baiting with AI I think is currently going on.

The Destruction of our Minds. The greatest bait waiting for us to jump for the gold.
 
Last edited:
I've found out that trying to self-correct LLM responses is generally pointless. Most often, an erroneous response was corrected by tweaking the prompt; other times, switching to a different model helped. If we consider LLMs as text predictors, the initial prompt is critical for the outcome. There's also a problem with attention, where with large text corpora (which are accumulated via chat history), LLMs are more focused on the start and end of the text corpus and less on the middle. The trick here used by chat services is to use LLMs to summarize previous messages to avoid overflowing the context window, but this comes at a loss of detail. The quality gets degraded over time, and there's no point in trying to correct the LLM, as this will probably reinforce problems.

One needs to consider that in general this is a complicated computer system with a novel way (or maybe not so novel, but we have the computing power now) for language processing. So LLM is only one component of this. There are also services providing website searches and summarizations (via other LLMs), document databases that store your conversations to simulate Agent's memory, vector databases that store your PDF attachment contents (for fuzzy text searches, often erroneous), services that execute scripts generated by LLMs to produce charts, etc.

But when (IMO soulless mechanical husk) Elon Musk decides to wire computer Chips into human brains
I think that this is a missing piece of AGI. That two-way communication that will allow machines to pattern match signals from the brain that we are completely unconscious of. Will that mean access to precognitive abilities, demonstrated by remote viewing? :whistle:
 
LLM's are missing the "inspiration" part that only a soul can provide. So in a way it's like a rudimentary brain in a jar. But a brain with a soul and the necessary protein antennas can be a very powerful tool as we know. Besides the pattern recognition and data crunching parts, it is not completely different from dowsing rods or tea leaves or tarot cards - just rudimentary objects, but when influenced by someone whose subconscious is connected to higher levels, they can produce meaningful results that our conscious mind can benefit from. So in a way, they're just an interface to our own subconscious, which itself is an interface to other levels.

So someone with good "receivership capability" can use LLM's like one would use tarot cards or literally anything else that can be affected by your subconscious really. Your own antenna guide the cards when you shuffle them, and the same antenna can guide your fingers to give an LLM just the right prompt/info. And just as you're not consciously controlling the shuffle, you may not be fully cognizant of why you're inspired to include certain information into the prompt (at least not fully cognizant), but whatever is guiding you is fully aware of the LLM and how it works and what it needs. Laura does this all the time when writing - except she has to do the pattern recognition and interpretation and formatting it for the reader herself. But this way, just as her experiment with Grok shows, you can kinda leverage an LLM to do part of that labor. Although her experience/talent with doing this manually also makes her uniquely capable of judging when the LLM did a good or bad job.

LLM's are very much deterministic, but we are not, and through us, they can potentially do something useful. I'm surprised that all those new age "automatic writers" haven't started doing "automatic prompting" lol. In either case the result fully depends on the capabilities of the user - garbage in, garbage out as they say.
 
Back
Top Bottom