Why LLMs Give Confusing True/False Answers

Ask an AI “Birds can fly, true or false?” and an AI might initially say “True,” only to concede “False” after a bit more probing. What’s happening here?

LLMs don’t “know” facts like humans do. They’re pattern-matching systems that predict the most statistically probable response based on their training data. When they see “birds can fly,” they recognize this phrase appears far more often than “birds cannot fly” in human text, so they lean toward “True.”

The “Most” vs “All” Problem

When humans say “Birds can fly,” we implicitly mean “most birds” or “typically, birds can fly.” We understand exceptions exist without stating them.

LLMs struggle with this nuance. They reflect the most common usage patterns from their training rather than applying strict logical quantifiers. Their “True” response approximates “Most birds can fly”—the dominant narrative in human language.

Context Changes Everything

When you follow up with “But what about penguins?”, you’re not catching the AI in an error. You’re providing new context that shifts the probability calculations. The LLM now recognizes patterns related to “exceptions” and “flightless birds,” leading to a different response.

This isn’t self-correction—it’s recalculation based on expanded context.

The Bottom Line

LLMs are brilliant at mimicking human conversation and generating coherent text. However, they are not inherently logical reasoning machines designed for perfect deductive or inductive inference. Understanding this helps us work with their strengths while navigating their limitations more effectively.