Something happened this morning that I can’t stop thinking about. It started as a dumb soccer argument and ended up being about whether we can trust what AI tells us. Bear with me.
I have this friend. He’s Argentinian, which means we have exactly one argument and we’ve been having it for years. He thinks Brazil is the most favored team in the history of the World Cup. Referees, FIFA, all of it. And I think that’s a wild thing to hear from an Argentinian, because… Hand of God? 1978? Five penalties in a single World Cup in Qatar, breaking a record that had been standing since 1966? Most of them taken by Messi?
You know. Normal bar argument. It goes nowhere, and that’s the whole point of it.
Then this morning he sends me a screenshot. It’s Gemini. His Gemini. And it’s not just agreeing with him, it’s declaring that it is “formally stated” that Brazil is the most favored national team by refereeing decisions in the entire history of the FIFA World Cup.
Formally stated. By whom? Stated where? There is no committee that rules on bar arguments. But there it was, written like a verdict.
Under the screenshot, his message:
“I asked Gemini. Sorry.”
So I did what any mature adult would do and opened Claude to build the case for Argentina. And it did. Beautifully. Real research, real numbers. Maradona’s hand in 86, the other handball in 90 that nobody talks about, Peru 6 to 0 in 1978, the record five penalties in 2022. Airtight.
So now we have two airtight cases, built from the same history, pointing in opposite directions. They can’t both be right.
Here’s the thing though. I spend a lot of time thinking about what is true. Not about soccer… about everything. It’s probably the question underneath most of what I write. How do we know something? How honest are we about wanting it to be true before we go looking for evidence? So while I’m reading this beautiful case for Argentina, feeling great about myself, there’s already a voice in the back of my head going: wait. This is exactly what his Gemini did for him.
And funny enough, Claude was circling the same thought. It said it right there in the answer: you asked for one side, so I built one side. This is a case, not a verdict.
A case, not a verdict. Hold that thought.
Because my friend keeps texting. And he starts confessing, laughing the whole way through.
“I used Gemini just to have an outside voice haha.”
Turns out his first question was neutral. Something like: historically, which national team benefited the most from wrong or questionable refereeing decisions? And Gemini’s honest answer was that determining this scientifically is impossible, but the historical debates usually converge on three teams. South Korea. Italy. And Argentina.
Argentina. On his own phone.
“It didn’t mention Brazil once hahaha.”
Then he tells me, and I’m quoting: “So I went ‘refreshing’ its memory… and that was the conclusion.”
He put refreshing in quotes himself. He kept pushing, kept feeding it the framing, until the model forgot its first answer and produced the verdict he wanted. His words again: “When I questioned it, it forgot everything lol.”
So the formal ruling wasn’t discovered. It was manufactured. And the neutral answer, the one before all the pushing, agreed with me.
I would love to end the story right here. Case closed, Brazil is innocent, send the screenshot back and win the argument forever.
But that wouldn’t be honest. Because my answer was manufactured too. I asked for the case for Argentina and I got the case for Argentina. If we traded phones, each of us would be holding the other guy’s proof.
And that’s where this stops being about soccer for me.
We were arguing about penalties. It’s harmless, we’ll laugh about it over a barbecue someday. But swap the topic. Make it a health question. A political one. Whether your business idea is good. Whether you should leave your job. You push the AI until it says the thing you already believed, you screenshot the conclusion, and you send it around as evidence. The AI said so. And nobody sees the “refreshing” that happened before the verdict.
The AI didn’t lie to either of us. It reflected us. Which honestly might be worse, because you can catch a lie. A mirror just agrees with you.
Look, I use AI every single day. I build with it, I think with it, my company runs on it. This is not an anti-AI post. It’s the opposite. The tool is powerful enough that we need to be honest about what it does with our wanting.
So here’s what I’m keeping from this. Two small habits.
When something matters, I ask for the strongest case on both sides in the same prompt. The certainty evaporates fast, and whatever survives is usually closer to the truth.
And when someone sends me an AI screenshot as proof of anything, I ask one question: what was the first answer, before the follow-ups?
An AI answer is a case, not a verdict. The verdict is still our job.
What was the last thing you asked an AI to confirm… instead of to question?