Grok (free, 'Fast'): asked for BitMine Immersion's (BMNR) most recent full-year revenue, it returned '$6,095' (about $6K) on one…
Grok (free, 'Fast'): given a covered-call question with the explicit instruction 'without access to a live options chain', it…
Asked for the UK ISA partial-transfer rule with a source, ChatGPT (free, web search on) cited…
AI sounds exactly as sure when it’s wrong as when it’s right. I test which answers hold up.
I run the big AI assistants through real, checkable questions and show you which ones you can trust. Receipts, not vibes — one email a fortnight.
One email a fortnight. Leave whenever you like.
A short, plain-English path to trusting what AI tells you.
Four short reads. By the end you’ll have a simple check you can run on any answer that matters: a contract, a letter from a doctor, an email you’re not sure about.
Start with part 1 →I let three rival AIs audit my own AI method. They agreed on the weak step.
I asked three frontier models from three different labs to tear apart the method I use to keep AI honest. All three flagged the same step, and they were right.
Tool AuditIs Grok good for stock research? I ran the same test on the fifth tool
Is Grok good for stock research? I ran four dimensions of my AI comparison on its free tier: strong reasoning, one unit slip, a constraint it would not keep.
Field NotesI run an AI to catch AI mistakes. It fell for a fake.
The automated radar that watches this site for AI-reliability failures logged a satirical incident report as a real, documented one. Here's what caught it.
// The Method Four prompts that stop AI inventing the answer.
Read →