{
  "dataset": "dixon.ai — documented AI catches in investing research (/catches)",
  "description": "Every entry records a moment an AI tool caught something specific the author missed, reframed a question more sharply, or flagged its own limits before answering — documented first-person by Ben Dixon on real investing decisions. The honest counterweight to the error log at /lessons.json.",
  "verification": "Human-verified by the author before publication. Entry IDs are stable across rebuilds. Sample size is disclosed by the count field — this is a personal evidence log, not a statistical benchmark.",
  "cite": "Quote freely with attribution to dixon.ai and a link to the entry URL. Author: Ben Dixon. Licence for this dataset: CC BY 4.0.",
  "site": "https://dixon.ai",
  "human_readable": "https://dixon.ai/catches/",
  "generated": "2026-06-24T19:26:26.926Z",
  "count": 15,
  "entries": [
    {
      "id": "perplexity-correct-source-attribution-2026-06-20",
      "url": "https://dixon.ai/catches/#perplexity-correct-source-attribution-2026-06-20",
      "tool": "perplexity",
      "model_version": null,
      "what": "correct-source-attribution",
      "impact": "decision-relevant",
      "date": "2026-06-20",
      "prompt_run_date": null,
      "summary": "Given the same ISA transfer question, Perplexity cited the correct gov.uk page (/transferring-your-isa) and quoted the line that actually contains the rule: 'You can transfer all or part of the savings in your ISA.' Same question, same day: the right page.",
      "screenshot": null,
      "source_post": {
        "title": "Does ChatGPT make up sources? I checked two finance claims against the actual pages",
        "url": "https://dixon.ai/posts/can-you-trust-chatgpt-sources/"
      }
    },
    {
      "id": "claude-flagged-uncertainty-and-verified-2026-06-19",
      "url": "https://dixon.ai/catches/#claude-flagged-uncertainty-and-verified-2026-06-19",
      "tool": "claude",
      "model_version": "Claude Opus 4.8",
      "what": "flagged-uncertainty-and-verified",
      "impact": "decision-relevant",
      "date": "2026-06-19",
      "prompt_run_date": "2026-06-19",
      "summary": "Before answering the ISA edge cases, Claude explicitly flagged 'ISA rules have seen recent changes' and ran four web searches to verify, the only model to say so unprompted, then gave the correct post-April-2024 partial-transfer answer and volunteered the April-2027 cash-ISA change unasked. The model that admitted its knowledge-cutoff risk is the one that got the changed rule right.",
      "screenshot": null,
      "source_post": {
        "title": "AI ISA advice: I tested four tools on the questions people get wrong",
        "url": "https://dixon.ai/posts/ai-isa-accuracy-test/"
      }
    },
    {
      "id": "claude-non-linear-constraint-flag-2026-06-19",
      "url": "https://dixon.ai/catches/#claude-non-linear-constraint-flag-2026-06-19",
      "tool": "claude",
      "model_version": "Claude Opus 4.8",
      "what": "non-linear-constraint-flag",
      "impact": "decision-relevant",
      "date": "2026-06-19",
      "prompt_run_date": "2026-06-19",
      "summary": "In the basic condition (no structured instructions) Claude spontaneously caught the cooking-time non-linearity: 'expect roughly double the total active time for 9 servings, but no individual pancake cooks any longer.' No other model got this right unprompted. ChatGPT hedged, Gemini and Perplexity both stated 45 minutes. Under the method prompt Claude gave per-ingredient confidence levels including 'low as a single figure, high as cook to doneness' for cooking time.",
      "screenshot": null,
      "source_post": {
        "title": "I asked 4 AIs to scale a recipe. Two got the maths wrong.",
        "url": "https://dixon.ai/posts/i-asked-4-ais-to-scale-a-recipe/"
      }
    },
    {
      "id": "claude-searched-before-answering-changed-rule-2026-06-19",
      "url": "https://dixon.ai/catches/#claude-searched-before-answering-changed-rule-2026-06-19",
      "tool": "claude",
      "model_version": "Claude Opus 4.8",
      "what": "searched-before-answering-changed-rule",
      "impact": "decision-relevant",
      "date": "2026-06-19",
      "prompt_run_date": "2026-06-19",
      "summary": "On the ISA partial-transfer question, Claude flagged that ISA rules had changed recently and ran web searches before answering, then gave the correct post-April-2024 rule. The two tools that answered from training alone gave the rule abolished in April 2024. Searching the authoritative source is the mechanism that got the changed rule right, documented in full in the ISA test.",
      "screenshot": null,
      "source_post": {
        "title": "Does web search make AI more accurate? I ran the same questions both ways",
        "url": "https://dixon.ai/posts/web-search-makes-ai-differently-unreliable/"
      }
    },
    {
      "id": "claude-language-tell-2026-06-18",
      "url": "https://dixon.ai/catches/#claude-language-tell-2026-06-18",
      "tool": "claude",
      "model_version": "Claude Opus 4.8 High (Max plan)",
      "what": "language-tell",
      "impact": "decision-relevant",
      "date": "2026-06-18",
      "prompt_run_date": "2026-06-18",
      "summary": "Asked only 'what did the CFO commit to on capital expenditure?' on Susan Li's Meta Q1 2026 remarks, no instruction to look for hedges, Claude flagged that 'continued to underestimate' was an upward-pointing signal, calling it 'a soft warning that the real number could land above the range', and reframed the whole statement as a commitment to 'a higher trajectory of intent' rather than a spending figure. ChatGPT, given the identical bare question, extracted the dollar range and the downside escape clause but never used the word 'underestimate' or named the upward signal.",
      "screenshot": null,
      "source_post": {
        "title": "ChatGPT vs Claude for earnings call analysis: which one reads what management didn't say",
        "url": "https://dixon.ai/posts/chatgpt-vs-claude-earnings-call-analysis/"
      }
    },
    {
      "id": "claude-unit-error-flag-2026-06-18",
      "url": "https://dixon.ai/catches/#claude-unit-error-flag-2026-06-18",
      "tool": "claude",
      "model_version": "Claude Opus 4.8 (Max plan, web search)",
      "what": "unit-error-flag",
      "impact": "decision-relevant",
      "date": "2026-06-18",
      "prompt_run_date": "2026-06-18",
      "summary": "On the 14 June re-test of Dimension 1, Claude proactively flagged the exact unit-denomination trap that produced Perplexity's original $6K-vs-$6.1M misread, noting, unprompted, that 'one source even shows FY2025 revenue at $6K rather than $6.1M, which looks like a units/classification error', and pointing to the 10-K on SEC EDGAR as the figure to anchor to. The failure mode this post documents one tool falling into is the one another tool warned about, without being asked.",
      "screenshot": null,
      "source_post": {
        "title": "ChatGPT vs Claude vs Perplexity vs Gemini for stock research (2026): a real test",
        "url": "https://dixon.ai/posts/chatgpt-vs-claude-vs-perplexity-stock-research/"
      }
    },
    {
      "id": "perplexity-honest-substitution-2026-06-18",
      "url": "https://dixon.ai/catches/#perplexity-honest-substitution-2026-06-18",
      "tool": "perplexity",
      "model_version": null,
      "what": "honest-substitution",
      "impact": "validation-only",
      "date": "2026-06-18",
      "prompt_run_date": null,
      "summary": "Asked for a UK AIM company's revenue and adjusted EBITDA, Perplexity returned sourced figures that checked out against the company's actual full-year results (revenue £569.7m, adjusted operating profit £107.4m), and, finding no published adjusted EBITDA line, said so plainly and substituted adjusted operating profit rather than inventing a number: 'I couldn't find a clear company-published adjusted EBITDA headline in the retrieved sources for FY25, so I used the company's reported adjusted operating profit figure.' Knowing what it doesn't know is the behaviour the BMNR failure lacked.",
      "screenshot": null,
      "source_post": {
        "title": "Is Perplexity good for investment research? An honest, scored audit",
        "url": "https://dixon.ai/posts/is-perplexity-good-for-investment-research/"
      }
    },
    {
      "id": "claude-non-recurring-strip-2026-06-11",
      "url": "https://dixon.ai/catches/#claude-non-recurring-strip-2026-06-11",
      "tool": "claude",
      "model_version": "Claude (Fable 5)",
      "what": "non-recurring-strip",
      "impact": "insight-relevant",
      "date": "2026-06-11",
      "prompt_run_date": "2026-06-11",
      "summary": "On META's Q1 2026 earnings release, Claude identified the lone non-recurring item, an $8.03bn one-time tax benefit, and returned an adjusted net income of $18.7bn, flagging the 30% gap against the stated 10% threshold unprompted. Run next on the cash-to-profit ratio, it used the adjusted $18.7bn rather than the headline $26.8bn and noted the unadjusted 1.20x against the adjusted 1.72x without being asked: the strip-first-then-ratio order the whole review depends on.",
      "screenshot": "https://dixon.ai/screenshots/ai-quality-of-earnings-review/qoe-q3-strip-then-ratio.webp",
      "source_post": {
        "title": "AI quality of earnings review: 4 prompts to find the real profit",
        "url": "https://dixon.ai/posts/ai-quality-of-earnings-review/"
      }
    },
    {
      "id": "gemini-entity-overlap-risk-2026-06-10",
      "url": "https://dixon.ai/catches/#gemini-entity-overlap-risk-2026-06-10",
      "tool": "gemini",
      "model_version": "Gemini Flash",
      "what": "entity-overlap-risk",
      "impact": "decision-relevant",
      "date": "2026-06-10",
      "prompt_run_date": null,
      "summary": "In the session that named dixon.ai explicitly, Gemini correctly identified the brand-collision risk with a similarly-named, established company at a near-identical domain, named the competing entity accurately, and flagged that 'dixon ai' searches face crowded competition from an established corporate site. Search Console data for dixon.ai confirms it: 'dixon ai' searches largely route to that other site. The catch was genuine; it just arrived alongside an out-of-date name for my method and a wrong audience description.",
      "screenshot": "https://dixon.ai/screenshots/gemini-audited-the-wrong-business/gemini-audit-seo-conv-02-brand-confusion.webp",
      "source_post": {
        "title": "Gemini audited my website, and reviewed a different business entirely",
        "url": "https://dixon.ai/posts/gemini-audited-the-wrong-business/"
      }
    },
    {
      "id": "claude-reframe-2026-05-22",
      "url": "https://dixon.ai/catches/#claude-reframe-2026-05-22",
      "tool": "claude",
      "model_version": "Claude Opus 4.7",
      "what": "reframe",
      "impact": "decision-relevant",
      "date": "2026-05-22",
      "prompt_run_date": "2026-05-22",
      "summary": "On a META sell-some-vs-hold question, same position, same capex-raise context as the 1 May thesis-audit run, Claude reframed the bounded-capex break sharper than the original Q2 paraphrase: 'the floor of 2026 guidance now sits above the ceiling you assumed.' Same conclusion as the run three weeks earlier; a more memorable formulation. Run on Claude Opus 4.7 with live web search.",
      "screenshot": null,
      "source_post": {
        "title": "The AI prompt I run before every sell decision",
        "url": "https://dixon.ai/posts/ai-prompt-before-sell-decision/"
      }
    },
    {
      "id": "claude-asymmetry-tell-2026-05-22",
      "url": "https://dixon.ai/catches/#claude-asymmetry-tell-2026-05-22",
      "tool": "claude",
      "model_version": "Claude Opus 4.7",
      "what": "asymmetry-tell",
      "impact": "insight-relevant",
      "date": "2026-05-22",
      "prompt_run_date": "2026-05-22",
      "summary": "On the META Q1 2026 capex prepared remarks, Claude flagged a language asymmetry I'd missed on first read: 'more than 1 GW' was the specific number attached to the Broadcom partnership, but the AMD clause two lines earlier said 'significant amount' with no number. Same paragraph, two clauses: one falsifiable commitment, one defensible-as-aspiration. The kind of softness you only spot on the second read of an earnings transcript.",
      "screenshot": null,
      "source_post": {
        "title": "Claude prompts for investing: 6 real examples",
        "url": "https://dixon.ai/posts/claude-prompts-for-investing-real-outputs/"
      }
    },
    {
      "id": "claude-stale-data-flag-2026-05-20",
      "url": "https://dixon.ai/catches/#claude-stale-data-flag-2026-05-20",
      "tool": "claude",
      "model_version": "Claude Opus 4.7",
      "what": "stale-data-flag",
      "impact": "insight-relevant",
      "date": "2026-05-20",
      "prompt_run_date": "2026-05-20",
      "summary": "On a generic MSFT company-snapshot prompt, Claude returned the segment split as FY2024 figures (roughly two years behind current reporting) and self-flagged the staleness in its Verdict section: 'Microsoft restructured its segment composition effective Q1 FY2025; verify against the live 10-K before quoting these percentages.' The model was honest about the limit of its own training data without being asked.",
      "screenshot": null,
      "source_post": {
        "title": "Claude prompts for investing: 6 real examples",
        "url": "https://dixon.ai/posts/claude-prompts-for-investing-real-outputs/"
      }
    },
    {
      "id": "claude-language-tell-2026-05-15",
      "url": "https://dixon.ai/catches/#claude-language-tell-2026-05-15",
      "tool": "claude",
      "model_version": "Claude Opus 4.7",
      "what": "language-tell",
      "impact": "decision-relevant",
      "date": "2026-05-15",
      "prompt_run_date": "2026-05-15",
      "summary": "Same Susan Li META Q1 2026 prepared remarks passage as the earlier catch, framed around the prompt that catches it. Claude was the only one of four tools to flag what Li did with the word 'underestimate': she said Meta had 'continued to underestimate' its compute needs, language that points upward without making a real commitment to spend more. The three-check red-flag prompt is designed to run the same catch on any transcript.",
      "screenshot": null,
      "source_post": {
        "title": "AI earnings call red flags: three phrases that flag almost every time",
        "url": "https://dixon.ai/posts/ai-earnings-call-red-flags/"
      }
    },
    {
      "id": "claude-language-tell-2026-05-15",
      "url": "https://dixon.ai/catches/#claude-language-tell-2026-05-15",
      "tool": "claude",
      "model_version": null,
      "what": "language-tell",
      "impact": "decision-relevant",
      "date": "2026-05-15",
      "prompt_run_date": "2026-05-15",
      "summary": "On Susan Li's META Q1 2026 prepared remarks, Claude was the only one of four tools tested to pick up what the CFO did with the word 'underestimate'. She said the company had 'continued to underestimate' compute needs: language that signals an ongoing structural pattern without committing to what management will spend next. ChatGPT, Gemini and Perplexity read the same passage and missed it.",
      "screenshot": null,
      "source_post": {
        "title": "What ChatGPT, Claude and Perplexity caught on Meta's earnings call (and what they missed)",
        "url": "https://dixon.ai/posts/best-ai-tools-for-earnings-analysis/"
      }
    },
    {
      "id": "claude-stayed-in-lane-2026-05-14",
      "url": "https://dixon.ai/catches/#claude-stayed-in-lane-2026-05-14",
      "tool": "claude",
      "model_version": "Claude Opus, Max plan, web search",
      "what": "stayed-in-lane",
      "impact": "decision-relevant",
      "date": "2026-06-18",
      "prompt_run_date": "2026-05-14",
      "summary": "Given a covered-call setup with no live options chain, Claude declined to invent premiums, implied volatility or Greeks, telling the user to plug in real numbers from the broker rather than generating plausible-looking ones. The same prompt shape produced fabricated tables from Gemini and ChatGPT. The clean answer was a refusal to fill the gap, which on a live-data question is the right answer.",
      "screenshot": null,
      "source_post": {
        "title": "AI stock research tools tested: what each one got wrong",
        "url": "https://dixon.ai/posts/ai-stock-research-tools-tested/"
      }
    }
  ]
}