Skip to content
// Category
Tool Audit

Tool Audit.

Posts here put AI tools against a real investing task. What each one was good at, where each one failed, which ones I kept using.

Posts here put named AI tools against tasks I'd run anyway: same prompts, same day, fresh conversations, outputs saved verbatim. Verdicts are per task, not per tool, because no single tool wins a whole research session. In my testing Claude reads documents best, Perplexity is for cited retrieval, ChatGPT is the dependable middle, and Gemini structures research well on big names — and invents things around options.

Each audit names a winner for the task it tested, shows the screenshots behind the verdict, and records what each tool got wrong as carefully as what it got right. The worst failures — an invented options premium table, an annual-report misread off by a factor of a thousand — are kept on /lessons with dates and screenshots.

Tools change fast, so every post carries the model version and the date it ran. A verdict from January isn't a verdict on today's model — which is half the argument for testing rather than trusting.

← All posts