Skip to content
Decision Process

What AI gets right (and wrong) about options trading

AI for options trading: a working taxonomy from real covered-call trades. Four uses where it helps. Six failure modes where it invents numbers.

// TL;DR

AI for options trading is useful in a narrow zone (thesis, intent, decision-trees) and unreliable across a wider one (anything requiring live chain data). The dividing line is whether the question can be answered from reasoning alone or whether it needs a live data feed. The rule is one sentence: never ask AI to produce a number it cannot see.

// On this page

Every list about AI for options trading leads with what the tools can do. This one starts with the things they will confidently invent if you give them the chance.

This morning I asked Gemini, on a fresh chat, what strike and premium it would recommend for a 28-day covered call on BMNR at $21.77. I gave it the stock price and nothing else. It returned a formatted table — $22 strike at $2.13 premium, $23 strike at $1.50, “Implied Volatility Rank: 0.52%”, and the phrase “Below are the options configurations based on current order book data.” Gemini has no access to any order book. The numbers had the right units and a plausible range. They were also fiction.

What follows is the taxonomy I have built from running these tools against real trades — six closed covered calls on BMNR since February, $753 in premium from the call book, every trade stress-tested through the Prompt Stack before placement. Four places AI is useful for options work. Six places where it will mislead you, including the one Gemini just demonstrated.


The data boundary

The single fact that determines which side of the line each item below falls on: general-purpose language models — ChatGPT, Claude, Gemini, Perplexity in its reasoning mode — do not have a live market data feed. They have training data with a cutoff. They have, in some cases, a web search tool that returns recent text. None of them can see your broker’s option chain, the current IV surface, or the real-time delta on the contract you’re about to write.

Any question requiring a current price, a live Greek, or a current IV figure is a question they cannot answer truthfully. They will often try anyway. The four “gets right” items below are tasks where the model only needs to reason. The six “gets wrong” items are tasks where someone has to bring the data — and if you ask AI to bring it, it will pattern-match something that looks right and serve it as fact.

I have tested every item below against my own BMNR trades. That is the whole post. The rest is detail.


Section A — What AI gets right


Setting the mental model before you open the chain

Decision served: Before any strike is chosen, what am I actually trying to do here — generate income on a name I want to keep, monetise a position I would happily exit at the right price, or cap an upside I do not believe in anyway? Strike selection without this is guesswork dressed as analysis.

Why it earns its place: The pre-chain thinking stage requires no live data. It is a reasoning task: articulate the thesis, name the intent if assigned, name the failure modes. Reasoning is what these models are good at. The Prompt Stack applied to “should I write calls on this name at all” forces the discipline. I run a version of this before every BMNR entry. Trade #5 was the near-miss — the stock ran toward $23, the short call closed for $18 of premium against a near-assignment — and the pre-trade thesis check is what stopped it from being worse. The thinking had already been done; the decision in-trade was easier as a result.

What good output looks like: A clear verdict on suitability, risks tied to the specific stock rather than to options in general, and an explicit acknowledgement of what the model is guessing on. If the response starts citing strike prices, you have lost the guardrail — restart.

Where it falls short: The model does not know your tax situation, the rest of your portfolio, or your real cost basis across multiple lots. This is a thesis check, not a portfolio check. The 7 AI prompts for covered calls post has the live version (Prompt 1).


Stress-testing the thesis before adding to a position

Decision served: Before adding capital to a name ahead of writing calls, is the fundamental thesis still intact — or am I anchored to a price I paid rather than a business I understand?

Why it earns its place: The 5 questions to ask AI before buying any stock covers this for stock purchases generally. For a covered-call seller the stakes are different — if the thesis breaks, you are not just losing on the underlying, you are also capped on any recovery. AI handles the question well because it requires no live data. “What am I actually betting on?” and “what’s the strongest case against this today?” are reasoning tasks against a stated view.

My Trade #4 ($198 premium) and Trade #6 ($375 premium, the best result so far, 9 days held) both had explicit pre-trade thesis confirmation. Trade #5 ($18 — the near-miss) was the one where the prompt asked “what changed in your view?” and the honest answer was “nothing, the stock just ran.” That answer told me the right move was to let it close tight rather than panic the roll.

What good output looks like: The model challenges your framing rather than listing generic risks. A response saying “the 7% drop you have experienced is about your cost basis, not about the business” is doing the work. A response that lists five risks of options selling is not.

Where it falls short: The model cannot tell you whether the thesis was right after the fact unless you feed it the outcome. This is a pre-commitment check, not a post-mortem.


Reading what elevated IV means for premium sellers

Decision served: You have an IV figure and an IV Rank from your broker. Is this a good time to sell premium, or is IV elevated for a structural reason — a catalyst, a known event — that makes the high premium a trap rather than an opportunity?

Why it earns its place: IV Rank and IV Percentile are available from any broker, Barchart, or Market Chameleon. The interpretation is a reasoning task: what an elevated number means for a seller, whether to wait or write into it, what mean-reversion assumption is embedded. AI handles this well when you paste the figures in. The distinction between IV Rank (sensitive to outliers in the 52-week range) and IV Percentile (distributional — how often IV has been below this level) is something every reasoning model I have tested explains correctly when asked, provided you bring the numbers.

I ran a controlled test on 2026-05-15 using live data from my broker: BMNR current IV 83.4%, 52-week range 64%–119%, IV Rank 0.19 (19th percentile), IV Percentile 22%. Pasted verbatim into Claude on a fresh chat. The classification that came back:

// Claude on IV Rank 0.19 for BMNR, real broker data supplied (2026-05-15)

“The conventional premium-seller’s playbook treats IV Rank as a regime indicator: sell when IV Rank is high (>50, ideally >70) and stand down when it’s low. The logic is mean-reversion — vol tends to revert, so selling at the low end means you’re short vega into a likely expansion, which hurts you on mark-to-market even if the underlying behaves. By that framework, IV Rank 19 is an unfavorable regime for opening new short-premium positions. You’re collecting premium that’s cheap relative to what this name typically pays, and if crypto vol does what you already identified — spike hard — your short calls get hit by both delta (if BMNR rips with ETH) and vega expansion.”

It also caught an arithmetic discrepancy in my own inputs — noting that (83.4 − 64) / (119 − 64) = 0.35, not 0.19, and flagging that the broker may be using a different IV series or lookback window. Correct regime classification from data I supplied, with an audit of the inputs I brought. Contrast this with Gemini’s response in the next section to the same class of question asked without any data supplied.

This is interpretation of data the trader provides. Asking AI to state the current IV Rank is a completely different task, and one it will confidently get wrong (see the next section).

What good output looks like: A regime classification — RICH, NORMAL, COMPRESSED — a plain-English read of what each number means for this specific name, and a flag if there is a catalyst inside the window that would make the elevated IV structural rather than mean-reverting.

Where it falls short: Only as good as the data you paste. The model cannot tell you why IV is where it is — that is fundamental research and requires knowing the news.


The roll, close, or let-it-assign decision mid-trade

Decision served: The stock has run. The short call is in the money or close to it. You need to decide between rolling out and up, buying back for a loss, or letting the shares get called away. This is the most consequential recurring decision in a wheel strategy and the one most often made on adrenaline.

Why it earns its place: It is a structured decision tree against a stated goal. No live data is required to think through it. The model needs your updated view, the cash impact of each option, and an explicit ranking of which outcome best preserves your stated intent. Prompt 5 of the covered calls post is the live version.

Trade #5 is the worked example. BMNR ran toward $23, my $23 short call went deep in the money, and the question “what has changed in your view of the stock since you sold the call?” forced an honest answer. Nothing had changed; the stock had simply rallied. That answer pointed at “let it close tight or take assignment”, not “panic-roll for a small credit and another month of risk.” I closed for $18 of net premium, kept the shares, the thesis held.

What good output looks like: A ranked recommendation — assign, close, or roll — that maps directly onto your stated intent, with the cash impact of each path laid out. The model should refuse to price a roll for you. That is a broker job.

Where it falls short: AI cannot run a live roll quote. It will give you the logic; you price the roll in the broker. Anything precise enough to act on requires the chain.


Section B — What AI gets wrong


Live option chain data: premiums, strikes, open interest

Decision served: “What premiums are available on BMNR for a $23 strike call expiring in 28 days?” — the single most common and most dangerous question to ask a general-purpose AI about options.

Why it earns its place: This is the failure mode that corrupts everything downstream. If your inputs are invented, your analysis is fiction with arithmetic. ChatGPT, Claude, and Gemini do not have access to live option chains. Asked for current premiums, a well-behaved model will say it cannot see them. A less-disciplined model will invent figures that have the right units, a plausible range, and zero relationship to any trade you could actually place.

I ran the test this morning. Fresh chat, no chain data, no IV figures, just the stock price. Gemini’s verbatim output:

// What Gemini returned (verbatim, 2026-05-15)

“Volatility Metrics Implied Volatility (IV): ~75.6% Implied Volatility Rank (IV Rank): 0.52% Context: An IV Rank of 0.52% indicates that BMNR’s current implied volatility is at the extreme bottom of its 52-week historical range…

Recommended 28-Day Options (June 12, 2026 Expiration) Below are the options configurations based on current order book data:

Strike Price | Type | Estimated Premium (Mid) | Premium Yield $22.00 | Near-the-Money | $2.13 | 9.78% $23.00 | Out-of-the-Money | ~$1.50 | 6.89%”

Gemini has no order book access. The premium figures came from somewhere in its training distribution. The IV rank quoted to two decimals is invented. The phrase “based on current order book data” is the model asserting a data source it does not have. The arithmetic that followed — $21.77 minus $2.13 equals $19.64 for the cost-basis reduction — is correct. The inputs are fabricated. Correct maths on invented numbers is the most dangerous failure mode in this category, because the output texture reads as analysis.

What the output should look like, by way of comparison, is Claude’s response to the same kind of options-context prompt the day before: “you’ll plug in real premiums from the chain.” No fabricated strikes. No fabricated IV. No claim of order-book access. The right answer was “I cannot see the chain.” Gemini answered the question the model wished it could answer rather than the one it had been asked.

Where it falls short (the right answer): Pull the chain from your broker. Paste a snapshot — strike, delta, bid, mid, OI, expiry — into the prompt. Then ask the model to reason about it. The model can audit data you provide. It cannot generate the data to audit.


IV Rank and IV Percentile figures

Decision served: “What’s the current IV rank on BMNR?” — a question that sounds like research but is a data-fetch request.

Why it earns its place: IV Rank and IV Percentile are calculated from a year of historical implied-volatility readings against the current intraday IV. The calculation needs live data from a broker, Barchart, Market Chameleon, or equivalent. The model has none of it. A well-calibrated response says “I cannot source this.” A poorly calibrated response gives you a number.

I have asked variants of this question across Gemini and ChatGPT half a dozen times. The morning’s Gemini response is representative of the pattern. It included “Implied Volatility Rank: 0.52%” — quoted to two decimals, framed as the “extreme bottom of the 52-week historical range.” BMNR’s documented baseline IV is around 85% (high-vol leveraged ETH proxy); Gemini’s own quoted IV of 75.6% is below that baseline, and yet it has been classified as the bottom of the range. The internal inconsistency is the giveaway. A trader following this regime call would write aggressive near-the-money strikes on the basis that premium is depressed, when in fact BMNR’s structural IV stays elevated by the underlying’s volatility. The data the trader thought they were acting on was generated by a pattern-matcher that does not know what it does not know.

Where it falls short (the right answer): Pull IV Rank from your broker or Barchart. Paste it into the prompt alongside the underlying price and a date stamp. Let the model do the interpretation in the previous section’s framing — RICH, NORMAL, COMPRESSED — but the figure itself has to come from a tool that can actually see the surface.


Specific strike recommendations without chain data

Decision served: “Which strike should I sell for BMNR this month?” — the question most likely to lose retail traders real money if they take the answer at face value.

Why it earns its place: Strike selection is the single most consequential decision a covered-call seller makes. Too aggressive and the position gets called away at a loss; too conservative and the premium is not worth the capital tied up. An AI generating strike recommendations without real chain data is inventing the most important input in the trade.

The fabrication pattern is consistent. The Gemini example I documented above shows it in full. The strikes are internally coherent — the near-the-money call yields more than the out-of-the-money one, as the textbook would predict — and they would pass a casual sanity check. They do not match any chain you can trade. They are generated from the model’s training-time understanding of how premiums scale with moneyness, not from any live data.

The correct use is the inverse pattern. Pull two or three candidate strikes from your broker chain. Paste them in with delta, premium, and open interest. Ask the model to adjudicate the shortlist against your stated intent. The model adjudicating data you have brought is useful. The model generating the data to adjudicate is not. Prompt 3 of the covered calls post is the live version of the right approach.

Where it falls short (the right answer): Treat any specific strike or premium the model produces, when you have not given it the chain, as fiction. Make it audit your shortlist instead of generating one.


Earnings dates — it knows roughly, not exactly

Decision served: Is there an earnings call inside my contract window? IV expands before earnings, collapses after — holding a short call through an announcement you forgot about is one of the most common ways retail sellers get caught.

Why it earns its place: The failure mechanism is structural. Models are trained on text containing historical earnings dates, but companies move their reporting dates and the model has no access to the current IR calendar. Asked “does BMNR report earnings before the June 20 expiry?”, a model can pattern-match “BMNR typically reports in mid-May” based on a prior-year schedule, when the actual date is different. The model does not know it does not know.

I treat any specific date the model gives me as a prompt to verify, not a fact. The event-window prompt is built around this — it generates the checklist of items I should look up, including earnings, ex-div, FOMC, index rebalances. AI is good at the checklist. Verification is a broker or IR-page job.

Where it falls short (the right answer): Check the company’s investor relations page, the Nasdaq earnings calendar, or your broker’s earnings calendar. Use the AI to remind you of every catalyst category. Do not use it as the verification step.


Black-Scholes calculations on hypothetical inputs

Decision served: “If the underlying is at $21.77, premium is $1.50, days to expiry is 28, what is the implied volatility / delta / theoretical value?” — the kind of question that sounds technical enough to warrant a precise answer.

Why it earns its place: Black-Scholes is a known formula. The model can implement it. I haven’t stress-tested AI-generated Black-Scholes code against my broker’s calculator personally — I use the broker for the figures I trade against. But the failure modes are well documented in the practitioner literature. AI-generated implementations have three specific weaknesses that propagate quietly into the output. First, day-count convention — using trading days (252 per year) versus calendar days (365 per year) inconsistently in the time-to-expiry term changes the theoretical value. Second, volatility units — annualised percentage input versus daily vol. Third, risk-free rate annualisation. Any of these errors produces output that reads as precise (delta to four decimals) but is built on a calculation that is subtly wrong. MarketCalls.in, an independent derivative-trader editorial site, documents the day-count error specifically in AI-generated Black-Scholes code.

A retail seller using AI-generated Black-Scholes output to estimate assignment probability or theoretical delta is getting false precision. The broker’s probability calculator uses the actual live chain and the actual volatility surface. The AI’s calculation uses your imagined inputs with a subtle formula error.

Where it falls short (the right answer): For anything you intend to trade against, use the broker’s analytics. Use AI to explain what a Greek means, or to walk through the sensitivities qualitatively, but not to calculate the number you will act on.


Assignment probability without live delta

Decision served: “What’s the probability my BMNR $23 call gets exercised before June 20?”

Why it earns its place: Assignment probability tracks delta — a 0.28-delta call has roughly a 28% chance of expiring in the money, a working approximation that breaks down where skew matters. Without live delta from the chain, the model cannot calculate this. Asked anyway, many models will produce a figure that sounds like analysis. “Approximately 25–30% based on the stock’s historical volatility” is the canonical confabulated answer.

I have run variations of this question across active BMNR trades — asking a model for the probability a call gets exercised before expiry without supplying chain data. The pattern is consistent: a plausible figure in the 20–35% range, derived from pattern-matching on historical volatility, not from the live surface. It passes a casual sanity check. It does not match what the broker’s chain-derived probability shows. The number that looks like a calculation is a guess.

The compound failure is that BMNR specifically has skew. It is a leveraged ETH proxy with implied vol around 85%. The delta-to-probability relationship is non-linear in ways that depend on the actual volatility surface — the model has no access to the surface. The number it produces will be directionally plausible and structurally wrong. A trader using it to size the position or decide whether to roll is acting on a guess presented as a calculation.

Where it falls short (the right answer): Your broker’s probability figure uses live chain data and the real volatility surface. Use it for the trade. Use AI to explain what the figure means, what a 0.28 delta implies about the underlying’s expected distribution, what assumptions are embedded — but not to produce the figure itself.


Where AI for options trading helps, and where to stop

The shape that emerges across these ten items is simple enough to hold in your head.

Stage of the tradeAI useful?When to put it down
Pre-trade thesis checkYes — strongest zoneNever
IV regime read (from data you paste)YesThe moment you ask it to state the IV figure
Strike selection — auditing your shortlistYesThe moment you ask it to generate strikes
Event-window checklistYes for the categoriesThe moment it cites a specific date — verify
Roll, close, or assign decisionYes — decision tree against your goalWhen you need a live roll quote
Live chain data, IV figures, current GreeksNoAlways — these are broker tools
Black-Scholes calculationsNoAlways — use broker analytics
Assignment probabilityNoAlways — that is a chain-derived number

The pattern: AI is a high-quality second opinion on every question that can be answered from reasoning over inputs you bring. It is a confident confabulator on every question that requires reaching out and pulling a live number from somewhere it does not have access to. The Prompt Stack is the discipline that holds the line — specifically the FILTER stage, which is where you are forced to either paste the real data in or admit the model cannot do this task.

Verdict

//

What works

Reasoning tasks against data you bring — thesis checks, intent articulation, decision trees, regime interpretation, post-trade review. The four "gets right" items are the four I actually run on every BMNR trade.

What doesn't

Anything that requires a live number the model cannot see — chain data, IV figures, Greeks, earnings dates, assignment probability. The model will produce a number anyway. The number will look right and will not be.

Confidence

High. The failure modes are structural, not accidental. Gemini reproduced the chain-fabrication pattern on a fresh prompt this morning with no warning required — the test passes every time I run it.


The single discipline that makes AI for options trading safe is one sentence: never ask it to produce a number it cannot see. Every useful prompt in the covered-call workflow either gives the model the data or explicitly tells it not to make the number up. The post that documents how to do that — seven prompts mapped to seven decision moments, each built around the data you have to bring — is 7 AI prompts for covered calls. Read it next.

Ben Dixon
// Written by Ben Dixon

Ben documents AI experiments against his own investment portfolio — real money, human analysis, sceptical use. About Ben →

// Keep reading
Decision Process

5 questions to ask AI before buying any stock

Five questions to ask AI before buying any stock — for the hour before you commit capital, when the research is done and the decision is about to be made.

Tool Audit

Best AI tools for earnings analysis: I tested three on a real results release

The best AI tools for earnings analysis are not the same tool. I tested ChatGPT, Claude and Perplexity on Meta Q1 2026 — four dimensions, one winner per task.

Tool Audit

ChatGPT vs Claude vs Perplexity vs Gemini for stock research: a real test

ChatGPT vs Claude vs Perplexity vs Gemini: I ran five stock research prompts across all four. One fabricated options data. Named winner per task, no hedging.

← All posts More in Decision Process →