June 2026

Tokenomics: Why the Cost of Legal AI Is Quietly Rising.

For two years the question was “can it do the work?” In 2026 a second question has caught up: what does the work cost to run?

Every time a lawyer asks an AI tool to review a contract, summarise a matter or run a research task, the system consumes “tokens” — the units of text a large language model reads and writes. Tokens are metered, and they are priced. As legal work shifts from simple prompts to multi-step “agentic” workflows, the number of tokens consumed per task is climbing steeply — and the price per token, on the newest models, is climbing with it. The combination is what people in the industry have started calling tokenomics: the economics of running AI on legal work.

Author Joel Seignior Published 5 June 2026 Read 8 min

Two rising fronts.

01 — the cost drivers

1,000×

More tokens consumed by agentic tasks than by a single chat exchange — measured on agentic coding tasks, but indicative of the wider shift (controlled study, arXiv 2604.22750).

30×

Variance in token consumption for the same task, run to run — driven by varying numbers of steps, tool calls and retries each time.

There are two forces pushing the bill up simultaneously. The first is consumption: agentic workflows — where the model plans, drafts, checks and revises across multiple steps — consume orders of magnitude more tokens than a single chat exchange. In one illustrative example from the legal-tech site law.co, a straightforward contract review that took 2,000 tokens as a single prompt ballooned past 200,000 tokens when run as an agentic pipeline. The second force is price: model providers have been raising the cost of their most capable models. The two curves are rising together.

The practical consequence is that a firm’s AI bill in 2026 looks very different from its AI bill in 2024 — not because it has more licences, but because it is running more capable tasks on more capable models, and each of those tasks is burning more tokens than anyone modelled when the contracts were signed.

Per-seat pricing under pressure.

02 — the model is breaking

The traditional legal software pricing model — a flat fee per user per year — was designed for tools where usage is roughly uniform. A lawyer either has access to a research database or they don’t; whether they use it heavily or lightly doesn’t affect the vendor’s costs much. Tokens are different. A lawyer running agentic workflows all day costs the vendor twenty times more to serve than a colleague who logs in once a week to check a definition.

Vendors are adapting. Crosby, an AI-native law firm backed by Sequoia and Cooley, has dropped both the per-seat licence and the billable hour, charging per contract instead — pricing the work done, not the seats filled. The direction of travel is clear: consumption-based pricing is coming to legal AI, and the firms that haven’t modelled what that means for their cost base are in for a surprise.

“Per seat pricing is gone. The future is usage-based pricing tied to value delivered.”
Shawn Curran, Chief Operating Officer, Linklaters — on the trajectory of legal AI pricing

The shift has a second-order effect on procurement. Legal teams negotiated their current AI contracts when usage was light and tasks were simple. Renewal conversations are now happening against a very different usage profile, and the leverage has shifted. A firm that hasn’t tracked token consumption across its tools doesn’t know what it’s actually spending — or what it should be paying.

Three routes to lower cost.

03 — what the industry is doing

The vendors building on top of foundation models are not sitting still. Three cost-reduction strategies are emerging, each with different implications for buyers.

01

Route to the cheapest adequate model.

Harvey’s CEO Winston Weinberg has pointed to two levers: routing each task to the cheapest model that can handle it (citing Factory’s task-routing work), and cheaper, batched ‘verifiers’ to check an agent’s output (citing LangChain, which reports verifier costs can fall by an order of magnitude). A contract summary doesn’t need a frontier model; a legal argument might. Routing well can cut token spend materially without sacrificing output quality on high-stakes tasks.
02

Fine-tune or self-host on open-weight models.

For high-volume, well-defined tasks — clause extraction, standard-form review, document classification — a fine-tuned smaller model can approach the accuracy of a large proprietary one at a fraction of the token cost. A 2025 ContractEval benchmark found the best open model trailed the leading proprietary model by a clear margin on clause-level accuracy, but the gap is closing and the cost differential remains large. For firms with the technical capacity, this is a serious option.
03

Compress context aggressively.

A substantial fraction of token spend in agentic pipelines is redundant context — system prompts, document chunks and conversation history re-sent with every step. Techniques like prompt caching, retrieval-augmented generation and sliding-window context can cut this materially. Vendors implementing these well can reduce per-task token spend without changing the model or the output.

The lesson from the rate-card era is that the firms that modelled the cost curve early had better conversations at renewal. The same lesson applies now — except the curve is steeper.
from section 04 — on Australian pricing dynamics

What Australian firms actually pay.

04 — the rate card

Precise pricing data for the Australian legal AI market is hard to come by — vendors quote on application, contracts are confidential, and list prices are rarely the prices anyone pays. The following table reflects what is publicly known or widely reported as of mid-2026. It is directionally accurate, not a procurement guide.

Vendor	Indicative A$ range	Seat basis	Notes
Harvey	A$8,000–20,000	Per seat / yr	Enterprise; consumption uplift on agentic use
Legora	A$4,600–7,700	10 seats	Mid-market; hard seat floor
CoCounsel	A$350–1,000+	None stated	Most transparent; base rises past A$1,000 at enterprise scale

Substantial discounts off opening prices are widely reported, especially for multi-year commitments and larger seat counts. No vendor publishes Australian pricing publicly; figures are converted from reported US rates and should be confirmed with vendors before any procurement decision.

Most firms are still learner drivers.

05 — the curve that is coming

Thomson Reuters’ 2025 survey found a majority of active law-firm users now use generative AI at least weekly, and roughly a third daily. But frequency is not intensity: the lawyers running million-token agentic workloads — the ones who actually move the bill — remain a small minority. (Thomson Reuters’ 2026 follow-up already puts weekly use among current users above 80%, so even the frequency baseline is climbing.)

That will change. The useful analogy is a learner driver: a little fuel, on local streets, at low cost. The same driver, years on, takes the car cross-country at speed. Most firms are at the learner stage. As more lawyers become confident, agentic users, consumption — and cost — will rise several-fold on top of any price increases.

The takeaway is not alarm. Large firms can comfortably absorb these costs; the point is that AI spend is additive to a technology stack — document management, research subscriptions, the Microsoft suite — that firms cannot drop. It swells the total budget beyond where it sits today. The firms that fare best will be the ones that model that curve now, rather than discovering it later.

Questions worth asking before the next renewal.

01
Do we know our projected AI cost in three years, not just today’s bill?
02
If we are committed to a single provider, what would routing across models save us?
03
How are our vendors handling rising token costs — absorbing them, or about to pass them on?
04
Do we pass any of that cost to clients, and if so, how visibly?

The close

Tokenomics is moving from a back-office curiosity to a board-level line item. It is worth getting ahead of.

This article draws on reporting by Artificial Lawyer, pricing data from Sacra, Metronome, Finout, CloudZero and OpenRouter, vendor pricing from Together AI and Fireworks, the ContractEval benchmark (arXiv 2508.03080), and Thomson Reuters’ 2025 Generative AI in Professional Services report. Figures current as at June 2026; pricing in this market changes quickly.