May 8, 2026

The anti-tokenmaxxing thesis

"Tokenmaxxing" entered HR vocabulary in March 2026. It's the practice of burning AI tokens to look productive without producing more. Left unmeasured, it converts a compensation philosophy into a budget hole.

The instinct of every engineer given a token allowance is the same instinct given a free lunch: use it. The instinct of every manager watching a high-token engineer is to assume velocity. The instinct of every CFO seeing a $1.2M Anthropic invoice is to ask whether anything shipped.

The problem is that consumption and output are different signals, and the platforms tracking the first don’t see the second.

The attribution loop

A token-allowance program without outcome attribution is theatre. It produces a ranking by spend, which is the same ranking incentive-aligned bad actors will climb fastest. Inside six months a tokenmaxxing culture eats the program.

The fix is structural: join the consumption ledger to an outcomes table at ingest. PRs merged. Tickets closed. Incidents resolved. Each outcome has a value_score 1-10 — small fix vs platform launch. The ratio of weighted outcomes to dollars consumed is the score.

A high-spend engineer who also has a high outcome score is a legitimate power user — the kind every platform wants. A high-spend engineer with a low outcome score is the thing the CFO worried about. A low-spend engineer with a high outcome score is the surprising one: a mentor, a careful operator, sometimes someone the model doesn’t need much help from.

The political problem

Outcome attribution is technically simple and politically hard. Engineers don’t love being scored. Managers don’t love having a fourth number to defend. People Ops doesn’t love having a new flag-board.

But the alternative is worse: AI token programs that get cut in the first downturn because nobody could justify them. The pillar collapses.

Where attribution lives

Attribution lives in outcomes, a table joined to ledger by user_id and time. Sources: GitHub PRs, Linear issues, Jira tickets, PagerDuty incidents. The join is idempotent on external_ref, so re-ingesting GitHub history doesn’t double-count.

The score isn’t shown to the engineer directly. It’s a People Ops view. The version shown to the engineer is a peer percentile — anonymous, role-and-level normalized. “You spent at p47” is information; “you scored 0.34” is judgment.

That distinction is what makes the thing work without turning into a surveillance product.

Read next: The fourth pillar →