OpenAI Codex Hits 85% of Output Tokens, Non-Developer Use Up 137x

For most of this year the “AI agents are eating work” claim was a sales line, not a measurement. The frontier labs were selling the future of it. Their own employees were still using ChatGPT like the rest of us, and nobody was publishing the percentage.

That changed yesterday. OpenAI published a long internal-data report on how its own workforce now uses Codex, and the numbers do something the marketing slides never could. They put a percentage on it.

The Lead: Codex Now Produces 85% of OpenAI’s Internal Output Tokens

OpenAI published an internal-data report on Thursday showing that its Codex agent now produces more than 85% of the average employee’s output tokens, with non-developer use up 137 times since August 2025.

The report covers every department: Engineering, Legal, Finance, Recruiting. By April 2026, three of those four crossed the threshold where Codex became the team’s primary AI tool, not a secondary one. Codex now serves 5 million weekly users worldwide. Users at the 99th percentile run more than 60 hours of agentic-coding turns per day across parallel sessions.

By May, 80% of sampled users had run a Codex task estimated to exceed 30 minutes of human work; a quarter had run one estimated to exceed eight hours. The Research team’s median use is 56 times higher than November 2025. Customer Support is up 32 times.

The TWO angle is the delegation-rate the report quietly establishes as a new operational metric. When the company that ships the model publishes the proportion of its own work that the model now does, marketing copy becomes a benchmark. Every other software company will be asked, this quarter, what their internal delegation rate looks like (OpenAI).

What It Means for You

The same agent surface that just absorbed OpenAI’s internal work is moving into the apps you already open every morning.

The week’s consumer story is the Slack tab. Anthropic launched Claude Tag for Slack on Wednesday, a multiplayer mode where anyone in a channel can tag @Claude, hand it a task, and walk away. Anthropic reports that 65% of its own product team’s code is now written that way. Two days earlier, OpenAI rolled out write-access Slack actions for ChatGPT Enterprise, including joining channels, posting, uploading files, and updating profiles.

The same week, both frontier labs moved their agents from “read-only assistant” to “channel teammate.” Underneath the consumer apps you actually use, the voice layer is shifting too. References to OpenAI’s new GPT-Bidi-1 bidirectional voice model surfaced inside the ChatGPT mobile app on June 23, with users reporting that it listens and speaks at the same time, handles interruptions, and keeps context across multi-minute conversations. The mobile assistant stops being a turn-taking machine.

“The week’s news is not new models. It is new placement.”

If the agent surface lives in Slack and the voice surface lives in your phone’s earpiece, the editorial question is whether you still need a chat tab at all. For most non-technical readers, by August, the answer will be no.

What’s Moving Underneath

The week’s macro story is governance: who controls the agent, and who controls what the agent learned from.

Anthropic told the White House and senators on Wednesday that operators linked to Alibaba’s Qwen AI lab ran 28.8 million exchanges through Claude across roughly 25,000 fraudulent accounts between April and June. Anthropic calls it the largest distillation campaign yet against a US frontier model. It is the first time a major Chinese conglomerate has been publicly named in this kind of letter. The complaint targets software-engineering and agentic-reasoning skills specifically, the same surface OpenAI’s internal report measured.

“The fight this week is not about the model. It is about who is allowed to learn from it.”

The same shift shows up in the governance stack. Microsoft moved its Agent 365 Defender context mapping into public preview this month, giving security teams a relationship map of which agents run where, which Model Context Protocol servers they connect to, and which cloud resources their identities can reach. OpenAI’s announced acquisition of Ona, the German cloud-orchestration startup formerly known as Gitpod, plugs the same hole from the runtime side, keeping agents alive long enough to finish multi-hour tasks inside enterprise clouds.

None of these three reach you this week. All three are the scaffolding the agent surface will sit on next quarter, when your IT team starts asking who can see the channel @Claude just joined.

One Tool Worth Knowing

Codex (chatgpt.com/codex)

Codex is the AI agent inside ChatGPT that runs multi-step coding and operational tasks in the background, including ones that take hours of compute. Thursday’s report establishes that it is no longer a developer-only tool: at OpenAI, Recruiting, Legal, and Finance are now its heaviest non-engineering teams, with median use 13 to 32 times higher than seven months ago. The product’s distinct trait is parallelism, since one user can fan out five or ten agents at once and collect the results.

The code-touching next step: if you write SQL, scripts, or any kind of structured query, run one real task through Codex this week, not a toy prompt, a real Tuesday-morning request from your inbox, and time how long it takes you to verify the output. The non-code-touching next step: ask your team’s heaviest ChatGPT user this week which of their recurring weekly tasks they have already moved to Codex, and ask them to walk you through one. That conversation is the leading indicator of your team’s delegation-rate.

Wisdom Speaks

“Six days shalt thou labour, and do all thy work.” Exodus 20:9, KJV

The Hebrew word behind “thy work” in that commandment is mlachah, the shaped, creative kind of labor that makes the Sabbath visible. Mlachah has edges. It has a stopping point. It is bound up with identity, which is why the commandment specifies it as the thing you stop. OpenAI’s data shows the visible mlachah of the modern desk worker, the typed output stream, is now mostly produced by an agent. The unfinished question, the one the report does not ask, is whether mlachah survives when the work that bore its shape is taken by something that does not tire.

“Action is the only activity that goes on directly between men without the intermediary of things or matter.” Hannah Arendt, The Human Condition, 1958

Arendt’s distinction is useful here. She separated labor (subsistence), work (durable things), and action (the relational, witnessing presence we offer each other). Agents can absorb labor and work. They cannot do action. If 85% of your output tokens go through Codex this year, the residue, the part of the workday that is yours, will be whatever action your role demands: deciding, witnessing, standing with. That is the Sabbath-bearing remainder of mlachah, and it is also what your team will hire for in 2027.

Yesterday’s digest: Google Builds Computer Use Into Gemini 3.5 Flash, on the browser becoming the model. Wednesday: OpenAI and Broadcom Unveil Jalapeño, on OpenAI’s first custom inference silicon. Tuesday: Gemini 2.5 Pro Deep Think Lands as Fable 5 Stays Behind the Ban, on the export-control split. Today’s adoption data closes a thread: this week we saw the chip, the model, and now the percentage of human work that runs on top of them.