OpenAI, Broadcom Unveil Jalapeño: First Custom Inference Chip

For two years the frontier labs rented their power. They bought Nvidia’s chips, ran them in someone else’s data center, and paid the toll on every token. Today OpenAI took a step toward owning the whole stack, and named the first piece after a pepper.

The move tells you where the money and the worry are going. The race is no longer only about who has the smartest model. It is about who controls the ground the model stands on.

The Lead: OpenAI and Broadcom Unveil Jalapeño, a Custom Inference Chip

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first in-house chip, built from scratch for running models rather than adapted from a graphics card, and moved from design to tape-out in nine months.

Jalapeño is an inference accelerator, which means it is tuned for the cheaper, repeated job of answering prompts rather than the expensive one-time job of training. Early lab testing shows performance-per-watt well above the current state of the art, and OpenAI used its own models to help design the silicon. The deeper signal is the strategy: a frontier lab is no longer content to rent compute, it wants to own the chip the compute runs on, a pattern worth naming as the inference ASIC bet (OpenAI).

This is the first step in a multi-generation Broadcom and Celestica platform, with an initial gigawatt-scale deployment by the end of 2026 and Microsoft as a data-center partner. Nine months from a blank page to a working design is the part that should give Nvidia pause, not the chip itself.

The question nobody is asking out loud: if the lab that sells you the model also builds the chip, designs it with that model, and runs it in its own halls, who is left to check its math? Vertical integration is power and exposure in the same act.

What It Means for You

The agent stopped living in a chat window and moved into the apps where your work already happens.

The clearest version this week is inside Slack. Anthropic launched Claude Tag, an always-on teammate you summon by tagging @Claude in a channel, where it picks up a multi-step task, works asynchronously with your connected tools, and reports back in-thread. It runs on Opus 4.8, and Anthropic says an internal version already writes roughly 65 percent of its Claude product team’s code. The same shift shows up in Microsoft’s world, where Copilot Cowork reached general availability worldwide as the agentic surface for long-running work, now billed against usage-based “Copilot Credits” on top of the license fee.

The pattern underneath both is that you are no longer asking an assistant a question, you are handing a coworker a job. Google made the same bet for consumers, expanding the Gemini overlay on Android so the ”+” menu launches video generation, music, Canvas, and Guided Learning without opening the full app.

“An assistant answers you. A teammate acts for you. The bill follows the second one.”

The catch hides in that new credit meter. When the agent works for hours on its own, you stop paying per question and start paying per effort, and the cost of a stalled task is now a line item you can see.

What’s Moving Underneath

The macro story is independence: every major player is trying to stop depending on someone else’s chip, someone else’s standard, or someone else’s blessing. Qualcomm is nearing a roughly four-billion-dollar acquisition of Modular, a software startup whose tools let AI apps run across different chips, aimed squarely at loosening Nvidia’s CUDA grip. That is the same instinct behind Jalapeño, read from the software side rather than the silicon side.

The independence push is physical, too. Agility Robotics is in talks to go public through a roughly 2.5-billion-dollar SPAC merger, and its bipedal Digit robot has already moved more than 100,000 totes in a GXO warehouse under a multi-year contract. Their phrase, “deployment over demos,” is the whole thesis: the robots that ship beat the robots that impress.

Independence has a limit, and Washington is drawing it. The U.S. is pressing Meta to submit its most capable models for federal security review, leaving Meta the last major holdout after OpenAI, Anthropic, Google, Microsoft, and xAI signed on under a June 2 executive order. None of these reach your desk this week. All of them are the foundation that decides what reaches it next year, and who is allowed to inspect that foundation before it does.

One Tool Worth Knowing

Claude Tag

Claude Tag is worth evaluating because it changes where the work happens, not just what the model can do. It replaces the old “Claude in Slack” with a 30-day migration window, runs on Opus 4.8, and adds an “ambient” mode that follows up on stalled threads on its own. Treat the 65-percent-of-internal-code figure as a ceiling set by an engineering team that built the tool, not a floor you will hit in week one. The honest test is whether it finishes a real multi-step task without you babysitting the thread.

Start small and watch the meter, because an always-on teammate runs whether or not you are looking. The code-touching next step: in a sandbox workspace, tag it on a contained task with one or two connected tools and read what it actually did, not just its summary. The non-code next step: assign it one recurring administrative chore in a low-stakes channel, like drafting a weekly status roll-up, and judge it on whether it saved you the half hour it promised.

Wisdom Speaks

“Except the LORD build the house, they labour in vain that build it: except the LORD keep the city, the watchman waketh but in vain.” Psalm 127:1, KJV

OpenAI is building its own house down to the silicon, and the ambition is not the problem. The Psalm does not condemn building, it condemns the quiet assumption that the building is self-securing, that enough vertical integration removes the need for anything beyond the self. The Hebrew word for “in vain” is shav, and it names not failure but emptiness: labor that succeeds at its task and still rests on nothing. A chip designed in nine months can be a marvel and still be built on shav if the maker forgets who keeps the city.

“You have power over your mind, not outside events. Realize this, and you will find strength.” Marcus Aurelius, Meditations, Book VIII

The Stoic locates strength inside the self, in the one domain a person controls. It is the same instinct driving the week’s news: own the chip, own the standard, own the foundation, and you cannot be cut off. Marcus is right that grasping for control of outside events is exhausting, but the Psalm answers the deeper question he leaves open. The self that builds is not also the self that keeps. Strength is real; it is just not finally self-supplied.

What in your own work are you building as if it secures itself, and what would change if you named the part you cannot keep on your own?

Yesterday’s digest: Gemini 2.5 Pro Deep Think Lands as Fable 5 Stays Behind the Ban, on the frontier labs racing on capability. Monday: John Jumper Leaves DeepMind for Anthropic, on talent as the scarce resource. Today’s chip news moves that same race down a layer: the labs are no longer only competing for smarter models and sharper minds, they are competing to own the ground both stand on.