When will the Jalapeño chip start serving traffic?

OpenAI and Broadcom are targeting initial deployment of Jalapeño chips by the end of 2026, then expanding in the years after. Engineering samples are already running at production target frequency and power, including GPT-5.3-Codex-Spark workloads, but a full performance report is promised in the coming months rather than at launch.

How does Jalapeño differ from Nvidia GPUs?

Jalapeño is an ASIC built only for LLM inference, so it is less flexible than an Nvidia GPU but cheaper to make and easier to tune for the specific job of serving models. Broadcom CEO Hock Tan says the accelerator is showing cost savings of roughly 50% compared with typical AI graphics processors used for the same inference workloads.

Did OpenAI design the chip on its own?

OpenAI designed the Jalapeño inference ASIC and Broadcom handled the implementation, integration and Tomahawk networking; Celestica is the electronics manufacturer. OpenAI president Greg Brockman says the company went from initial design to manufacturing in nine months, with OpenAI's own models helping to accelerate the design and optimization work.

What workloads is Jalapeño built for?

Jalapeño targets inference — running already-trained OpenAI models for ChatGPT, Codex and API users — rather than training. OpenAI describes it as the first of multiple planned AI accelerators in a strategy to control chip architecture, kernels, memory, networking, scheduling and deployment end to end, framed as becoming the 'Apple of AI'.

OpenAI · 2026-06-24 · major

OpenAI Jalapeño — first custom inference chip, built with Broadcom

OpenAI Jalapeño is OpenAI's first custom inference chip, co-designed with Broadcom and built by Celestica. The ASIC targets LLM inference at substantially better performance per watt; engineering samples already run GPT-5.3-Codex-Spark.

OpenAI and Broadcom Jalapeño inference processor announcement banner — DatacenterDynamics

OpenAI's first piece of custom silicon — an inference ASIC co-designed with Broadcom for cheaper, lower-power model serving.

Quick facts

Maker	OpenAI + Broadcom (co-designed)
Manufacturer	Celestica
Type	Inference ASIC, LLM-optimized
Status	Engineering samples at target frequency and power
Deployment target	End of 2026, expanding in following years
Validated workload	GPT-5.3-Codex-Spark
Networking	Broadcom Tomahawk silicon

What is it?

OpenAI Jalapeño is a custom inference ASIC, designed from scratch for serving large language models rather than training them. OpenAI co-designed the chip with Broadcom and Celestica manufactures it; engineering samples are already running GPT-5.3-Codex-Spark workloads at target frequency and power.

How does it work?

The Jalapeño architecture cuts data movement and rebalances compute, memory and networking around the LLM inference loop, paired with Broadcom's Tomahawk silicon for large-scale fabric. OpenAI used its own models to accelerate the nine-month design cycle. A detailed performance report is promised in the coming months.

Why does it matter?

Jalapeño is OpenAI's first move into its own silicon stack, aimed at cutting the cost of running ChatGPT, Codex and API traffic. Broadcom CEO Hock Tan says the chip shows roughly 50% cost savings versus typical AI GPUs on inference, and OpenAI frames it as the start of an 'Apple of AI' full-stack strategy spanning chip, networking and product.

Who is it for?

OpenAI infrastructure teams and any developer relying on OpenAI inference economics

Frequently asked questions

When will the Jalapeño chip start serving traffic?: OpenAI and Broadcom are targeting initial deployment of Jalapeño chips by the end of 2026, then expanding in the years after. Engineering samples are already running at production target frequency and power, including GPT-5.3-Codex-Spark workloads, but a full performance report is promised in the coming months rather than at launch.
How does Jalapeño differ from Nvidia GPUs?: Jalapeño is an ASIC built only for LLM inference, so it is less flexible than an Nvidia GPU but cheaper to make and easier to tune for the specific job of serving models. Broadcom CEO Hock Tan says the accelerator is showing cost savings of roughly 50% compared with typical AI graphics processors used for the same inference workloads.
Did OpenAI design the chip on its own?: OpenAI designed the Jalapeño inference ASIC and Broadcom handled the implementation, integration and Tomahawk networking; Celestica is the electronics manufacturer. OpenAI president Greg Brockman says the company went from initial design to manufacturing in nine months, with OpenAI's own models helping to accelerate the design and optimization work.
What workloads is Jalapeño built for?: Jalapeño targets inference — running already-trained OpenAI models for ChatGPT, Codex and API users — rather than training. OpenAI describes it as the first of multiple planned AI accelerators in a strategy to control chip architecture, kernels, memory, networking, scheduling and deployment end to end, framed as becoming the 'Apple of AI'.

Try it

Watch openai.com/index for the detailed performance report due in the coming months