The Third Announcement Most People Missed at Google Cloud Next
10 min read

I spent the week on the floor at Mandalay Bay. The Wednesday headline was two new TPUs (8t for training, 8i for inference) and the commentary has spent the rest of the week debating specs. None of it was the actual story.

The actual story is the third announcement most people missed. NVIDIA simultaneously disclosed Vera Rubin A5X instances on Google Cloud scaling to 80,000 Rubin GPUs single-site and up to 960,000 multisite. Both silicon fleets ride Google’s new Virgo networking. Both are sold through the Gemini Enterprise Agent Platform. Google is not trying to beat NVIDIA. Google is making Google Cloud the neutral venue where TPU and NVIDIA both monetize enterprise agents.

That is the post. Everything that follows serves it.

The industry framing has been compute-driven. More parameters, more FLOPs, bigger clusters. That framing is out of date. Software is driving hardware. The workload shape, long-horizon enterprise agents running on any frontier model at production scale, is what forced the hardware decisions on both sides. The chip did not invent the workload. The workload invented the chip.

Worth saying out loud: it was a busy week beyond Cloud Next. OpenAI shipped GPT-5.5 mid-conference. DeepSeek shipped V4. The frontier model race kept compounding while Google was framing its silicon launch. None of these announcements happen in isolation. Every silicon decision Google made gets pressure-tested against the model that drops next week, and the pace right now is unforgiving.


Why I Went

I went to Cloud Next to gauge the energy on the floor and see which companies actually had commercial traction in the agentic era. Slide decks tell you what hyperscalers want to project. Booth traffic, customer conversations, and partner-pavilion energy tell you what is real. Four observations stood out before the silicon analysis even started.

The Anthropic booth had real energy. Consistent traffic across all three days, enterprise buyers and solution architects working through integration questions. Anthropic is the lighthouse customer for Google’s dual-fleet architecture, and the floor confirmed it. Anthropic quietly became one of the most important brands at the show.

The major SaaS providers were all there, and they were not retreating. ServiceNow, Salesforce, SAP, Workday, and the rest of the enterprise SaaS roster showed up as the orchestration and workflow layer for agents. The “AI eats software” bear case assumes agents replace SaaS. The floor said the opposite: SaaS providers are tools in the agentic world, not victims of it. This is the same thesis I laid out in The Orchestration Layer, and the Cloud Next floor was a strong real-time confirmation. Agents need workflow, identity, and orchestration plumbing. SaaS providers own that plumbing.

Enterprise AI deployment is earlier than the tape suggests. I spoke with IT managers from large healthcare and MRO (maintenance, repair, and overhaul) firms who have not yet started enterprise-wide AI projects. Not “haven’t gotten to production” — have not started. This cuts in two directions. The bull read: enterprise runway is enormous, hyperscaler capex is funding capacity that is years ahead of where adoption sits today, and we are still very early. The bear read: enterprise adoption is slower than the capex curve implies, and eventually that gap matters. I am in the bull camp on this, since Anthropic and OpenAI are filling the near-term capacity gap while the Fortune 500 catches up, but anyone modeling AI infrastructure should mark it.

The TPU-versus-NVIDIA business model is fundamentally different, and most market commentary misses it. NVIDIA sells you GPUs. NVIDIA gives you reference designs. NVIDIA partners with Supermicro, Dell, HPE, Foxconn, Quanta, and others to let you build your own racks and your own data centers. NVIDIA’s commercial logic is volume and ecosystem reach. Google does not sell you a TPU. Google does not give you a reference design. Google does not let you build a TPU rack in your own data center. Google’s commercial logic is rental: get you onto Google Cloud, then layer Vertex, BigQuery, Workspace, and the rest of the Google product stack on top. NVIDIA is a silicon company. Google is a cloud company that builds silicon as a tenant-attraction strategy. These are not the same business and they should not be modeled the same way. Salvator framed NVIDIA’s side of this directly when we spoke at the booth: NVIDIA has prioritized fungibility since the inception of deep learning, balancing AI training and inference performance with traditional ML, data analytics, visual computing, and scientific simulation. Custom silicon optimized narrowly for one workload sacrifices that breadth, which is part of why merchant silicon keeps winning at infrastructure managers who need to be ready for whatever workload comes next.

DeepMind was barely at Cloud Next, and that is informative. Gemini was everywhere — keynotes, demos, the agent platform, the Workspace integrations. DeepMind gave one talk. No booth, no meaningful floor presence. This is worth thinking about. Internally, Google appears to have separated the research lab (DeepMind) from the commercial product (Gemini). That structure is faster for execution, since the product team can ship without research-org gravity slowing it down, and it is the same separation Google ran historically between Google Research and Search. But it is also different from how OpenAI operates, where research and product sit in one building and ChatGPT iterates at lab speed. The implication is mixed. On the bull side, Google is treating Gemini as a real product franchise rather than a research demo, which is necessary for enterprise credibility. On the bear side, if the research-to-product handoff at Google ever weakens, the Anthropic dependency I described earlier becomes even more important, because Google needs another frontier-grade lab feeding workloads onto TPU regardless of what DeepMind ships next. This is one to watch.

The Anthropic fine print at Cloud Next. When Anthropic’s representative presented at the conference, the first slide carried fine print stating that Amazon Web Services is Anthropic’s primary cloud provider and primary training partner. Read that again. At Google Cloud Next, in front of Google’s enterprise customers, Anthropic disclaimed primary-provider status to Google in slide-one fine print. That language is presumably contractually required by the AWS agreement. It is also a real signal about the structural shape of the Google-Anthropic relationship. Anthropic is not a Google partner with AWS as a backup. Anthropic is an AWS partner that runs significantly on Google. The TPU commitment is enormous and real, but Google is the second cloud, not the first. This sharpens the Bear Two case I make later in the piece. Camp Two’s demand underwriter has a contractual primary somewhere else.


The Third Announcement

Two Google posts got the attention. A third disclosure, from NVIDIA, was the most commercially significant of the week. Ian Buck’s post titled “NVIDIA and Google Cloud Collaborate to Advance Agentic and Physical AI” laid out the dual-fleet architecture in detail.

Vera Rubin A5X on Google Cloud. Bare-metal Rubin NVL72 rack-scale systems, scaling to 80,000 Rubin GPUs single-site and up to 960,000 multisite. NVIDIA cites up to 10x lower inference cost per token and 10x higher token throughput per megawatt versus the prior generation. First-class NVIDIA deployment on a hyperscaler rival’s cloud, co-engineered between the two companies.

Rubin runs on Virgo. A5X uses NVIDIA ConnectX-9 SuperNICs combined with Google’s Virgo networking. Virgo is no longer TPU-only infrastructure. Google is turning it into the shared scale-out fabric for both TPU and NVIDIA fleets.

Gemini on Blackwell. Gemini runs in preview on Google Distributed Cloud on NVIDIA Blackwell and Blackwell Ultra. For some sovereign and confidential deployments, Google is positioning Gemini on NVIDIA Blackwell rather than making Gemini TPU-exclusive at the workload level.

Nemotron on Gemini Enterprise Agent Platform. NVIDIA’s open reasoning and multimodal models distributed through Google’s agent platform with a managed RL API built on NVIDIA NeMo RL. CrowdStrike is already fine-tuning Nemotron for cybersecurity on Managed Training Clusters with Blackwell GPUs.

OpenAI on Google Cloud. OpenAI is running large-scale inference for ChatGPT on NVIDIA GB300 (A4X Max VMs) and GB200 NVL72 (A4X VMs) on Google Cloud. Not Azure. Google Cloud, on NVIDIA silicon. Thinking Machines is doing the same on GB300 NVL72 for their Tinker API.


The Axes Forming Behind The Camps

Step back from the silicon layer and a clearer competitive structure emerges. Two axes are forming at the model-plus-infrastructure level, and each axis is more structurally integrated than the individual companies inside it suggest.

Axis One: Microsoft plus OpenAI plus NVIDIA. Azure Cloud, ChatGPT, CUDA stack. The model lab and the silicon vendor have converged on the same cloud for the workload most people associate with frontier AI. Azure remains the center of gravity, but OpenAI’s Google Cloud footprint proves even the strongest axis is diversifying.

Axis Two: Google plus Anthropic plus TPU. Google Cloud, Claude, TPU stack. Google’s reported $30 billion Anthropic TPU commitment is the anchor. And Google has strong incentive to deepen that relationship (reported additional capital into Anthropic, preferential TPU pricing, engineering co-investment) because Anthropic is the demand underwriter that makes the TPU fleet economically viable.

Google’s investment in Anthropic and NVIDIA’s earlier investment in OpenAI are directionally similar trades, even if the dollar scale differs. Both are proxy hedges on platform dominance: each silicon vendor is helping fund the model lab that anchors workload demand on its preferred stack. As Jensen acknowledged on the last earnings call, and as I wrote in The Chip Is Dead, Long Live The Factory, the bulk of TPU growth came from one customer. That customer is Anthropic. Today’s Cloud Next disclosures only deepened that dependency.

Here is the logic. NVIDIA has effectively locked up the leading-edge silicon supply chain. Google cannot outbid that supply chain and does not try to. Instead, Google wants to fill as much of its cloud capacity with NVIDIA GPUs as NVIDIA will sell it, because that is the silicon many customers already want to rent. The TPU fleet then becomes the optimized stack for Gemini-native workloads and Anthropic-scale demand. Supply timing matters too: TPU 8i ramp will take into next year, which is part of why Google leaned heavily on Rubin commitments now. The Vera Rubin order is doing double duty, capturing customers who want NVIDIA today and bridging Google’s own capacity until TPU 8i is at production volume.

If Anthropic’s TPU commitment scales, Camp Two survives as a real commercial counter-axis to Microsoft plus OpenAI. If Anthropic diversifies back toward NVIDIA or if Anthropic’s enterprise growth slows, the TPU fleet loses its demand underwriter and the economics get much harder. That is the single most important dependency in Google’s entire AI strategy, and it is the reason Google has every incentive to keep investing in Anthropic and keep pushing Claude workload onto TPU.

This is defense, not aggression. Google is not trying to beat NVIDIA at silicon. Google is trying to not-lose the frontier AI cloud battle to the Microsoft plus OpenAI axis, and the way it accomplishes that is by pairing Gemini with Claude on a shared TPU fleet while selling NVIDIA capacity alongside to capture the rest of the market. Two axes forming in parallel, each with its own model-plus-silicon pairing, each competing for enterprise agent workloads.

Below the paywall: the four conditions that gate Camp Two and why only Google clears them today, the bandwidth mechanism behind the 4x per-accelerator jump, the optical and memory supply-chain read-throughs, what the dual-fleet thesis means for NVDA and GOOGL, the three bear cases ranked by what I am actually watching, and the four signals from three days on the floor including the on-the-record Salvator conversation.


Originally published on BEP Research on Substack. Subscribe for more.

Posted in

Leave a Reply

Discover more from Ben Pouladian

Subscribe now to keep reading and get access to the full archive.

Continue reading