NVIDIA GB300 GPU Liquid Cooling Revolution

Table of Contents

With the introduction of the GB200 chip in 2024, NVIDIA has pushed the TDP (Thermal Power Design) of its GPUs to new heights, with a single B200 coming in at 1,200W (liquid-cooled), and a GB200 chipset consisting of two B200s plus a Grace CPU with a total TDP of 2,700W.

Under the NVL72/36 design architecture, each tier of the Compute Tray is equipped with two GB200s, which means that a server height of 1-2U needs to carry a total TDP similar to that of the previous H100 HGX system. in order to meet this demand, it is necessary to adopt the more efficient liquid cooling technology, which is not only effective in overcoming the heat dissipation demand, but also improves the overall energy performance of the data center.

Liquid-cooling-GB200-GB300

Source: Immersion Cooling Technology: Current and Future Developments in Energy Efficiency

Whether using the L2A (Liquid-to-air) or L2L (Liquid-to-liquid) cooling solution, the four components of Cold plate, CDU (Coolant distribution unit), Manifold and UQD (Universal Quick Disconnect) are used to remove waste heat from the chip surface. Universal Quick Disconnect (UQD) are the four major components used to remove the waste heat from the chip surface. L2A will then pass through the back door of the fan and the heat exchanger, while L2L will pass through the outdoor chiller to cool down the coolant and re-circulate it back into the system.

In the design of the GB200, liquid cooling technology has demonstrated NVIDIA’s deep understanding of high-density computing power cooling. Taking a compute tray as an example, its cold plate configuration adopts a “one in, one out” design, with each large cold plate connected to the liquid cooling system through a pair of quick connectors. Multiple cold plate circuits are brought together into a single circuit via a manifold, which is ultimately connected to the chassis enclosure. Theoretically, a compute tray contains two pairs of quick release couplings (on the cold plate side), plus two pairs of couplings to the manifold, for a total of six pairs of quick release couplings. Taking the NVL72 system as an example, 18 compute trays require 108 pairs of quick release connectors, plus 9 switch trays (two pairs each), bringing the total number of quick release connectors for the entire system to 126 pairs.

Four-Components-of-Cold-Plate

Source: NVIDIA、Nidec、Danfoss

The GB200’s quick-connect design utilizes quick-connect fittings for the connection between the cold plate and the manifold, with a pair of quick-connect fittings on each end of each tube (female end on the cold plate side and male end on the manifold side). It’s worth noting that the female quick connectors on the cold plate are hidden inside the grommets and are not easily noticeable on the outside, while the male end on the manifold side is more protruding. This design is often misunderstood in teardown pictures, but in reality, the quick connectors are everywhere, ensuring the flexibility and maintainability of the liquid cooling system.

Cooling Module

Source: NVIDIA、Nidec、Danfoss

According to industry research, in the GB200 NVL72 / 36, each layer of the Compute Tray uses a liquid-cooling plate on the CPU and GPU, and places 6~10 cooling fans at the back of the chassis; the Switch Tray uses a liquid-cooling plate on each of the two NVLink Switch ASICs, and places 6 sets of cooling fans; For the whole cabinet, a pair of cooling water manifolds and a set of CDUs with cabinet liquid-cooling back doors are used; UQDs are also used in the Manifold and liquid-cooling boards, and the BoM table is shown below:

liquid-cooling-plate-on-the-CPU-and-GPU

Source: NVIDIA、Nomura、Morgan Stanley

Compared to the GB200, the GB300 takes a bold step forward in liquid cooling design. The most significant change is the innovation of the cold plate structure: GB300 abandons the model of covering multiple chips with a large cold plate, and instead equips each chip with an independent “one-in-one-out” liquid cooling plate. Taking the NVL72 system as an example, a single compute tray contains 6 chips, and each chip corresponds to 2 pairs of quick connectors (one pair for each inlet and outlet), totaling 12 pairs, plus 2 pairs connected to the manifold, totaling 14 pairs of quick connectors. The number of quick connectors for the 18 compute trays in the entire system has thus increased to 252 pairs, more than double the 108 pairs in the GB200.

GB300-Reference-Board

Behind this independent cold plate design is a response to the increase in computing power density. the GB300’s chip layout is much more compact, and the traditional large cold plate can no longer meet the heat dissipation requirements, while the independent cold plate not only improves the heat dissipation efficiency, but also provides the possibility of future modularization and upgrades. However, this change also significantly increases the use of quick connectors and system complexity.

Compared to its predecessor, the GB200, the GB300’s liquid-cooling design achieves breakthroughs in structure, efficiency and supply chain.

The GB300 abandons the GB200’s large cold plate coverage scheme in favor of an independent one-in-one-out liquid cooling plate for each GPU chip. This design significantly improves cooling efficiency and allows for more flexible hardware configurations. For example, in the NVL72 system, the number of liquid-cooling plate quick connectors on a single computetray has increased from 6 pairs in the GB200 to 14 pairs, bringing the total number of connectors in the system to 252 pairs, which is twice that of the GB200.

The GB300 adopts the new quick connector NVUQD03, which reduces the size to 1/3 of the prototype model, and the price of a single unit is reduced to $40-$50 from $70-$80 of the GB200. This change accommodates the need for high-density chip layouts and also reduces the cost of the overall liquid cooling system.

Although miniaturization may increase the risk of liquid leakage, the GB300 ensures stability through an optimized sealing process and accelerated testing (e.g., plug testing, material reliability verification). The connection between the cold plate and the manifold still uses quick connectors, but the cold plate end is designed with a hidden female end for a more compact appearance.

The GB300’s innovations in cold plate design do not completely overthrow the GB200’s system; components such as the manifold, CDU (Cooling Distribution Unit) and cartridge remain the same as the original design, with only adjustments made to the core heat sink module. This strategy not only reduces development costs, but also ensures system compatibility.

Currently, the GB300’s switch tray is still predominantly air-cooled, with only the main chip being water-cooled. However, NVIDIA has revealed plans for a full shift to water cooling, and all components, including the front-end transceiver connector, may be equipped with liquid-cooled modules. In the future, each optical module may be individually equipped with liquid-cooled boards, replacing the fast connector copper pipe welding design will become mainstream. This shift will significantly increase the complexity and cost of the manufacturing process, but also pave the way for the realization of ultra-high density computing power. At this stage, this program is still in the design stage, the specific landing form is not yet clear.

Liquid cooling solution for GB300 not limited to GPUs.

Currently, switchtray is still mainly air-cooled, but in the future, it may shift to water-cooled. If optical modules (such as ConnectX-8 SuperNIC) are liquid-cooled, each fiber optic connector requires a separate liquid-cooled plate, which may be connected by copper pipe soldering instead of quick connectors, further pushing up the cost.

As single cabinet power densities climb to 140kW, liquid cooling needs to work in tandem with highly efficient power supplies such as DrMOS. The GB300 reduces power supply costs by 35-40% through optimized DrMOS design, while supercapacitor modules (although some may be removed in some models) are used to smooth millisecond load fluctuations.

According to institutional forecasts, the global liquid-cooling market will reach $21.3 billion in 2030, and China’s liquid-cooling server market will grow at a compound annual growth rate of 47.6%. the launch of GB300 will accelerate this process, and demand for its quick connectors alone may exceed 150 million pcs in 2025.

NVIDIA binds core suppliers through liquid cooling solutions, forming a technological ecological barrier. Competitors need to make breakthroughs in areas such as miniaturized quick connectors and high-precision manufacturing in order to gain a share of the market.

Can GB300 fill AI’s “abyss of desire”?

Although GB300’s liquid cooling technology significantly improves arithmetic density and energy efficiency, challenges remain:

Cost pressure: the price of top-of-the-line servers exceeds $3 million, which is difficult for small and medium-sized enterprises (SMEs) to afford.

Technical risk: the long-term reliability of miniaturized quick connectors still needs to be verified, and the risk of liquid leakage may affect the stability of data centers.

Ecological dependence: highly centralized supply chain may constrain capacity elasticity.

GB300’s liquid cooling solution is not only an iteration of cooling technology, but also a reconfiguration of infrastructure in the age of computing power. Its success will depend on supply chain synergy, cost control and long-term reliability verification. If NVIDIA can balance these factors, GB300 may become a key oil well in the “new oil” era of AI, driving the computing power revolution to new heights!

Boost your business with our high quality services
Please fill out below form, we will contact you within 1 working day, please pay attention to the email with the suffix @ptheatsink.com.

Or send email to mia@ptheaksink.com directly