Article written by Peter A. Panfil, Vice President of Global Power, Vertiv. 

High-performance computing (HPC) workloads are on an upward trajectory and show no signs of slowing down anytime soon. Bloomberg reports that generative AI is poised to be a $ 1.3 trillion business by 2032, while GPU improvements have grown by a thousand times in IT performance in just 10 years, with developments rapidly outpacing previous releases in shorter periods in between.

As workloads rise, data center operators also experience challenges from increased power fluctuations due to the spike in energy demand. This drives them to innovate their data center power to support HPC workloads efficiently.

A dependable power supply starts with ensuring a reliable power train and continuous operation. Among the elements in the power train, uninterruptible power supplies (UPS) are essential for reliability, protecting against power interruptions, seamlessly delivering electricity to IT equipment, and upholding operational efficiency.

Understanding the data center power train for AI

The power train (see Figure 1) is a power distribution structure from the utility power to IT equipment comprising switchgears, UPSs, power distribution units (PDUs), and other vital components. These technologies work together to provide electricity data centers.

800x450-figure-1.jpg
Figure 1. Diagram of a power train


An integrated system promotes maximum uptime, offering reliable electrical power and backup solutions to safeguard against outages and maintain continuous operations. To understand the efficiency of this setup, let's explore the flow of electricity and examine how each technology contributes to the overall function:

  1. Utility grid or renewable energy source: This is where the power for the data center infrastructure comes from.
  2. Automatic transfer switch: Seamlessly switches to backup power sources in the event of a primary power failure, ensuring continuous operation. Aside from the UPS, these backup power sources include:
    1. Fuel cells: Run on hydrogen and can act as primary or redundant power sources.
    2. Long-duration batteries: Offer higher power capacity and resilience than traditional UPS batteries.
  3. Critical switchgear: The first line of defense against external faults, such as utility voltage fluctuations or short circuits, directing power into the data center's network. It also serves as a vital safety measure, isolating equipment during maintenance or repairs.
  4. UPS: Provide critical power backup during utility outages. They also serve as a buffer, filtering any voltage fluctuations or spikes coming from the grid. It turns to the battery energy storage system (BESS) when the power grid goes down.
    1. BESS: Buffers power and smooths out short-term supply variations. Like the UPS, it can also store power as an energy reserve for a localized area.
  5. Static transfer switches: Provide immediate transfer of electrical loads between power sources, thus maintaining continuous and reliable power.
  6. Power distribution units (PDU) and remote panel: Distribute the power efficiently to various computing nodes and storage systems.
  7. Busway and rack PDU: Deliver power to IT equipment, accurately meeting the demand from the installed computing resources.

AI workloads are reshaping the power dynamics in IT, adding new challenges to grid capacity and operational complexity. Despite rising rack densities, data center power infrastructure and technologies remain the same. This situation challenges data center owners and operators to ensure that every asset in the power train seamlessly delivers power from the utility to critical components of the entire infrastructure.

The UPS is crucial in mitigating disruptions during power outages, ensuring continuous and reliable electricity flow. Choosing the correct UPS system can help data center operators manage the power load brought by HPC applications.

How UPS manages the AI and HPC load

The UPS unit stands as the backbone of the data center power train, playing an indispensable role in stabilizing power delivery. UPS units ensure critical computational tasks can continue without interruption by mitigating power fluctuations and providing immediate backup during outages. 

Delivers consistent and balanced power 

Three-phase power systems can consistently deliver higher levels of balanced power. Each current in a three-phase system is separated by 120 degrees (see Figure 2), ensuring that when one phase reaches its peak, the other two are still contributing power, preventing any drop in delivery.

 

800x450-figure-2.jpg
Figure 2. Three-phase AC power systems


On the other hand, the current-carrying legs of single-phase AC power systems are always 180 degrees apart (see Figure 3). Consequently, there are cycles in which there is no power delivered to the load, making it only suitable for household and light commercial applications for a decent energy supply.

 

800x450-figure-3.jpg
Figure 3. Single-phase AC power


The three-phase system's capabilities of consistently supplying power without any interruptions make it an ideal choice for high-power computing applications. The constant and balanced power delivery can support the often-fluctuating demands of AI workloads without compromising performance or causing any damage to critical equipment. It also allows for better utilization of energy, reducing wasted power and increasing efficiency.

Vertiv's large three-phase UPSs utilize advanced inverter control algorithms that actively sample and neutralize the active harmonics present in loads, addressing the challenges posed by "spikey" AI loads and their high harmonic content. These solutions ensure a cleaner power supply, enhancing the performance and reliability of systems running complex AI workloads.

Enables continuous and high-quality AC power

Online double-conversion UPS systems provide continuous, high-quality AC power, essential for smooth IT operations. Converting between AC and DC protects these systems against voltage issues and prevents equipment damage. Furthermore, they feature a robust internal bypass that reduces the risk of downtime during maintenance or failures. 

Online UPS systems isolate critical loads from power supply irregularities, safeguarding against all power problems and ensuring that AI systems operate on stable, clean energy. Their ability to maintain a perfect sine wave output and provide zero transfer time to the battery during outages is essential for preventing data loss and guaranteeing uninterrupted operations.

The integration of liquid cooling technology in UPS systems is crucial for ensuring continuous power supply, which is vital for mechanical loads in coolant distribution. Reliable cooling systems maintain optimal temperatures for equipment and facilities. With enhanced UPS systems, data center operators can ensure cooling mechanisms continue without disruption during power failures, emphasizing the need for constant power in high-density computing environments.

Transitions seamlessly to energy storage solutions 

Grid-interactive UPS systems work with energy storage technologies like BESS to help manage electricity needs from AI applications. They always ensure power, even during outages or when demand is high. With fast frequency reserve (FFR) capabilities, UPS units quickly respond to supply-demand fluctuations, enabling a smooth transition to stored energy without interruption. This is vital for data centers with intensive AI and HPC workloads. 

BESS works with the UPS to balance the AI load, storing extra energy when demand is low and releasing it during peak times to keep AI operations powered (also called "peak shaving"). This combination prevents overloads, maintains operational efficiency, and reduces the reliance on traditional power sources.

Power your AI/HPC deployment

High-capacity UPS systems can help manage the AI load demands of data centers, serving as a robust backbone for the entire power train. However, preparing for the requirements of the rising AI and HPC workloads requires more than just a reliable UPS.

Vertiv offers a comprehensive solution that delivers comprehensive support and expertise to data centers handling AI and other HPC workloads. We offer a wide range of industry product categories and technologies for AI/HPC. Whether you need brand-new technologies or retrofit systems for higher densities, we have you covered.

Learn more about High-Performance Architecture & Vertiv Contact a WWT Expert 

Technologies