Overview
Explore
Resources
Select a tab
2 results found
Deploying generative AI applications with NVIDIA NIMs on Amazon EKS
Amazon EKS is a managed service for running Kubernetes workloads on AWS. We can use EKS to orchestrate NVIDIA NIM (plural: NIMs) pods across multiple nodes, because it automatically manages the availability and scalability of the Kubernetes control plane nodes responsible for scheduling containers, managing application availability, storing cluster data, and other key tasks.
Partner Contribution
• Dec 3, 2024
Partner POV | Gain operational insights for NVIDIA GPU workloads using Amazon CloudWatch Container Insights
As machine learning models grow more advanced, they require extensive computing power to train efficiently. Many organizations are turning to GPU-accelerated Kubernetes clusters for both model training and online inference. However, properly monitoring GPU usage is critical for machine learning engineers and cluster administrators to understand model performance and to optimize infrastructure utilization.
Partner Contribution
• Dec 4, 2024