LABDCN-2011 - Build and Monitor AI/ML Ready Network using Nexus Dashboard
Proctors | Vivek Dalvi None |
This lab highlights the integration of Cisco Nexus 9000 switches and the Nexus Dashboard to support high-performance AI/ML workloads. It consists of both private and shared infrastructure and covers the following two scenarios: 1. Deploying an AI/ML Fabric In this scenario, we explore the deployment of an AI/ML fabric using virtual Nexus 9000v switches. This includes administrator access to the Nexus Dashboard. The Cisco Nexus Dashboard Fabric Controller (also known as the Fabric Controller service) provides best-practice configuration and automation capabilities, enabling network configuration in just minutes. This includes essential features such as QoS configuration for PFC (Priority Flow Control) and ECN (Explicit Congestion Notification). 2. AI/ML Traffic Congestion Management Here, we examine congestion management within a shared physical hardware setup. Users can review pre-configured components and deploy AI/ML fabrics to understand how Cisco's networking solutions enhance latency, scalability, and automation for AI applications. This scenario demonstrates the mechanisms that manage and mitigate network congestion for AI/ML RDMA traffic in backend networks. Cisco's solutions, including ECN and PFC, ensure high throughput and low latency, maintaining seamless data flow in demanding AI/ML environments