June 16, 2025
Networking in the Era of AI
IT leaders are preparing their networks to support emerging AI applications while also leveraging the technology to improve network reliability and performance.
- AI IN THE NETWORK
- AI ON THE NETWORK
- AI ERA NETWORKING SERVICES
Although line-of-business use cases have recently garnered widespread attention, AI can deliver profound benefits in data centers by automating configurations, optimizing performance and strengthening security. Gartner predicts that 30% of enterprises will use AI to automate more than half of their network activities by 2026, up from less than 10% of organizations in 2023. Among the capabilities driving this surge:
IMPROVED PERFORMANCE: Networking vendors are integrating advanced automation to optimize connectivity and enhance user experiences. For example, radio resource management optimizes wireless network performance by dynamically adjusting radio parameters such as transmit power, channel allocation and beamforming. This process provides an optimal balance of coverage, capacity and interference. Previously, network administrators needed to make these adjustments manually. Similarly, modern networking switches can handle engineering tasks such as dynamic load balancing, using AI and machine learning (ML) techniques to distribute traffic flows more evenly.
AI is also central to intent-based networking, which translates high-level business intent into network policies and then automates the execution, monitoring and enforcement of those policies. Networks that leverage IBN can automatically adjust to changes in traffic, user demand and application requirements, ensuring optimal performance and resilience.
SIMPLIFIED MANAGEMENT: Modern networks generate an immense amount of telemetry and log data, and AI-driven management platforms can use this information to generate critical insights and automate routine management tasks. In the past, network administrators largely managed their environments device by device, but modern platforms provide a unified view of devices, switches, routers and even cloud networks. In addition to this real-time monitoring and visibility, modern network management solutions use AI and ML to correlate events across different systems and detect patterns.
Automation tools have also become critical enablers of routine network management tasks, such as patching. Such tasks are necessary, but they are also time-consuming, and the management burden associated with them can keep senior engineers from working on projects that require their knowledge and expertise. By offloading some of their low-level network management activities, administrators can free up time for higher-value tasks.
Click Below to Continue Reading
AUTOMATED TROUBLESHOOTING: AI-powered networking tools can identify, diagnose and sometimes even resolve network problems automatically, often before users notice any disruption. This reduces the need for manual intervention, speeds troubleshooting and lowers the risk of human error.
Increasingly, networking tools are incorporating generative AI features that allow IT teams to use natural-language prompts to collect network data and translate that information into insights. In the past, engineers had to set up individual alerts and dashboard monitors to gather these insights. Now, through tools such as Slack or Microsoft Teams, they can access observability solutions, using a chatbot to make real-time inquiries. This not only makes it easier to gather information and perform troubleshooting activities but also “democratizes” network monitoring interfaces by making them accessible even to users without an IT background.
ENHANCED SECURITY: Network security tools have grown better at detecting small anomalies that may be indicators of a cyberattack. Using AI and ML, these tools analyze vast amounts of network data in real time, identifying both known and unknown threats by hunting for unusual patterns and behaviors that traditional security tools might miss.
By first learning what normal network activity looks like, AI-powered security solutions can quickly flag any deviations that could indicate malware, data breaches or insider threats. Some organizations that deployed automated tools (such as managed detection and response platforms) several years ago quickly became overwhelmed with false alarms. But these tools have improved over time, and they are now much better at distinguishing between harmless anomalies and true threats.
Supporting generative AI is largely a compute-centric challenge, with LLM training typically requiring large clusters of expensive, scarce graphics processing units (GPUs). However, these workloads also introduce new networking requirements, particularly for east-west traffic within the data center. To support emerging AI applications, organizations will need to deliver lossless data flows, maintain multiple specialized sub-networks within their data centers and leverage technologies that will support AI at scale.
LOSSLESS NETWORKING PROTOCOLS: To maximize GPU utilization and avoid idle time, networks must minimize packet loss and jitter. Even minor packet loss can dramatically degrade AI training performance, and the new emphasis on lossless networking means that many network administrators are rapidly getting up to speed on protocols with lengthy names. For instance, Remote Direct Memory Access (RDMA) over Converged Ethernet, version 2 (or RoCEv2, pronounced “Rocky Vee Two”), allows direct memory-to-memory data transfers across nodes with extremely low latency. The protocol operates over standard Ethernet infrastructure, but it requires a no-drop environment, which can be achieved through Data Center Bridging enhancements.
Organizations with significant AI training workloads may need to upgrade data center switches to models that support DCB, and networking teams may need upskilling to ensure they are comfortable managing RDMA-capable networks. Major networking vendors have begun to incorporate AI-friendly networking protocols into their product lines, with leading Ethernet switch vendors supporting RoCEv2.
Click Below to Continue Reading
LOGICAL NETWORK DESIGN: Enterprise AI deployments introduce the need for multiple logical networks within the data center, each with its own distinct purpose. IT leaders should plan to build and support at least three logical networks: a front-end network, an out-of-band management network and a cluster (or back-end) network. The front-end network is the traditional data center LAN that connects AI servers to applications and carries user traffic, API calls and general business data. While a front-end network may not have special low-latency requirements, it will often need a high capacity to support large data sets without creating bottlenecks.
The OOB network connects to the servers’ management interface and cluster controllers, giving administrators control and visibility for AI clusters even if primary networks are congested. Finally, the cluster network is a high-performance fabric linking AI servers and storage nodes. This multinetwork architecture allows organizations to rightsize investments and optimize performance for AI workloads, but some aspects may be unfamiliar to internal IT professionals, depending on their background and training.
INFINIBAND VS. ETHERNET: Organizations supporting AI workloads will need to make a key architectural decision when choosing between InfiniBand and Ethernet connectivity for their cluster networks. InfiniBand offers ultralow latency and high bandwidth for high-performance computing networks, making it a natural fit for AI training clusters, where speed is especially important. However, deploying InfiniBand requires a special skillset, and engineers outside of teams focused on high-performance computing typically lack expertise with the InfiniBand networking stack.
Ethernet, which has rapidly evolved to meet the needs of AI workloads, offers an alternative. Modern data center Ethernet can now approach InfiniBand-like performance for most workloads, and virtually all enterprise IT teams already have extensive experience with Ethernet networks. Vendors such as NVIDIA have begun building validated design reference architecture models with Ethernet supporting the primary cluster network, signaling a shift to the technology for enterprise AI clusters. As vendors continue to support these offerings, Ethernet is likely to become the default for enterprise AI deployments, with InfiniBand reserved for niche scenarios.
The challenges of supporting modern computing and connectivity require IT and business leaders to be strategic about network design, implementation and management. Services from a trusted partner such as CDW can help organizations overcome knowledge gaps, support emerging applications and identify and deploy innovative networking tools.
NEXT-GEN NETWORK ASSESSMENTS: During a CDW Next-Generation Network Assessment (NGNA), experts evaluate existing network infrastructure to identify areas for improvement, with more than 140 robust checks based on best practices from leading manufacturers. These assessments cover areas such as architecture and configuration, and they can be tailored to an organization’s specific needs.
NETWORK INFRASTRUCTURE MODERNIZATION: As organizations seek to upgrade, refresh or expand their existing networking infrastructure, CDW’s solution architects can provide vendor-agnostic advice about which switches, routers and other networking equipment will help them reach their goals. This assistance is particularly helpful for organizations that have not made previous investments in next-generation networking tools that extensively incorporate automation, as well as for those that are planning to support AI workloads for the first time.
HIGH-PERFORMANCE NETWORKING: For organizations building AI clusters or other data-intensive applications, CDW can help design and deploy high-speed, low-latency networks. Services include configuring lossless Ethernet or InfiniBand connectivity, implementing RDMA protocols and optimizing quality of service policies. By leveraging the expertise of CDW’s solution architects, organizations can accelerate the deployment of AI-ready infrastructure that integrates with their broader IT environments.
IMPLEMENTATION AND INTEGRATION: CDW provides hands-on support to help organizations deploy new networking solutions and integrate them with their existing infrastructure. Services include configuration of new infrastructure, workload migration and performance testing to ensure networking solutions meet performance demands. By working with CDW, organizations can sidestep common pitfalls and accelerate the time to value for their networking investments.
AI-ENABLED NETWORK MANAGEMENT: As organizations introduce AI-enabled management platforms to their networking environments, CDW can help internal teams set up and become familiar with anomaly detection platforms, AI-assisted troubleshooting features and other advanced management tools. As organizations mature in these workflows, their networks will become increasingly automated and self-healing, freeing up time for administrators to focus on higher-value activities.
KNOWLEDGE TRANSFER AND TRAINING: To help organizations close skill gaps, CDW delivers custom training for both networking and AI, as well as vendor-specific classes to help IT professionals get up to speed on new networking equipment. Offerings range from workshops on managing InfiniBand networks to sessions on using AI-driven network analytics effectively.
ONGOING SUPPORT: CDW offers ongoing services to optimize network performance as AI demands evolve. Support includes regular health checks, software updates and scaling assistance for new AI workloads. These ongoing services ensure that networks continue to deliver performance and reliability even as workloads, networking technology and business needs evolve. Acting as an extension of an organization’s internal IT team, CDW can ensure that networks continue to meet operational and innovation needs for the organization and its users.
Click Below to Continue Reading
Addressing Networking Skill Gaps With AI
According to Network Computing, there are several ways AI can address skill gaps for organizations that struggle to fill networking roles.
INCREASE PRODUCTIVITY: Specialized tasks require specific knowledge or expertise and often take longer to complete than other tasks. AI tools can help IT staff members accomplish more with their time.
Empower Generalists: Growing complexity has made it a challenge for many organizations to hire specialists for every part of the network. AI tools can help generalists apply their skills across network infrastructure.
COMBINE GENAI WITH DIGITAL TWINS: By combining these emerging technologies, engineers can gain valuable insights into network operations, helping them identify security concerns, troubleshoot issues and meet compliance requirements.
DEMOCRATIZE DATA ACCESS: AI tools put actionable data into the hands of everyone on a networking team, enabling less skilled engineers to use natural-language prompts to gain network insights.
Rick McGee
CDW Expert
Tim Larson
CDW Contributor