Tuesday 5 May 2026, 01:04 PM

How 3nm DSPs and 224G SerDes enable 1.6T Ethernet for AI fabrics

Explore how IEEE P802.3dj, 224G SerDes, and 3nm PAM4 DSPs overcome 56 GHz signal loss to enable 1.6T Ethernet for next-generation hyperscale AI fabrics.

If you spend enough time around the Bay Area’s AI ecosystem, you start to notice a recurring frustration. We are building incredibly sophisticated models capable of real-time translation, advanced accessibility features, and fluid conversational interfaces, but the end-user experience often hinges on a loading spinner. The bottleneck isn't always the compute power of the GPUs themselves; it’s the network trying to keep them synchronized.

When you ask an AI a complex question, or when a user relies on a real-time vision model to navigate their environment, the responsiveness of that experience is dictated by how fast thousands of GPUs can talk to each other. That’s why the industry's shift to 1.6 Terabit Ethernet (1.6T) under the IEEE P802.3dj standard is so pivotal. It’s not just a numbers game for data centers—it’s the foundational layer that will make next-generation AI applications feel instant, intuitive, and accessible to everyone.

After watching the landscape evolve, particularly following the live multi-vendor interoperability demonstrations by the Ethernet Alliance at the Optical Fiber Communication Conference (OFC) in March 2026, it's clear that 1.6T has moved off the theoretical roadmaps and into reality. But getting here required solving some massive physical hurdles.

Beating the 56 GHz physics problem

To push 1.6T, the industry is relying on an 8x200G lane configuration using 224 Gbps electrical and optical signaling via PAM4 encoding. The problem? This pushes the fundamental operating frequency to the 56 GHz Nyquist limit.

At this frequency, standard PCB traces and connectors stop being conduits and start acting like severe attenuators, causing up to 40-50 dB of channel loss. It’s a hostile physical environment. If we can't get clean signals across the fabric, AI clusters stall, and the end-user experiences lag or dropped connections.

To recover these heavily degraded signals, we have to use ultra-complex Digital Signal Processors (DSPs) equipped with advanced equalization techniques. The real breakthrough I’m seeing is the shift to fabricating these DSPs on a 3nm CMOS process. It’s a thermal and operational imperative that makes the whole system viable.

Keeping infrastructure intuitive and sustainable

Whenever we introduce a massive leap in bandwidth, my immediate concern is how it impacts deployment. If upgrading to 1.6T requires ripping out existing data center architectures to install exotic liquid cooling, it limits scalability.

This is where practical innovation shines. Marvell, for instance, recently demonstrated a 20% reduction in optical module power consumption with its 3nm Ara PAM4 DSP. That power efficiency is critical because it keeps 1.6T pluggable modules below the strict 22W threshold. Why does that matter? It means operators can maintain standard air-cooling in high-density 51.2T and 102.4T switches.

Similarly, Acacia (now part of Cisco) is sampling its 3nm Kibo 1.6T PAM4 DSP designed specifically for standard OSFP and QSFP-DD form factors. By supporting complex Transmit Retimed Optics (TRO) configurations in familiar packages, they are keeping the hardware intuitive. Data center engineers can deploy 1.6T DR8 and 2xFR4 modules without having to reinvent their physical workflows. When infrastructure is easier to scale, the barrier to deploying larger, more capable AI models drops, ultimately benefiting the end users who rely on them.

The latency game and what it means for real-time AI

Raw speed is great, but in synchronous GPU-to-GPU communications, latency is the true metric of success. Even microsecond delays can stall AI training operations, increasing Job Completion Time (JCT) and driving up the cost of compute.

Credo Semiconductor’s 3nm Bluebird 1.6T Optical DSP caught my eye recently by hitting ultra-low latency metrics of below 40 nanoseconds. This is exactly the kind of verifiable metric we need to ensure that scale-out AI fabrics, like Nvidia's Blackwell clusters, run efficiently.

But we can't just take the spec sheet at face value. Validating this performance is incredibly complex, which is why Keysight Technologies’ deployment of the AresONE 1600GE platform is so important. Hyperscalers are using it to rigorously validate Layer 1 through Layer 3 performance over those 224G SerDes lanes. They are specifically testing Forward Error Correction (FEC) integrity and AI cluster congestion control. Managing FEC latency is a delicate balancing act—if the system takes too long to correct errors, we get bottlenecks. Keysight's emulation ensures the fabric can handle real-world AI workloads without degrading the user experience at the application layer.

Navigating the transition

As optimistic as I am about what 1.6T Ethernet will unlock for AI accessibility and user experience, the transition won't be entirely frictionless.

We are looking at an "air pocket" market risk as hyperscalers navigate the jump from 800G to 1.6T. Furthermore, we are witnessing the physical death of traditional PCB routing. To survive the signal loss at 56 GHz, the industry is being forced to adopt "flyover" twinax cables and Co-Packaged Optics (CPO). It’s a necessary architectural shift, but it requires a fundamental rethinking of how we build switches and servers.

Despite these growing pains, the ecosystem is maturing rapidly. By solving the deep, physical layer problems of high-speed data transfer, we are laying the groundwork for AI tools that don't just process data, but interact with us seamlessly, instantly, and intuitively. The loading spinner's days are numbered.