White Papers

Direct from

Development

PowerEdge Product Group

Specialized accelerator-to-accelerator communication

Many technology companies are now implementing specialized accelerator-to-accelerator connection links, as

conceptualized in Figure 6 below. Nvidia’s NVlink is an example of a specialized communication path that dramatically

improves bandwidth between accelerators for applications that benefit from peer-to-peer data exchange.

Figure 6: Accelerator-to-accelerator connection links

To be clear, while these auxiliary connection types are extremely valuable for some end customer use cases, there are

other deep learning applications that yield very little benefit from this type of interconnect. Furthermore, these specialty

interconnects can be costly, both in terms of materials and design changes required to accommodate them.

In fact, the current proprietary interconnect trend is driving unique server designs - just to support the interconnect;

resulting in wide variations in hardware from vendor to vendor. Accelerator technology vendors are, seemingly,

abandoning all forms of conventional design guidelines in their own pursuit of maximum peer-to-peer bandwidth. This

may be the single biggest pain point for designing a truly optimized deep learning platform.

What’s next for Machine Learning platforms?

Physical manifestation of the peer-to-peer interconnect is not the only place where deep learning technology providers are

departing from conventional techniques. In the pursuit of ever-improved performance, some vendors are moving beyond

PCIE form factor, pushing beyond the accepted power/heat limits, and writing new math libraries. Platform designers

need to be aware that the technology underpinning the explosive growth in machine learning is still very fluid and

divergent.

Conclusion

Machine learning customers have more choices than ever for neural network models and frameworks. Those choices

impact the type, number, and form factor of the preferred accelerator, the dataflow topology between accelerators and

CPUs, the amount and speed of direct attached storage, and the necessary bandwidth of I/O devices. The resulting

platform must:

 Serve ‘their’ specific learning model– not an unrelated deep learning model.

 Stay within their data center requirements for server form factor, rack depth, power and cooling

 Be management agnostic

Solving a platform optimization challenge with this many degrees of freedom may seem daunting, but Dell EMC is

committed to helping our customers meet this challenge. Today, we are already working with a wide range of customers,

across a number of industries to solve some of the most complex and interesting machine learning problems. And going

forward, we are committing resources to ensure we remain a technology leader in this arena.

For more information on what Dell EMC Extreme Scale Infrastructure is doing with Machine Learning, contact

ESI@dell.com .