
Deep Neural Networks (DNN) have shown significant advantages in many domains such as pattern recognition, prediction, and control optimization. The edge computing demand in the Internet-of-Things era has motivated many kinds of computing platforms to accelerate the DNN operations. However, DNN hardware implementation is challenging due to the high computational complexity and diverse dataflow in different DNN models. To mitigate this design challenge, a large body of research has focused on accelerating specific DNN models or layers and proposed dedicated designs. However, dedicated designs for specific DNN models or layers limit the design flexibility. In this presentation, we discuss the NoC-based DNN platform to bring interconnect flexibility to the accelerators. The NoC-based design can reduce the off-chip memory accesses through a flexible interconnect that facilitates data exchange between processing elements on the chip. We study and analyze different design parameters to implement the NoC-based DNN accelerator. The presented accelerator is based on mesh topology, neuron clustering, random mapping, and XY-routing while we discuss other alternatives as well.

Associate Professor at KTH Royal Institute of Technology