Orange Silicon Valley recently announced its work with South Korean high performance computing company CocoLink on scaling deep learning algorithms across a cluster of 20 GPUs. Using Nvidia K40 GPUs, the system delivered 100 teraflops but they have recently upgraded to newer Nvidia GTX 1080 boards and hope to reach 200 teraflops. It’s a functional prototype of one of the world’s highest density Deep Learning Supercomputers in a box.
From the announcement:
A team of artificial intelligence (AI) researchers at Orange France were able to scale Caffe to 8 GPUs using the beta release of CUDA 8.0 and CuDNN 5 and CuDNN4. The eventual objective is to scale Caffe to take advantage of all 20 Pascal GPUs. The full 20 GPU server could theoretically deliver more than 200+ teraflops making this the world’s highest density Deep Learning systems.
The work is a notable partnership between researchers in USA, France and South Korea to accelerate Artificial Intelligence by pushing the limits of thermodynamics, geometry and price v.s. performance efficiency.
Orange VP Infrastructure, Technologies and Engineering Jerome Ladouare said “it is now possible to run Deep Learning over massive volumes of video data at high speed and also perform contextual analysis over several hundred streams in real time. With our partners, we have prototyped an advanced video analytics capability that could efficiently exploit a supercomputer in a box at the edge of our network; we can thus envision a convergence between AI and Exascale.”