Aiming to maintain and strengthen its advantage amongst intensifying competition for AI development, NEC has begun building a supercomputer for AI research. The system, which will exceed 580 PFLOPS in processing power, is scheduled to start operating next March. FLOPS are a unit of computer processing performance and indicate the number of floating-point operations performed per second. P (Peta) indicates 1,000 trillion, so 580 PFLOPS means this new system can perform 1,000 trillion x 580 (58 K) floating-point operations per second. This performance will make it the largest scale supercomputer to be in use for AI research by a private company in Japan.
Part of the supercomputer's system is already in use by several hundred AI researchers at NEC (100 FLOPS). With the addition of the 480 PFLPOS system in construction, the company will establish the most-advanced AI-focused R&D environment in Japan. NEC plans to utilize the system for the rapid development of more advanced AI. In the future, the company also aims to realize a center of excellence in AI research that produces advanced social value through co-creation with customers and partners.
Deep learning, a core AI technology, is rapidly evolving, and its areas of application are expanding. The amount of computation needed for the development of deep learning is also continuing to grow. Therefore, large-scale computational resources that can rapidly produce a wide variety of advanced AIs are needed to promote digital transformation (DX) throughout society.
NEC will invest billions of yen in the development of this supercomputer. The system consists of 116 state-of-the-art GPU servers (from Super Micro Computer, Inc.) featuring eight high-end NVIDIA A100 80GB Tensor Core GPUs per node. It also has a storage appliance with a 16PB trillion EXAScaler high-performance parallel file system (manufactured by DataDirect Networks).
Its theoretical processing performance is over 580 PFLOPS, making it capable of learning tens of millions of images in a few minutes. For networking, it uses the NVIDIA Spectrum SN3700 high-speed Ethernet switch. To achieve high-speed distributed learning, all servers are connected via 200GbE and communicate at ultra-high speed and low latency via RoCE (RDMA over Converged Ethernet) v2. The company uses its proprietary construction method, centered on Kubernetes, an open-source method of container management, to tightly combine these advanced hardware and software groups. It has thus realized a high-performance and user-friendly system.
NEC aims to develop advanced AI that can deal with social issues in this age of volatility, uncertainty, complexity, and ambiguity (VUCA) in a real-time and dynamic manner, to achieve DX for society, improve intellectual and physical creativity and productivity in human activities, and create a sustainable global environment. This new supercomputer will bring together customers, partners, and the company's AI researchers to create a center of excellence for AI research that will also help produce advanced social value.
This article has been translated by JST with permission from The Science News Ltd.(https://sci-news.co.jp/). Unauthorized reproduction of the article and photographs is prohibited.