Chinese technology massive Huawei declared its plans for the next generations of its Ascend chip line at the Huawei Connect 2025 event in Shanghai this week.
In his address to the conference, deputy chair of the Huawei board, Eric Xu, stated that 2025 have been a “memorable year,” and observed the debut of DeepSeek-R1 in January as a turning point for the corporation. He also mentioned that China is probable to lag behind in semiconductor producing process nodes, “for a relatively long term.”
The corporation’s reaction to tariffs and trade embargoes is to develop infrastructure design and technology, plus it’s made the decision to open-supply numerous large swathes of its software program, which includes the openPangu foundation AI models and the Mind series SDKs.
The new Ascends
The corporation plans to manufacture three latest collection of the Ascend chip, the 950, 960, and 970.
The Ascend 950PR and 950TO could be cast from the equal die, and offer extra guide for low-precision data formats, which include FP8 – in which the 950 will deliver a PFLOP of overall performance, and MXFP8, rated at two PFLOPs. A PFLOP is a one thousand trillion floating factor calculations according to second.
There’ll also be better vector processing, and extra granular memory access, all the way down to 128 byte chunks from 512 bytes.
The Ascend 950 chips will provide 2 TB/s interconnect bandwidth, 2.5x more than the modern Ascend 910C. The 950PR will be available Q1 2026, and the Ascend 950DT launches Q4 2026.
Available a 12 months later in Q4 2027, the Ascend 960 can have two times the computing energy, memory access bandwidth, reminiscence capability, and range of interconnect ports as the 950. It will help Huawei’s proprietary HiF4 data format, which, the corporation claims, brings more precision than other FP4 technologies.
The most successful chip will be the Ascend 970, slated for launch Q4 2028. Xu stated, “We’re still working on some of its specs, but our general goal is to push all of its specs much better.” He stated it was anticipated that the Ascend 970 series will provide an interconnect bandwidth of 4TB/s, be capable to 8 PFLOPs of FP4, and will include large memory capacity.
SuperPods of NPUs
Huawei’s strategy is to provide hyperscalers clusters of raw compute in the form of SuperPoDs, that allows you to appear starting to seem Q4 2026 in the form of the Atlas 950 SuperPoD, prepared with the new Ascend 950DT chips.
Competitor NVIDIA’s NVL144 system (a SuperPod analogue) will release mid- to late-2026, and Huawei claims that its first SuperPoD may have 56.8 times more NPUs than GPUs within the NVL144, and deliver almost seven times the processing power. Even with the scheduled arrival of the NVL576, which NVIDIA is ready to release in 2027, the Atlas 950 SuperPoD will still be the better performer.
General computing chips
For preferred computing, Huawei plans to launch two models of its Kunpeng 950 processors in Q1 2026, sporting 96 cores & 192 threads, and 192 cores & 384 threads in the faster of the two models. There can also be what Xu referred to as “the world’s first general-motive computing SuperPoD,” the Kunpeng 950-primarily based TaiShan 950 SuperPod, so one can be available within the first quarter of 2026.
Open-supply connectivity protocol
The NPU and general computing SuperPoDs will use UnifiedBus 2.0, the next iteration of the current UnifiedBus 1.0. That’s the interconnection technology used by the Atlas 900 A3 SuperPoD, which got here into service in March this year, with over 300 installations to date.
UnifiedBus 2.0 is to be an open protocol, with the tech specifications launched at once to developer community. UnifiedBus 2.0 might be used internally in the latest generations of SuperPods, and join clusters of SuperPods, forming SuperClusters.
The first cluster product is to be the Atlas 950 SuperCluster, providing 2.5 times extra NPUs and 1.3 times more computing power than xAI’s Colossus, recently the world’s most powerful computing cluster.
In the last quarter of 2027, Huawei intends to release the Atlas 960 SuperCluster, which will incorporate to over 1,000,000 NPUs and deliver 4 ZFLOPS in FP4 (with a ZFLOP representing 10^21 floating point operations per second). “SuperPoDs and SuperClusters powered by using UnifiedBus are our solution to surging call for for computing, both today and day after today,” Xu stated.