Huawei has introduced the CloudMatrix 384 AI chip cluster, a platform for large-scale AI model training. It uses a high-density network of Ascend 910C processors connected via optical interconnects, and the company says it offers improvements to energy efficiency and the speed of training. The design purports to CloudMatrix 384 outperform GPU-based clusters, although taken individually, Ascend chips don’t match the performance of top-tier GPUs from the Chinese giant’s Western competitors.
Huawei’s AI hardware-software stack positions the company as a challenger to NVIDIA’s market dominance, especially in domestic and allied markets. Despite, or perhaps because of, the sanctions that limit access to American technologies, Huawei is expanding its ecosystem with tools that circumvent dependence on foreign hardware and software, producing its own versions of AI workflows and finished products that compete with the established AI players.
To take advantage of Huawei’s AI infrastructure, data engineers need to modify their workflows, using tools specifically designed for use with Ascend processors. Chief among them is MindSpore, Huawei’s deep learning framework. While any transition will mean the introduction of new tooling, underlying workflows – data ingestion, transformation, and model iteration – remain largely transferable.
Instead of leveraging CUDA through frameworks like PyTorch and TensorFlow, Huawei’s Ascend processors need MindSpore; tightly integrated with Ascend hardware and supporting dynamic and static graph execution.
Models built in PyTorch or TensorFlow can be converted or retrained using MindSpore, for which Huawei provides MindConverter. This helps convert model definitions, although engineers should recognise that it can require manual adjustment and fine-tuning. Feature parity is not, unfortunately, one-to-one.
MindSpore differs in syntax, operator behaviour, and training pipeline architecture. Default settings for padding in convolutional and pooling layers and weight initialisation methods, for example, might behave quite differently. As ever, it’s in the nuances between the Huawei and CUDA tools and methods that help create training reproducibility.
MindSpore uses MindIR (MindSpore Intermediate Representation) as its model export format, a static graph representation used for cross-platform deployment that’s optimised for Ascend NPUs.
Models trained in MindSpore can be exported via the mindspore.export function, which serialises the trained network, producing output in MindIR format. This makes the model ready for inference.
According to MindSpore’s official documentation, deployment involves loading the MindIR model and invoking inference using APIs which manage model de-serialisation, memory allocation, compute execution, and so on.
It’s worth noting that MindSpore separates training and inference logic, unlike PyTorch or TensorFlow which blur the two. Resultingly, all preprocessing steps should match those used during training. Tools like GraphKernel, AOE (Auto Optimizing Engine), MindSpore Lite, and the Ascend Model Zoo offer additional optimisation for inference deployment.
Huawei’s CANN (Compute Architecture for Neural Networks) is a foundational component of the Ascend AI software stack, and can be considered an analogue to NVIDIA’s CUDA. It consists of:
Tools provided within CANN, such as Profiler, MindStudio, and Operator Tuner, are able to fine-tune model performance at runtime, and provide metrics on memory use, kernel-level bottlenecks, and execution flow.
MindSpore offers two execution modes:
To support both modes, code should avoid complex Python-native control flow methods and use MindSpore’s built-in control operators wherever possible.
ModelArts is Huawei’s cloud-native AI development and deployment platform for Ascend hardware and MindSpore. It provides a full AI pipeline, similar to AWS’s SageMaker and Google’s Vertex AI.
ModelArts supports:
Transitioning to Huawei’s MindSpore and CANN ecosystem will need some re-skilling for those well-entrenched in CUDA-based tooling.
While Huawei’s ecosystem is maturing, it lacks the open-source community and third-party enjoyed by PyTorch and TensorFlow, so developers can expect gaps in documentation and a degree of limit on library compatibility.
Hardware access is another consideration. Ascend processors are powerful and highly efficient for AI workloads, but their availability is limited outside of areas where Huawei has and is investing. Teams may need to rely on remote access via cloud platforms like ModelArts to run large-scale experiments.
Huawei does, however provide a comprehensive migration guide, some conversion and tuning tools to assist the transition. For teams targeting regions where Huawei infrastructure is readily available, the performance and efficiency gains can be considerable.
(Image source: “Huawei P9” by 405 Mi16 is licensed under CC BY-NC-ND 2.0.)
See also: Apple to open its garden for developers building Watch widgets
Looking to revamp your digital transformation strategy? Learn more about Digital Transformation Week taking place in Amsterdam, California, and London. The comprehensive event is co-located with IoT Tech Expo, AI & Big Data Expo, Cyber Security & Cloud Expo, and other leading events.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
Jun 30, 2025, 10:59 AM ETWimbledon begins Monday, with two-time defending champ Carlos Alcaraz a…
IE 11 is not supported. For an optimal experience visit our site on another browser.Now…
Amazon's Prime Day is fast approaching and the are coming through thick and fast. One…
Matt MillerJun 30, 2025, 06:35 AM ETCloseMatt Miller is an NFL draft analyst for ESPN,…
The NSA and CISA are urging developers to adopt programming languages that reduce the risks…
Descrease article font size Increase article font size Winnipeg Jets fans will have an opportunity…