AI chip startup company, entering the "cut-off line"

release time:2026-05-09publisher:CSC


93a6fbed40bebfd843e8c1d2fa9b53b8_18120524836816.png

A Q1 2026 Data Point Throws Cold Water on the AI Chip Track

A piece of data from Q1 2026 has thrown cold water on the AI chip sector.
Globally, 135 companies are engaged in AI processor development, of which 36 are listed giants possessing advantages in technology, capital, and ecosystem. Names like Nvidia, AMD, Google, Amazon AWS, Qualcomm, Tesla, Meta, Microsoft, Broadcom, and Marvell cover the entire industry chain layout from underlying computing power to terminal applications.
The remaining 99 are startups that uphold the innovative vitality of the track. Most of them target niche scenarios, specializing in dedicated AI accelerators. Players like Tenstorrent, Cerebras, SambaNova, Groq, and Esperanto once garnered significant attention with their differentiated technical routes.
However, the elimination round for the track has already begun. A forecast from JPR (Jon Peddie Research) points directly to a harsh reality: by 2030, the number of global professional AI chip developers will plummet to around 25.
From 99 to 25, this great industry filtering is forcing every startup to rethink its survival logic.
01
AI Chip Startups Face Four Major Constraints
The first constraint is that capital is now more focused on leading AI chip startups.
The AI large model boom of 2023-2024 ignited global demand for computing power. Data shows that total global AI financing reached 599.52 billion CNY in 2024, an increase of over 300 billion CNY compared to 2023, representing a doubling in growth.
The logic of capital was "compute scarcity." As long as a chip could be produced, it was expected to fill the market gap, thus driving up valuations for AI chip startups. But entering 2026, the capital logic has reversed completely. If the previous situation was "insufficient computing power," the current dilemma is the awkward transition "from showcasing technology to real-world application."
This article summarizes twelve highly competitive AI chip startups:

f34008f27bfe56437384c4e0fc1919cf_18120527241506.png

Among them, Cerebras is renowned for its wafer-scale chips—a technology that fabricates an entire wafer into a single chip to achieve unprecedented computational density and performance. Compared to Nvidia GPUs, Cerebras Systems claims their chips offer up to 20x performance improvement on certain AI workloads. Specifically, the CS-3 holds significant advantages in AI inference.
Matrix focuses on developing AI inference chips based on digital Computing-in-Memory (CIM) technology. Last year, they developed a novel implementation scheme for 3D DRAM technology, promising to boost inference workload performance by "several orders of magnitude."
Groq introduced the LPU (Language Processing Unit), an inference chip focused on the inference phase of AI models—that is, running trained large models efficiently in real-world applications. Groq claims its LPU outperforms general-purpose GPUs in speed, low latency, and cost control, offering more cost-effective computing power for large-scale AI deployments. In December last year, Nvidia and Groq reached a non-exclusive licensing agreement to integrate Groq's AI inference technology into future Nvidia products.
The aforementioned companies are almost all hailed as formidable challengers to Nvidia. However, constrained by the dominance of Nvidia’s GPU and CUDA ecosystem, they face tremendous challenges in securing commercial customers. This predicament has directly impacted the financing landscape, making once-fanatical capital increasingly rational.
For most small and medium-sized AI chip startups, the characteristics of heavy asset investment and low profitability make it even harder to secure new funding sources.
The second constraint is cloud vendors developing their own chips, turning from customers into rivals.
The four major cloud giants—Google, Amazon, Microsoft, and Meta—were once Nvidia's largest customers and the "golden sponsors" most desired by startups. But now, they are collectively shifting towards self-developed ASIC chips. Not only are they reducing purchases of external chips, but they are also becoming the strongest competitors.
In October last year, Amazon announced that one of the world's largest AI computing clusters, "Project Rainier," was operational. This supercluster is equipped with nearly 500,000 Trainium2 chips distributed across multiple US data centers. Amazon AWS partner Anthropic has already begun running workloads on this cluster; this AI infrastructure provides over five times the computing power Anthropic previously used to train its AI models. By the end of 2025, Anthropic will run Claude model training and inference workloads on over 1 million Trainium2 chips.
Subsequently, in December 2025, Amazon released the Trainium 3. Featuring a new generation of Neuron Fabric interconnect technology, a single Trn3 UltraServer can integrate 144 chips, achieving a total compute power of 362 FP8 PFLOPs. Through the EC2 UltraClusters 3.0 architecture, cluster scale increased tenfold compared to the previous generation, expandable up to 1 million chips, providing support for Anthropic's "Project Rainier." Customer tests by Karakuri and Metagenomi show the chip reduces AI model training and inference costs by up to 50%.
This means Amazon can be self-sufficient for at least over 1 million AI chips.
In January this year, Microsoft announced the Maia 200. Microsoft claims the product's 4-bit floating point (FP4) performance is three times higher than Trainium 3, and its 8-bit floating point (FP8) performance exceeds Google's seventh-generation TPU.
Google announced it is raising its 2026 TPU shipment target by 50% to 6 million units. In early April, Anthropic signed a new agreement with Google and Broadcom. According to the agreement, starting in 2027, Anthropic will receive approximately 3.5 gigawatts of AI computing power support based on Google TPU processors provided by Broadcom. This collaboration continues the strategic layout between the parties: in October 2025, Anthropic announced an expansion of cooperation with Google, planning to deploy up to 1 million TPUs (exceeding 1 GW of compute power) to support Claude model demands. As Google's self-developed AI chip, TPUs are actively adopted not only for internal use but also by large model manufacturers (like Meta), highlighting their competitiveness in high-performance inference and training scenarios.
This cooperation also represents a structural evolution in the AI computing market: outside the "first supply chain" dominated by Nvidia GPUs, a "second supply chain" combining Google TPUs and Broadcom manufacturing capabilities is taking shape.
The third constraint is the comprehensive crushing pressure from traditional AI chip leaders.
Nvidia, AMD, and Intel form a dimensionality reduction attack on startups relying on deep technical accumulation and ecological barriers.
Nvidia holds a 90% market share in the AI chip market and has built an unshakeable moat with its CUDA ecosystem. Beyond that, as mentioned, Nvidia reached a non-exclusive licensing agreement with Groq to integrate Groq's AI inference technology into future products. At the GTC conference in March this year, Nvidia also released the Blackwell Ultra GPU, claiming a 100x increase in inference computing power. It can be said that whether in ecosystem, raw compute power, or differentiated routes, Nvidia has comprehensive coverage.
AMD broke through with the cost-performance ratio of its MI300 series, capturing more market share, while Intel partnered with SambaNova to focus on the R&D of AI inference and training chips and related software.
The fourth constraint is that ecosystem barriers are a chasm difficult for startups to cross.
AI chips are not isolated hardware but a complete ecosystem comprising "chip + toolchain + framework + model optimization." Nvidia's CUDA boasts millions of developers and supports almost all AI frameworks and models; AMD continuously catches up with ROCm; cloud vendors' self-developed chips are deeply bound to their own cloud services. However, 90% of startups only have the chip without a complete ecosystem. Customers need to invest significant manpower to refactor code and optimize models to adapt to new chips, resulting in extremely high migration costs that deter most clients.
02
The Future of AI Chips: Which Path Remains?
Since 2025, AI chip demand has undergone a critical shift: before 2025, the industry focus leaned towards the training side, centered on massive computing power supporting explosive iterations of large models; today, the AI industry is shifting towards the inference side. As large model R&D stabilizes, commercialization becomes the core. With efficiency, low cost, and low latency becoming the core demands of computing power, this has reshaped the survival logic of foreign AI chip startups.
Current mainstream AI chip routes mainly include GPUs and ASICs.
Representative enterprises in the GPU route include Nvidia and AMD. GPUs became the mainstream choice for AI training and inference due to strong parallel computing capabilities and high versatility. Especially General-Purpose GPUs (GPGPUs) can adapt to most AI models and scenarios without customization, which is the core reason for their rapid proliferation. However, it is precisely this "versatility" that makes the GPU route the hardest for startups to break into. Corresponding to the "third constraint" above, the market space left for startups is inherently limited.
Currently, startups deploying the GPU route are divided into two main directions: one is "cost-performance" for high-end general-purpose GPUs, and the other is specialized GPUs for niche scenarios to achieve precise positioning. For the former, domestic chip companies like Biren Technology and MetaX have launched products with performance close to Nvidia H100/H20 but with greater cost advantages, achieving scale shipments leveraging local supply chain benefits. For the latter, some companies create "scene-specific" GPU products to avoid head-on competition with giants. For instance, startups focusing on autonomous driving optimize GPU power consumption control and real-time performance for low-power, high-reliability vehicle environments to create automotive-specific AI GPUs; startups focusing on edge computing streamline GPU architectures to reduce power and size for edge device deployment needs.
Looking at the ASIC route, according to Marvell's forecast, the global AI ASIC market size will jump from 6.6billionin2023to55.4 billion in 2028, with a CAGR of 53%.
This route is the core path for AI chip startup breakthroughs and can be divided into three directions:
  1. ASICs offering extreme performance: Companies like Cerebras and Groq mentioned above abandon the general-purpose route, custom-developing ASIC chips for extreme scenarios like ultra-large-scale AI training and low-latency inference.

  2. Scene-specific ASIC products: Similar to the GPU scenario, this involves providing ASIC products for specific niches. For example, Matrix targets the memory bottleneck of AI inference by developing novel 3D DRAM technology.

  3. Custom chips via binding with head clients: For instance, SambaNova chose to cooperate with Intel to customize dedicated AI ASIC accelerator cards for the Intel x86 ecosystem, creating a "CPU+ASIC" synergy solution.

Among AI chip startups, companies choosing the ASIC route account for over 60%, but the development logic of different segments varies drastically. For the three types of companies mentioned above:
  • The first type of AI ASIC company can solve extreme scenario requirements uncovered by giant chips. However, if such companies fail to become absolute leaders in their niche, they may face acquisition or investment by giants to serve as technological supplements.

  • The core logic of the second type is "scene deepening + technical optimization." Without pursuing extreme general-purpose performance, they only need to precisely match industry needs to gain living space in niche scenarios. Meanwhile, due to scene focus, ecosystem construction costs are low, client migration costs are relatively lower, making scaled profitability easier.

  • The core advantage of the third type lies in "ecosystem binding." They do not need to face market competition alone but survive and develop leveraging giant resources. Through customized development, they avoid head-on competition with giants and become a "complement" to the giant ecosystem—this is one of the safest ASIC breakthrough routes.

In the future, the AI chip market may develop towards "heterogeneous fusion," where GPUs and ASICs are not mutually exclusive but maximize computing efficiency through reasonable pairing. Returning to the "99 to 25" elimination round, whether choosing the GPU or ASIC route, the core lies in precise positioning—based on own resources and capabilities, avoiding the core advantages of giants. Either find gaps in niche scenes or cost-performance areas within the GPU route, or build barriers in scene customization or ecosystem binding within the ASIC route.
So, turning our gaze to domestic Chinese AI chip companies, how did these companies fare over the past year?
03
Domestic AI Chip Companies: The Golden Window Period
The prospects for domestic AI chips seem rather optimistic. With the full penetration of AI applications and surging demand for high-performance computing chips, domestic AI chips have ushered in a golden window period for development.
  • Hygon Information: 2025 revenue was approximately 14.377 billion CNY, a YoY increase of 56.92%; net profit attributable to parent company was 2.545 billion CNY, a YoY increase of 31.79%.

  • Cambricon: Achieved revenue of 6.497 billion CNY, reaching 453.21% growth; net profit attributable to parent company was 2.059 billion CNY, a YoY increase of 555.24%, finally turning profitable. Cloud product lines are Cambricon's absolute revenue pillar, generating approximately 6.476 billion CNY in 2025, a YoY increase of 455.34%, accounting for over 99% of total revenue; edge product lines generated about 3.3939 million CNY; IP authorization and software services generated about 2.2887 million CNY.

  • MetaX: 2025 revenue reached 1.644 billion CNY, a significant YoY increase of 121.26%; net profit attributable to parent company was -789 million CNY, a 43.97% reduction in loss compared to -1.409 billion CNY in the same period last year. This marks the first time since the company's establishment that the loss amplitude has narrowed. The financial report states that 2025 revenue growth was mainly because products and services gained wide recognition and continuous procurement from downstream clients, leading to significant growth in GPU shipments. Specifically, sales of MetaX's training-inference integrated GPU cards (mainly XiYun C-series) reached 33,649 units, a YoY increase of 147.31%, while sales of intelligent computing inference GPU cards reached 4,946 units, a YoY surge of 866.02%.

  • Biren Technology: 2025 revenue was 1.035 billion CNY, a 207.2% increase from 337 million CNY in the same period last year; gross profit was 557 million CNY, up 211% from 179 million CNY last year; gross margin was 53.8%.

  • Moore Thread: 2025 revenue was 1.506 billion CNY, a 243.37% increase compared to the same period in 2024. Net profit attributable to parent company owners was -1.024 billion CNY, with the loss narrowing by 36.70% compared to the same period last year.

  • Iluvatar CoreX: Total revenue reached 1.034 billion CNY, a significant YoY increase of 91.6%; gross profit was 558 million CNY, soaring 110.5% YoY. In 2025, revenue from general-purpose GPU products reached 923 million CNY, a YoY increase of 149.6%, accounting for 89.3% of total revenue that year. Among them, the Tianjia series designed specifically for AI model training generated revenue of 584 million CNY (up 116.7% YoY), while the Zhikai series designed for cloud and edge inference applications generated revenue of 339 million CNY (up 238.2% YoY), with its revenue contribution rising from 18.6% in 2024 to 32.8%.

With the increase in domestic computing power demand and the gradual deepening of R&D by various companies, domestic chip manufacturers are currently continuously eroding Nvidia's market share in China. According to the latest IDC report, the market share of domestic GPU and AI chip manufacturers has risen to 41% for the first time, while Nvidia's share in the Chinese market has plummeted from nearly monopolistic 95% to 55%.
IDC data shows that total shipments of AI acceleration cards in China in 2025 were about 4 million units. Nvidia still ranks first with about 2.2 million units (55% share), but compared to the previous absolute dominance of nearly 95%, this change is considered a cliff-like drop.
IDC analysis points out that the major shift in China's local AI chip market stems primarily from US export controls cutting off China's access to Nvidia's high-end chips, coupled with the urgent domestic demand for supply chain localization. These factors jointly drive the rapid ramp-up of local domestic chips. Particularly in 2025, with the concentrated landing of China's AI new infrastructure and intelligent computing centers and a procurement tendency towards localization, this has become the core driver of domestic chip growth.