With five decades’ development, the hot topics in Silicon Valley have eventually regressed from Big Data, APP, and SaaS back to the original reason for the rise of Silicon Valley: chips. Only the main character this time is GPU, not CPU.
From AI startups like OpenAI to the top three public clouds of Microsoft, Google, and Amazon, and new cloud service challengers like CoreWeave and Lambda, to large technology companies with AI demands such as Meta and Tesla, are all managing to take more GPU computing power.
Not only is GPU available for training AI models, but it has also become a type of accredited “new asset.” In 2023, the cloud service startup CoreWeave successfully obtained debt financing of up to US$2.3 billion with Nvidia’s popular chip H100 GPU as collateral. Which has also brought up the market value of Nvidia. This decisive leading company in the GPU market has more than tripled in growth in the past year and become the third-large enterprise in the US after Microsoft and Meta.
However, the short supply of GPU has become the largest bottleneck in current AI development. For example, at the end of 2023, OpenAI was forced to suspend new user registrations for several weeks due to insufficient GPU computing power. I’m also told by close entrepreneurs that they have to wait at least one to two quarters to obtain GPU computing power via public cloud services now. No wonder that, despite the computing power support of Microsoft, it was recently announced that OpenAI CEO Sam Altman is going to raise approximately US$7 trillion to establish his own AI chip factory.
The shortage of GPUs has not only set off “preparations” among technology giants but has also raised the threshold for new AI startups. First of all, current AI computing power distribution is extremely uneven. It’s estimated that less than 6% of Nvidia’s latest GPU products have flown to startups, making it harder for them to obtain AI computing power. In addition, nowadays, prior to developing products and finding users, AI entrepreneurs must first raise hundreds of millions dollars, manage to obtain computing power for AI model development and optimization, then start developing application services. Their entrepreneurial logic is completely different from that used for SaaS or APP previously.
However, this phenomenon of AI computing power shortage is hard to solve in a short time, and it may even worsen. The reason is that we are switching from general-purpose computing to niche computing. In short, chips will become more diverse in the future, with different designs to deal with the devices and application types carrying them. For example, the computing power required for AI to generate videos and text are different; there are also TPU (Tensor Processing Unit) chips launched by Google that are specially designed for neural networks, and LPU (Language Processing Unit) chips launched by AI startup Groq designated for large-language models.
With chip applications becoming more subdivided, it is difficult for us to rely on a few major cloud vendors to provide such diverse varieties of chip computing power as in the past. The more dispersed computing power becomes, the more challenging “finding computing power” will be.
With this trend, in addition to benefiting chip manufacturers like Nvidia and TSMC, I believe that another opportunity lies in improving “distribution” – whoever can reallocate existing idle computing power with a more efficient approach can take advantage of business opportunities. It’s hard to imagine idle GPU computing power with such a shortage of GPUs now, isn’t it? However, it’s shown in data that the usage rate of data centers in the US has only been 12-18% in recent years, indicating that there is still much room for optimization in data center resource allocation.
Cherubic Ventures’ cloud GPU computing startup Inference.ai is an example of properly grasping the “distribution” business opportunity. Its primary product, a GPU version of Airbnb, can match idle GPUs in data centers around the world with enterprises demanding AI computing power that don’t want to establish their own GPU servers. They have also developed the ChatGPU chatbot, which can recommend suitable chip style and quantities based on customers’ development demands. In the past, purchasing AI computing power was like opening a “black box” blindly. It was difficult to estimate how much computing power was there, and how long it would take to achieve an AI model with a specific accuracy. However, as Inference.ai has completed lots of tests with different AI models in advance, so we can identify how much computing power users need, find where this idle computing power is, and provide it to users.
As mentioned earlier, the shortage of GPUs may be a development bottleneck to many people, but to outstanding entrepreneurs, it has become a business idea with great opportunities. With this attitude, it is definitely helpful to us to find the best position to take advantage in the initiation of a new era, with market reshuffling and reorganization!