AI isn’t all about speeds and feeds
A few years ago, I started to engage with AI and ML companies that created purpose-built architectures that pump out chips and IP that are 100, 200, and even 300 times faster than anything ever invented before. I was amazed! After all, I’d been used to a world where next-generation products bring incremental improvements – never 100x. I even started to gravitate toward the type of marketing I’ve consciously tried to avoid throughout my career: the ‘speeds-and-feeds’ marketing that’s ubiquitous in the semiconductor industry. But in this case, it seemed that anything so disruptive would sell itself.
As it turns out, it doesn’t. People will buy superior technology, of course, but superior technology alone isn’t what moves customers from established players like Nvidia to new up-and-comers with better technology.
Companies will embrace a ‘new thing’ that’s hundreds of times faster or lower power only if they have to and/or are forced to. If adoption isn’t easy from an integration and software development perspective, or the risk is too high, most customers will go with the low-risk, proven option. In other words, I believe customers will embrace a ‘total solution,’ especially for use cases or form factors they won’t be able to achieve with today’s incumbents.
For example, companies like Axelera bring to market various form factors for quick customer onboarding. Brainchip, Neuton.AI and previously Pilot.AI (now Syntiant) are evangelizing IP and software into the more traditional microprocessor suppliers and enable extremely efficient small-footprint, intelligent wearable or even implantable devices, which traditional offerings don’t support. I’m also seeing companies that design vertical solutions combining both hardware and software for a total solution, like Syntiant.
Several weeks ago, AI company Cerebras announced their large ‘win’ and I’m excited to see how they and their competitors will bring faster compute to the market. There’s also more and more traction coming out of Groq with the recent news that the company is breaking performance records in Large Language Model inference. ChatGPT, new data models and more consumer services will force the world into deploying more compute, fast. Demanding new developments require more and better compute that’s more efficient, faster and cheaper and has more functionalities (on the edge).
This compute requirement is fast overwhelming the cloud. As we’ve seen with the recent announcements from Qualcomm and Meta, even generative AI needs a hybrid approach to scale as well as more efficient, ‘right-sized’ and specialized AI compute done on edge devices.
Whatever the trajectory, one thing is clear. The progress for AI is going to be enabled or blocked by AI-friendly compute that requires a much lower power envelope and is easy to acquire, integrate and deploy. Companies like Enfabrica have also looked at the problem. They believe the fabric is the computer and have built a solution that’s advanced, performant and efficient, connecting compute, memory and network.
Speeds and feeds will make a significant difference, but the onboarding of clients, the provider’s reputation and easy-to-use software and tools will be determining factors. Turns out the world changed hard and fast, and it also didn’t: suppliers build new technology and products, the customers decide.