Efficiently Scaling AI Data Infrastructure with Ocient

Utilizing Tech Podcast hosted by Stephen Foskett and Jeniece Wnorowski

When you are dealing with complex, always-on data workloads, efficiency is a key element to managing and analyzing that data in real time. Ocient provides software architecture solutions that can scale the amount of data you're analyzing in a system. From telcos moving from 4G to 5G, vehicles producing petabytes of data, fintech and compliance, and more, data must be analyzed in order to be useful. Listen to our discussion with Chris Gladwin, CEO of Ocient, to see how they are using SSDs to tackle this never-ending stream of data efficiently in terms of infrastructure and energy use. 

 

Audio Transcript

This transcript has been edited for clarity and conciseness

Stephen Foskett: As the volume of data supporting AI applications grows ever larger, it's critical to deliver scalable performance without overlooking power efficiency. This episode of Utilizing Tech, sponsored by Solidigm, brings Chris Gladwin, CEO and co-founder of Ocient, to talk about scalable and efficient data platforms for AI. Welcome to Utilizing Tech, the podcast about emerging technology from Tech Field Day, part of The Futurum Group. This season is presented by Solidigm and focuses on the question of AI data infrastructure. I'm your host, Stephen Foskett, organizer of the Tech Field Day event series. Joining me today from Solidigm as my co-host is Jeniece. Welcome to the show.

Jeniece Wnorowski: Hi, Stephen. Thanks for having us back.

Stephen: Jeniece, you and I have spent a lot of time talking about a lot of storage topics, but one of the things that we come back to a lot is energy efficiency and the question of power consumption when it comes to AI data infrastructure.

Jeniece: Yeah, absolutely. At the beginning of the year, it was really just around AI. How about AI? What's the GPU? How to work with the GPU? It was all about the instances of the GPU, but now we're kind of turning that corner and we're starting to see many organizations looking at not just performance, but also really looking at power and efficiency and scalability. So we are super excited about the opportunity today to talk about that a little bit more. And I know you want to introduce our guest, so I'll let you do that. But yeah, we're excited to dive into that specific topic.

Stephen: Yeah, absolutely. And I think this is one of those things that comes up again and again, when people talk about AI, you know, the naysayers are constantly talking about the fact that, you know, it uses all this power and, you know, people aren't considering that. They're not thinking about that. Well, one company that is absolutely thinking about that is Ocient. And so it's very, very nice to have here on the podcast today, somebody I've known for a long time, Chris Gladwin, CEO and co-founder of Ocient. Welcome to the show.

Chris Gladwin: Great to be here. I really look forward to this conversation. Great to reconnect with you, Stephen. I know we worked together, you know, over the past years and here we are again.

Stephen: So tell us a little bit more about what is Ocient and which part of the AI data infrastructure stack do you play in? And then we can talk a little bit more about the question of energy efficiency and scaling performance.

Chris: So Ocient is a company that's developed a new software architecture for analysis of large data sets for complex analytics, always-on workloads. And in doing that, we've really, we have focused on efficiency in general, both price performance as well as energy efficiency because they go hand in hand. So what we're doing is providing solutions using this new architecture for this emerging group of ultra large data analytics requirements. And as you and I have seen, we work together at my prior company, CleverSafe, in the kind of large scale storage realm. Today's ultra large analytics workloads just become tomorrow's normal and eventually that's what your phone does. So it's really important to get this right. We really have to, as an industry, kind of be on the right trajectory of efficiency. Otherwise we're going to have some real problems powering this future without being efficient.

Jeniece: So Chris, let's dive into that a little bit. How is Ocient specifically looking at addressing some of these challenges?

Chris: Well, it really comes down to focus. As with a lot of things you see in information technology, capabilities just keep growing and growing and growing and it really is like what are the engineers building all these capabilities really trying to do? What are they trying to optimize? You've seen example after example, when you take a group of talented people and you give them the mission of delivering better performance, improving cost, having new capabilities, they do it. The other thing that is really important is the focus on efficiency. So, what we're seeing now in the industry is all these new capabilities are coming out in the realm of artificial intelligence and large scale data analytics and all these really amazing capabilities. But so far, there hasn't been a focus on doing that not only in a way that's cost efficient and performant, but in a way that really is energy efficient. I was fortunate enough, one of the jobs I had early in my career is I worked at Zenith Data Systems in Chicago. That's why I originally moved to Chicago when it was the largest portable PC maker in the world. And that was the very, very beginning of focusing information technology on energy efficiency because back then you had a battery and batteries back then weren't that great. But yet you still had to find a way to get two, three, four hours of battery life. And so once you had kind of an engineering team that said, look, we've got to figure out how to make this more energy efficient. So making the CPU speed up and slow down when it has some kind of time where it's not really busy, that has a huge effect. And you just start going down this list. And so what we've begun to do at Ocient is really look at that energy efficiency and not just optimize for scale, not just optimize for cost, not just optimize for performance, but also optimize for energy efficiency. And you can get transformational benefits by doing that. We've already announced that we can do 50 to 90%, like truly 90%, reduction in energy efficiency. And a lot of it just has to do with focus. It takes person years, person centuries of dedicated, diligent engineering work to deliver that kind of result. But you're not going to get there unless you focus on it, unless you say, this is the target, this is what we're doing, let's go make it happen.

Stephen: And of course, it's not just about efficiency, it's about scalability and performance as well. You have to do the job, not just do it efficiently, but you're doing both. And it's interesting what you mentioned that the scale and the scope of data, when we started our careers in the storage industry, megabytes was a big number. I remember my first gigabyte storage array, and now that's a laughably small number. So you're right. It is coming everywhere. And the technology that you describe, changing clock speed on processors and stuff, that is not just table stakes for modern processors. It's the whole ballgame. You cannot not build a processor that operates in that way. And you see the same thing happening with data.

Chris: Well, the prior company I started was a company called CleverSafe, which when IBM bought the company in our category of on-prem object storage software, we had at hyperscale things at 100 petabytes and above a total storage in that system. We had 100% market share. We made all those systems because at that time no one else could do it. I remember when IBM bought that company in 2015. And at that time, petabytes were like normal and exabytes, tens of exabytes, hundreds of exabytes was kind of the state of the art in terms of scale for storage systems. I remember when I started CleverSafe, I started in 2004. I think in 2005, I sat down and calculated how many systems in the world at that time, 2005, were at least a petabyte. And my estimate was 13. That was it. And now a petabyte, that's a corner of a server. So this always happens. And we're seeing this scale, not just like twice as big, but orders of magnitude bigger again and again. And we're seeing the same thing happen with hyperscale, always on, compute intensive, intensive data analytics workloads, which we focus on. Our focus right now is things where you need at least 500 cores in order to deliver that solution on which our software would run. Typically in terms of data volumes, the average query or the average machine learning function or the average geospatial function, the things that the analytics themselves are going to look at, hundreds of billions, if not trillions of elements, data elements or like rows in a spreadsheet would be one way to think of that. That scale, if you kind of that trillion scale, there's around 500 to a thousand different systems right now. Still just a small part of the giant data analytics market, but this concept of trillions and the next number that people are going to start to learn is a quadrillion, which is a thousand trillion. We're working actually right now on the first quadrillion scale system. These things don't deploy overnight, but those words are words we're going to start to use is not just trillion scale, exascale, but quadrillion scale. That's what's coming.

Jeniece: So tell us a little bit, this fascinating, Chris. So you talked a lot about focus and being able to scale. Can you speak a little bit? I know you've been working with us on deploying the 61.44 TB drives, but what's fascinating to me is the way you're able to kind of architecture the overall system. Can you give us some quantifiable numbers and show us how you're making those systems a lot more dense and power efficient?

Chris: Well, the breakthrough that enables what Ocient's able to deliver is solid state. And I think the amount of investment from semiconductor companies like Solidigm and others to bring that technology to market over the past decades, it was definitely tens of billions of dollars, probably around $60 billion investment. And that kind of goes back to focus. And this problem that Ocient solves, which is how do you scale the amount of data you're analyzing in a system without limits, without worse than linear increase in cost? That kind of had been the state of the art where, oh, yeah, you could scale up and scale up and scale up. But if you want to analyze a million times the data, it's going to cost you more than a million times the dollars. That had always been a known problem in computing for decades. And how can you solve this problem? And the reason why this was such a problem was you really had two building blocks. You know, as a software designer or software builder company, you cannot go faster than the hardware. Your price performance cannot be better than the hardware. And if you do your job perfectly, you're going to max out what the hardware can deliver. And previously, the only two building blocks you had were DRAM and spinning disk. And the problem with DRAM is it's crazy expensive. Yeah, you can solve these giant hyperscale problems with DRAM. That's what a super computer is, and they cost about a billion dollars. So there's some people that can spend a billion dollars. But if you don't have that kind of money, previously you were stuck with using spinning disk. And the problem with spinning disk is the performance is ultimately a physical phenomenon. How fast can the read-write head settle into a track? How fast can the platter spin? That time hasn't changed for decades. And so on a Moore's Law adjusted basis, spinning disk keeps getting slower and slower and slower. It's a million times slower than it used to be. And it's just way too slow to solve this problem. Along comes solid state. Solid state today is offering thousands of times, 2,000 to 3,000 times the price performance per dollar of spinning disk. That's limited not by a physical phenomenon, but an electrical phenomenon. And that's on a Moore's Law curve. So 2,000 to 3,000 times better price performance right now means 5,000 times better, 10,000 times better, 100,000 times better. And it's the thing that unlocks this whole solution to this problem of hyperscale, always-on, compute-intensive data analytics is unlocked with solid state.

Stephen: But it's not just solid state that's making this possible. And I think that that's the thing. People could listen to this and be like, oh, great. You're using SSDs. Congratulations. But you're doing a lot more than that. I think one of the interesting aspects of the Ocient solution is sort of this proximity idea that you're doing processing closer and closer to the data. And that actually reflects the architecture of modern machine learning and HPC. So if you look at how, for example, the NVIDIA Grace achieves such high performance, it's because the memory and the compute are located right together on the same. It's the same way that Apple achieves performance with their Apple Silicon. And you guys actually have a data approach that works similar to that, right? So you're not moving as much data around.

Chris: Yeah. You and I have been around long enough to have heard you got to move the compute to the data many, many times. But we're in this realm where when you're looking at these hyperscale workloads, typically in the workloads that Ocient focuses on, you're talking petabytes, you know, if not exabytes of data, not just being stored, but being analyzed. And if you want to run a query or run a machine learning function or something like that, and it's a petabyte of data, and you need to move that, you know, it might take a day, it might take an hour. That's just not, you cannot use that architecture. So what we've seen, you know, in terms of focus is the rest of the industry is focused on kind of the large opportunity, which is real. And they've really done a great job building solutions for kind of smaller active data sets, smaller amounts of data that's being analyzed. And they separate compute and storage into two tiers in the architecture fundamentally. So they have to pull the data out of the storage tier across the network into the compute tier. And that's fine if it's a gigabyte of data or even a terabyte of data, maybe 10 terabytes of data that you're analyzing, but you're getting into the hundreds of terabytes, petabytes, tens of petabytes, that's just simply not going to work. So what we've done is collapsed compute and storage into a single tier. So we're not pulling data from storage across a network connection up into a compute server. We're pulling data across multiple parallel PCI lanes, like within a server. And we'll have thousands of times the data bandwidth just at an architectural level. And it shows up in queries. The kind of analysis we can do are things that are either impossible, customers have tried other things and it just doesn't work. And one of the problems we'll often have is that the rate at which they're adding data to the system is greater than the rate at which other systems can add data. So you never catch up and that's a problem. Or the other thing we see is we'll replace two or three or four systems or even five with the single Ocient system. And I think, Jeniece, you were asking earlier about kind of what does that look like? You know, we'll often see like five or ten racks of equipment, you know, 100 to 200 kilowatts of power draw, and you can replace that with like half a rack. One-tenth the number of servers, one-tenth the amount of electricity.

Jeniece: Wow. Oaky, so I just want to dive in and on that note, talk about some of your customers. I think it's fascinating, as Stephen said, what you're doing with the software and the hardware. Can you give us a little bit of color around the type of customers, like who's really, you know, on your target list to support with your solution?

Chris: So for companies that have these kind of computing requirements, it means that you've got large scale, complex, always on, requirements for either your business or your mission, because it includes some government customers as well. And there's only so many ways you can have this much data and there's only some certain use cases that need this. And so we have a pretty good sense of who they are. The way we model the market, is we do a lot of research where we'll go off and identify a use case and write down in a spreadsheet, who are all the customers? How much data do they have to analyze? [So we] really understand it on a bottoms up basis. The biggest market right now is telcos. Telcos are big networks and they're going through a process right now of making what I think is the largest investment in human history, which is 5G. I think they're spending somewhere between $5 [trillion] and $10 trillion on 5G, which is a very big number. And 5G is amazing. It's not only like, you know, amazing price performance, super high location resolution. It's going to enable all these new apps, but it's also the first real redo of the backend infrastructure of mobile telephony in a long time. So there's a lot going on there. It's amazing. It's going to happen and there's no denying the world will benefit from its use. A challenge for that is in 4G, the amount of data, metadata, that a large telco makes is already at a scale that they can't analyze. So if you're a major telco, your network connects things a lot. Your phone wants to buy something, it's going to make 10 connections to do that and this is constantly happening. So a major telco will make a trillion connections every two days, maybe every three days. If you want to go back and analyze, why is my network slow in Boston yesterday or where should I put my next cell tower, or things like that that you need to do. Or there's also compliance reasons why you have to have this data and analyze it. Already they can't analyze at that scale, that trillion scale. Along comes 5G. 5G increases the amount of metadata that a telco network creates by 30 to 50 times. So they're already, “I don't know how to deal with this volume that I have today” and it's going to 30x to 50x and the marketing people at those telcos, what they're doing is they're going to go sell 5G. The IT people that have to run the network, they don't get to say, whoa, whoa, whoa, slow down. This is really hard, how am I going to deal with this metadata? No, they don't get a vote. They just get a problem and they have to solve that problem. So that's one example. We also see it in vehicles. Vehicles is the same thing like a car, a typical car today makes petabytes of data. And right now, they have to just throw most of it away because they can't analyze it at that scale that it's made. We also see this in ad tech. We also see this in other markets like financial services. So there's these very specific markets that have this kind of requirement where they're just dealing with this scale of data.

Stephen: There's a close relationship, I think, between HPC and massive data scale and, of course, emerging AI applications. So talk a little bit about how AI applications are starting to demand this kind of scale. And in particular, I'm thinking of retrieval augmented generation, which is emerging as one of the key technologies that are powering practical applications of AI as opposed to just check out my cool chat bot.

Chris: There's been a lot of AI revolutions over the decades. This is not the first time the world has been captivated by AI. What I would say is different about this round is not only is the technology better and amazing, but it's the first time AI has dealt with scale. It used to be AI, I did some AI programming back in the 80s and it was like a megabyte of data, something like that. And it wasn't scale. And when you look at what large language models are doing, for the first time, they're able to understand the whole language and every time it's been expressed on the internet, which is giant. That's a scale of AI that's new. The prior revolutions couldn't do that technically. So I think that's kind of one of the big differences, maybe THE big difference in this AI revolution that we're going through now. But that creates challenges. And one of those challenges is if you want to analyze the world's language or analyze the world's network metadata or analyze all the vehicle telematics data and have AI have the intelligence of what that data is saying, you have to have that data in one in an analyzable form and it never starts in that form. So thats loading and transformation, and it's not a one-time thing. If you want to analyze vehicle telematics data with AI, that is a data set that is a fire hose of massive proportion that never shuts off. The cars are driving, people are using them. Exabytes of data are going to pour into your system. And so the real challenge isn't, “Oh, I've got this static data set. I'm going to put this AI system on and I'm going to do, I don't know, correlation or regression or ask it questions about reliability or something like that.” That would be hard enough if it was an exascale static data set. But that's not the requirement. The requirement is there's a billion cars driving at all times and they're just pumping out data. And you've got to put a system on top of that that derives intelligence. And that's a challenge. And a big challenge for that is how do I take this never-ending giant pipe of data and get it into a usable form in a reasonable period of time? That's a real difficult challenge and something we worked a lot on at Ocient as well.

Jeniece: Maybe we switch gears just for a moment. So I know, Chris, we've talked about this before, right? You have 50% to 90% power efficiency and it's just amazing. But working with a lot of different partners from the Solidigm side, we're not really seeing that from other partners. We're not seeing the same aggressive stance. So is there anything that you guys are trying to do to uphold the industry, to push others to kind of do the same thing? Do you want to talk a little bit about that?”

Chris: Yeah, absolutely. The reason why an essential ingredient for making these systems more power efficient is to focus on making them more power efficient. And we're just not going to get there. And the way it works is just like every company that makes any kind of computing product, focuses on cost efficiency, focuses on performance. And the way that happens is it's not just like one number, like make it seven. You know, it's very complex. When you say, what do you mean by cost? Well, you got all the life cycle costs. What do you mean by performance? Like, what does that mean? So the first thing is to define what it means to be energy efficient. And we're working right now with a lot of other industry players, including Solidigm and others, to define what energy efficiency means. What's the benchmark? What's the metric? And we're in the very early stages of an industry of doing that, but that's going to happen. We're going to create measurements and metrics and goals. And so that's kind of step one. In parallel with that is, “Okay, now you've defined what does it mean to be efficient.” What does it mean to be energy efficient? And so then the way it works at any information technology development company is you then start to prioritize. And the low-hanging fruit is bigger. You focus on the bigger low-hanging fruit. So you always start with, “Oh, this first thing we could do won't take very much time. It'll make it twice as efficient. All right, let's do that. Then let's focus on the next thing. Well, that's going to take a lot longer, but it'll be another 2x improvement. Let's do that next.” And you prioritize them based on their efficiency. And then over time, it gets harder, like 7% more efficient with this giant investment. Well, that's down the road. So where the industry is right now is we really haven't focused on it. And as a result, like, there's a lot of low-hanging fruit. And so you're going to see, like you saw this in Ocient, immediately we come out with 50% to 90% improvement. In some cases, we've been demonstrating a 98% reduction. That's giant because it was just a lot of low-hanging fruit to start with. And we now already know, here's the things we're going to do next. This will double it. This will, you know, improve it by 30%. So it really is just that simple and that complex. At a simple level, you just prioritize the biggest bang for the buck and work your way down that list. And the only way you do that is by saying, I'm going to make that a priority. And I'm going to measure, [and] here's how I'm going to measure it. Now, it gets really complex in how you do it, but it is really just simply a matter of focus.

Stephen: It is really refreshing to hear somebody in your position focusing on energy efficiency because this is a consequential topic to literally every company. And yet, it is not the focus of most companies. And yet, what do we hear? We hear people constantly criticizing AI, criticizing the cloud, modern compute, etc., for the energy consumption that it has. Criticizing it, people will talk about the scale. They'll talk about, “Oh, well, this company, they've got their own nuclear power plant.” Or this company is investing in pre-buying gigawatts of electricity in order to support their build out…

Chris: Terawatts

Stephen Terawatts, yeah. What is refreshing, it is refreshing to hear somebody say, “No, we've got to think about the efficiency of this. We've got to think about the impact of all this.” And yet at the same time, being able to say, but we also have to be able to support data scale that just nobody could achieve previously.

Chris: Yeah. And the reason why it's really changed. If you go back a year or two ago, this wasn't even a topic. But what's been happening is energy use by data centers in general, driven by Bitcoin, definitely driven lately by AI, but still all the other types of data analytics are the biggest users still and will continue [to be]. So if you go back a year or two, it just hadn't like reached the tipping point. And then a couple of years ago, the amount of energy consumed by data centers passed [energy usage by] California. It's starting to get real. And now it's about to pass [energy usage by] Brazil. Like a big, giant economy will soon be [consuming] less energy than what data centers are doing. And the problem is it's accelerating. While all these countries, for really important reasons, are lowering their power consumption, here's this category that has gone from not a big deal to passing by large countries and accelerating. If you look at the latest IEA models, (the International Energy Association) models of what is going on with data center power consumption, they have different models, kind of like climate models, if they're projecting different things. And the range is between the amount of data, sorry, energy consumed by data centers is doubling between every two and four years. And we've all been in computing enough. Looking back, like things that, storage would double every two years, and that's Moore's Law or compute or whatever. And next thing you know, it's a million times more storage or a million times more compute. Well, that's not going to work with energy. We're not going to be able to have a million times more energy for compute. So the only answer is efficiency. And so that, I guess my call to action would be right now, if you look at RFPs [request for proposal], and this is what drives the industry, is when big customers buy stuff or customers buy stuff in general, what's in the RFP, you know, the request for proposal. What's in [it], what are they making their buying decision based on? And it'll have like cost and performance detailed to forever. And here's all the capabilities I need. They don't currently have, “And here's how much energy and here's how much energy efficiency you got to hit.” It's not in RFPs. So my call to action would be, as customers, and it's in customers' best interests. Like these are accelerating costs, a million dollars a year for energy. You can't have that accelerate. So what we need to see in RFPs and in buying decision making is energy efficiency, energy use, and that's going to cause the whole industry to focus on it. And you'll see breakthrough results.

Stephen: I've got to agree with you there. And the other thing to keep in mind is many of these companies have made commitments that they're going to reduce their energy or their greenhouse gas impact. They can't not do that just because they're chasing the AI trend. So they have to find ways of reducing power consumption. Great, great conversation. I so appreciate the fact that we were able to get you on here and talk about this because I feel like this is something that's been missing from our conversations a lot of the time here on Utilizing Tech. And yet it's something that's important to all of us. So thank you so much, Chris. It's great to catch up with you. It's great to learn about Ocient. And it's great to learn how you're leveraging advances in technology like flash storage and colocation of compute processing and storage and so on to improve the overall impact of everything that we're doing. Thank you for listening to this episode of Utilizing Tech Podcast. You can find this podcast in your favorite podcast application as well as on our YouTube channel. Just search for it in your favorite search engine. If you enjoyed this discussion, please do leave us a rating, maybe a nice review. This podcast was brought to you by Solidigm as well as Tech Field Day, part of the Futurum Group. For show notes and more episodes, head over to our dedicated website, which is utilizingtech.com, or find us on X/Twitter and Mastodon at Utilizing Tech. Thanks for listening, and we will catch you next week.

Copyright Utilizing Tech. Used with permission.