Transcript
Hi Mike Matchett Small World Big Data and we are here today talking about a small world with big data. How do you handle, uh, supercomputer size amounts of data in a performance scalable, distributed, and most of all efficient way? Do you just put it all in S3? Do you just put it in an object store? Well, then how do you get at it? Uh, to do all your cool new workloads like AI and stuff? It's it's a challenge we have got down here today. Who is offering us a solution. So just hold on for a second. Hi, James. Welcome back. We haven't talked in a bit. How are you doing? Hi, there. Thanks, Mike. Glad to be here. I'm doing really well. Uh, so let's get let's get into it. You didn't, uh, clearly the leader in what we might think of as HPC or supercomputer types of, uh, infrastructure, particularly, uh, Zscaler, let's call it out there, you know, big parallel file systems that help power, uh, really the biggest kinds of, uh, HPC workloads, including AI, which is a hot topic today. But, uh, tell us a little bit about, uh, what what you know, why, in your words, Zscaler wasn't really covering the whole market. What what was going on that made you guys look to develop a new storage system, first of all, before you even talk about it. Yeah. So I mean, we've always been a company which tries to solve the challenges of scale, and that's scale in terms of performance, in terms of capacity, in terms of complexity, data challenges that are really, really tough. That's what we do as a company. And then about five years ago, really in conjunction with Nvidia, when Nvidia started purchasing systems to support their own Superpods, that was the start. And then since then it's really blown up and we've found ourselves as the core data engine behind the vast majority of these superpods you see around the world. So very large AI supercomputers, and it's a different world. So we've always been into HPC, but this AI world is a slightly different world. But what we've been doing so far is solving the challenges of scale, performance and efficiency. People are spending a huge amount on these large AI systems. They need high productivity. The last thing they want is to be waiting for data and wasting that. Those, you know, data center space, data center cycles, data scientists, time, money, effort. And we really solve that piece of the puzzle, which was the scale efficiency performance challenge. Right. You spent you spent you spent you spend millions of dollars on GPUs. You want them running at 100%, right? You just you want you want those super pods just cranking on that. So that's what exascale does. Literally. These customers are spending $100 million on GPU systems, infrastructure cooling, and maybe a few million of that is for storage. So let's make sure that storage is not holding it all up. Holding back. Okay. So what so what what did when you when you sort of step back and said like what's going on in the world with everybody adopting AI. Everybody is this these these HPC workloads are kind of trickling down in the form of AI workloads and some other things going on in. The complexity starts rising for I don't want to say necessarily for enterprises, but cloud service providers for the more aggressive and global enterprises for people who are even doing, you know, things with petabytes of data and they're doing research and stuff and, and they're saying, you know, this isn't just about my GPU pod, this is about managing these petabytes of data. What do you what do you what are you doing for that now? Yeah. So when we're talking to these customers, we kind of realize, of course, that we're not solving all of their problems with exascale, which just solving those that basically supercluster challenge. Yeah. Um, how to make this more efficient. So what's the other problem? Well, of course, think about the data. It's all about the data. It gets generated somewhere in a microscope on a satellite. Um, in a scanner, on a vehicle, in a camera. It's somewhere in the world. He gets tagged. Typically it gets tagged by AI models to start with. And there's lots of tags. It comes across, uh, networks into regional data centers. Maybe data scientists access it, it gets filtered, it's cleaned, it gets more tagged. Maybe it then comes into a core cloud system or data center. Uh, more tags start to train, models start to test, models start to maybe, um, uh resimulate some of that data to add more data into into the pile. And then it goes start to train your models with that data at some point in time, and then you put the models into production. Then of course, there's this feedback cycle. Meanwhile, the CIO and CFO is looking down at this huge, complex operation with massive metadata, massive databases, data distributed, um, often running partly on shared systems. And their job is to be able to say to the authorities, the government, the financial authorities, we know what we've done, we know how we trained this model, and we know why it's behaving as it is. And we have done due diligence in training it on the right data. So, so lots of tags, lots of stuff. People generally have been relying on object storage for this, what you might call outer ring. I think you guys do call this sort of outer ring around the world of of the dense core I cluster of data. So I got petabytes of data. And you've created a storage solution here called Infinia. Tell us what Infinia really does. Well. Yeah. And finally we started building it many years ago. Uh, to be clear, it's it's completely ground up. Built. Um, so people shouldn't be confused. It's nothing to do with exascale. I don't share a line of code with anything else out there. So we built it specifically for these kind of problems. And, you know, there's several things I could say really at the core. But what we've tried to do is get all that tough stuff like multi-tenancy, data distribution, metadata management and tagging, searching a metadata, get all those really hard challenges to do and make sure we solve them at the most native layer inside our data platform as we possibly can. The more native it is, the more efficiently we can do these things. Let's allow ourselves to be orchestrated from above by exposing APIs, but do everything like metadata, search, tag, query distribution, movement and managing quality of service SLAs for tenants. All these things which add into the complexity of these new environments. Let's do them at the core, very natively, very efficiently, and allow them to and expose them to APIs to be orchestrated. And we think that's the the ideal flavor of storage system for the modern world. All right. Let's throw the gantlet down. Since we're running out of time here. Infinia highest performing object storage. True or false? True. True. So, okay. Yeah, we've been. Testing this internally and pretty much like, for, like, for the best performing object stores out there. Uh, we think we're the fastest and was particularly in terms of low latency. We think we're the lowest latency object store, which is very important. Lowest latency, highest performance, incredibly scalable because of the way you've architected this, designed for petabytes of stuff. It's really it's really S3 for the a new generation of how you how we approach computing. It's it's all flash QLC super low latency 100 times lower latency. What you find in a public cloud. Um, there's a potential here for Infinia to help move S3 from tier two into tier one. It's got that kind of Posix style, low latency, so you can just run applications directly off it. So you're making you're making object storage a primary storage, and no longer have to think of this as, as a secondary environment or archive or backup or cloud. This is this is where we run is in is object storage. Yeah. Sub millisecond latencies mean that now becomes realistic to run running it directly against an S3 object store with Infinia. All right. Sounds like a dream for a lot of people. A lot of developers out there who now just have, get and put. It's extremely fast no matter what size things they're doing. Um, James, we're out of time. We could talk now about this now that we scratched the surface for hours. And I think we're going to at some point, hopefully. But if someone wants to start to get a little bit more comfortable with what Infinia is and start to look under the hood and say, like, hey, you just you just you just are challenging the entire industry here. Where would you point them at? Yeah. Just go straight to CNN.com. We've published white papers and data sheets. And then you can get a feel for how it's architected a bit more depth about how we've built in multi-tenancy and how you can use this as a cloud provider. So Datacom probably slash Infinia. And then you'll find a home page for the product with all the relevant materials right there. All right. Well, thank you for just teasing that with us here today. We're definitely going to have to dive deeper into this because it sounds really game changing for cloud providers storage industry, just about everybody out there. So, uh, thank you so much for being here. And, uh, folks, it's worth starting to check this out. Even if you're only slightly interested in AI, you should definitely check out how the storage layer should work for you. Um. Take care. Thanks, James.