Transcript
Hi Mike Matchett with Small World Big Data and I'm talking today about several of my favorite topics. We got AI, we've got storage, we've got supercomputers. We're wrapping it all up today. We've got Vader in, and we're going to look at some of the latest and greatest things they're doing, bringing this supercomputing class of storage out to AI workloads everywhere across different levels of the stack. They've got a lot of news to talk about. So just hang on. We'll get into it in a second. Yum yum. Hey. Welcome, Eric. Welcome to our show. Mike. Thanks. Uh, let's just just refresh before we even get into Vader and what's been going on there, just tell us a little bit about your background, because you've been around the storage industry up and down the stack, working with the bigger guys, the smaller guys everywhere. Just give us a little thumbnail of of your experience and background. Thanks, Mike. Well, you know, I really had two parts to my career. You know. Half my career spent at AMD making CPUs and systems with CPUs in them, and then half my career I spent at Seagate making storage and storage systems. Right. So I've really seen the industry, you know, from from the inside for a long time. And, you know, the exciting thing about Verdura is that, you know, we looked at all this huge opportunity in AI and we realized, like, my gosh, the company, you know, we created the first parallel file system and the technology that's available to us. You know, we can kind of meld kind of the speed of HPC with the ease of use and reliability of the enterprise world. And so that's what our product is all about, really. Bringing bringing that. Yeah. I've often talked about bringing these trends, bringing supercomputing technologies into the enterprise and how this is kind of like bringing the stuff we developed during the moonshot into into consumer goods. Supercomputers often lead the way here, bringing these technologies, and they have to be made more reliable and available and performant and cost effective. Bring them into the enterprise. But it happens. This is the kind of kind of what we that's. What we're all about, really. Well said. That's exactly what we're about is like, let's take these really fast storage systems, but make them easy to use and make them reliable. All right. So let's talk let's talk a little bit about Vidura then. So Vidura is a is a I'd say rebranding but a sort of a launching of a new way of thinking about parallel file systems. But you're built on and inherited all this stuff from Parnassus. Tell us a little bit about what Parnassus did and how Vidura is changing that game. Right. Well, you know, we're super proud of our heritage as Parnassus, right? It's one of the it's one of the foundational high performance storage companies. We created the first parallel file system. We wrote the NFS spec. Right. So you think about it just foundationally, the product has been, you know, is was around right from the very beginning. And and also architecturally, you know, you look at what Garth Gibson, he's our founder, um, you know, did he actually if you look at we didn't know it then because clouds didn't really exist. But he actually created a high performance storage architecture that looks a whole heck of a lot like a modern, modern day large cloud. Right. And so the reliability and ease of use and, you know, durability of the data, um, you know, it's kind of foundational to the product. And so, you know, we're super proud. And we've actually kept that. We've kept the name. We actually just announced the the 11th release, the 11th major release of our software. So extremely mature. You know, one of the one of the rules that I've learned in my time in storage is that it takes ten years to make a reliable enterprise data path, right? Like you cannot that's not something you can do overnight. And, you know, that's one of the things that the the legacy from Parnassus has done for us is that we've got this just incredibly hardened and reliable, um, core. And so, you know, that's where we started. And then, you know, six months ago we renamed or we rebranded the company to Verdura, which Verdura means velocity means durability. Right. Because that's really our thing. We're fast, but we're also we're not going to lose your data, right? We're durable, we're reliable. Your data is safe with us. And that's a big deal. So, you know, that's our new focus. Our new focus is how do you how do you how do you take this, this really fast parallel file system, but grow it to the needs of AI and to the changing needs of HPC? All right. Let's let's I don't want to pick on other people too much, but let's talk a little bit about what, uh, the other the reasons why supercomputing class HPC storage is hard for enterprises to adopt. Let's just run down some of those reasons. And what are what are some of the common challenges people look at when they say, like, you know, I really need this class of performance now, but, you know, what do they face? Well, you know, I have quite a long history in with luster in particular. And, you know, you know, people have gotten used to the, to the, to the compromises they've had to make because they haven't had any choice. And, you know, frankly, it's been a long time since there's been fundamental innovation in this space. And so, you know, when I was working on luster, we would we would joke, we would say it's like an F1 car, right? It's really fast, but it breaks all the time. And to get the thing started, you need 13 engineers in white coats and laptops to get the thing moving right. And that's just how it's been for a long time. And, you know, that's one of the problems that we've tackled as Vidura, is that there's no need for that. Right. Like, you can have these fast file systems and our product is very fast, but it can also be easy to use. And most importantly, it can be really reliable. Right. We don't want to lose your data. And you know, in the AI world data is expensive and it's important and the compute is expensive. And so you just don't want to be losing your data and having to start again. So this whole idea of of HPC storage being scratch storage that you just have to kind of get, you know, get over the fact that it's going to crash all the time. We don't accept that. Right. And so our product has been focused from the first day to be really reliable and really hardened and, you know, easy, easy to administer. We're not going to lose your data. Yeah. I sort of remember as being a storage analyst for a long time looking at Parnassus and like, oh, you guys are doing more scale out. Back in the days when scale up was the thing for people trying to trying to solve those HPC solutions, right? You were you were taking this, like you said, more cloud like approach to it, just sort of intuitively building something that was going to be more modern friendly than what a lot of the other market has. Um, I did want to I did want to ask you, though, like there is a difference between, you know, what HPC workloads demand out of a storage system and what AI, generative AI, and some of these other things that are coming along are demanding out of a storage system, but they tend to try to leverage the same infrastructure. How is the workload changing? What do you what do you what do you see? Uh, is maybe some differences between, you know, sort of your classic simulation workloads on HPC and what AI is looking to get out of storage. Well, it's such a good question and really perceptive, you know, because when you look at it from the top level, they almost seem the same, right? You know, you think you need high performance storage, etc. but when you get into the, you know, to the to the real implementation, there's some subtle differences. And, you know, kind of the way I look at it is they're cousins, right? They're not they're not, they're not, they're not the same workload. They're cousins. And what are the differences right. So you need performance, right. There's a lot of right performance. And HPC is really write heavy checkpoints snapshots things like that. Um the difference with AI is twofold. One is that it's not just write heavy, it's write and read heavy. Right. So people are doing a lot more reading and that has a couple of downstream effects. So instead of the workloads kind of feeling sequential or mostly sequential like they do in HPC, they feel much more random, right? So that's something that the storage system has to be good at. And the other thing is that the there's a lot more files and they're smaller, right. So you think about a traditional HPC workload. You're probably talking millions of files, but they're relatively large, you know, megabytes, um, etc. whereas HPC, it's billions or, you know, looks like these new architectures, trillions of files and they're smaller files. And so, you know, that's really the difference. Again, very, very close but not the same. And you have to be better at read. You have to be better at random, and you have to be better at, um, small files and better at handling many, many files. And so, you know, so just a couple of topics here, right. You've got the Scale-out file system in the core. You're now handling this AI workload more head on with this on the same infrastructure that that the HPC workloads are on and handling it very nicely because of this Toolings you've done. But there's some key concepts here in Vidura that we haven't really talked about yet. One of them I want to bring in to the conversation is that you've got kind of multiple tiers of service all built into one namespace. Can you tell us about that? Because I think people are still thinking, if I bring in HPC class storage, I've still got to do this kind of storage and this kind of storage and this kind of storage as well to meet all my needs. How do you wrap that all together? Well, that's a great question. And, you know, that's one of those compromises that people have sort of learned to live with over the years, because there hasn't been a good solution. But, you know, what people typically do is they've got their fast storage that is, you know, kind of kind of unreliable and flaky, and then they somehow manage that to a more, you know, more reliable, but, but, but higher value stories like an object store or something like that. And what we've done at Verdura is actually we manage all the three storage levels under a single namespace. So we've got this super fast NVMe layer, we've got a parallel file system layer, and then we've got an object store layer at the bottom. And as far as the user is concerned, that's one set of storage, right. Single namespace, single management layer, etc.. And so they don't have to worry about it. Right. And we're able to we're able to move the data between those layers behind the scenes in a way that is completely transparent to the customer. And when you have this scale out architecture in now at 11, you were saying, uh, when I'm, when I'm on the IT side of the fence and I'm trying to support all those different tiers of service. How does that work? Do I have to like, really have set up three different silos of infrastructure and meld them together under this? Or, you know, what is what is it? What does that look like underneath. Right. Not at all. So, so from from an IT perspective, what what the IT manager can do is they can choose here's how much, here's how much flash I need. Right. And architecture is very, very flexible. And we have the ability to mix and match both kind of disk heavy storage and flash heavy storage so that the customer can choose. They're like, you know, if I'm if I've got a, you know, super high IOPs workload, I can have a bit more flash. But if I've got a lot of capacity in the back, I can put more disk on. By the way, I can expand one or the other, right. So I can put more. And we see that a lot. Right? People say, you know, hey, I actually just have more, more data than I thought, you know, let's, let's add some more disk nodes to that. And so that's one of our, one of our big values is that we're able to mix and match these nodes. And then, you know, customers can add them. And then our balancing algorithms are really sophisticated in the backend. So as you add storage we can just you know, we can we can rebalance the nodes, um, behind the scenes in a way that's, that's transparent to the user. All right. And, you know, highly reliable because you've got some, some advanced methods of dual erasure coding and things I understand. Uh, but just to be clear, if I, if I'm the IT guy, I really just down to the decision of do I need more performance or more capacity to add those nodes? I'm not trying to tweak each of those different tiers of service on multiple kinds of infrastructure for each tier of service. It's really a simpler decision. Yeah, it's a simple decision. Everything lands on flash first, and then we. Then we move it around in the background. All right. All right. Very cool. Uh, tell me, tell me a little bit about, uh, you know, what you're doing in this latest release, fedora 11. Uh, just real quick. What's what are what are what should people look forward to when they get this? Well, we're. Super excited about this. So this is probably the biggest release in the history of the company. And you know, they're really, really two major things. One is one is we've moved to a microservices architecture so very modern. And this just gives us flexibility in how we deploy the product. Right. We'll have a cloud product. Um, you know, we can have, you know, larger implementation, smaller implementation, more flash, less flash. The microservices really help us in that. And then they also help us, uh, in reliability. Right. We're able to move our failure zones around to the point where, you know, for example, a disk failure does not mean a node failure. And so, you know, that's how we're able to get these, you know, high number of nines of availability and durability much, much better than, than the standard parallel file systems because we have that flexibility. So, so that's that's one thing is that we're just super excited about this microservices move. Big investment for us. And the other one is that we've got a ripping fast IOPs and metadata engine. So, you know, we're um, we're introducing our Vilo engine. And this is just, you know, the, you know, the needs of AI in particular, a lot of small files, a lot of IOPs, a lot of performance. We've really put we've doubled down on that area of the product and we have this really fast front end. So we're super excited to announce that and share that with our customers. So just to try to summarize this up, Eric, in my head, you've got, uh, storage that is inheriting from the supercomputing landscape and all the advances there for the highest demanding workloads possible. But you've microservice it now and containerized it. It's probably the better verb and brought it to where enterprises could look at this and say, hey, this is highly reliable for what I need to do and achieve. For workloads that include AI that are going to there's going to be new workloads. I mean, AI is still changing rapidly every every six months. I look and there's a different shape to it again every six weeks. Oh my gosh. Right. Yeah. You don't pay attention. You miss some big some big innovation that just happened. It's a big thing. I mean we can talk about Rag and all sorts of stuff today and tomorrow we'll talk about some other some other recurrency or a thing that's going on there. Uh, but just from an enterprise perspective, you know, it's not super computer storage that I have to stand up here. It's actually storage that's inheriting all this goodness and high performance from there, but allowing you to bring it into the enterprise and handle millions, billions of files, handle, uh, multiple tiers of storage efficiently so that I can add capacity at a, at a very cost effective basis, and maybe eliminate some other silos of storage that I have. Or I was thinking about and operate more coherently going forward. I think it's a great story. Oh, thanks. Yeah, that's exactly our plan, right? Like, we want to bring you the performance of the parallel file systems, like so we're the original parallel file system with the reliability and ease of use of the enterprise systems. Right. That's that's our overall strategy is like, let's make storage easy and let's and, you know, let's innovate our way out of the, um, the compromises that people have made, um, in this market. Right. There's no need for it. Right? The technology has has advanced. And we're able to to bring those two things together in a way that nobody else has ever done. Right. And, you know, we don't even talk about primary and secondary storage anymore. This is sort of obviating some of that need as well. Because what good is data if you're not using it. So you need to have that online in your space. Uh, we don't talk about we don't have to talk about setting up a performance system versus a capacity system because you're, you're you're collapsing that as well. And you're making it all automated and easy to use. So I don't need my Department of Academic researchers just to run the thing. Yeah. Exactly. Right. Storage should be the easy part, right. Let let the let let the engineers work on on saving the world and let. Let's just work. Well I'm looking I'm looking forward to some of the ways that Vader gets packaged here. You've got a growing ecosystem of partnerships as well. And with this microservices, it looks like there's a burgeoning number of ways. Now you could think about bringing this to market and solving people's needs, uh, going forward. So this could be an exciting year for you. Uh, if someone wants to learn more about this, Eric, uh, look deeper into what Vader has been doing and what they're bringing to market here. Uh, what would you what would you recommend they start with? Uh, well, first of all, check out our website, Vader vdara.com, and all our latest information is on there. In fact, you can download Hyperion. Just released a white paper on on durability that is that is really relevant to what we're doing. You can get on our website. And then if you are going to supercomputing check us out. We're doing we're doing a great promotion, uh, with Hafthor Bjornsson who, you know, The Mountain from Game of Thrones. And he's going to attempt a world record data lift in our booth at Super Compute! So that's super exciting. All right. We'll look for footage of that by the time people see this. And supercomputers probably happen. But we'll be looking for that mountain lift and see what the what the what the answer is to how much flash a single person can lift. How much data. Absolutely, absolutely. All right. So it's pretty cool stuff. Uh, there you guys have it. Uh, a pretty cool convergence here of supercomputing, HPC technology, enterprise needs, uh, high end, high performance workloads that need parallel file systems with AI, which everyone is talking about and trying to build infrastructure for, uh, and making it highly reliable at the same time and cost efficient if we're looking at the cost. So check out vidhura. Mike. Thank you. All right. Thank you. Yum, yum.