Transcript
Mike Matchett: Hi, my match with Small World Big Data. We're here today talking about AI. Of course we're talking about AI. Everybody's talking about AI. If you're looking at AI, though, and you're trying to apply AI in your organization or for whatever purpose, you probably aren't really doing AI testing, are you? You know it. You need AI testing for lots of reasons, from just being correct to being compliant and making sure you're actually adding value to the customer. Stay tuned. We've got Raaga AI here today to tell us about AI testing. Hold on. Hi, Gaurav. Uh, welcome to our show today. Gaurav Agarwal: Thank you. Mike, thanks for having me here. Mike Matchett: Uh, so you you started reggae. I, uh, you some time ago because you have this deep interest in I as as I do. Uh, how did you sort of get started in this whole land of artificial intelligence? What was your background there? Gaurav Agarwal: Yeah. No, uh, first of all, Mike, thanks for having me here. Uh, I have been an AI and software practitioner all my life. I started coding when I was 13, and I was very passionate about all the innovations and the change that software and AI can bring. Right. And I did my master's in AI Computer Vision back in the mid 2020, 2000, around 2005, and have seen the full evolution of AI right from the early days of machine learning, traditional methods to when AlexNet came. Imagenet came right in, in, in uh, in the 5 to 10 years later. Right. With deep learning and now with the advent of of generative AI engines. Right. So, uh, I just seen an exponential, um, improvement in what I can do. Right? And the promise it brings. Right. So, yeah. So I've been very excited about this now. Mike Matchett: All right. So I, you know, I've been following AI obviously since I was a youngster as well. We would looking at, you know, expert rules based systems and things that might pass the Turing test and neural nets are exciting. And then deep learning and all the rest of it. But one of the things that's always, always, always struck me is that it's really always it's really always hard to tell if AI is doing the right thing. You basically you basically create this model because you want it to be smart, smarter than you are in some cases, but then you can't tell if it's smarter than you are or not in a lot of, you know, cases. So anyway, it's cutting cutting forward. Today we've got generative AI, man, that stuff is everywhere. And I don't know how do you tell if it's correct or not. Right. So what what what do we do? Where do we even start on that. Gaurav Agarwal: Yeah. No Mike. So that's such an important question right. The failure of AI is everywhere. Right. And it starts with detection of these failures. That's the number one problem statement. I'll tell you a story why I started AI. It was a few years back. I was in California. I was driving a semi-autonomous vehicle. Right. And the it was it was rainy, dark. Uh, the vehicle was supposed to brake because there was debris in the road. It did not. In a split second I realized that I has failed. I put the brakes. If I would have not, I might not have been sitting here. Mm. So understanding detecting eye failures is a number one step. Right. And that's what we do as a step one. Step two is to know why it has failed. And then the step three is to how to fix that. Right. And detection starts from a range of looking at eye from all different aspects, be it the data it is used to train or in the case of LMS, right. Fine tune or rag, right. Uh, the the prompts we are getting the correctness of those and the and the correctness of the output in terms of the context. Right. So we have to we have to we have to look at all different things to make sure that that we are. Detecting and understanding why I fail, and raga AI has built some key technologies to address this important problem statement. Right we are. Raga has is has built some foundation models to understand this AI in the performance today. Right. And that has been very critical. Mike Matchett: All right. So let's dive into this a little bit. If I, if I'm trying to implement an AI feature or uh, layer or advisor in my organization, or I'm tasked with even just supporting one and getting it into production. Well, my main concern, you know, making sure that the data is secure, making sure that it operates, making sure that it it performs as an application should. But now I've got these additional concerns. Is the AI correct? Is the AI performing properly? Is the model drifting or not? Is the I mean, you mentioned it yourself, you know, sort of data going into it, garbage in, garbage out kind of stuff. I've got to now monitor that stuff as well. I've got to manage that and I've got to measure how I'm doing on those categories. And that's really that's really kind of new territory for a lot of people. When we look at organizations and enterprises today, they're they're used to sort of like, I can run the application, but it's up to the developers to make sure it was doing the right thing. Now it's like, wait a minute, I've got to run this thing that's got its own smarts, and now I have to monitor how smart it is and how smart it's staying. Right? Basically. Right. So what what does what does someone need to do to be able to implement ongoing operational. Management of this. Gaurav Agarwal: So this is this is very important question. Right. Right. So first of all, we we we know that this is so important to be able to, to to basically make sure that AI is functioning properly. And it's not easy to do that. Right. And it's not happening today. That's a problem. Right. Uh, because a lot of focus is on building and deploying AI. And that's the reason we are seeing so much failure. Right. And that's exactly, uh, because it is so ad hoc. Right. That's exactly where AI comes in, right? Where we have this platform with over 300 plus tests. Okay. Which is, which is, which can be deployed on customer cloud or prem. Right. And and and and any test can be just called as a very simple AI in anywhere in the CI CD pipeline, as with the Python interface. Right. So so so now the now anyone who is building and deploying AI in the world has this, has this. Tool and platform, right. Which we which which they can just use to make the to to understand their quality and and performance of AI in a very in a very simple three step process. Right. And and that's the reason Sagi has become the number one testing platform in the world already. Right. With with over 300 plus comprehensive tests. Right. Mike Matchett: So, so so 300 tests. Tell us what this three step process is. Just just just take it down because, you know, I think I think there's a lot of people who are saying, hey, you know, I don't know the black box. I don't know what's inside this. I, I'm not the data science team or maybe I am, but what's, what's sort of the operational three step process. Gaurav Agarwal: Yeah. So it's it's very easy to use. Right. So the customer just have to basically do a pip install of our technology. Right. And just call our API and see the results. Right. And the results include detection of the issue diagnosis and actionable recommendation. Right okay. Very very simple to use click buttons. Mike Matchett: All right. So so we can actually start to see maybe uh, the emergence of people who aren't data scientists being, you know, but being in production who are more responsible for the correctness and compliance of this thing, uh, than just the creation of it to start with. And they need rigor, I and testing solutions to get their hands around that part of it. Right. Gaurav Agarwal: Exactly right. So we are making testing, uh, uh, not only as something we should be able to detect any issues, which is very important, but also easy to use by anyone who wants to ensure that right that that I that even if they're not building the AI, they are users of AI that that I should work. Well, it should give the intended result. Mike Matchett: All right. We talked I just sort of tease a little bit about correctness and compliance. But tell us, what are some of the common failures that people will see in their AI if they're not, especially if they're not watching out for it? What what what creeps up on them? Gaurav Agarwal: They're like range of failures. I will talk about a few of them. For example bias right I, I is there have been so many public instances of bias. Right. Uh, we, we have one customers who was just looking to find the right resumes for doctors. Yep. And the AI was was giving primarily male doctors. Right and using our technology, they were able to quickly fix that, reducing those errors by over 90%. Mike Matchett: All right. So. So could identify gender bias in the data. Was it was it was it was it looking at the training data or was it looking at the model output or or both in that case. Gaurav Agarwal: So it was looking at a lot of different things. But but the primary thing here, the issue was that training data bias okay. The training data was biased towards, towards uh, towards one one particular gender as compared to the other. Mike Matchett: All right. Uh, and then and then I know people. So we're talking about uh, your, your, your testing platform is good for all kinds of machine learning, uh, functions and use cases popular today. You might have heard of these large language models, right. That everyone everyone's doing, uh, what can it do for that? I mean, that's such a that's such a different environment where I put in some natural language prompt and get something out, which how do you how do you look for correctness and compliance and whatever else in that. Gaurav Agarwal: Yeah. No. Mike. So LMS and JNI is magical, right. But it fails. Right. And we have to look at all different aspects like the, like the way we do it. For any system you have to look at all different aspects, be it the prompt. So the prompt correctness, be it the right data, it is used to either do that does the rag, which is basically providing the context, the user context, or the fine tuning or the output, right, the context of the output right, the correctness of the output. So we have to look at all different aspects to make sure that the LMS are working fine all the time. Mike Matchett: And you've got a platform that then does that. It helps someone understand that's those are the things they need to be looking for. Uh, just quickly I said compliance like five times here. What what are we talking about and how you help someone be compliant. Gaurav Agarwal: Correct. The next step of testing is compliance, right? So we want to make sure that our AI is working fine. Right. That's exactly the the goal of AI. But we also want to make sure that it is that it is doing the right thing all the time. And that's exactly is compliance, right. So next step of testing is compliance. And raga AI is basically offering testing with 300 plus tests. But it's also offering compliance for various aspects. Right. To ensure that, uh, the AI all different kind of AI basically is, is basically compliant to various regulatory standards or guidelines. Mike Matchett: Right. And you're also looking, uh, sort of at the core measurements people might normally think of, you know, performance, latency, scale, uh, even drift in the answers if there's, if there's a test answer set. Right. So you're doing all that, which people, I think have understood for a while. But you're now layering on this idea of let's look at the data set. Let's look at the bias, let's look at compliance, let's look at some of these other things. And especially as we get into these I don't want to say, yeah, funkier applications of AI that people are applying to on there, which is which is pretty cool. Uh, uh, and so this is I think, I think you mentioned this. It, it is a not just a, not not necessarily a SaaS solution, but something people run in their own premises as well. Or how does that work? Yeah. Gaurav Agarwal: No, I think so. That's a very important design consideration. Uh, so our solution uh supports both basically. So it supports deployment on customer private cloud or prem as well. Okay. Primarily because primarily because data privacy has become so important. So so we enable this for all customers to be to be that our technology runs, that their data never leaves their infrastructure. Mike Matchett: Okay. So you're not you're not pooling everyone's AI together and creating this massive data set behind the scenes. You're saying, well, look, we understand the AI problem. You get to run this in your own environment. This is great. Gaurav Agarwal: We definitely respect the customer's data privacy. Mike Matchett: Yeah. So, you know, I understand it's kind of early days to be looking around and saying, hey, there's there's, uh, there's there's these AI experts in QA now, and there's these AI experts in DevOps and there's AI experts, and really, they're not. Right. Do you have the data science team who are the AI experts? And what you're doing is enabling those downstream, uh, really operators of AI and appliers of AI and managers of AI to get their hands around this problem and to and to do and to be able to manage it properly. Gaurav Agarwal: Exactly right. So we are enabling everyone who is associated with AI, right, to be able to, to make sure that that they are there. They are confident on the output of the AI, they are happy with the ROI of the AI. Right. And that's when the when the eventual goals of AI will be achieved. Mike Matchett: All right. So, uh, pretty pretty pretty early days for AI in the enterprise. Uh, it's definitely coming on strong. You've got a testing platform there. A lot of people looking at this are probably going, oh my gosh, I should get take a look at this if they are interested. Uh, where should they go? Obviously you've got a website, but is there anything special you'd point them at? Gaurav Agarwal: Yes, definitely two places. Try I try our try our platform. Second is book a demo. So we are very readily available at both the places. Mike Matchett: Okay. That's raga ai uh, and uh, just finally, you know, raga didn't come about because of retrieval, augmented generation, the rag name, it came about earlier than that. What is what is raga actually imply? Yeah. Gaurav Agarwal: So raga is is raga is a Indian word which means like music and tuning. So we are tuning the AI. Mike Matchett: Tuning the AI. Yeah, it's a happy circumstance there, but, uh, because you're helping people with that. But check it out, folks. Uh, raga dati for AI testing and, uh, testing includes not just correctness, but compliance, failure modes, operational requirements and all that other goodness that you want as you're deploying AI into production. So take care, folks. Thank you for being here again today. Gaurav Agarwal: Thank you. Thank you Mike. All right. Take care folks.