Transcript
Johanna: Nodding heads. Okay. So very, very warm. Welcome to everyone who has joined us for our webinar today. I'm very happy to have you here. Um, if you want, you can switch on the cameras. I think. If not, no problem at all. Um, I will very shortly kick this off. My name is Johanna and I manage partner sales and accounts for Stormforge here in EMEA, located in Cologne, Germany. And our webinar today is called triple 20 How to Reduce Costs on Target while Increasing Speed and Quality. And this is actually a darts analogy that I was not familiar with. I'm not going to spoil it. I'm sure Benedict will get into it. It's a very cool analogy. And yeah, like I learned something. You're going to learn a lot today and you're going to learn from these two presenters. Benedict and Niels, Very happy to have you both. Benedict is CTO at Talent Formation, highly experienced IT, professional talent formation is responsible for some very, very impressive digital transformation projects here in Germany. And then Niels, our very own also very experienced solution architect. He will get into achieving cost efficiency in Kubernetes using machine learning and give us a demo presentation of our Stormforge optimize life. So very much looking forward to this. This is the agenda Benedict will start and then Niels will take over the second part and then at the end we'll have a wrap up and a Q&A please. Like said before, wait with your questions till the end. So we have a lot of time for them, but feel free to drop them in the chat during the presentations. And that's it for me for an intro and I'm going to pass it over to Benedict now so he can start us off with his talk. Enjoy, everybody. Benedikt Stemmildt: Thank you very much, Johanna. I need to switch to my presentation for a moment. Let me share this screen and slideshow. Can you see my screen? I can see you. So. Okay. People are not in. Great. So. Hello, everybody. Oh, I need to have my timer. Sorry. So I only have 20 minutes. Great. Timer. Go. Okay. Um. Yeah. So. Hi, I'm Benedikt. Johanna already introduced me. I think it's not not that important to go into much more detail. I'll just start with the slides. If you have any questions regarding our projects or me or my role, we can. We can do that afterwards in the Q&A session. And I'm here today to to tell you a little bit about my opinion or my thoughts regarding cost reduction. And I've seen a lot of cost reductions in the past years at a lot of clients and companies I've been working for. And most of them just aim for using like immediately reducing infrastructure costs or like reducing personnel costs. And the problem with that is always that at some point you, you will have like a point in time where where you want to grow again or where you want where you want to do stuff again. And then you need to kind of start over because by reducing it in this like very short minded way, you lose all your ability to, to scale up again. And my my idea here is to to tell you a little bit on on how I would reduce cost to be able to scale up again without like losing a lot of time. Um and yeah so this is like a little bit what I described to you, like the cost, the cost cutting trap, the cost cutting trap everybody falls into. And I think you can can view that a little bit more in detail. If you look at this, this graphic, I will explain it and I will also translate it. So never mind the German. I wasn't able to change that anymore. And. Like most of the problems or one of the biggest problems that I see is that companies treat all their projects the same. So when they have like a lot of activities, projects, however you might call that products and they all they treat them the same. And if it's up to reducing costs, they then start like kicking off projects. And no, this project is done now and this is done and this is done. And they don't even know which projects they they kill like or they, they, they stop. And one of the things that that we are doing not only for cost reduction, but also for like creating new projects and because they are like all a little different regarding the context is that we actually look at the context of the project of the product of the activity that that that is happening there. And in our view, there are three types of, of contexts. The one we call one, the other we call see two and the last one is three. See very creative. I know. Um, and the one is for optimization, which happens like for, for a lot of good reasons in the current business model of a company where you just want to like create more revenue and reduce costs, you want to, um, yeah, try to micro optimize your, your current product, your current activity to be more. Yeah. To get more revenue in the end, not revenue. Like the other thing where you that what's left over after revenue and costs. Um and you you have like regarding projects I think it's it's a context where you have a lot of certainty right you know that you need to spend this time this much money and you will get this result and it's I think it's return. Yeah. The return and and the second I will start with the search horizon. The search context is for like creating new opportunities for new big business models. You will try to like adapt business models from others and change your own your own business by adapting other patterns. And this is, for example, something where where companies try out new, Yeah, new business models and where you want to do that because your product is obviously like at some point it's going to die and you then want to like for the long run, you want to create a new business mechanics to keep that up for a longer time. And yeah, you need to validate stuff and you need to validate it very cheap. You don't. So if you if you try to build your new business model or your validate your new business model with the same cost structure of your current business model and horizon one context one you will spend a lot of money for just validating stuff, which is like most of the time not going to work. And then you have like a lot of costs that that could have been avoided. And the same goes for the second context in which which is focused on innovation. So you can copy stuff from, from other business model patterns. You need to find something new, a new solution, maybe on the technical side, but maybe also on on some kind of product side. And the difference between the two and the three is that in three you have like the whole business model with the full mechanic, and in two you have like just like the small innovation you want to want to get. And when I look at companies, most of them have like a portfolio like this. So when when we go in there and we see, okay, they have like a lot of stuff in all of the contexts and they are not even like aware. This is like two, two or 1 or 3 and they don't even know the returns that might occur of the project. And if you now want to cut costs, you then just like you have to like shoot shoot off a lot of them without knowing what's going to happen because you don't want to shoot off projects in the first context, because the first context is what makes money. Two and three is making money in the long run. So a very good portfolio should should look like this. So one thing on how I would reduce costs is to rebalance my portfolio. I would look at the projects and product activities that I'm doing in my different horizons and then reduce them in C two and C three. Not completely kill them because I need that in the long run. But I would like move a lot of them into C one and or do a lot of activities in the C one. And and this is not only like a cost perspective, right? Because you will make more revenue as well. So it's also focusing on on growing your business. And that's why I think it's a good combination of reducing costs and also. Scaling or generating revenue. And you could, for example, if you now need to reduce costs, you could now like also kill projects, but keep the balance right, keep the balance of having it's like a portfolio with, with shares or something where you say, okay, I have like a bigger amount of of ETFs in there and then some very fluctuating stuff. And this is very important for me when I look at cost reduction to avoid that trap. And then the next question would be, how can I? So if all the projects in the first context are making me money, I need to think about how can I make these projects more efficient? So how can I how can I be faster and more cost efficient in executing Horizon one products and projects to generate more revenue in the end or more returns? And and I think there are two things. So there's a lot more, but there are two very important things that that come in my mind. And one is to have like a good design and so a good balance between design, execution and operation. And that because of this role, right, there's this this rule that if you like fix it in design, it's much more cheaper than to fix it in production. But also if you design all the time like it's going, not going to pay off, right? So you need to execute it at some point. So that's why you need to have like a good process of. Making a very aligned design phase. And in the end, most of the projects that I see, they have a lot of dependencies. So one project is not going to work on its own. There are a lot of dependencies to other projects, so you need to somehow reduce this dependencies, create independence to be able to be efficient. There's a lot of. So looking at the perspectives and the first place we see six perspectives that that have like that that come to mind regarding planning and where you should and you should align all these perspectives on the specific product. If you do that, you have not forgotten anything and you will be sure that in your planning and in your you will be very effective. Also in the end, because you have looked at all the aspects that that particular project is made of. I will not go too much into detail here because of the time, but I think you could read the the slide and then the back and but yeah, these perspectives are business model service and product organization controlling architecture, technology and steering. And we could go into more detail here, but maybe there are questions afterwards where you can where you can do that. Regarding dependencies, my take is that that there are three of these perspectives that are in particular responsible for for dependencies, which is service and product organization and architecture and technology. And, and you could focus on these perspectives regarding independence by separating different parts of the products, different parts of the organization and of the technology in a way that, that you create independent teams and independent products and independent architecture. And this is what I'm going to show you now on how you could probably do that. You like looking at the product? What we like to do is we like to find we like to like to cut the product in a way that you have different independent parts regarding what a customer wants from the from the certain product, from the particular product. If we don't go into everything here. But what you can see, for example, we are looking at an online shop here and we we try to find in this case three independent parts. And we decided on that there would be like some part which is discovering one part which is deciding and one part that is fulfilling. And so these product parts could work independently from each other. So someone who is creating a very nice search and navigation experience does not need to talk or to interact with someone or the part of the product that is doing the checkout and payment experience. This is just an example to demonstrate that there's independence in these parts. What is interesting is that we do not have like independence in one part is the front end and one part is the back end. There's a lot of dependency there and we want to reduce that. And so we so we slice it more in a way that that we have the product independence here. Looking at the second level. So the teams we now need to find a way to orchestrate the teams. There's team topologies as a method which which is really nice. Um, and you can see here that, that we have these streamlined teams which are basically the same and you can compare that to, to this, right? So each of these slices is one streamlined team. And then you have maybe some, some teams that need to enable the others or that can support them. Maybe you have a platform team with a Kubernetes cluster for the other teams. And the idea here is if you have streamlined teams, you're very efficient because there's not much interaction between those teams. You can see the coupling there with the black bars there, the black links. They mean that there's a service or like a like a very loose coupling there and not like overlap the enabling teams and the streamlined teams, they are overlapped with the dots, meaning that they need to talk to each other, which is like a like a much more inefficient way of talking to another. Right. So um, try to focus on like reducing these dependencies to have more, um, more faster or more efficient interaction modes. You can read a lot more about that in Team Topologies. But we then do is we, we put all of the people in one of the teams that we need to address this particular problem, right? So we don't have like a dev team and a QA team and a infrastructure team because this means a lot of dependencies, a lot of inefficiencies, because talking to each other means inefficient. The process of creating new stuff or solving a problem gets very inefficient. In. Subtitle is having a. Well, that is wrong. It's the wrong over the wrong headline and the right on this in the in the in the in the bottom sorry, architecture and technology separation with something. There's a good book, modern software engineering, where a lot of this stuff is explained, for example, having loose coupling and high cohesion. And what what we do here is we also try to create different parts of the of the system that are independent and not like, for example, a big monolith. Right? A monolith is could be very inefficient because maybe just one of the parts of the monolith gets very high traffic and it needs a lot of infrastructure to address this issue. And the rest of the monolith is doesn't need anything regarding infrastructure, but you all have to deploy it in one thing to to to address this one part where you need to have like high traffic. And this is why it's very cost inefficient. And here's the idea to split that up. Like you could split it up into microservices. Or we say here it's more self-contained system style where you then can individually assign all the infrastructure parameters to to have like a very good distribution of your, of your resources, which is more efficient than having like one big bulb. And then you can also see that we try to reduce or try to create efficient coupling between these systems. So there is no I try to so I try to visualize it with a with the arrows, but try to have as less interfaces as possible between these systems. So if you have a lot of APIs, a lot of synchronous calls, a lot of hard coupling between the systems, you will be very inefficient, a lot of network overhead and and this is not going to be very nice. So if you try to do it as synchronously and cut systems that address issues regarding, for example, the product domains we had in the beginning, you will be much more efficient in creating products. So the trick here is to to yeah, to tackle the cause by introducing the portfolio, introducing a lot of projects in the first quadrant and then making these products efficient. And you don't just simply cut stuff down and then are not fast anymore. You try to combine costs and growing. And yeah, in the end we are just following a very nice study which is called Accelerate or DevOps Study. Maybe some of you have heard that already. And there is like what I want to say with that is that it's a proven, proven principles or proven ideas we use here that that rely relate to organizational performance. And the study has been done with, I think, 50,000 companies worldwide. And that show that these these kind of ideas that you see on the slide that they lead to organizational performance. And a lot of these ideas here, we try to address that with what I've shown you before to to make that happen. Yeah, I think, um. There are these principles in detail. Yeah. So getting a lot, a lot of technical detail here, but some of these principles are, are like an extract of, of this graphic where you can actually see that if you use these principles you will be again, more efficient with your projects in the first context. And the current practices that you've probably heard already are domain driven design or micro-frontends or to have like a shared pattern library for for your front end designs. Um, and also being a lot more cloud native. What I addressed before was having like a Kubernetes cluster splitting up the monolith into like better resource allocated parts. And yeah, so we try to use these practices to further increase our efficiency. You can read a lot more about that in some of the books and links I will show show you now. Probably not read it now in the in the webinar because you will get the slides afterwards and then you can paste out the the ideas there. But I want to add is so the like like more like the cliffhanger to the part of Niels is that this what I've shown you here is something that you that you can do some of them that you can do like very short term. But a lot of that stuff is more like mid term or long term or overall more efficiency in your organization. And if you have like something where you need to reduce costs now or you want to be very efficient in your current resource allocation, then you need tools like for example, stormforge and or maybe you are already like very efficient and stuff that I've shown you is something what is what you are doing. But then you have this Kubernetes cluster with all the microservices and you want to, as I said, like address the resources, like allocate them very, very efficient and for all the different microservices. And this is where Stormforge comes in and helps you to do that and automated and not manually, which is again, yeah, less cost because you don't have people that need to do that manually. And I think this is going to show you a little bit about that in our next presentation. So here's the demo, the sources, a little bit more sources. Again, you will see them afterwards. And thank you very much for listening. And I'm very happy to to see Niels now and what he has to show us about Stormforge. Thank you, guys. Johanna: Thank you very much. Benedict That was very cool, very interesting approaches. And like you said, Niels is now going to zoom into one aspect of where you can be cost efficient very fast in Kubernetes. Go ahead, Niels. Niels Roetert: Did we want to allow people to ask questions at this time? Maybe for Benedict. Johanna: If there's any question, I would suggest putting them in the chat and we'll definitely address them first. In the end in our Q&A. Niels Roetert: Okay, cool. Then I'll just start with my part. So. Yeah. Hello. I'm Niels, based out of the Netherlands. I'm a solutions architect for. For Stormforge. And, yeah, as Benedikt already introduced me, I'm going to talk about scaling your applications in the Kubernetes space. So using microservices. You can spend a lot of time and effort on trying to do that manually, but that's definitely something that you don't want to do because it's pretty hard to find people that understand Kubernetes that well that they can actually, you know, come up with good results. So that's why the talk is called put the auto in auto scaling, because there is many solutions out there that allow you to automate this whole process. And Stormforge is just one of those. So, yeah, it turns out that, you know, running your application efficiently on Kubernetes is definitely something that people want. Obviously, like Benedict just explained, if you're basing your application on microservices, you can scale particular parts. That's what all the modern software design is about. So you can scale certain parts without the need to scale others, which makes it really efficient from that perspective. But it turns out that in Kubernetes, every microservice, you know, obviously uses its own resources, and developers these days are responsible for setting these resource requests, as we call them. So what that means is that for every container that is being run on a cluster, like on a Kubernetes cluster, you have to set how much CPU and how much memory you think it's going to use. Now that's pretty hard, it turns out, because you probably don't know up front if you're using some standard components that you have used before, you might know, you know, in a particular setting what the resource request should be. But it's pretty hard to get it right, you know, and make it lean so that it is very cost efficient. So that's what I'll be talking about. One way to do this, because you don't want to do it manually probably is by using auto scaler. So Kubernetes actually has some components called Autoscalers. That's the vGPU, the vertical pod Autoscaler And there's also the HPA, the horizontal pod Autoscaler And if you're running your cluster in a virtualized environment, you can also use a cluster autoscaling, which will actually add nodes to the cluster. Um, the vGPU itself is responsible for coming up with recommendations based on the actual usage of your micro server. So it will look at how much CPU memory it's using and based on that it will come up with a recommendation that you can then apply to your to your micro service. The HPA will look at the utilization of each individual micro service and based on a trigger that you can set, which I'll talk about in a minute, it will decide that another copy of this micro service would be beneficial or if the utilization is very low or pretty low with, you know, below the trigger point, then it might decide to scale down your your environment and basically being more lean. The autoscaler looks at, you know, how much microservices are running, how much micro service are being added to the environment and how full are my nodes. And based on that, might decide to add more nodes to your to your cluster as well. To be fair, most most teams start with utilizing the Hfpa because it's fairly simple to use and it's less intrusive than vGPU because if you're running the vertical pod autoscaler, if you make a change to the configuration of a microservice, that microservice will have to be restarted. Um, yeah. So that might have a little bit of impact on the, on the application behavior. It shouldn't because it's microservices and you have multiple copies of them running so you can restart them one by one. But yeah, Hfpa is definitely more popular to start with because that doesn't affect the availability and doesn't have the need to restart your your servers. And it also does it at a very short interval. So if utilization of your pods on average is higher than the trigger point, after a couple of minutes it will decide to to scale out. But I've mentioned trigger point a couple of times now. How do you set this trigger point? That's something that is something that you have to decide on yourself. So the developer will have to decide if, you know, let's say I have three copies of a microservice running. If the average utilization of these these microservices is at 60%, should I add another note, another copy or another replica, as we call it, or should that be at 70% or at 80%? That's that's one of those things that you have to decide on as a developer. And that can be pretty hard to to make a decent decision there. So what ends up happening is that people just take a default or they guess that, you know, probably at 50% utilization, we should add a replica and this can result in scaling out way more than necessary and everything that you request from a resource perspective. So if you're requesting CPU and memory for every microservice, which you might not actually need, you're obviously still are going to be paying for that. So what we see in practice with setting requests like CPU or memory requests is that people tend to overprovision quite heavily. So if you think maybe your application needs, I don't know, like one CPU core to be able to run efficiently, you might as well, you know, request maybe a CPU core and a half or two CPU cores just to make sure that you have sufficient resources, resources. But then if it turns out that your actual usage like shown on the left side of the slide is way lower than everything between what you're requesting and the actual usage is and this is marked in red here is wasted resources. So you're basically wasting all these resources and wasting means you're also paying for them. Now on the other hand, there's people that try to be very. Uh, efficient. You could call it aggressive with these requests and set them fairly low. And that kind of gives you a risk of running out of CPU. Now, that's not the worst scenario potentially, because if you run out of CPU or if you hit your limit, the CPU will be throttled, which means that you basically get less resources than you need. So the application will slow down. But if you do this from memory, if you run out of memory, if you hit the memory limit, the bot will actually be killed. So it will have to be restarted on a different note. So I mentioned already requested limits, something that you have to set at the container level and you can have multiple containers in a pot. A pot is an instance that you run in in an Kubernetes cluster. Um, and just to give an idea, if you have large Kubernetes cluster with multiple applications running, it could be, I don't know, 25 applications could be 100 applications depending on the size of your organization. Um, that's going to result in hundreds if not thousands of pods running. So you would have to set these these requests and limits for each and every one of those if they are all different, if there's commonalities between them, you might get an idea. But, you know, obviously you need to make sure that that works out well. Target utilization is the other component. That's for the HPA horizontal pod Autoscaler. So if you are scaling out based on the average utilization of your replicas, then this is something that you have to come up with. Like I mentioned before, it's pretty hard because you need to take into consideration a couple of things, you know, because if you want to spin up another replica, how long is that going to take before the replica actually starts? If it is something if you have to copy data to that instance, for instance, which normally shouldn't happen, but that's just an example that might might take more time than just spinning up a replica with just a very small microservice in it. So that's something that you need to take consideration can be different reasons why the startup takes longer or, you know, there's something else going on. And that's one of the things, one of the reasons why we use machine learning, because it looks at all these scenarios and comes up with basically the best possible target utilization, but also requests limits for you to to work with. Yeah, like I said. So these days the developers are responsible for setting these values because they the they develop the application. They supposed to know the application really well. So they are the right people to to come up with these recommendations. And they also create containers out of these applications that they develop quite often. So at some point in time they have to decide on these settings. And fun fact is, I visited KubeCon a couple of weeks ago and met a lot of interesting people among those, a lot of developers, and whenever I had the opportunity, I would ask them, you know, how do you come up with requests and limits? How do you set them? And basically nine out of ten, I think I asked the question more than ten times, actually, but nine out of ten for sure said, you know, we just come up with a value, set it and run with it. And then ideally, you know, we can reevaluate it after some time, see how it performs, If it's, you know, performing well, we can take some of the resources down. If it's performing poor, we might add more resources or we change the target utilization. Um. But yeah, like I said, the majority of people said, you know, we just come up with a number, run with it. If it doesn't perform, we'll add more. If it performs well, we might take away some resources. And there was a lot of emphasis on the might because if an application runs really well, why would you want to risk, you know, changing that and basically taking away resources and potentially slowing down your application? So, yeah, like I said, in reality that doesn't really happen. So more often than, you know, necessary, they will just set a value that they think is, you know, sure to work very well and then just leave it. When asked the question, I must say a lot of people were kind of like, yeah, we know we should be reevaluating. Reevaluating these values, these values. But they said, you know, we just don't have the time to do so. And, you know, people want us to do be doing other things. We need to focus on other things. And what you also see, obviously, in the DevOps world is that, you know, after some times the applications are being transferred to another team, might be the ops people, might be the series, the site reliability engineers. Um, and these are not going to touch these values most likely either because they don't want to, you know, make a change that will result in the application of performing anymore or, you know, crashing or having bad behavior. So it's very hard if you didn't design the application and you don't know all the insights to make potential changes that could negatively affect this this application. So that's why we're talking about Stormforge optimize live. That's the product that we developed to be able to cope with this problem. What it does, it looks at the actual usage of resources. So CPU and memory. It comes up with a recommendation. So the machine learning takes these values. It analyzes them over a longer period of time and then sees, you know, if we set the recommendation to this value, we can be very close to the actual usage. And that obviously saves you a lot of time and a lot of effort. I say time because you don't you know, you no longer have to, you know, have your best people look at these these values and this problem. And for them to solve it, you just completely automate it. And the cool thing is it does it continuously. So if you use changes, if the situation changes, the machine learning will notice it based on the actual usage and it will come up with new recommendations. The other thing is that it does both the horizontal auto scaling and it does both the vertical auto scaling. So it can come up with a combination of of parameters that works best and makes sure that the in the vGPU work together very efficiently. The way to do it is you install a really small component on your cluster. So if you're running a Kubernetes cluster, you just install one agent. The agent will go out and it will automatically start discovering all your workloads. So all your containers running on the cluster, it will discover if you allow it to, obviously. So you can also set it to particular namespaces. To particular. Pieces of your closet doesn't have to go out and discover everything. But, you know, within the boundaries that you give it, it will go out and discover all the workloads. If you add workloads after the fact, after you install the agent, it will obviously also pick up on those and it will automatically start ingesting performance metrics from these workloads. So it will query the nodes for information like how much CPU is currently requested, how much is actually being consumed. Same for memory. And then it will start analyzing that and come up with better recommendations. So that's where the machine learning comes in. And like I said, it does this for CPU and memory, both requests and limits and it does also for the horizontal pod target utilization parameter. And then you can either decide to do this manually, you go to the UI, you press apply and it will automatically change it for you or not automatically after you press the button, it will change it for you. And you can do this over and over and over again. The machine learning will come up with new recommendations based on the frequency that you tell it to. After some time you might be, you know, why am I pressing this button? This can be fully automatic. So you just set it to automatic and it will automatically go out and apply these recommendations. If you're using something like a gitops workflow where you want to put these these changes through your your CI CD pipeline, you can also export the recommendations to a Yaml file. And then as you can see by looking at the pointers, it's a continuous loop. So this will just keep doing it forever as long as you tell it to. So I did want to show really quickly what the the UI of the product looks like. You don't have to use the you know, you don't have to consume it through the UI. You can also do it through a CLI and the API and all that good stuff. But the, the UI is, is pretty cool to look at. Um. Yeah. Let me really quickly show you that. What kind of cluster I'm working on is just a really small demo cluster. So we have the Stormforge namespace here. Stormforge the system namespace, it has one agent running in it. That's the component that I mentioned. So if you have multiple clusters you will have to install it on every cluster and that's like a couple of applications running here. So you have the Stormforge hipster shop as we like to call it. Then you have monitoring which has Prometheus in it, and that's also a component that runs a load generator against the application. In the hipster shop namespace. The hipster shop is a microservices based application, so it consists of 12, I think from the top of my head, different microservices and I can actually show you that. So yeah. What's in the. So these are the microservices that are running in this particular namespace. Now, once you have the agent installed, you can go to our application. It's on app.stormforge.io. And if you go to workloads under Optimize live, you will see all the workloads that were discovered by the agent. So if I scroll down just a little bit and hope this is readable, not too small. You can see that there's a total of 19 workloads discovered by the by the agent. Let me zoom in just a little bit. Um, and what you will see here is the workload name, obviously, which cluster it is running on. I currently have only one cluster attached to this, to this. Um, only one agent installed on one cluster. So you will only see a partner cluster then. Which namespace the, the particular workload is in what type of workload it is. And in this case it's deployments could be daemonsets could be something else. And how efficient we think this this workload is running. Now, what you'll see here is that it is 12% efficient according to the mechanism that we use. And we tell you why that is so. Currently, the total request is three full CPU cores. So this workload is requesting three cores, and that doesn't necessarily mean that every microservice is requesting three cores. It can be that there's multiple replicas and together they request three CPU cores and in this case they are also requesting nearly three gigabytes of memory. Now, with optimization, these requests would drop to 0.8 CPU cores, so less than one core versus three cores here and only, let's say 400MB of memory versus three gigabytes here. There's nothing that you had to do for this. You just install the agent and it will automatically come up with these recommendations after just an hour. And you can right away see that the impact will be that, you know, almost two or actually a little bit more than two CPU cores will be reduced and the memory is going to be reduced quite dramatically as well. Now. This might not seem like a lot. But if you look at the prices and this is just an indication on an average price that we came up with, the workload to run it per month was $78 initially. And after the optimization, it's 21. So the impact is, you know, more than $50 per month. But that's only for this one particular workload. Now on this cluster, I only have basically two applications running. There's 19 workloads here. So you can imagine if you're having a saving of 50 bucks per workload and you have 90 of them, that adds up. If you have 1000 of them or way more than that, obviously the impact is going to be really, really high. Now what I'll do is I'll drill down a little bit. That's going to be a warning here that the recommendations that we are currently giving are preliminary, and that is because we have not collected seven days of metrics yet. Machine learning likes to have a big set of data to base its decision on. And this agent was only installed six days and a couple of hours ago. So it's still not at seven days. So tomorrow it should this should disappear and it should say these are recommendations that we actually want you to to implement. Now, let me go directly to the recommendation details here. So what I can do is I can show you. Um, in this particular deployment only has one pot in it, so you will only see one. But if you have multiple pots in the same. Um, yeah, sorry for that. I'm not sure what happened. Um. What I was telling and what I will continue telling because of time and maybe go back a little bit in a second. But what I was talking about is that. Not all recommendations are going to be green in the UI. As you can see, there's one that's red here. What that actually means is that the and that doesn't mean it's negative. It just means that the application is actually under provisioned. So as you can see here, the total requests is one CPU core and four gigs of memory. And what we say is that the optimization should be is actually nearly two gig, two cores and four and a half gigs. So we are recommending to up the number of CPU cores and also to up the memory. Now that's obviously not going to result in a cost reduction, but that is going to result in an application that runs runs better. So it's going to give you better performance, it's going to be more reliable. I if I understand correctly, I. Got disconnected, apparently during talking about the recommendations. So what I was doing is I was showing that if I disable. These graphs. In the detailed view, you can see the actual usage of the application. That's the blue part. And then what we also show you is the current request and the current limits. And these are the actually the recommended ones. So as you can see, they're pretty close to the to the actual usage of the resources, but they're not following the pattern of the the actual usage. And the reason for that is that the recommendation interval is currently set to once a day. So we need to make sure that we can facilitate the peaks during that day. If you were to set the recommendation frequency to once an hour, you will see that the behavior is going to be more aggressive and the recommendations will actually follow the pattern of the of the actual usage. This one is just for CPU. There's another one for memory here as well, which shows you the the same thing. One thing that I did say is that you can currently not apply, so you cannot apply because the recommendations are less than seven days old. So the machine learning needs a certain set of data on the specific workload to come up with a good enough recommendation or actually the best recommendation. So we the current ones we call preliminary because this agent has been installed six days ago, so it still needs to collect some more data. Uh, one thing is that the efficiency and you can see that by hovering over it is we want the efficiency to be somewhere between 80 and 95%. That's ideal. Um. Yeah, the way we calculated the CPU efficiency is basically the actual CPU CPU usage divided by the request, which gives you a number that we based the the efficiency on. Um, yeah, I think that's it. I wanted to keep this one pretty quick. So one thing that a lot of people ask is, you know, how do you compare to other products in this market? And what we tend to see is that a lot of products out there will give you great visibility. They will just collect all these metrics. They will show you what the cost of your applications is. They will show you what the cost of your clusters is. A lot of technical details, which is really nice, but they don't give you actionability. So what we try to do is give you the visibility like you just saw, but also add in the intelligence so that you can actually take action and you no longer have to put in any manual, you know, resources or manual effort to be able to come up with a lean as possible cluster. And in some cases, we will add some resources just to make sure that the application runs really well. So everything is fully automated. It's basically a hands off operation which makes it ideal in large scale Kubernetes environments because yeah, there's just not enough people with the right knowledge for you to do it manually. There is other tools out there in the open source world which will give you recommendations. They might be, you know, as, as good as ours, you know, depends on the tool that you're using. But there's usually a lot of scripting involved. And if you want to do that at scale, it's going to be pretty hard to to maintain that. And the other thing is that there's basically no other products out there that will combine vertical scaling with horizontal scaling. So that's definitely something to take into consideration as well. I think I saw some questions in the chat earlier, so. Maybe we can have a look at that. Benedikt Stemmildt: Do you see them? Still see them? Or because of your disconnect, You probably don't, right? So that's a question from from Mino. He was asking if there's a predicted scale up as well. Niels Roetert: Yeah, that's the only one that I still see, unfortunately in my chat. But yeah. There is no predicted scale up at this point in time. It's definitely something that is on our roadmap. So we are looking to introduce that. The cool thing is that since our product is based on machine learning, it's something we could add fairly easily. We have done it in the past, so the algorithm is already available. It's just something that we want to fine tune further. We've had a lot of questions for this because, you know, there's a lot of companies have, for instance, a pretty high load on Monday morning and maybe the next day in the morning as well. So we definitely should be able to to help with that. Any other questions that I don't see or maybe others are seeing? The. Benedikt Stemmildt: I don't think so. I only see this one. Johanna: There's a new one from me, though. Do you see it? Is there any instrumentation? Okay. Niels Roetert: Um, no. At the moment we don't. Don't do that with optimized life. We are looking at supporting different metrics than just CPU and memory, for instance. But we also, since optimized like stormforge optimize is a platform, it has multiple products. Today we discussed optimized live. We also do have optimized pro and optimize. Pro is something that you wouldn't be running in production. It's something that you could run against a specific application. And with optimized pro you can definitely optimize for come up with basically the ideal configuration for every specific application based on latency and throughput on cost. So yeah, that's probably something you could realize with pro. Johanna: Oh, there's one. Niels Roetert: Um, if you Christian, if you mean that in the sense of is there a pricing calculator similar to AWS for optimized life? There's not. Um, but we have a fairly simple licensing structure. Um, it's based on the number of cores that you have the applications running on that we need to optimize. And then obviously if we do a good job, the number of cores should go down. So yeah. But that's something that Joanna can answer if you want to reach out to her. Johanna: Yes, happy to. Anytime. Niels Roetert: No is. Any closing thoughts from you, Joanna? Just putting you on the spot waiting. Johanna: Because there were some questions that took like a minute to appear in the chat. So I'm going to give everyone a couple of seconds to post some last question. Obviously, every question that come up after the webinar we're happy to address. Any time you can just email us, you will receive emails with the slides, etcetera. So any time, any and all questions. But if there's no more questions for right now, then I thank you, Benedict very, very much and Niels for your presentations. It was really valuable. I think there's a lot to learn on how to reduce costs while staying on target. Things that there's a lot of great tools out there. There's a lot of great concepts. We've seen them. Thank you so much and everyone else. Thank you very much for joining and yeah, have a nice rest of the day. Talk to you soon. Benedikt Stemmildt: Yeah. Thank you very much, Johanna. Thank you, everybody. See you soon. Bye bye. Bye bye.