Access the Video Interview Here: TigerGraph & The Graph Database
Hi, I'm Mike Matchett with Small World Big Data, and today we're gonna talk about one of my favorite topics, which is graph databases. I've got TigerGraph here, and we're gonna talk about what their new graph database, although it's not that new. But what they do that's different than other graph databases you've heard about. And if you don't know anything about graph databases, now is the perfect time to start looking at them and figuring out what they can do for you. With that, let me introduce Todd Blaschka, who's the COO of TigerGraph. Welcome.
- Alright, so, first there are open source graph databases. There are other graph databases. Before you even get into what a graph database is, why did TigerGraph enter the database market? Because there's like 40 operational databases out there. Why did you guys do this?
- We entered the database market, we started the company in 2012. And when we looked and surveyed the market, there was not a graph database that was native to support massively parallel processing. So in the world of MPP, could be able to handle large data sets very fast, from data ingestion to data access, there was no other graph database on the market that could support what we felt was going to be a big data revolution. And now we're seeing a big data revolution. And so we launched the company publicly just about a year ago. And it took 5 years for us to develop this technology.
- Right, because graph database is kind of an interesting concept, but it's not at all what relational databases are. So let's just step back a little bit, for those people who aren't familiar with graph. And, in a thumbnail, talk about what a graph database does. And what I'd like to say, I mean you can give us your impression. I have to say is I think of that Kevin Bacon Six Degrees of Freedom game, social game we all play, where a friend of a friend of a friend of a friend, and if you try and do that kind of data analysis in relational database, it'll choke. Almost at any size. But a graph database, that's a natural, native kind of question to ask. What friends do I have within Six Degrees of Freedom would be something a graph database could handle usually pretty well? How would you sort of describe it for enterprises? What do they get out of it?
- Yeah, I think for enterprises, all the users are used to using technologies like LinkedIn, Facebook, AlleyPay. When you look at the evaluation of these companies, we're looking at trillions of dollars. All those companies have one thing in common; they're a graph database. They're using this to find the connectedness and the meaning of the connected data in order to drive revenue. So when you think about that, how can I apply this technology that they developed by themselves, as a proprietary technology, how can enterprises leverage this for their own business value and business gains? And that's where graph is coming into market with enterprises because they have their data that's been locked in silos or they brought into data lakes, now they're asking the questions, "How do I bring this data together? How do I bring together this data in a meaningful way, that I could get deeper insights that I couldn't see, two, three, four degrees of separation, and also drive much deeper business outcomes?" When that can be cross-sell/up-sell opportunities, because you know your customer better. It could also be from an anti-fraud point of view. These are things that companies have already been working on, invested millions and millions of dollars from across the board. But they're now looking at , "Well, what new technologies can help us get this deeper insight and better outcomes of the data that we have?"
- And go more directly at it. As you said that, I was thinking, "Well, Google, with their page rank, is really a big graph database company. Facebook is a big social graph database company." I mean, everybody that's done something new is really talking about "how do I look at the chain of relationships between things?"
- And then, as you were saying that, I was also thinking, "Look, if I create a business app in the last 10, 15 years, I do a query here to this database and I pull in some relationship. And I do a query there, to some other relationship. But if I'm really clever, a third database, and pull in a relationship. And in my app, I kind of hard-code those things together and give it back to the user." But a graph database could connect all those things together. And when you said "big data", I start thinking of, like, "Well, in big data, you know people are building data lakes and stuff now." So do you guys enter into that spot? What's your play there?
- Absolutely. I mean, Within the database or database area regarding data lakes, this is a natural fit for graph. We have seen the last five years everybody working to bring all this data in. And what are they hoping for? What is the connectedness? What are the better outcomes? And so they have been struggling with that, being able to understand what they can do with that data. So we're seeing customers where they want to able to drive better value out of their data lakes. Because now they got the data in, but with graph, it can augment what they're doing within those data lakes. So, for example, if you have customer data, it can be used for better customer 360 view, where you can apply contextual information. You know, basic details, spending habits, churn scores. All these different things that you want to better understand. It can be applied for better servicing. So you want to be able to do better network analysis. You know, the data is being stored in the data lakes, we can bring it out into TigerGraph, as a graph database, be able to run this kind of connectedness analytics, be able to do the link analysis, and then that could continue to go on through the data pipeline to support new insights that they have not been able to get out of their data today.
- It's almost like you take a data swamp, and say, "Look, the value in there is really not that you have all the data in one cluster. The value's in I've got all this data, how is it connected to each other and how do I get the insight out of it and explore those connections?" Right? Exploit that.
- Right, and you brought up a great point, Mike, and that was when you gave the example of trying to connect things together. When you think about the world of relational database, you have many joints and many tables. That is a very tough thing to do when you're looking at large data. When you have a customer that, they're trying to look at it, "Well, how's my customer being spent in terms of my commercial banking unit, my investment banking unit, and also my retail banking unit?" Today, those are three different silos of data. And you just want to be able to know what is the customer and what are all my relationships, because then I can provide better services. And probably upsell them, with new offerings, more tailored to their needs. Bringing all that data together is a native graph problem. And people and companies have tried to do this with relational. They've tried this with no sequel. Now they're seeing that graph is the natural way to be able to find the analytics. To be able to support these kind of initiatives.
- And... There have been other graph solutions in the market. But you guys said you're aiming for that scaleable, high volume of data thing. Tell me a little bit quickly here about, when you do that, when you create the solution, what does your resulting performance look like? And then how do you harden this for enterprise use? Because those are really some of the key nuts you gotta crack, right?
- Absolutely. Absolutely. When you look at it from these enterprise requirements, data has grown substantially since there was a new database, graph database players on the market. The legacy that generation one, generation two, they're designed to run on one machine or to be able to provide some distributed storage, but they don't have the capability to run as a distributed graph database. That is what's unique to TigerGraph. We're the only player than can do that, in this space. So when you're looking at large data sets, they want to distribute, you know, three, four, five, 10, 20 terabytes across many machines to be able to operate as one graph database. That's what Tiger Graph can bring. The second thing is, based on our MPP technology, we're able to be able to not only load the data in very, very, very fast, where we can handle, you know, 150 gigs per machine, per hour.
- Wow.
- But we can also support this in a continuously updating of the data. So this is not a one-time batch. So when you think about applications that require real-time updates, mutable graphs, those are areas where the enterprises are really looking for is "What's changing with my data?" And then "Who has access to that data?" How can I use this as part of my data pipeline? Who's able to look at, you know,... We have one big graph and it may have customer information, it may have product selling information. But different people have different accesses to that. So what we have done is not only provide the connectivity to support enterprise security requirements, such as Single Sign On or L Dap, an active directory support, but we also provide within it very deep-grained security that can provide access. So one group can be able to see what part of the graph they're allowed to see. Another group's able to see what part of the graph they're allowed to see. And this is basically, not only seeing the graph, but the results of the queries and what the outcomes are. They're only able to see what they're allowed to see, and then be able to provide the tracking and audibility behind that to verify that, if you're working with sensitive data, the right people have access and can only see that sensitive data.
- I mean, those are all those enterprise-ility things and you guys have been tackling that. So just in the last few seconds here, how does someone get started with TigerGraph? Do they have to take off a big bite? Is there a free community edition? I think you mentioned there was a developer. What would be the first step be to sort of get their hands on Tiger Graph?
- Yeah, the first step we recommend is going to our website and downloading the developer edition. It's a free version of the product and we have built it out so that a customer can download it and get to running an application in under an hour. So this is from installation, we have some pre-set data sets, pre-set queries in there. So someone can get the value and see the value of graph in under an hour. That is where I'd recommend someone to start. Then their mind will start realizing, "What are the potentials I can do, because I now I can understand the relationships of the data. I can run queries, such as Shortest Path, Circle Detection Common, graph algorithms against my data to see what the outcomes are, and then be able to build new algorithms right out of the box, very quickly.
- And again, we gotta end this, but the data set you start with doesn't have to be adata lake, right?
- No.
- You get value out of looking in a graph way at some, actually some very small data sets, it turns out. That are highly interconnected.
- Absolutely. It can be a couple of gigs of data, but if you have, think about if you have data with many, many rows, and 50 to 100 columns, and you're trying to interconnect the data, this is a natural graph thing. So you can start with a small data set and drive tremendous value, and then look at how I want to expand that to other use cases within the organization.
- I know my head is already expanding. We gotta' go. Todd, thank you very much for being here, and hopefully we'll get to some more deep dives on graph databases as we go along. Explore some use cases and some of the technologies powering this. But thank you for being here today.
- Thank you for having me. Thank you, Mike.
- Thanks. And this is Mike Matchett for Small World Big Data. Stay tuned and we'll definitely be back with some more stuff on no sequel and graph databases. Thanks. [Music]