About this talk
Who really owns your data in the cloud? Put another way, who’s responsible for it if something goes wrong? If it’s hosted by a SaaS provider, you might be surprised to learn that the “shared responsibility” model will leave you in the lurch! Data ownership is an increasingly hot topic these days, and not just for the usual reasons of governance, security, and business continuity – though all of those remain critical drivers.
As the Information Economy expands, businesses are realizing that traditional data management practices simply don’t scale fast enough, or cost-effectively, to handle rapidly evolving business needs. Watch this episode of InsideAnalysis to hear host Eric Kavanagh interview data visionary Joe Gaska from GRAX and Yves Mulkers of 7wData. They discuss a new approach to data management that puts power back into the hands of the business, enabling companies to maximize data value, while minimizing risk and expense.
[MUSIC PLAYING] ANNOUNCER: The information economy has arrived. The world is teeming with innovation as new business models reinvent every industry. Every industry. Inside analysis is your source of information and insight about how to make the most of this exciting new era.
Learn more at insideanalysis.com. Insideanalysis.com. And now, here's your host, Eric Kavanaugh.
- Oh yeah, folks. That's right. Welcome to the future. "The future is here already", so said William Gibson, "it's just not evenly distributed". Folks, we're back once again on the show that is coast to coast here in America, all about the information economy. It's Inside Analysis, your host, Eric Kavanaugh.
And I'm so excited today to dive into one of the most critical parts of any information strategy, is your data. And specifically, do you do you rent your data? That was the title of the show today. Why Rent When You Can Own? It might sound a bit silly, but data ownership is a really important topic these days, especially with compliance issues coming down the pike.
We have CCPA out of California, of course, the California Consumer Privacy Act. GDPR kicked this all off, the General Data Protection Regulation coming out of the EU. That really impacts EU citizens, at least technically.
But I can tell, you lots of large organizations are viewing it as the de facto standard for business and for data. So whether or not you're an EU citizen, you're probably being impacted by some response to GDPR, even if you don't have the right to tell Google you want your information taken away. Because that's a big part of it, the right to be forgotten is what they call it.
And basically it says that if you're an EU citizen and you want xyz corp to forget about you, then you can reach out to the company and say, hey, forget about me, get rid of my data. And they're supposed to do that. And if they don't do that, you can complain, and fine, and all kind of bad stuff can happen.
And that's already begun. There are already fines being levied by the EU on companies that are in violation of GDPR. But that's just one straw in the wind, right? There are lots of other things to consider here.
We have a couple of great guests. Joe Gaska of GRAX is going to join us. And my buddy Yves Mulkers from all the way across the pond in Belgium is online as well. And I'll just say a couple of quick things before we bring in our experts today.
One is the concept of data gravity. I know that might sound a bit silly, but data really does have gravity, metaphorically speaking, in that it's hard to move. It actually takes time, it takes resource, it takes effort, it takes money to move data.
And historically, we've moved data with something called ETL, extract, transform, load. That was the old way of doing it, and it's still the way of doing it for many companies. It's probably never going to go away.
It's a very useful mechanism for extracting data from a source system, transforming it into some format that a target system can accept, and then loading it into that target system. There's also ELT, extract, load, transform. That came out of a company called Synopsis, got bought by Oracle. I know a bunch of the folks from over there, too.
These are different ways to move data. These days, there are tremendously powerful technologies for gathering data, for ingesting data at scale. And that's one of the watchwords for the modern information world is that we have these scale-out architectures now that can just grab tons of data, but you have to be careful about what you're doing with it, where it's going, et cetera.
Well, what I've seen in the last couple of years that's very, very interesting-- and GRAX, I think, is right at the forefront of this whole transformation-- is that one part of the business landscape in data is what they call continuity, business continuity, failover, disaster recovery, all that kind of stuff. What happens if something bad occurs to your data?
In the old days, would be a drive failed, for example. Well, did you back it up? When's the last time you backed it up? A lot of people know Time Machine in the app, the Macintosh world. But most people who know what backup and restore is really about, know that it tends to be very slow, and thus, kind of painful.
So it's a must have, but it's a, oh, do I have to go back up my data? And what frequency do you want to back up your data at? Like, every second, every minute, every hour? Well, traditionally, big companies have been like, well, at midnight we'll do a quick snapshot. That'll be good enough for government work, as they joke.
But what we're seeing now is really that whole space is reinventing itself. And one of the reasons why is because of the cloud. So if you use Salesforce, for example, one of the lessons people use when they're leveraging Salesforce is that the data is kind of sticky. It gets in there in all sorts of different places, and trying to extract it actually becomes a bit of a challenge.
I can tell you we use a lot of email marketing technologies. They're very useful. Email is still king in the enterprise communication world, I promise you. Social is great. It's interesting. But they're always changing the rules, aren't they? They're always kind of pulling the rug out from underneath you in the social world.
So email is still king. There's so much cool data in our email marketing platforms that I really can't get access to because it's kind of a pain. You almost have to write scripts and things to pull it out. And that's hard to do.
So what we're seeing now is-- and there's the last concept I want to throw out there is egress. A lot of cloud storage vendors will be very nice when you're putting your data in their cloud, but if you want to take your data out of their cloud, that costs an extra chunk of money. That's called egress.
I always joke about Hotel California. You can check out anytime you like, but you can never leave at least without paying a big price for your data. And it's your data. So that was the whole point, as I was talking to the folks from GRAX, is, if it's your data but it's in the cloud, it's at some software as a service cloud provider, well, who technically owns that data?
And there is this thing called shared responsibility in the cloud world, which is classic newspeak. When I first heard about that in the security context I was like, shared responsibility? What does that mean?
And the experts said, basically it means you are responsible. The big cloud provider's like, yeah, yeah, shared responsibility means you are responsible for everything in your environment. So you have to pay attention to it, watch it, et cetera, build tabs, do whatever you have to do, and back your data up.
Well, Joe Gaska from GRAX came up with a very interesting way of leveraging that technology. And in our next couple of shows, we'll talk about some of the forward-looking stuff like using it for other access patterns for other applications, not just a backup that sits in cold storage [AUDIO OUT] live version, a movie version of your data that allows you to go to any point in time and figure out what was happening then. So with that very long winded introduction, let me bring in Joe Gaska of Grax. Joe, tell us a bit about yourself and where you came up with this idea and what it is that you folks have built.
- Thanks Eric, I appreciate it. So a little bit about myself. I've been a serial entrepreneur and the idea for Grax really originated, I'd say, back in early 2017. So I really started thinking about historical data and the value for questions that people wanted to ask and really started to think about, what data is locked away? Like you said, for example, Salesforce.
So when I pay Salesforce a subscription, I put my data very easily into their cloud and then if I don't pay, they're renting me access back to that data itself. And it's not just a tactical obligation of backup, archive, restore, but how do I make sure that I'm protected for my future, that I have all that history captured somewhere? Right?
So the idea for Grax really started back when the technology was not really possible to build this architecture, back to about 2017. So when we launched Grax and really built it and said, well, we're going to capture every version for our customers, every version of data, and store it in its most rawest, purest form that we possibly can. We don't want to transform it, we don't want to lose any of the fidelity of the data itself, and we want to make sure our customers have the highest optionality of what they can do with the data itself.
So when we launched Grax, that was are our biggest piece that we want to just stick by is protect our customers best interest, not only for making sure they can't lose any of the data, they're mitigating risks, and also, they own the data forever. When I say that, we don't lock it in our cloud and then rent access that they pay a subscription fee. We truly put it in their cloud that they own forever, and I can go into exhaustive detail about that, but I don't want to bore everybody with the--
- Yeah, but you bring up such a good point already, right? And I'm glad you mentioned this whole concept of high fidelity because when we bring in Yves later in the show, he's a data warehousing expert. And I always try to remind folks that the whole how we got here story. And if you think about ETL and why it became dominant in the data warehousing world, it's because back in the day when we started data warehousing, processors were slow, storage was expensive, pipes were thin, and so we had to strip out all kinds of context to funnel in that critical data, which would be name, address, maybe the transaction details, et cetera.
But now in the era of big data, when we have fast processors, cheap storage, fat pipes, all these interesting things going on, AI and ML, if you preserve the fidelity of that data as you described, if you preserve all the context to it, then somewhere down the line, someone, some data scientist or some business person can go, hey, you know what? I want to better understand this aspect of our customer experience, and they have the data to do that, right Joe?
- Absolutely. I mean, as you alluded to, Moore's law is just relentless. So as computing gets faster and cheaper, so does the thirst of what data they want to look at. And we all know data creates data. So the difference between two points in time creates a data point, the velocity of change between two opportunity values.
So as the relentless progression of Moore's law just continues to drive this, people's expectation for the highest fidelity, the highest frequency of data, it's no longer just, capture these select data sets. As all of you said, it's everyone wants all of it. Give me all of it now. The one thing I've always said to a lot of our customers, if you select the data now, you selectively choose. You're simply making a business decision in the future that you don't need that data and there's no going back ever.
Protecting that optionality in the future, whether it be analytics, whether it be AI, whether it be machine learning, all of that is fed from historical data, right? So making this selection now that I want to hit the record, if you think of Grax as a black box for your business, capturing everything. If you want to look back at what's important-- it's always interesting, we were talking before the show started, you never know how significant something is until the future.
So you might have something that happens today, the significance of that, it might be an inflection point on your business, it might be a decrease in the velocity of your sales. Those significance don't happen until you go back and take a look at it in retrospect. So it becomes super interesting and I'm sure Yves' notes, the expertise of the data warehouse, that's what everybody's trying to look for is that nugget of gold.
- Yeah, that's exactly right and I'll just throw one other question to bring into Yves before the end of the segment here. When you talk about persisting it in the customers cloud, that's what really got me in our conversations and that's why I came up with this concept Why Rent Your Own Data? Because it just seems silly once you put it in context. You're like, wait a minute, that's actually a good point.
Why do I want to rent my own data? It's my data that I created and I'm paying to store, but now I have to pay more to get it back? I mean, technically in a traditional on-prem data center, you still have to keep the lights on, you have to pay for the energy, you've got to pay for all the staff, et cetera, so it's not a zero cost game. But the point is with these major cloud storage companies, if you try to pull that out, a lot of them will tack that egressed fee on top of it.
So it's one of these, and TCO has always been hard to understand-- total cost of ownership. It's a common term in the enterprise software world. TCO is always, always difficult to really ascertain and to pin down, but there's really no confusion about you owning your data, it being persisted in your data center as opposed to living in some other cloud vendor who might change their prices next week or next month or their terms of service or whatever, right Joe?
- Absolutely, and if you really think about it so they want to bet that they're going to lock you in and as you said before moving large amounts of data even if you think about it terabytes is nothing now when you start getting into petabytes of data. Moving that amount of data becomes extremely cost prohibitive to move that data, never mind trying to move it and then report on it and do analytics and look at it. That becomes an absolute nightmare to manage and extremely costly.
So we store the data in the rawest, purest form in the customers cloud itself. Not only that, one of the biggest things that when we built the architecture, we wanted to minimize the surface area of exposure for our customers so their data actually never leaves the facility that they don't own. This is critical.
So it goes from their Salesforce to their cloud of choice. It never goes into our cloud, it never passes through. So when you think of security and governance and geoaffinity and all these things that people start to think about, think of their cloud provider like Salesforce, their cloud of choice. Their data never goes into our environment and it's stored in their environment forever for the lifespan of their governance policies.
- Yeah, that's a big deal and it's a perfect segue to bring Yves in before the first break here. We've got about 3 minutes left, Yves. That's a big deal in the European Union these days, right? Where data goes, where it persists, where it stays. Companies have to think about that and make sure that it's in a particular region that is the delineated by GDPR, right Yves?
- Yeah, indeed. That's something that I think keeps us in Europe a bit more awake on where to store it and how that goes about and being reluctant to store it on a US data center, so that's what you see happening. So it's great to hear from Joe as well that you have a way of storing every kind of version wherever you want to store it.
In your cloud or very likely, on-prem solution as you like, and not if you want to build some computer or overview or trends from the past that you need, for example in this example, to stick with Salesforce and do something in there may be a more expensive cloud, but go to the cloud of your choice and have a more cost effective solution to build those type of solutions on your data, if I quite well understand Joe.
- Yeah, I think that's exactly right, Joe. And real quick, I mean, the key here is to give yourself, as an organization, the agility and the ability to move and pivot as you want to, right? To watch out for that "gotcha" of a big egress fee-- and not just a fee, but that the amount of money it takes to move something.
You've got to map fields through all these kinds of things, make sure you do it correctly, otherwise you spend a whole lot of money for mistakes and then you're in even bigger trouble. So I think the key that you've done here is create this engine that will capture everything that's happening in whatever system you point it at in the cloud, of course like Salesforce and other such solutions, and allow you as the user to decide where it goes, at what frequency, and so forth. Right, Joe?
- 100% and it's, as you said, future optionality. If there's one thing that we all know as technologists, there's this constant cycle of new technology, whether it be Snowflake, MongoDB, Redshift, right? As technology continues to progress, so does the need for taking the history and shipping it to new end points.
So having it in its rawest, purest form allows you to have that optionality of whether it's data visualization, whether it's data warehousing, whether it's machine learning, whether it's artificial intelligence. All of those are massive consumers of historical data and having that data in its most purest form is really where it becomes very interesting. It's not just backup, archive, and compliance, but it becomes engrained through the history, you know?
- Yeah that's exactly right. And just to add some color to that folks, what Joe's really referring to is the fact that in a field like machine learning and artificial intelligence and predictive analytics if you're trying to better understand your customer, what they're going to buy, what they're not going to buy, to know which communication to send them at what time, all these factors are important. And there could be what's called covariance between any two factors. Covariance basically means there's some predictive capability in this data point.
What time of day does this person log online? How often do they purchase something online? How often do they go to this website? These are all little details that, like I mentioned earlier, we used to strip out because the pipes were thin and the processors were slow.
Well, all of that has changed folks and now what really needs to change is the mindset of the users of the business to figure out how to capture all this stuff and how to leverage it. But don't touch that dial, we'll be right back with Joe Gaska and Yves Mulkers talking all about your data. Why rent when you can own? You're listening to Inside Analysis.
RANDALL BOETTGER: Clear.
- All right. One down, three to go. Great job, guys. This is always a lot of fun.
- Yeah, I just love this stuff. The commercial break is in just under 4 minutes. So I figure I'll come back to you, Joe, to start and I'll throw it over Yves. And Yves, if you want to ask a couple of questions of Joe as well, that would be great. But is there anything else that you want to be sure that we cover in the next segment, Joe?
- No, I mean, it seems pretty much data ownership getting down to, I think, historical data is more about more than buying an insurance policy.
- So a lot of people think of the back up and trying to have people rethink the idea of backup data is history.
ERIC KAVANAGH: Right.
- Right? History is the foundation of knowledge. All right?
ERIC KAVANAGH: That's interesting.
- How do we derive that knowledge really comes from-- you have people that are visualization, you have machine learning, you have analytics. All the tools that Yves is building is really for extracting the knowledge out of that history.
JOE GASKA: That's really where it becomes a very interesting piece. The backup, people really think of it as a boring, like, aw, I don't want to buy an insurance policy. But it's such a wealth of knowledge that's just sitting there untapped and that's really what I'm trying to get people to realize is that historical data, what you want to do with it five years from now, we haven't even dreamed what people are going to want.
- There's going to be questions. So anything of that, it's really about getting people to back up data as history, take ownership of it, don't lock it away in-- all of our competitors basically download their data, put it in their cloud, then rent access to their history. Now there's a couple things that break down and we can kind of go into it at some point. Trying to use an API to get historical data out is actually not only cost prohibitive, it's absolutely ridiculous trying to take a terabyte or petabyte data and put it into a data warehouse through an API. Just doesn't work.
- Yeah. Tell you what, talk about a topic for a future show. API, like hitting limits on API requests and how these APIs actually work, like REST. It's funny, I've always wondered to myself and I don't know the answer to this question. But I remember reading into back in the day W3C came out and said, hey, it should be SOAP. Remember WSDL and all that kind of stuff?
They said it should be SOAP, not REST. Everyone's like, yeah, that's great. We're going to go REST. Bye. They went off and did it their own way, and now everybody does REST.
I mean, you can still do SOAP, simple object access protocol, right? You can build a system around that, but not many people do these days and they use REST. But with REST APIs, I mean, you obviously have to give authorization to certain users to hit that API but then whoever is controlling it can shut that off if they need to. If there's too much traffic or if they don't like you, they could just block you down, right?
I remember in the early days of Twitter, a lot of the companies that came along and built on top of Twitter got very frustrated because all of a sudden one day they just cut them off from the API. Or they'd say, oh, you have to do this and then like, well, what do I have to do? Boom, cut off. And that's it's not exactly an ecosystem friendly approach to take, right?
- No and it's chain data capture--
ERIC KAVANAGH: --about 30 seconds.
- Chain data capture's even worse.
- I'm reading a comment from one of the attendees. We have 40 years of crash data in our crash analysis system. Wow. Someone in the Ministry of Transport has offered to add it with data he has in a different format for 1970 to 1979. Wow. That's cool. All right. We'll be right-- here we go. Stand by.
- Learn more at lls.org
RANDALL BOETTGER: Welcome back to Inside Analysis. Here's your host, Eric Kavanagh.
- All right ladies and gentlemen, back here on Inside Analysis talking all about the future and the past, right? Those who don't learn from the past are doomed to repeat it. That's a classic line. One of the more interesting ones I've heard too is that history may not repeat itself, but it rhymes. That's an interesting one. So it does pay to pay attention.
And Joe Gaska from Grax is on the line and my buddy Yves Mulkers from 7wData as well. And Joe, you had a great line in that opening segment where you said, you never know how significant something is until the future. And I actually have a theory about this and you see this playing out in movies in fact, sometimes, which is kind of interesting.
But when there is something happening in your life or in your business and you get this sense, wait a minute. This is important. I feel like these additional recorders kind of spin up in your brain like, whoosh. You start just soaking in everything you possibly can. Like let's say someone's trying to trick you into something or whatever, there's something weird going on, and all these recorders go on. And
I think that's going to be the future of AI too. It'll sense that something strange is happening and the cameras will speed up the resolution, they'll do something different to capture more data. But your point was basically that you don't know for sure that something is significant until some time in the future when you can look back and go, wow, that's when things really changed.
And that's when you're going to want to have all the context around your data. You're not going to want to have some tiny, thin cross-section, for example, or core sampling. You're going to want to get access to the full context. And the way cloud systems are set up now, that would be really difficult to do.
I mean, because, again, I know a lot about these systems. I know a lot about APIs. Yves knows a lot about this stuff too. So what you've done is essentially created an engine that allows for the retention in your system, in your on-prem or in your cloud or wherever you want it to go, of all this contextual data such, that at some point in the future you can go back and really piece that all back together, understand what happened, and then really be able to make some positive change to your business. Right, Joe?
- You're 100%. So one of the biggest things, as you alluded to, is your perspective in the past and the significance. Right? We all know, especially with our memory, your memory can blur over time and being able to go back and really think about it is critical.
But one of the other crucial pieces to this puzzle, one thing that's critical that we say is, not only do we capture all the versions of the data, but we make sure we preserve what's called chain of custody of your data. To have trust in your data you also have to make sure that the data itself, we preserve the chain of custody. So we make sure that it does not ever get manipulated in your trust. And there's a lot of different regulatory bodies that go with that.
But 100%, if you really think about your perceptions from the past and the significance of it in your business-- and there's a lot of different ways that people want to analyze that. You want to look at their data over time, whether it be inflection points, whether it be velocity changes, whether it be key losses at a customer and analyzing and doing a retro on them, becomes incredibly important. Not only do you trust the data of the highest frequency and fidelity, and you have ownership forever. So it is owned by your business itself.
- Yeah. And that's important for all kinds of reasons, right? So one of which is, we talked about, the right to be forgotten. For example, GDPR compliance regulations around where that data is physically, and you can rest assured that they are going to come and rap people on the knuckles, man. That is absolutely going to happen. It's happening already.
So it's like in anything related to security. The fewer access points you enable, the better. Right? Chain of custody around data, all that stuff is important. And if it's in your [AUDIO OUT] lot more control. I mean, I'll share a very quick story and I'll throw it up and we'll bring Yves back in, but our business relies heavily upon email marketing.
So we do all these webinars. It's some sort of lead generation business model, as you know, and we really rely on those email platforms. And one show I come up with a strategy for how to get the optimal live attendance, that's to wait nearly till the last minute to do a big push. And one of those platforms we used crashed and it was just down.
I was like, [GASP] oh, no. Luckily we have a backup, a whole separate system that I use for redundancy and I could pivot quickly and use that system to save the day, right? Because that stuff does happen.
Amazon has gone down, Microsoft Azure has gone down. Big, major players go down sometimes and if they do, again, what's your recourse? Not a whole heck of a lot, which is another selling point for the approach that you've put together here, right Joe?
- Absolutely. So customers can bring their own cloud. They can actually bring multiple clouds.
- BYOC. And you can actually have multiple regions in how you store your data, how you replicate your data. This is your choice. Once you have your history, how many replicas you store, where you store it, who do you grant access, all of those compliance-- it's your data. Choosing who you want to give access to, how many replicas you want, storing the highest fidelity, highest frequency that we possibly can, that's the key point. You're 100% right.
And having that optionality of what you wanted to do with your data, you made that selection, you made that choice. That's exactly why we built Grax. Giving everybody that optionality is absolutely critical for us that I'm extremely passionate about. And obviously, my belief that backup data is just more than a tactical obligation, that it becomes historical significance that we spoke of and having that optionality to go and look at that, it's our customers choice.
And that's really what we wanted to do is make sure we protect that and we're not going to hold them hostage and rent access to their history from our cloud. It's theirs, so they own it. Every single data point never leaves their environment. They own it forever, full chain of custody.
- Yeah, ownership. And it's interesting, I'll throw it over to Yves to comment on this. But we did a couple of webinars a couple years ago now all around data ownership and I was curious to see how well that would pull, because you never know until you try. You're marketer like me so you know sometimes you've got to [INAUDIBLE] the best idea. You throw it out there and nobody cares. Sometimes like, oh, is this a [INAUDIBLE]? Everybody loves it. But data ownership to me is important and it's going to be even more important, especially in this cloud connected world, right? What do you think about that, Yves?
- Yeah. I think it's a nice approach as well where you say, I store every version of your data at the purest form where I see at any point in time you can rebuild that history point. Whereas if you were in typical data warehouse, you had to make a selection either on the granularity level where you say, OK. For this entity it will be on that level, on the other one it will be unaggregated because due to storage capacity, due to cost, due to speed and frequency, you had to make that choice.
And if I'm following what Grax is offering, this, just turn it on and you have any detailed level at any point in time and you can rebuild that. And we touched upon ownership, we touched upon compliancy, and that's a very strong thing where now, as you look at compliancy, you have to build in all that logging and you have to put something else on top of your traditional systems to keep track of that. So I think it's very powerful.
Just by a simple solution you can go so many ways on looking at your historical data that is well on the level of compliancy and the linage and rebuilding that point in time. So I don't know how you look at that, Joe and if that's something where you can approach it from a cost perspective where now we're building, I don't know, what, 3, 4, 5 different systems to achieve the same kind of solution with your historical data.
- So yeah, if you think of for some of the technologies out there, the OSI model and as you move upwards the cost of storage. As you get higher up, the cost of storage increases. So we store the data at it's lowest, purest form so your higher level computing and storage mechanisms-- like a data warehouse, like a relational database-- you can then consume it and change the data and reformat it any which way that you want. Right? So really starting to think about that data itself and how you want to consume it is really where we want to allow people-- and you know in the data warehouse space people want to change their shape of data. You want to change your schema. You want to change how does it look, being able to refresh that data from its purest instance or back test it. That's all there. So really, really starting to think about that one.
- Yeah. And since you mentioned that, as I recall, one of the design components or the design points, I should say, for Snowflake was that very capacity. The ability to very quickly spin up a new data warehouse, right? Because to your point, in the old days of data warehousing-- I'm sure you've had some horror stories here-- you had to spend so much time really thinking about the questions that are going to be asked, thinking about what the ideal data model is going to be, thinking about what's the frequency of loading the data and who gets access to it because it was all so expensive and painful. And it would crystallize, right?
So 25 years ago if you talk about, let's just change the scheme and do this, do that, people would throw you out of the room. Like, who is this guy? Get this guy out of here. You'd be a pariah. But I think one of the design points of Snowflake was to enable that kind of change. To say, look, we know that you can build a warehouse and realize, man, we really should have done it this way and not that way so that the business can get what they want.
And so they thought through that very thing and how they designed that system to be able to spin up quickly and spin down. And this all speaks to information strategy in general and understanding, what do you have as a client. What data systems do you have, how much budget do you have, what are all these considerations as you map out TCO. And I'll throw it over to Yves real quick to comment on this.
These days there are so many different ways that you can get the job done and what I see in this space that Joe, I think, is pioneering, is that it's a very positive, enabling technology that'll allow you to make decisions down the road. Because if you can change down the road, that's very encouraging to a business person, especially in what is now an almost post-COVID world. But what do you think about all that Yves?
- Yeah. I think it's an interesting point as well where we were discussing on different technologies and it goes at a very high speed that new technologies come on board and you have to restructure it, you have to change your data layer, your data platform. So if you have that unique data platform in its most low form and have the data stored in there, you don't have to really remigrate to the new architecture, to the new infrastructure.
And I think that's some totally new approach where I see in so many projects, OK, you can start up but we're still doing the migration. I hear it time and time again and it's either, you first have to migrate ERP system from one platform to the other one in a different structure. So if you built a model you can try out and on the fly we built it-- I've been looking into data catalog solutions where these guys built the data catalog on the fly.
So even the data model, this is in a virtual model and you spin off your system and you have that new insight, what you've built, at the time you wanted. You have your insight, you just turn it back down and you go on trying to find new insights and remodeling that. And I see a perfect, well, base platform with Grax that is allowing you to do more, so still have to think about that on how that would be put in place.
- Yeah and the cost of storage, of course, keeps going down, which is good news. But I think, Joe, you actually made an interesting point in the opening segment that as Moore's law speeds things up, so the appetite grows of the user. And there is something to be said for maintaining a sort of awareness in your organization of, how much is data going to be used?
You always have to think through these things, but the key is to have the ability and the wherewithal and the agility to be able to pivot and change course and do things differently over time. To me, that's one of the most important components of a sound information strategy is to build into your infrastructure, into your programs and your processes, the ability to change at some point in the future. Right, Joe?
- Absolutely. So there's a few different things. Data changes is a super interesting topic that we could go on forever. So one of the interesting things to think of data changing and schema changes, so Grax, no matter what, stores everything. If your schema changes and your data changes over time, not just from the source but destination, you have to make sure that you keep track of all of that and obviously, the mapping between there is an ETL and all those pieces.
Never ever throw any data away. We don't want to ever lose fidelity of data, whether it's A, I want archive data. We have a webinar coming up on March 23 at 1:00 PM talking about archiving data because we see a lot of people throwing away data itself. And try and to keep it forever. And you said Moore's law. It's pretty interesting piece where you really think about,
I want every version of data. I want to do everything with it to protect in your future and you said as people can iterate faster-- great, great sample you said about Snowflake. So as these technologies are becoming faster and quicker to spin up and do more things and iterate faster and experiment and make choices and prototype, this is going to do nothing but go faster. If you think back in the late '90s, early 2000s, we're calling up an IT person saying, spin me up an Oracle database. Then you wait three weeks, then you can do something.
But those days changed early in the early 2000s with, OK, AWS. Wow, look I can click and have elastic resourcing right there. All of those things. Now your data agility, as you spoke of, the expectation now of saying, spin this up. I want two years worth of this data at the highest frequency over here so I can experiment with R and do some analytics on it and then throw it away when I'm done.
ERIC KAVANAGH: That's right.
- People have that--
- Yeah. No, that's exactly right folks and these are such cool topics to dive into. We got the next segment coming up in just a second. But yeah, you want to be able to spin it up and spin it back down again, right? That's the key.
Not just elasticity on the way up, but also very quickly on the way down once you find that kernel because you don't want to have to pay for all the compute, for all the storage, all that stuff. Just pay for what you need to, folks. Don't pay too much. We'll be right back. You're listening to Inside Analysis.
- Here is today's topic--
RANDALL BOETTGER: Clear.
- All right, two down. Those are the long ones. The next couple are pretty short. So we're back at 48. Is that right, Randall?
RANDALL BOETTGER: Yeah, that's right.
- OK, back at 48. This one's going to go by like a bullet train. It's basically 10 minutes and 50 seconds, or rather 58:50. And then stick around and right at the top of the hour we'll go for about eight minutes doing the podcast bonus segment. But you're teasing a lot of really cool angles to dive into.
And you know, capturing the important changes and just knowing what those are, that's critical stuff too, and responsiveness. That's the word I was trying to think of earlier. The more responsive you can be, the better.
I'm actually writing an article right now about real time ERP and I kept trying to think of, all right, what's a good topic to dive into? And I'm a cyclist, sort of. I'm not nearly as good a shape as I used to be. But as you may know, the cycling world went out the window in fall of 2019 because of the tariffs on China, and then bam! COVID hits and it's like, whoa! Everyone in that business is just sideways these days because all that stuff happened.
So if you have a real time ERP, if you can really do some analysis and figure out, all right, well who are our best customers? How can we pivot our strategy to be able to cater to them? To maybe high end customers or something like that, that's how you're going to survive these days, right? Because we're still in such weird territory with regulations and with all these crazy things happening now. It's a bizarro world, but the clearer your memory is and your historical foundation, the better off you're going to be I have to think. Right?
- Because today I propose a nice, meditative nature walk. does that sound delight--
- No, people now have so much technology now we're kind of increasing the amount of data. The expertise like Yves', having your expertise with data warehouse, it's like a unicorn with most companies, you have to realize. A lot of people don't have the expertise where people can come in and say, hey, this is your data warehouse and this is your data strategy and here's your history and how you put it together. There's a lot of companies out there that want to but don't know where to begin, don't even know how.
- That's right, call Yves.
- That's what he needs to do.
- Step number one, call Yves Mulkers.
- The typical question you get, I mean, how do we get started? Sometimes it's simple, say, just start building your pivot tables. Just build the insides because it's about the insides. But having those various versions and don't need to think about it, that's very powerful.
We're so used to building it in a traditional way and getting people out of this traditional way, that's a big challenge as well. Right? You need to explain, this is a new approach. Don't think about all the pure modeling and whatever. It's kind of on the instant, you can play in which direction you want to play with that.
It's really the business agility, building the data warehouse at the speed of business. That's what I try to achieve and that's why I'm looking at a lot of technologies out there, but it's pretty hard to convince companies. So I think for the next 300 years I still have a job to do in data management.
- --you're like, the latency could be a business opportunity. There's always business opportunity. All right, we got 20 seconds. Standby.
- Information any time, anywhere, allowing you to spend more time with family, friends, or simply just enjoying the day. Social security. Securing today and tomorrow. See what you can do online at socialsecurity.gov. Produced in US taxpayer expense.
RANDALL BOETTGER: Welcome back to Inside Analysis. Here's your host, Eric Kavanagh.
- All right, folks. Back here on Inside Analysis talking all about data, your data. Why rent when you can own? We've got Joe Gaska from Grax on the line and Yves Mulkers of 7wData. And Joe, we were talking in the break and in the last segment too about the nature of changing data, right? So we got this concept in data warehousing, slowly changing dimensions and it's always fascinated me, this concept, in general.
And things do change, right? So this is why predictive models have to be updated. Again, speaking to the value of having your historical data how do people act last year at this time, two years ago at this time, et cetera. And these days you really have to be responsive. We were talking about how COVID threw everything sideways, all sorts of industries, supply chains broke, all kinds of crazy stuff happened.
And when that sort of disruption occurs, the business person, the entrepreneur needs to figure out pretty darn quickly, what are we going to do differently? So restaurants focused on pickup, for example, take out or curbside, for example, lots of different things in that space. But for the rest of the industry out there, you've got to know, A, what's your DNA? Right?
What makes you special? And I think that's where having a lot of really good rich, historical data, let's say, on your customers or even your prospects comes in handy, because then you can strategize and figure out what you can do in order to survive or even thrive in a disrupted market. What do you think, Joe?
- 100%. One of the key topics I like to talk to all of our customers about is the biggest thing that people want to understand is the velocity of their business. And if you think of degradation or customer attrition or customer acquisition, people always want to know about what the growth rate is or the changes in the growth rate or the velocity of change or if there's an inflection point with that velocity. That's really where it becomes critical, much like your example about on demand inventory in the cycling world.
They want to know what the velocity is based on, everything from, hey, we live in New England. The winter weather, we don't sell many bikes here in the winter here. But in the springtime, here it comes. So understanding the velocity of the last few years and how it is allows you to react to that business faster.
And not all data is created equal, right? So in today's world, it's OK to have data that you want higher frequency, higher velocity readings or perspectives on. Whether it be inventory, whether it be orders, whether it be opportunities, accounts, cases that data itself is not always have to be on higher frequency this and less fidelity of these. Customers now want highest fidelity of everything, and let me decide later what's important with the perspective.
So the velocity of your business, everybody is really talking about that and this is not a new problem. Everybody wanted on demand inventory. The only way you get on demand inventory is to understand what the rate of change is, and all of that really comes down to your business itself.
And historical data, backup data is really about your history and taking ownership of that. I love talking to people like Yves about what can you do with that history and what rich information is there, because you have expertise that I haven't even thought of yet. So it's our customers want to do magical things with this data, but it's unlocking it and it's first getting it.
- Yeah and I think that the whole ecosystem is always changing too, right? I'll throw this one over to Yves. I was talking about this keynote I did about AI versus DW, artificial intelligence versus data warehousing, and it's not a versus. It's really a combination that you want.
And what I see happening here with innovations like what the folks at Grax have done and some other really cool things happening out in the industry is that the different value chains are going to modify. And I think what's going to happen is data warehousing vendors like Snowflake are going to realize what's happening in some of these other contingent spaces, if you will, and they're going to adapt. I mean, you kind of see this in the data sharing world with data marketplaces they talk about. But just recognizing where the market is going.
I actually noticed once the San Jose Sharks got good, all of a sudden all the Silicon Valley folks started using those metaphors. Oh, we're going to skate where the puck is going to be, like Wayne Gretzky. But you know this as well as I do, Yves, that the cycle times on developing software, even for the big guys, it takes weeks and months, sometimes longer than that. Especially if you're in the hardware space, it takes years.
The cycle time for a chip, is like 10 years from the time you think about it to the time it's fully in production somewhere. But I do see that the whole industry is going to be evolving. So it's not that what Grax is doing will be viewed as the new data warehouse. It's not.
It's a component of that strategy. And if you think through how you're going to connect these different systems in the classic cloud native sort of way, that's when you can really take the full advantage of what's available to you. But Yves, I'll throw that over to you for commentary.
- Yeah, I think that's a very nice perspective on there where you say, depending on the use case, what you have, you want to experiment with different technology that gives you the insight. That's the big discussion where we had, OK, it's a data lake, it's a data warehouse, or we need an analytical engine. And you had to trade off it's one or the other, whatever. And in fact, you see the two migrating.
I see with various new data-based technology, you don't have to make the choice anymore between, for example, an analytical workload or just the traditional operational workload. And depending on that, you can make the choice for the technology that's most relevant at that time that helps you solve that insight and based upon that you can move forward. So I think it's a completely different view, what you now have to have with Grax where you say, we keep all the versions stored and you don't have to think about making that decision upfront and have that as an addition, the glue in between.
I mean, I'm thinking about, for example, an enterprise service business. But then we come again in the discussion we had before on APIs that are very slow to provide that information and as well that you have to model it. You have to define it. You have to document it. A lot of overhead what goes into making the data available to your consumers. That's what I see where it's going, and having that flexibility of plug and play on top of your version of data.
- Yeah. Yeah and also just all the different ways that cloud vendors store their data, right? So in terms of hitting APIs, I'd love to do a show on API management at some point in the future and really kind of dive into how all this stuff works. We had a show a couple of weeks ago where one of the guests was an API expert and he was funny. He's a classic Russian guy who has no bones about being very candid of his opinions of things.
He was talking about the different APIs out there and how some are just a disaster, just a menace to deal with, especially if you're trying to hit it hundreds of times in an hour. He's like, good luck with that one, buddy. But to Yves point, you have to document all that stuff, right?
So the thing that also fascinates me is, again, in the marketing world alone we have 7,000 technologies since the Martech 7,000 these days of technologies for automating sales and marketing. And they all have slightly different data models. They all have slightly different APIs if they have an API. They certainly don't all have APIs.
But this speaks again to just the heterogeneity, the diversity, and the complexity of these environments. And to Yves point and to your point too, Joe, things are changing. I think this whole cloud native vision that some people have is really going to come into play. Of course, one guy joked to me the other day like, we view cloud native as AWS native since AWS is the leader in the space.
But nonetheless, the point is you have to be cognizant of the fact that there are all these different ways that data is being stored and persisted. And if you want your data that you've got in all these different systems, you really should have it in your own environment if you want to be just safe and sound as possible. Right, Joe? Oh. I'll turn it over to Joe.
- Sorry, I was on mute. If you think about, right now, one third of all mission-critical apps used by the average enterprise is in the cloud.
ERIC KAVANAGH: Wow.
- So that means one third of your enterprise data is locked away in other people's servers. So if you think about that for a minute, cloud ERPs, cloud CRMs, marketing technology, all of the most critical things that your business has for their lifeblood is rented access on other people's servers.
ERIC KAVANAGH: Wow.
- Right? So really taking that down and understanding that how--
- --makes a chief data officer nervous. Yeah, go ahead. Sorry. Real quick.
- No, it's crazy to think about that. I mean, people before this really hyper focused on data islands. Not only do we have data islands, but we have rented islands. So it's really thinking about, OK, I need to get all that data.
I need to own it. I need to store it in my cloud. I need chain of custody. I need to choose where my geoaffinity is. All of these things for compliance and regulatory is just something that, obviously, I'm very passionate about. So it's something I want to help people unlock their data and take ownership forever.
- I love it. Folks, we'll be talking to Joe Gaska from Grax. Look them up online, G-R-A-X. And of course, my buddy Yves Mulkers from 7wData. We're going have two more shows on this topic getting really more to the forward looking stuff about how all this rich contextual data can be used to better understand your customer, to better understand, really, your own DNA. Right?
Every organization is different. Even if you compete with someone on almost every level you can imagine, I guarantee there's something about your business that is distinctly different than their business. And that's what you want to understand these days, folks, and that's what you will understand if you examine your data. Not some, not part, all of your data. We'll be back for podcast bonus in a second.
RANDALL BOETTGER: All right, we're clear. Great job.
- Thank you, buddy. I just emailed you a commercial too. Did you get that?
RANDALL BOETTGER: Hold on. Yeah, I put it in there for today.
- Oh, good. Look at you go. Well done. All right, guys. Podcast bonus. Isn't amazing how fast that hour goes by? It's like, is it over, really? Yes. So for the deep dive, any thoughts? Yves, anything you want to dive into for the podcast bonus?
- It's mostly about the change, your view on what you're doing right now. I mean, connecting the platform to any kind of application what you have running and how simple that can be. That's what I want to dive a bit more under the hood to understand what is going on there, how easy it is from a cost perspective at the same time as well. What is extra needed to have that versioning of all your application data into a single location?
- Absolutely. I mean, the one thing that's interesting about our architecture as well is since we're not pricing based on cost of AWS and adding a premium, customers today take advantage of their discount levels in their cloud. So if people pre-bought resources or they have a massive discount, when we run Grax it's cheaper for them because they have this discount that's there. So it's quite-- There we go.
- Eric's back.
- Yep, good thing that happened at the end of the show. There you go. Timing is everything. OK, so podcast bonus segment. Hold on one second. Yes. Yes, yes, yes. Hold on one second. OK, good.
- You have such the radio voice, Eric, by the way.
- Thank you. I love it. I actually used to do when I was in college-- or wait, in high school as a matter of fact. I was on the speech team and I would go around, wake up at 4 o'clock in the morning, get on a bus, and drive down to Peoria, Illinois to go read TS Eliot to people. But they used to say that and I was like, Oh, why are they saying I have a nice voice? What a strange thing to say, I have no idea what they're talk about. Anyway, here we are on radio, so there you go.
So let's talk about change in connectivity with Zoom. So connectivity on Zoom is a fun thing. No, I'm just kidding. Man. I want to find some-- I don't know how you even do it these days, but a truly hardcore, robust internet connection where you're just darn sure it's not going to fail. I was actually talking to a guy from Infoblox about mesh networks and he was talking about how he's got a whole multipronged network in his home.
But I asked him, couldn't you have two feeds into that? In other words, Cox cable is one but some other wireless services the other. He's like, I don't think so. Because that's true redundancy, right? When one is crashing, kick the other one in like a generator. But they still don't--
- It depends on your router but you can have a prioritized route that weighs it based on delay. So you can do that with the right switch.
- Really? OK, that's what I need to do is get the better hardware, man. I need some evidence--
- I just ordered the Starlink.
- Is that the?
- The Elon Musk Starlink.
- I was going to say. Yeah. Right, right, right.
- Because I got a generator now and I got Starlink. That means even with power out down, I'm still connected.
- I love it. That's what I need, man. I'm going to order Starlink too. A buddy of mine ordered that as well. So I hope that takes off like his rocket ships do. All right, let's do this. Let's talk about change and the importance of change. I'll start at three past the hour. Hold on one second. And then let's go till 4:11 from 4:03.
All right, folks. Time for the podcast bonus segment. Talking to Joe Gaska of Grax and Yves Mulkers of 7wData. We're talking all about information strategy, really, and the richness of historical data. So we mentioned at the top of the show how backup and restore is often viewed through the lens of something that is painful, but I have to do it. And it's painful because it takes a long time. And it takes a long time to get the data back if you do have a crash. It's just brutal.
But that's all changing. Its changing for a variety of reasons, one of which is this new way of looking at data looking at capturing historical data as it's manufactured, basically. You had a great quote, Joe, where you said data creates data. Which is true and it's fascinating. It's really interesting stuff. And there's all kinds of different data that you want to preserve.
And the key, again, is change, and you never know when some significant change is going to come. You can have an idea, you can know events that are coming in the future. But a lot of the most disruptive events, you don't know about. Whether it's a volcano erupting or COVID breaking out or whatever the case may be. And the key is, you want to be able to have some ability to pivot accordingly.
And so, Joe, I'll throw it over to you. When you talk to clients and you get excited about understanding historical data, how do you describe to them? Do you use any examples to drive home the point of really being able to appreciate your corporate DNA through the lens of data and allowing that to give you the agility to change in the future? What do you think about that?
- Sure, absolutely. So a lot of people are mystified or [INAUDIBLE] use or don't know where to start with data and history and how do I do it? It almost seems blocking for a lot of people. When we boil it down, every company does it and every company does it by several means. There's two places that every company does it continually every day, all day, and it's sales and customer service.
Sales and customer service are continually looking about velocity changes over time. What did my reps do this quarter versus last quarter? What did this team do this quarter team versus last team? What is my pipeline velocity? How did it change over time? Is it growing? Is it decreasing? Right?
There's a company up in Maine. A company called IDEXX one of the largest vet companies up there understanding how the vet industry is impacted by historical statistics. You don't sell tick food in the middle of winter because none of the dogs are going out and getting ticks. So all of that historical relevance and external data and how it impacts the business, everybody is currently doing it, right? And really having that perspective and being able to look back and continually dig in it and mine.
Whether it's a data warehouse, whether it's a visualization and you want to see statistical variances, historical data starts simply with a question. How did my business change over time? Then if you want to change your business into, how did sales or how did this sales rep or what are the ticket velocity by customer service rep.
All of those questions are facilitated with historical data and the answer can come in many different ways, whether it's a report, whether it's visualization, whether it's open ticket volumes. it really starts with a question and the question is always about perspective in time and velocity.
- Yeah. That's a really cool point you made about customer service and sales I think you mentioned, and I would say marketing too. And I'll throw this over to Yves. I think this is right in your wheelhouse, buddy. Because I've watched changes over time and I'm talking dramatic changes, like how people respond to email, how people respond to links on LinkedIn versus Facebook versus Twitter.
And of course, the challenge in those environments is that the rug is always being pulled out from underneath you and you're constantly having to calibrate and then recalibrate and re-understand and then recalibrate again and that's no more true than in the world of social media marketing. But Yves, what do you think about that quickly? We've got about three minutes here in our podcast bonus segment.
What do you think about the importance of knowing change and of looking back? That's the problem, I think, with a lot of humans is we're not really willing to look back and be honest with ourselves about what happened, especially if we made mistakes. But what do you think about all that, Yves?
- Well, if you refer to the marketing area, I think that's very important because there we do a A/B testing and you want to see, what is the impact? And sometimes you forget about what you've been testing and the correlation between the various parameters what you have been applying on your data. So having that [INAUDIBLE] a multitude of parameters what you've been experimenting with, without, you really have to think it through whatever.
But just turning the knobs and seeing what happened and then going back and looking into that, I think that's a very new approach on how you can experiment with your data without really having to think it really through and noting down what you have been changing and trying to guess what has been the impact due to which type of parameters. That's very strong, especially in the marketing scene to have that ability and keep track of whatever your experiments have been in that area.
- Yeah and understanding too who is out there. If you're on social media platforms, they're all very different. Facebook is very different from LinkedIn, which is different from Twitter. And I know bot traffic is everywhere. If you look at your website traffic, how much of that is people, how much of that are bots.
And you don't always know is the bottom line. I mean, people in the black hat security world, of course, are always trying to use proxies to hack into places and pretend to be other people. So that's a reality as well.
And I guess getting back to the foundation of data, it's very difficult to think through, to your point, the parameters of what am I trying to measure, how does it matter. I mean, something like allocation they talk about in the digital marketing world, it's a very difficult thing to determine. Exactly why did someone buy this product at that time? It's very hard to determine that sort of thing. And you're never going to really know, but the closer you can get to that truth, the better. Right, Yves?
- Yeah. Joe mentioned that as well. Today you start understanding a part of your business or part of an event what happened and if you have that insight, you can go back in the past and try to segment that out or just filter it out and then see how the past looked like so you can build much cleaner models based upon that insight what you gained today and didn't know yesterday and I think that's very powerful.
I didn't really thought it through just now that you say it, Eric, but I think it's very powerful to see this is what I learned today and what if I could have applied that to the past and built the insights or better insights for today, like you said, with also social media traffic. You see that you have less traffic coming from social channels to your website. So what did change that behavior? Is it really the algorithms that made that change happen or the different behavior from customers? But it would be great if you could go back with the different perspectives and have those insights.
- Yeah. That's such a good point. When you apply different filters in your mind, you see the world a different way. And I mean, that's true in different languages, it's true in lots of different ways. But folks, what a fantastic show today. Like I said, first of a series of three so we'll have Joe back in about a month or so.
We'll probably drag Yves into that as well. Send me an email, we're booking the rest of the year right now. firstname.lastname@example.org, that comes straight to me. We want to know what you want to know and we'll talk to you next time, folks. You have been listening to Inside Analysis.