podcasts Market Intelligence /marketintelligence/en/news-insights/podcasts/next-in-tech-episode-74-data-platforms-and-analytics content esgSubNav
In This List
Podcast

Next in Tech | Episode 74: Data platforms and analytics

Podcast

Next in Tech | Episode 84: The bear bites M&A

Podcast

Next in Tech | Episode 83: (Re)Building the Digital Workplace

Podcast

Next in Tech | Episode 82: Flexible Infrastructure

Podcast

Next in Tech | Episode 81: An updated cloud conversation

Listen: Next in Tech | Episode 74: Data platforms and analytics

Data-driven decision-making is accelerating, and organizations need to ensure that they’ve got the data and infrastructure to fuel this imperative. Senior analyst James Curtis joins host Eric Hanselman to look at data about expectations for addressing hybrid computing environments, data movement and advanced database technologies and why they’re critical for digitization. The data we discussed and more on AI and ML uses is available here.

Subscribe to Next in Tech
Subscribe

Transcript provided by Kensho.

Eric Hanselman

Welcome to Next In Tech, an S&P Global Market Intelligence podcast where the world of emerging tech lives. I'm your host, Eric Hanselman, Chief Analyst for the 451 Research arm of S&P Global Market Intelligence. And today, we'll be discussing data and analytics with our guest, senior analyst James Curtis. Jim, welcome to the podcast.

James Curtis

Thank you, Eric. Glad to be here.

Eric Hanselman

The latest of the VotE data and analytics data is out, data platform study. And we've been talking a lot about the importance of data not only in applications like customer experience, a lot of the infrastructure and aspects that are supporting data, but this is really taking this in some new and interesting directions about data platforms and perspectives. So -- but what are the highlights for the study?

James Curtis

Yes, that's -- it's a good point. So one of the interesting things is as we have come out of COVID, this is one of our sort of our first full surveys that sort of gets into that, and it really presents some highlights in terms of what customers and enterprises are sort of doing with this with their databases and data platforms.

But there's really, I would say, maybe three things that are probably worth sort of noting as sort of highlights.

One is that there's certainly still this general adoption of cloud and sort of how enterprises are sort of navigating that with what might otherwise be their modernization strategies or digital transformations and some of those things. So that's still very much in play here, and the results really sort of bear that out.

The second piece of that, which relates to some of that digital transformation and modernization efforts, is really, what do you do with your existing legacy systems? How do those sort of adapt and change? And again, the results gave some insights on that.

I would say maybe the third piece, Eric, is really around, as you look at sort of the database market -- this is a market that's been around for like more than 50 years. And while you have some sort of state technology going on, you also have new, emerging technologies that rear their head to sort of address challenges in the market.

And so certainly, with all of our surveys, at least this one in particular, we talk about some of the adoption of sort of new technology, this newer frontier and sort of how that's integrating. So maybe in a nutshell, it's probably wrapped around that. Certainly, cloud weaves its way very well through these results.

Eric Hanselman

Well, cloud drives all, as we find. Well -- but it's interesting and especially some of the contrasts in -- and some of the points you're bringing up, I think, are really important. We've talked about the importance of data. But now we're in this environment in which we're really in transition.

There -- a lot of key, critical organizational data is still in existing repositories in whatever form that is, databases typically. But yet, we're now integrating the delivery of a lot of the results based on that data in cloudy environments. And now we're starting to spread into hybrid environments, and some of that shift now starting to create some distance between the data and the analytical and operational workloads that actually are going to work with that data.

So what are you seeing from the study about what happens? I mean so as we get more hybrid, the challenge is we're now -- we're executing in more places, but yet we've got data still in significant repositories. So what are we seeing from the data, about the data?

James Curtis

Yes. Well, I mean, we've certainly -- our results show that data will continue to grow, organizations love data and some of those things. But the hybrid piece is actually really interesting, is because when you introduce sort of a cloud model, you, by nature or by definition, introduce different locations, right, of that data. Like it's going to reside in different places.

And you probably heard maybe on the show and others that like water, it tends to seek the lowest place, right? But data has the same kind of thing. It tends to sort of gravitate where it makes sense. So at least from a hybrid standpoint and just to sort of level set maybe from our listeners here that hybrid really is moving, say, from one cloud framework to a different cloud framework. It could easily be an on-premise cloud to sort of maybe a public cloud or something like that.

There are different ways where you can sort of bridge multi-cloud, different things, but hybrid sort of does that. But yes, certainly, as you do that, so it creates these problems. But tooling and where the systems are running, how the systems are running across this nature really plays into -- and it's got to play into the minds of enterprises. Really they -- you certainly can't get around it because it's going to be -- the data is going to go where it wants to go. It's going to sit where it wants to sit.

Eric Hanselman

Well, water is one of my favorite analogies for data, right? And there are a lot of folks who say data is the new oil, data is the fuel for the digital economy. But I really keep dropping back to the idea of water because it does -- as you say, it finds its own level. It gets into all sorts of places. You get puddles of it scattered all over the place. If you're not careful, it winds up in places you may not want it.

There are a lot of things about the water analogy. I think there are probably the more apt piece of this. But to that end, what are ways in which organizations are wrestling with making sure that they're getting the right data in the right place? Because, I mean, the challenge with hybrid is now that we are in these multiple places. And if we now look at a lot of these data sources, we've got data coming from many different environments. What are tactics that they're starting to put to work to deal with this?

James Curtis

Yes. So as your data gets distributed or sort of fanned out, there are certainly ways to sort of address that. I mean -- and our survey suggests that at least one question that says that enterprises tend to like change the least, right? And so if I can have sort of systems or...

Eric Hanselman

As much as anybody likes change, right?

James Curtis

Yes. But -- and certainly, when you're talking about databases, which tend to have potentially sort of maybe customer data or very valuable data to the customer, like everyone gets just a little bit shy of sort of making dramatic changes on that.

But one of the things -- one of the tooling things and -- at least from an enterprise organizational standpoint, that has to be kept in mind is if I'm going to run in sort of different environments, realizing that the data is going to be there. Because there's always this idea of like I can move data, but you generally pay a cost if you have to move the data somewhere else. And that model breaks down, particularly if it's like significant amounts of data, right?

There's reasons why you hear these stories of AWS pulling up and piping all of the data into a big truck that drives it to its location, right? That might be dramatic. But -- so the idea is I want to potentially sort of operate on the data and have the least amount of change as possible. So can I run that application on-prem or in the cloud or different ways with the least amount of disruption?

And the results in the survey said that customers actually like that. They like the idea of sort of running the same thing in different places. And that's at least one way of getting around the idea of sort of moving data, is that I'll sort of operate on it where it is, but I will bring some of those operations to that data, but I will have them be very consistent between the other systems I have. I don't know if that makes sense, Eric, right? It's just the way of sort of reducing that change or that ability to sort of adjust to an environment.

Eric Hanselman

Well, it's one of those things that we keep talking about, which is distributing the work that's necessary to be able to go achieve the end results. And with analysis, the extent to which you can reduce the data volumes that have to get in some different place, all the better, I mean, because it's -- but egress fees are the next-level sticker shock for cloud capabilities.

And as you said, if you've got to move a lot of data, moving bits actually cost money. And in on-prem environments, we got used to being able to move large volumes of data over the -- all of the infrastructure that we have already paid for. But now in cloud models, you've got to pay for the movement of that data. So there are significant advantages of distributing that processing and ensuring that it's actually taking place where the data actually exists.

James Curtis

Yes, that's -- it's a great point. I mean what we're seeing a lot of times is early on when cloud computing and systems were, hey, have your system wherever you want, but -- one of the things that customers have sort of pulled back on is they said, I'm going to -- there's a pause before I dump data somewhere. Because I might dump it there and I may or may not be able to get it out as easily as it sort of went in as you talk about egress freeze and things like that.

So by nature, that then presents -- I'm going to have sort of very good data at different places, and I need to sort of speak to that data where it may lie. But certainly, it's a notable trend that we're seeing, is this pullback in sort of immediately dumping into the cloud. I'm just a little more cautious about that right now, so.

Eric Hanselman

Well, makes sense. And especially, I wanted to focus on the -- another of the points you made earlier, which is that data volumes are growing. And you've now got not only data in more places but substantially greater volumes of data. And in some ways, we've had reductions in storage costs, which means that we haven't had to make the hard decisions about what do you keep, what do you discard.

But yet, there's just a lot more data because there's a lot more data that's being generated and -- especially in cloud. We're throwing off a lot more telemetry just from the infrastructure that's housing it. We're throwing off a lot more data from all of the aspects of all of the elements of our environments. Digitization is generating more data. How are organizations dealing with this? And what's -- what kind of technology are they putting to work to manage it?

James Curtis

That's really interesting because early on when -- and I don't know if we can necessarily put sort of a historical sort of point in the sand on a date. But one of the things -- if we look back, the big data movement really was meant to sort of address some of the data volumes we had. By its very nature, big data and that term has sort of been debated. But what we're talking about with [ patience ] a while back and sort of the idea that oh, yes, big data, that's so last century.

Yes. But it -- really, if you go back and say -- it was really sort of addressing a problem at the time, right? So we still have actually lots of data around. In some ways, we might say we certainly have more data than we did last year or even the year before, and that's still big. But we just don't sort of use that term as much.

But big data really and some of the systems, Hadoop and some of the SQL systems that were sort of nonrelational, were really meant to address sort of this need in the market that came up that says, hey, listen, my -- there's actually a lot of devices and things I have that's collecting data, and I -- we think there's value in that data.

Now storing that data in maybe the legacy systems we have was probably not cost efficient. So we had object storage and some of these things and Hadoop and some of the technologies around that, that gave you sort of commodity hardware to store it. It certainly provided additional problems on what to do with that data and how to and manage things like that. But it certainly gave us tools to sort of do that. And so you see the data lakes, and now we're seeing sort of the emergence of the lakehouse and things like that. But these were sort of meant to do that.

Modern day, we still have those. Those are still certainly kicking around as technologies. But we have other technologies, too, that are sort of rearing their heads. Graph technologies in a way are a means to address some of these large datasets and things like that. And again, the market sort of responds to that. And maybe when they come on, they're kind of immature, but they take a while to mature. But the market somehow finds a way to give us answers to some of these problems, as you mentioned data volume, right? So it helps us out.

Eric Hanselman

Well -- and that's something we -- again, we're getting back to water metaphors, the data lake. But of course, the thing that I think we sort of found over time, which is if you simply take every single element of data that you've got, pitch it all into one big, centralized place, I mean, that was an interesting idea when we were getting to that stage of thinking that analytics do well with lots of data.

There's potential to derive useful insights from this data that we may not know. So we don't know which data that we should hang on to. So let's all put it into this data lake. And yet you get what you get in a lake that's not managed well and maybe overused. And there's a lot of stuff in that lake that you may not want in that lake.

And the odd shopping cart here and there and who knows whatever else people happen to toss into the lake make it just a little more complex to pull out quality analytics. So you've got to figure out how you actually sort out what you need. That shift to the lakehouse model is sort of the next bit of that.

But we've come to a number of different technologies to be able to address that in terms of something that looks a little more structured or a little more organized for how we actually want to take the results of those analyses. I mean traditionally, in databases, we had a relatively defined schema that laid out what all of the various data relationships look like. You had tables that were tied together that were set up to go structure things well.

That first stage of trying to do something better than that one single unified database, distributed database technologies like the Mongos and the Couchbases and things like that, were ways to be able to split that up among multiple locations. But you identified something that I think is one of those next stages which allows better representations of the relationships between that database, which are the -- between the data, which are graph databases.

James Curtis

Yes. So graph technologies has been around probably if you go back maybe a long time. But what's interesting with graph is that it really is a different way of getting at the data. And we've seen -- I would say within the last 2 years, our results from the survey certainly suggest that probably 60% of the respondents said that they're using some type of graph in its iteration. And probably another maybe 30% will sort of probably adopt it maybe over the next 12 months hence. And there's probably maybe a quarter who say they're not quite there yet. But these are actually pretty big -- bigger numbers we've seen before.

But graph has really sort of taken on and taken off because it does address certain problems, and it does it pretty good because it is able to sort of pinpoint relationships that sometimes are challenged by traditional databases, okay? So truly, databases tend to be rows and columns and things like that. And then as you search from the data, the rows and columns get joined and divide -- and subdivided and parsed and things. But graph works on the model of, like, you can have an entity, if you will, and have sort of all sorts of sort of traits around that, that might be interesting.

And so the data suggests that as organizations sort of deal with some of their data volumes and they come across graph as a potential solution for that, they look for ways to adapt that. It increases sort of the overall market appeal of this stuff, right?

So things like, I don't know, the -- like knowledge graphs and [ heat ] graphs and fraud analytics and different things like that are just great uses of graph and not only address sort of data volume issues, but they provide probably much sort of streamlined solutions and offerings than what organizations might otherwise be able to maybe use with traditional tools.

Eric Hanselman

Yes. I mean they're -- seems like they're able to tackle things that you might have been able to go construct a series of queries to figure some of this stuff out in a traditional database. But the relationship associations now allow you to identify these relationships more efficiently, more effectively and with greater nuance.

James Curtis

Yes. Absolutely, yes. So, I mean, like I say, it's -- there's a debate that goes on in the database world, Eric, you probably noticed, that...

Eric Hanselman

In technology, yes. No.

James Curtis

Can I have a database that sort of does everything for me, right? Can I purchase once and have it do for me, right? So there's constantly this, do I get a specialized tool for that? Or is the tool that I have sort of were good enough? So -- and graph is one of those things where like it's more on the specialized side and it just does its job really, really well. Like I say, you can do things -- you could probably do a similar use case.

Say, for instance, I don't know if you've covered sort of CX and some of the customer service with Sheryl Kingstone and some work that she does. But graph is often used in some of those context to sort of paint sort of a 360 view of the customer, I guess, if you will.

And graph is really good technology for doing some of that because you can get all sorts of things about people, right? Like their purchasing power, where they've been, they've visited, details on their contact information, all these things. And graph just -- graph is really good at that, right? Just it's sort of streamlining that -- solves those problems.

Eric Hanselman

Complex relationships with content. Well -- and it's -- we see dramatic increases in graph use in security.

James Curtis

Yes. Yes.

Eric Hanselman

Back to my day job, yes, because, again, exactly as you said, complex relationships, being able to build associations with very rich datasets. Yes. Excellent applications for it. So if we're thinking about the various odds and ends of strategies for data and analytics and looking out for the year ahead, where should organizations be planning? And what should they be thinking about?

James Curtis

So I get this question a lot just in terms of whoever that I'm speaking to. And if I might lean on a bit of a metaphor, for instance. There's a -- it might be an African proverb or something, I'm not sure where. But basically, it says somewhere in Africa, every morning, a gazelle or a deer, an antelope is going to wake up, and it knows that it has to outrun the fastest lion to survive. But the lion has got to wake up and it's got to chase this gazelle or deer, whatever it is, for food.

And so you have these competing forces. And so the idea is whether you're sort of on the lion side or the gazelle side, when you wake up in the morning, like you got to get moving. Like the day's going and you might be addressing sort of competitive forces and different things like that.

So from an organizational standpoint, you can't necessarily sit still. You are going to have to evaluate some of the systems, where they're -- where they reside. Are they on-prem? Are they in the cloud? And while we say modernization and digital transformation is sort of throw-around terms, they do have real sort of meaning and impact with organizations. Like you simply can't ignore them.

You do have to address some of these technological innovations that are happening around you and either adopt them or figure out a way to extend what you have. But eventually, you will get caught in some of these things. You do have to have an answer for it.

Eric Hanselman

Well, depending on which side of that point you happen to be on, either you may wind up lunch if you're on the gazelle side, if you're not fast enough. Or you may go hungry if you're not quick enough from the lion's side.

James Curtis

Yes. Yes. I guess this is one of those themes that we keep coming back to over and over again, which is that there is in that march to digitization, it can be tempting to say, this is not -- I don't need to get out there and get fully digitized, I don't need to make this transition. But yet, if your competitors are out there doing it, you're -- again, you're either going to be lunch or -- there's just great technologies out there that -- it's just a great time to be in this market, and the innovation that's out there is just wonderful, so.

Eric Hanselman

Well, and I will point our listeners to the data analytics study on data platforms to get some insights into where they should be going. But that is it for this episode. We are at time. Jim, thanks for all these perspectives and a whole set of really great metaphors.

James Curtis

Thank you very much, Eric. It's a pleasure.

Eric Hanselman

And I'll appoint our audience members to the results of the study as well as an upcoming webinar. It's actually coming up in a couple of days, right, Jim?

James Curtis

Yes, it is. It's coming up on the 21st.

Eric Hanselman

Great. Well, that'll actually be digging into not only this data, but a whole set of additional perspectives. So we'll have a link in the show notes that will point people there. Hope that you'll see it either live or pick it up on the recording. A whole set of additional perspectives on this. So lots of good things and lots of good perspectives on data.

And that is it for this episode of Next in Tech. Thanks to our audience for staying with us. And thanks to our production team, including Carolyn Wright; Caterina Iacoviello; Ethan Zimman on the Marketing and Events team; and our studio lead, Kyle Cangialosi.

I hope you'll join us for our next episode where we're going to be discussing decentralized finance. And we talked about various aspects of it, but going to be going into a pretty high-level view as we get to the -- maybe the other side of the crypto meltdown. There's an awful lot that's going on and a lot of things that you'd be at your peril to write off. So I hope you'll join us then because there is always something Next in Tech.


No content (including ratings, credit-related analyses and data, valuations, model, software or other application or output therefrom) or any part thereof (Content) may be modified, reverse engineered, reproduced or distributed in any form by any means, or stored in a database or retrieval system, without the prior written permission of Standard & Poor's Financial Services LLC or its affiliates (collectively, S&P).