Prepare To Scale – How Data Ops Enables Growth With Yves Mulkers And Christopher Bergh
We all know that data drives the information economy, but the efficiency of that engine makes all the difference. That is why DataOps came to be. Savvy practitioners realized that taking a quality manufacturing approach to data management could yield wonders for agility, awareness, and growth.
Find out more on this episode of #DMRadio as Eric Kavanagh interviews Yves Mulkers of 7wData and Christopher Bergh of DataKitchen, a pioneer of DataOps.
—
Transcript
Eric: Welcome back once again to the longest-running show in the world about data. It’s called DM Radio. I took a little bit of time off traveling with the family. More information about all of that is on InsightAnalysis.com and possibly DMRadio.biz. I went out to Belgium. I saw my good buddy. In fact, Yves Mulkers is on the show with us. I had dinner with him near his home in Belgium. It was beautiful. I learned a lot more about data, race cars, electric cars and lots of fun stuff. The topic for this episode is something called DataOps. I’m very excited to have a guy who was the first one to tell me about it. As far as I’m concerned, he started this thing.
Christopher Bergh of a company called DataKitchen is going to join us. First, let’s dive into what this stuff is. What is DataOps? Maybe you’ve heard of DevOps. That’s where developers work directly with the operations people in businesses to solve problems and to get things done and workflows. Maybe getting the checkout cart to work better. Getting the web analytics to work better. Whatever it is, you have developers working directly with business people to solve problems.
That did great and everyone figured out, “This is a new way of doing development.” It used to be what was called waterfall, which is a very slow process that used to be followed, where one group does one thing, then another group does another thing, and these are like week-long or months-long periods of time. Eventually, you get some solution at the end of the day. That’s not how things get done these days, folks. It is much faster. It’s much more agile. That’s the term that we use for the development process.
Now we see that porting over to other areas of the business. You hear about DataOps. DataOps is basically taking a very lean manufacturing style approach to the management of data assets. You prepare data. You have data in different stages. You authorize it to do different things at different times, but you’re trying to automate as much of that as possible, which does a lot of different things. It reduces errors for sure because humans make mistakes.
Once you nailed on a process and automate that process, as long as you do it properly and you have some visibility into it, which we’ll talk about observability too, then you’re off to the races. A lot of these tasks are very boring and very simple but they’re important. They can be tremendously time-consuming if you don’t automate them. If you can automate the testing, for example. Automate pushing it to production.
All these different steps you can automate, that’s going to go and do wonders for your business because then you can get down to getting things done instead of trying to keep the nuts and bolts of the organization rocking and rolling. With that, let’s bring in Christopher Bergh from DataKitchen. Like I say, you’re the first one that told me about DataOps. Tell us where the idea came from and why you decided to spend time building this engine.
Christopher: I think the idea came from theft. People have had these problems before. I don’t know if you remember the ’70s and how bad American cars were. All of a sudden, they got better because we started to adopt this series of Japanese manufacturing techniques. We manufacture total quality management. I was an old nerd. I was part of the software industry in the ‘90s. If you remember, software used to take forever to come out and it would be shipped in a box.
The software industry has learned to ship software very quickly with very low errors. If you go to your typical website, it’s updated. It does several times or even several hundred times a day. These principles that came from making great cars and making great software apply to the world of getting value from data, or data and analytic teams who analyze data and make predictions and visualizations.
There are a lot of words for that but I tend to use the word data and analytic teams. I personally came at it from a fifteen-year career in software, where I worked in NASA and Microsoft for Startups. About 2005, I thought I would take an easy job because I was always interested in data and analytics. I would run a data analytics team and my kids were small. I thought I would be home by 5:00 and life would be good.
I soon discovered the dirty secret that existed for me in 2005 and unfortunately, still exists. It’s that you fail a lot in data analytic teams and it’s a very hard job. Why is that? With the explosion in the industry and the $80 billion in spending, why do so many data and analytic projects fail? Why is it such a tough job?
Here’s an example of a survey that we did of 700 data engineers with some questions on their emotional health. It is a lot and we hired a survey firm too. It wasn’t your usual biased marketing survey. Of the 700 people, 50% of them were thinking of quitting, 70% of them had daily frustrations with their job, and 78% of them wanted a therapist because they were so stressed out. Why is that?
On the other side, why are so many of the projects and things that we deliver not working? Why do business customers generally don’t trust the data? Why are data and analytic projects failing? Gartner estimates 60%, sometimes 80% of all projects fail in either time or functionality. Failure and a sucky job do not make for success. What’s the root cause of that? DataOps has a unique perspective on that, that I think it learns from other industries and takes those lessons and applies them to the world of data analytics.
Your customers are not jerks. They just don't know the complexity and have five or ten more questions. They see so much farther than you do. Share on XEric: Those are some amazing numbers, by the way. Talk about staring down the barrel of the gun, and one year or over a half earlier, there are throwing up their hands and the majority of them want a therapist to deal with their jobs. That’s a very bad sign. That’s a very interesting piece of information. What I like about it is that you were demonstrating the fact that you know from experience the pain that these people go through. It’s very stressful.
When everyone is breathing down your neck because this data set is not where it needs to be and you can’t do reporting on it. You can’t talk to the auditor about it or whatever. That’s a very stressful situation. Data engineers know that getting it right takes time. There are all sorts of little areas that you can slip up along the way and suddenly, things don’t work. You’re even joking before the show how a lot of times, people spend a lot of time and money building these solutions. They don’t even know if it’s working well enough because you don’t have enough observability, which is something else you’re focused on now.
Christopher: What’s interesting is the psychological state of people because there are thousands or hundreds of thousands of people in data and analytics. They have these diverse roles. Their data engineers and scientists and doing BI. When I took over a team in 2005, I have all those people working for me. I would dread going to work in the morning and I had a Blackberry. I drive all the way to work and do not look at my Blackberry because I was afraid of that email saying, “Chris, your data is wrong. Can you fix it?” “Chris you’re late. Why haven’t you got this done?”
It’s not fun to have that happen and to have to go talk to a team, rally the troops, or get pulled out of your weekends and work, and go into meetings and have to say, “No, we can’t do that. That’s going to take a long time.” The psychological state of a lot of people who do data analytics is initially great because they’re excited. They go to class and they’re into it. They find out the reality and they realize it’s hard and they’re not getting any credit. They’re caught between the rock and the hard place.
The rock and the hard place are like the data itself. It is complicated and ever-changing, big and small, and batch and streaming. There are customers who want a lot and expect insight from data to be turned on like tap water. One of the most important things I remember is I worked for this company in 2005 and 2006. The CEO is a Harvard medical school-trained doctor. He knew a lot about healthcare and could think of things in analytics but he wasn’t too technical.
He asked me to do something. I would sit in a room with some data engineers and scientists. We draw on the board and I would say, “Oh.” I’d go back to them all happy as a fresh bunny and say, “David, this will take two weeks. Isn’t that great?” He look at me like I caught up in a patient on the table and I was bleeding, “Chris, I thought that should take two hours and not two weeks.”
That is the truth. Your customers are not jerks. They just don’t know the complexity. They have 5 or 10 more questions and they see so much farther than you do. How do you live in that world where your data providers don’t care that you exist, they are constantly throwing stuff over, and your data customers just want, want, want? No wonder people want a therapist. It’s a very demanding world. I think that it’s possible not to solve it by getting into therapy or burnout or leaving your career. There’s another way to solve it. Eric, you mentioned it. It’s the path that other people have taken. It’s through automation, testing, observability, and building a machine that makes the machine.
Eric: That’s smart and it makes a lot of sense. I’ve seen this myself too. You always have to understand the context. Maybe I’ll bring in Yves Mulkers to comment on this because he’s a practitioner, who’s also an analyst. Yves, you know from experience that every time you go on another consulting gig, you’re going to be dealing with this wide range of experience levels and awareness and knowledge.
It’s difficult to read the room quickly enough to navigate through that situation efficiently because you can say something that’s going to piss off this person or upset that person. The next thing you know, people are throwing mean glares around and that’s no fun. You do have to get through and figure out who knows what and how much and how can you rally the troops, to Christopher’s point, but what do you think, Yves?
Yves: I think that’s the part of the stakeholder management that try to figure that out. If you come on a new assignment, start to get to know the business. It is of major importance trying to understand which insights can you deliver and where to find that data. I was listening with a lot of interest to what Christopher was saying and I heard him telling my story all the time. I’ve been there, did that, done that.
“How much time do you get for this one?” “You got five days to build a complete data warehouse,” then you say, “How am I going to start? I don’t even know anything about the business, how to map the data or where to find it. I know we have this automation tool and I know it’s full of bugs. How does it work?” You start looking at it in a pragmatic way.
It’s still a lot of what I see all over in the past twenty years that I’ve been in data management, doing data engineering and data assembling. We try to reinvent the wheel all the time. It feels like people want to do the work and not look into automating the work and making it smart. I started as a developer. I always said I’m a lazy developer because I want to automate things and make my life easier. All the things that I cannot make, I will automate.
We always try to reinvent the wheel, but people want to do the work but not look into automating it and making it smart. Share on XWhen you come into more data space, there was an extra dimension. That’s the structure and the data in itself that keeps on moving, and getting that aligned with all the codes of what you’re trying to build and develop to move your data around and build the insights. That’s the extra difficulty of what I saw happening.
Now, these days, the data gets versions as well. That’s a lot of improvement but still, we’re not yet there to have an easy way to make a small change on the data concept and then deploy it into production. I see the people coming from data science backgrounds. They are used to working more in codes. You see they’re working more in that DevOps/DataOps way of working. People working with traditional ETL tools and so on are not yet there. They’re still trying to do the things with the old way of building data uplines, of building reports and trying to align to that, and building one package to deliver that insight from the source into the reports or the dashboards, so on and so forth.
I’m still intrigued to see one of the first real DataOps projects coming my way, where you can get people on board. The business, the engineers, the technical people, and the data scientists together to build that one model or that one insight, and then work together from the beginning to the complete deployment, and not where you still need 3 to 6 months before you can deliver.
If I’m getting into a company, they say, “This is our way of working. We have quarterly due releases.” I still look forward to coming into the continuous integration and development of the way of doing things. There’s still a lot of work here to come. I’m looking forward to what DataKitchen can offer in this respect and make my life easier whenever I’m onto some data engineering assignments.
Eric: I’ll bring in Christopher Bergh again from DataKitchen. One of the biggest issues is that you have such a wide range of experience out there. The people with the money oftentimes don’t know. To your point, they don’t have any idea how long these things normally take. People in their team have probably been telling them that’s how long it takes. You have to take a new approach.
I’m reminded of one of my favorite clichés. Well begun is half done if you get your processes right or if you’ve nailed down the protocol for grabbing the data, assessing the data, and having one golden copy from which you pull instead of making replicas of the data. There are all kinds of mistakes that get made all the time, Christopher. It’s important to get that mindset change in the business.
Christopher: It’s helpful to think of it in two buckets of work. You could think of the machine that we’re using to produce data and analytics, but then you can also think of the machine that makes the machine. The assembly line, not the result. Those who’ve been involved in factories and that’s a quote from Elon Musk, who learned a lot about manufacturing. If you look at working on the manufacturing and working on the machine that makes the machine, getting that right, investing time and effort to it, putting your smartest people on, it pays dividends.
Here’s a case. I have a 22-year-old son and surprise, he’s a software engineer. I don’t know why and he just got his first job out of college. I love my son. He’s great but his room was coded in Legos from age 4 to about 12. When he hit puberty, he didn’t want to go into his room at all. He took his first job as a software engineer at Amazon. He deployed code to production within his first week. I’m like, “I wouldn’t trust my son to do that.” Not because he’s not smart and great but a 22 years old, deploying to production and it’s a financial product? How do they do that?
I don’t think it’s because they trust. I don’t think it’s because they’ve got armies of people checking my son’s work, but they’ve done it. They had built a machine around my son. He makes a little change somewhere in a big complicated thing. In some other machine next to it, checks it and says, “That’s right it works. That’s right, you didn’t bust anything else.”
What we need to do is build that machine that makes the machine around our people, our data engineers and scientists so that they know. They’re all well-meaning. Everyone has this problem. I make a small change and it has unintended consequences. How do you know that? I think it’s important that we optimize honestly for 22-year-olds to deliver things of value in a few weeks.
Eric: Christopher, I’ll bring it back in to shed some light and put some meat on the bones here for what benefits you get and why from taking a DataOps approach. In my opinion, data quality is a big one but agility is the most important, and improving morale. You talked about this. I would love to get a copy of this report that you did, talking to data engineers.
Everyone talks about empathy these days. You’ve seen it in the corporate culture like different terms become popular for a while. Empathy is very popular now, but the word is great at demonstrating what empathy means. That’s more important. It seems to me that you would be going a big step in that process by reaching out and talking to these people. Tell us how it improves their lives and how it improves their mood.
The light at the end of the tunnel is your customer going, “Thank you,” and changing their activity because you gave them some insight from data. Share on XChristopher: If you boil it out down, teams are 5 to 10 times more productive and we’ve benchmarked that. We’ve got a bunch of big customers like AstraZeneca and GSK, and smaller customers. It’s amazing how much more productive they are. Why is that? You would think, “People are doing their work fast.” The problem is in data and analytics, the actual work is a small part of their day.
There’s a lot of waste in data and analytics. A lot of waiting, meetings, unneeded documentation, and a lot of context switching. What happens is because it takes so long to build something, you’re not sure if it’s right. You may talk to a customer. You think that they want ten things. You go off for three months and you build those ten things. You deliver back to them and they go, “I want four of those. The other six, not really and here’s five more.”
Those other six are pure waste. The other source of waste is your data providers are constantly giving you crappy data and things are breaking. You’ve got to go back and rework. Rework, wasted effort, and waiting are the biggest things. By building automation or building an automated machine that allows you to deploy quickly, run with low errors, metrics and mantra your system, your productivity goes way up. That has the net effect of making your customers trust the data and get more insight. It also has the net effect of improving team happiness and morale.
I’ve been on the search for this path of working for many years, mainly because it took me a bunch of years to figure it out. When we started with the company years ago, we refuse to work in any other way. We’ve always been a profitable company. As part of that, we’ve done some service work to help grow. We will not work in any other way because it’s awful to build something for many months and find out no one wants it.
It’s so awful to have a data table in production for three months and then find out that the data is wrong. You have to send it to all your users and say, “That data is wrong. We didn’t notice.” Those things happen to people. As a result, that’s why people are unhappy. That’s why people feel stressed. In some ways, they don’t have control over their destiny.
Eric: That’s right because when you’re under the gun and you don’t feel like you have the capacity to solve the problem, that’s when the stress level goes through the roof because you don’t know what to do. You’re afraid to answer phone calls or look at your blackberries like you were saying. Those are all signs that something is very wrong. I always talk about morale as the number one most important metric in any company from a business perspective, but in life as well. If your morale is low, something is wrong and you have to address that something.
I think it’s fascinating that DataOps turns out to be such a powerful construct for solving so many of these tedious, unpleasant, and boring problems. It’s the same old story of automation. For anyone who thinks they have high-quality data, just take a hard look at the database. Download a CSV of whatever database you’re using. Look at it on page one. I guarantee you, you’ll be like, “Look at that. These are a lot of fields. I remember that guy. He’s not around anymore.” It’s all these problems and you’re not going to solve them going one by one through the database fixing it. You need to come up with programs to do that. Right, Christopher?
Christopher: Yeah. If you think about manufacturing, there’s this thing called the Andon Cord. Joe Six-pack can pull the cord and stop the entire assembly line because there’s a problem. Think about that. You’re an hourly worker. You’ve got a multi-billion-dollar company and you can stop production. You’re empowered to do that. Why? It’s because if they do it, you can notice a defect and fix it before it gets into production. That empowers people. It puts them in control.
The opposite of that is people who turtle and they’re trying to survive. They focused on smaller goals. They say, “This is my world. I don’t care about anything else.” That is a problem when you want to make customers successful. That light at the end of the tunnel is your customer going, “Thank you,” and changing their activity because you gave them some insight from data.
Eric: I think a big part of it too, and I’ll throw this over to Yves to comment on, is good morale leads to good results across the board. If people are feeling good about themselves, that’s going to show to your customers. You hear about net promoter scores and how important they are in the business world. It’s because it’s coming from the outside. You’re always going to try to tell the best story from the inside as you go out and speak to the public about your organization.
When other people are saying good things about you and that’s cataloged and distributed in a formal fashion, that’s a pretty powerful barometer to look for. If you find a guarantee to companies that have good net promoter scores or companies that have nailed down their processes and are feeding their people the information that they need, that’s where DataOps comes in handy. What do you think, Yves?
Yves: If you can tie everything in a complete data chain delivery together and make that more swiftly. Nobody puts up your morale. Think about trying to reach out to the people from the network. Try to walk and supports to get access to source systems. Try to get to the people from the infrastructure team to get some more storage or memory because your pipeline is not working.
The faster you can build these solutions and the faster you can see the issues, the more morale you get. Share on XIt all boils up and takes a lot of time and a lot of frustration because you need 4, 5, or 6 steps for that simple thing. You think, “This should work out of the box.” Put on top of that what Christopher was saying that you go away with ten points and you come back at 6 out of the 10 points. You can throw that away and it’s not bringing any value. The faster you can build these solutions and the faster you can see the issues, the more morale you get because you say, “Is it like this? Did I understand what you were trying to get?”
The faster you can do that, the better insights of this. You can get away with your perfect analysis as we did in the waterfall time. Now, it’s trial and error, and the faster you can fail, the faster you can get the results and insights. There are still a lot of things to do to optimize those data pipelines. Especially what I said as well is you need to cross a lot of teams, all those silos, and bring them together. That’s hard labor. You’re only performing 10% to 20% on doing your data engineering work. The rest is aligning and coordinating, and trying to get things done. That’s still a hard struggle in data provisioning.
Eric: Christopher, I’ll bring you back in. One of the signs that we see out there or the leading indicator that we’re getting somewhere in this business and we’re moving past focusing on the nuts and bolts, the speeds and feeds, and the technical underpinnings of what gets done, is we’re talking about data products now. Data products and being able to combine them to do interesting things in healthcare, in risk management, in financial services, whatever the case may be. You want to get past the tedious setup of IT-based stuff. You want the business to focus on business ideas, market movements, and how you can tack into the wind.
Christopher: I think people should work on products and not projects. Success doesn’t mean following the project management methodology that your organization has laid out. That’s not success. Success is a product that somebody uses and gets value out of, full-stop. Unfortunately, I’ve seen organizations that define success as, “I’ve successfully done all the steps required in our PMO. Therefore, I can get my gold star.” Does anyone use it? We don’t know. Did it make any difference in the business? I’m not sure.
Partly, the problem with that is you have to be humble about what you know as an engineer or a scientist. You can’t believe that you are smart. You have to test it by putting it in your hand. As you’ve said, fail fast and learn faster. I think that mantra of customer focus and small value delivery iterations are important. The other mantra is to get something in the bank of value, but make it so that you don’t have to worry about it ever again. Make it so that it tells you while it’s running if it’s wrong.
Once you find value, get it in the bank, but have that bank have an alarm system monitor or something that goes off. Inevitably, a server will go down, a CP will pin, a data provider will change, or something will happen underneath. You shouldn’t have to tend it and take care of it. The other thing that goes wrong with data and analytic teams is you go and build things, and then you’re so afraid to change them because you haven’t put them in the bank yet.
You got to watch it and monitor it so it doesn’t go away. Focus on customer value for iterations. When you get something, put it in the bank and put a security system that alerts it and never touch it again. If you do those two things, it sounds simple but that’s the secret to success because then you’re always focused on what you want to do. It’s to dig into the data. Does it provide value? Maybe it doesn’t. Maybe it does. How can I get more of it? How can I express it? That’s why all of us got into this field because that’s interesting.
Eric: That’s a good point. When I think about it, the ideal scenario here is you want a company with good morale or good culture. How do you get that? You get that by paying attention to morale. Bad morale is an early warning sign for failure. If you’ve got people that are unhappy in your organization, I guarantee, bad things are going to happen. They always happen. People are going to leave or they’re going to stick around and not do a very good job and be unhappy. All that stuff is bad news. You watch out for these leading indicators, which are frustration because what happens? You have friction points. That’s when the system doesn’t work properly. Now I have to go in and manage it. I have to do someone else’s job. I have to call somebody.
To Yves’ point, someone asked me for a new report. They didn’t know there are eight steps that have to take place and five other people are responsible for them for me to get them done. You’re constantly coordinating and angling and trying to align interests. Some of that stuff is good if the morale stays high and you know you’re getting somewhere. There’s nothing more frustrating than not being able to fix the problem and having the problem come up again and again, and being blamed for it again and again.
That’s why people get shy. You want your organization to be where people feel they can pull on that rope and say, “There’s a problem. I see this thing. We don’t want that going out the door,” and you reward people for doing that. That’s a bit of a management issue as well of making sure that your managers are focused on what they’re doing, but have the temerity and humility to admit when they’re wrong.
Christopher: I’m an introvert. I’ve had to learn how to manage people and my wife gets annoyed by this. When we go to a restaurant, I can tell within the first ten minutes if it’s a well-run restaurant. People are happy, things are cleaner, and the menu is not as dirty. I’m like, “Let’s go to this restaurant again. It’s well run.” She’s like, “I don’t care. The food has got to be good.” I’m like, “No, if somebody is running a good restaurant, the food will be good.” Sure enough, it usually is in a well-run restaurant. I’m a fan of well-run organizations.
We’ve been at this for a while. If there are two problems, you got to put it in a bank with an alarm on it and you’ve got to iterate your way to value. You have to do both but I think a lot of organizations have so many alarms going off. They built so many things that they can’t get to delivering customer value. That’s we’re focusing on observability, trying to lessen the number of errors, trying to stop the sirens, going off, make your life day to day and hourly to hourly a little bit more sane is the first step where we’ve seen our customers want to go to.
Fail fast, learn faster. Share on XEveryone likes this idea, “I want to get customer value and fail fast.” Everyone has read the agile and DevOps principles. I think it has become more accepted over the last few years. We’ve written two books and a manifesto. I yakked about these ideas for years but what we’ve learned and our focus in getting people to start is if you’ve got a bunch of bank alarms going off, make sure you solve that first. You don’t have to worry about putting more money in the bank.
Eric: I had mentioned that I came back from a racing event in Europe, which was fantastic. At this moment, I thought of a fun little analogy. I’ll throw it over to Christopher to comment on what happens when you do DataOps. It’s like doing the engineering on your car if you’re in one of these races. If you get a flat tire halfway around the track, then you got to come into the pit and have them change the tires and all that stuff.
That’s when something goes wrong with the data pipeline in our business. Now, instead of racing, staying out in front or catching up on your competition, you’re off on the sideline and you’ve got people frantically trying to fix something so you can get back on the trackway behind where you were before. Christopher, I think that’s a pretty fun analogy. What do you think?
Christopher: Your pit crew is important, how fast they work, how well they work and how fast they can identify problems. If your team is taking two minutes when your competitor is taking one minute to change a tire, who’s going to win a race? Everyone is going to have a broken tire.
Eric: That’s right. It’s just one example but the point is that we want to run our business from the data. We were talking on the break about Snowflake and how they’ve been dominating. They’ve captured the imagination of the analytics world and now they’re focusing on building apps on top of this thing, etc. They’re taking off and they’re doing great. I don’t see that changing anytime soon. One of the reasons they took off is they took this manufacturing style approach to their own data, eating their own dog food, drinking their own champagne, whatever you want to say about it.
It’s very important and it lets you get to the fun stuff. I promise anytime a data pipeline breaks, it’s like a flat tire. It’s like a very unpleasant thing. I got back from Europe. It’s like 1:00 in the morning. We’re about to drive back to our house. Long story short, the battery is dead. I’m like, “The battery is dead.” Why? We left the light on inside the car. There should have been a small alert that said, “Turn this light off before you go to town for a week.” I’m like getting a jump. Anytime you are off the road and not driving and not doing what you want to do, that’s a problem of DataOps not being done properly. Right, Christopher?
Christopher: I think so. There are two things. One is like every race you want to improve your car and have it get better, better fuel, better technology and while in the race if something breaks, you want to notice it as quickly and fix it. Those metaphors apply to data as well. We talked about the challenges that data and analytics have in both because it’s a creative field, just like engineering a race car and an electric race car is a creative field. You want to be able to create. In order to create, you have to change and try out and learn. There are a lot of very frustrated creative people in data and analytics.
They’re frustrated because number one, the process to put anything into production is onerous and slow. Second, when it’s in production, all the alarms are going off. They don’t have time and no one trusts them. I think both those things need to be solved. That’s where if you spend time saying to yourself, “We can solve them. We’re going to have effort in them.” There are principles that have been established in other fields. There are examples from my data and analytic brethren that I can use. There are books I can read, templates I can follow, and software I can buy. You’re much better off. Instead of living passively. You can actively fix the problem.
Eric: It’s a question of processes and best practices. Yves, I’ll bring you back in to comment on this. It does take a little bit of effort upfront to lay the groundwork and begin doing things properly. The bottom line is if you’re going to improve anything you’re doing, you’re going to change something. Either your supplier or your process or your tools or your technologies. Whatever the case is, you’re going to change something.
You wanted that iterative change to happen quickly. You want to be honest about the results. If you can tackle a lot of these data-oriented problems as you know from experience, your morale is going to go up. Your team’s morale is going to go up. Everyone’s going to be a lot better off if you do the hard stuff up front and as I say, well begun is half done. What do you think, Yves?
Yves: It’s about a process and thinking ahead a bit. Thinking ahead helps as well in the experience you have. The things you already saw. There are a lot of things that can go wrong if you do your data. For example, if you have schema changes and your field comes in, how do you tackle that? Either you automate it, or you put it in a tool that anticipates that. Once you give the signals, as Chris was saying, you put it in a process.
I was thinking as well when you’re not using the system, it’s costing you money. I was thinking of a metaphor. If you take a plane, the numbers that they’re not up in the air, that’s costing them money. If you see how a pilot does that, he has a complete checklist. There’s a part where they have the checklist. Even though some things will never happen, they do the check and make sure that the systems will work.
You want to be able to create. To create, you have to change and try out and learn. Share on XWhen they’re up in the air, they have a lot of systems that control their actions as well. I think that’s the way where we need to go with our data management and our data operations as well. It’s amazing to see that there is so much knowledge already with the people working on the data solutions that we keep on reinventing the wheel. It should be simple if you say, “This is what I’m doing. Take the time to document your steps of what you’re doing.”
I’ve learned that the hard way because back in the day, I was trying to automate before even thinking about the process. A guy came up to me and he said, “Why don’t you do it manually so you completely understand all the steps that you’re going through before you start automating it?” That’s the lesson that sticks with me. Import your file manually so you’ll see what is going wrong and what you didn’t think about, then you can automate that in a certain way and go to all the steps instead.
Even as you think well about all the different patterns that you can come along if you’re moving data across the pipeline, what can happen? What should we anticipate and how should we tackle that? If you give it some follow-up upfront, you’re not running behind when it’s in production and you say, “This failed because we don’t have duplicates in our system and we need to fix it. Let’s go back to the source. We don’t have access to the source anymore.”
All those things you can anticipate before you run into troubles at the end of the chain. Mostly, the problem is that you get so little time and people are pushing to get it in production, and you’re taking shortcuts. Unfortunately, if you come into a new project and you inherit the code of somebody that took the shortcuts, that’s the fun project that you’re coming up on. That’s something I want to see more embedded than in a way of DataOps and operating your data solutions and data pipeline.
Eric: I completely agree. We’ve talked about this a couple of times in the show but replications of data, multiple copies of data, and then people do different things downstream of that. You’ve gone catawampus. This is a problem. I’ll throw it over to Christopher. The cloud is such an amazing marshaling area and we are finally moving in leaps and bounds away from the on-prem world of old and into the cloud.
Not that on-prem is ever going to go completely away. I think these data centers are going to stick around for a long time and be repurposed and other things are going to happen. The cloud is a beautiful marshaling area as long as you can handle the concurrency, the quality, the speed, the efficiency, etc. It’s a great place to architect around. What do you think, Christopher?
Christopher: I think it is. It’s a better set of tools. Its virtualization and the costs make it a lot easier to do creative things, but you can also recreate the same problems you had in a different world. I’ve seen organizations spend 400 engineers spending two years building a cloud platform without a customer ever getting any insights out of it.
It’s just tools. You can use those tools in a good way or a bad way. We realize that it’s not about the tool and it’s about the team. It’s about your customer. It’s about the system that those customers and teams work in. The shift from one tool to the other, if it changes suddenly to cloud ten years from now, it won’t make any difference. That will be another toolchain, another time, and we’ll recreate the same problem. There’s an opportunity, but there’s an opportunity that some organizations are taking.
Eric: I always look at these old expressions and maybe I’ll close on this one. There’s an old expression, “Throw good money after bad.” What it means is you should have learned a lesson at some point and stop that foolish process because that was the bad money that you throw away. Once you’ve learned it, if you’re still making those mistakes, now you’re throwing good money after bad. It shows that you’re not learning your lessons. Trust me, those lessons will be learned for you in a very unpleasant way.
We see that happening now. This is not new but it seems to happen a lot faster. I think the half-life of business ideas of organizational structures is a lot shorter than it used to be. You have to stay on top of that and have to know how to modify your organization. Bring in new people, bring in new processes, look at other places for data and it’s always going to be about automation and intelligent automation. As Christopher said, get the money, put it in the bank, put the alarms on it, and make sure it doesn’t get messed up, then go out and find some more money. Folks, thanks so much for your time and effort. Send me an email at [email protected].
Important Links
- InsightAnalysis.com
- DataKitchen
- NASA
- Microsoft for Startups
- [email protected]
- Yves Mulkers – LinkedIn
About Yves Mulkers
Yves is a Data strategist, specialised in Data Integration.
He has a wide focus on business performance and is a domain expert in Data Management.
Yves helps companies build their vision, strategy and roadmap to data Walhalla.
Yves is recognised as a top10 influencer in Bigdata, AI, Cloud and digital transformation and brings all his knowledge on emerging technologies and capabilities into the game on both client side and on B2B marketing strategy for Data Brands.
°°°°°°°°°°°°°°°
buzz SECTION for the algo’s:
Data Strategy, Data Vision, Data Architect, Data modeling, #Datamanagement, Data Platform, futurist, visionair, analyst, industry analyst, digital transformation, Influencer, social influencer, thought leader
Cloud, AI, Artificial Intelligence, Machine Learning, Graph technology,
DAG, Data Science, ETl, ELT, Chat Bots, Automation, Data Analysis, RPA
Knowledge graphs, kimball, integration, datawarehouse, datalake, lakehouse
Azure, aws, GCP, Snowflake
SQL, Python, noSQL, API, IoT, 5G
Oracle, Microsoft, IBM, SingleStore, Huawei, SAP, AtScale, Thoughtspot, Nutanix,
Influencer Marketing, Content Marketing, Content Strategy, Creative content, whitepaper, webinar, fireside chat, twitter chat
Public Speaker, Host, moderator
About Christopher Bergh
Christopher Bergh is the CEO and Head Chef at DataKitchen. Chris has more than 30 years of research, software engineering, data analytics, and executive management experience. At various points in his career, he has been a COO, CTO, VP, and Director of engineering. Chris has an M.S. from Columbia University and a B.S. from the University of Wisconsin-Madison.
Chris is a recognized expert on DataOps. He is the co-author of the ‘DataOps Cookbook” and the “DataOps Manifesto,” and a speaker on DataOps at many industry conferences and with other media outlets. Chris began his career at the Massachusetts Institute of Technology’s Lincoln Laboratory and NASA Ames Research Center. There he created software and algorithms that provided aircraft arrival optimization at several major airports in the United States. Chris served as a Peace Corps Volunteer Math Teacher in Botswana, Africa.
automation data analytics data assembling data engineering Data Management dataops