If you work in an IT department, you probably don’t think of data operations, or the optimization of data management, as a key part of your job. Sure, you might be responsible for keeping some MySQL or SQL Server databases up and running. And you may help maintain the underlying infrastructure that supports your organization’s data operations. But actually processing, managing and analyzing data itself probably does not fall within the purview of your job responsibilities, and you don’t think often about how to optimize data-related workflows. That may be changing. The so-called DataOps movement is placing new demands on IT departments when it comes to planning, automating and optimizing the management of data. As a result, the IT jobs of the future are likely to require more expertise in data management and analysis.
Here’s a primer on what DataOps means and how it’s changing the nature of traditional IT jobs.
What Is DataOps?
Like DevOps, DataOps, which was first conceptualized in a 2014 IBM blog post, lacks a singular and concrete definition. There is a “DataOps Manifesto” (modeled very explicitly on the “Manifesto for Agile Software Development”) that describes various principles associated with DataOps, but it doesn’t really define what DataOps actually is. Instead, it explains how people who work with data should think and behave in order to maximize the efficiency of their work.
In general, the core concepts of the “DataOps Manifesto” are similar to those associated with the DevOps movement. Like DevOps, DataOps prioritizes automation, communication, continuous feedback and collaboration between different individuals and teams.
DataOps and the IT Department
The DataOps movement is designed first and foremost to improve the work done by data scientists or data engineers--in other words, people whose main job is to manage and analyze data.
As noted above, most IT engineers don’t fall into this category. Their job is to support the systems and software required to enable other employees, including data scientists, to do their jobs.
But that does not mean that IT pros shouldn’t be paying attention to the DataOps movement. Because DataOps encourages data scientists to rethink how they approach data management and analytics, it also places demands on IT departments to deliver new levels of efficiency and optimization when it comes to data.
Those demands may include:
- More flexible data infrastructure. Advocates of DataOps believe that data engineers should be able to pick and choose which types of data infrastructure work best. They should not have to use only the cloud (which the seminal DataOps post from IBM criticized as a potential bottleneck for data operations) or only on-premise infrastructure. They instead expect IT departments to supply them with choice, and to be ready to support whichever type of infrastructure (or combination of infrastructure) they choose.
- Tooling choice. Data engineers who embrace DataOps also believe they should have maximum choice when it comes to which tools they use. This means that IT departments must be prepared to support whichever tools data engineers deem necessary. Gone are the days when IT could say to data engineers, “This is the database [or operating system, or environment configuration, or whatever] that we support, and you need to find data processing tools that work with it.”
- Data automation. DataOps is all about automating as many data-related tasks as possible. For IT departments, this means that enabling DataOps requires eliminating as many manual processes and handoffs as possible when it comes to data operations. Data engineers don’t want to wait for you to export a database manually, or provision new data infrastructure by hand. They expect these things to happen automatically.
- Speed, speed, speed. DataOps believers prioritize speed. They expect the IT department to minimize the time it takes to move data from one location to another, or to transform data between different formats. Achieving speed in data operations requires careful architectural planning (for example, designing infrastructure in ways that minimize the amount of data transfer over the network), as well as smart decisions about which types of tools to adopt. (Some databases will be faster than others for certain use cases, for instance.)
- Feedback and communication. IT engineers and data scientists don’t have a history of working closely together. DataOps changes that. To do DataOps, both groups need to communicate effectively in order to keep each other aware of upcoming needs and expectations, as well as to provide feedback about what’s not working so that the other group can take steps to address it. IT can’t solve the problems of data engineers if it doesn’t know about them, and data engineers can’t make the jobs of their colleagues in IT more efficient if they never talk to them.
Obviously, the specific requirements that DataOps places on IT departments will vary from organization to organization, and the above is not an exhaustive list. But it provides a sense of how DataOps is poised to change part of the nature of the work performed by IT departments.
The bottom line: If you thought that working in IT meant you didn’t have to think much about data operations beyond setting up some databases and running backup scripts, think again. DataOps is a thing, and it means that IT engineers have to meet new expectations for supporting and optimizing data operations.