DataOps is a set of practices, processes and technologies that combines an integrated and process-oriented perspective on data with automation and methods from agile software engineering to improve quality, speed, and collaboration and promote a culture of continuous improvement in the area of data analytics.[1] While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics.[2] DataOps applies to the entire data lifecycle[3] from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.[4]
DataOps incorporates the Agile methodology to shorten the cycle time of analytics development in alignment with business goals. [3]
DevOps focuses on continuous delivery by leveraging on-demand IT resources and by automating test and deployment of software. This merging of software development and IT operations has improved velocity, quality, predictability and scale of software engineering and deployment. Borrowing methods from DevOps, DataOps seeks to bring these same improvements to data analytics.[4]
DataOps utilizes statistical process control (SPC) to monitor and control the data analytics pipeline. With SPC in place, the data flowing through an operational system is constantly monitored and verified to be working. If an anomaly occurs, the data analytics team can be notified through an automated alert.[5]
DataOps is not tied to a particular technology, architecture, tool, language or framework. Tools that support DataOps promote collaboration, orchestration, quality, security, access and ease of use.[6]
:0
was invoked but never defined (see the help page).