Much has been written about Data Quality (DQ) in the broader context of Data / Information Management and most of the practitioners can recite it’s dimensions (accuracy, completeness, timeliness, uniqueness, consistency, timeliness etc.), DQ assessment / profiling, and step-by-step approach to enhancing DQ. Unlike, Data Governance though, there hasn’t been much about Data Quality Framework though. Any practice about Data Governance starts with a Data Governance framework and how to put that together. This begs a question about why not Data Quality framework?
First, let’s understand what a framework is. A framework is ‘a structure underlying a system, concept, or text’. Given that definition, shouldn’t there be a ‘structure’ for Data Quality around which a comprehensive program is put together? At Digital Transformation Pro, we have been working with a Data Quality framework as a starting point with our clients given that DQ is an end-to-end process.
Data Quality Framework
Our framework draws upon Six Sigma methodology, Define, Measure, Analyze, Design/Improve, and Verify/Control and System Development Life Cycle components Plan, Analyze, Design, Build, Test, Deploy and Maintain (as mentioned in Data Management Body Of Knowledge – DMBOK). The main components of this framework are: Plan, Assess, Analyze, Pilot, Deploy, and Maintain. Let us break down these various components.
Planning (or designing) phase consists of defining scope & business need, identifying stakeholders, clarifying business rules for data, and identifying business processes. The outcome of the planning phase should clearly communicate to relevant senior management as well as other stakeholders the objectives of the DQ work.
This phase measures the existing data with respect to business policies, data standards, and business practices. Profiling is a key component of this phase and of course a lot has been written about profiling & assessment.
Typically, we use both quantitative and qualitative analytical techniques to do gap analysis of where the data quality should be based on what’s defined in planning phase and where the data quality actually is.
There may be variations in how different organizations deal with Pilot and Deploy phases but we recommend a Piloting phase to focus on specific actions needed to improve the data quality. Piloting phase might also identify any business processes that need to be adjusted to improve data quality on a sustaining basis.
Based on the outcomes of pilot phase, Deploy phase should focus on both business and technical solutions to improve data quality. The tendency of many organizations is to focus on technical solutions only and ignore business solutions but in our opinion, it is a major mistake.
It is very important to make sure that processes and control mechanisms should be put in place to maintain the data quality efforts on an ongoing basis. Data Governance will play an important role in making sure that data quality is maintained for a sustaining program.
We’d like to hear from you on this framework in general and the components of our framework in particular.
Check out our curated training on Data Quality here.