Measuring Data Quality and Data Stewardship - April 2023 Meetup
Measuring Data Quality and Data Stewardship
Storage is cheap, processing is flexible, and data is a hidden trove of value increasingly used to drive business decisions and priorities. We generate it by the terabyte. We copy it from database to database, transform it from NoSQL to Relational data to graphs and charts. We ingest it from customers and reflect it back to them with augmentations and additions our Sales departments have promised are just what is needed to drive them to the next level. Yet whether it’s an unstructured data lake or a decades-old dusty schema everyone eventually comes to a point where they realize that a lot of this data is possibly wrong, missing, or maybe just useless junk.
How do you measure the quality of your data? What are the actual metrics that can be used to measure data quality to ensure confidence in your decision-making process? What does data quality even mean?
Data Quality is a comparison of the actual state of a particular set of data to a desired state.
For data quality to be measured you need a standard of comparison. Within a given data set you can compare that data with itself or statistically with similar data sets. Between copies of data you can compare the source data to the target data while compensating for business rules that transform or selectively filter data between the two.
Each of those methods has a number of data quality measurements that can be used as components of total data quality such as:
Consistency
Completeness
Validity
Exactness
Aptronymity
Uniqueness
Further we can elaborate on practices and cultures that encourage data quality or that enable poor quality to sneak in. Data Stewardship is a great start for creating an ownership of data and connection with its purpose and uses. Data standards enable development teams to build data quality in from the get go with practices like rigorous input validation or periodic data surveys.
If your business and customers depend on data you need date you can depend on.
Nick Bonnichsen
Nick Bonnichsen is a Software QA Engineer with over 20+ years of experience, some of it actually worthwhile. He has focused on data and data centric testing for the last 8 years or so having worked in various parts of testing and deciding to stay the heck away from UIs whenever possible. In his free time Nick enjoys live music, overly complicated video games, poor attempts at woodworking, international travel, and the occasion backpacking trip.
Видео Measuring Data Quality and Data Stewardship - April 2023 Meetup канала PNSQC
Storage is cheap, processing is flexible, and data is a hidden trove of value increasingly used to drive business decisions and priorities. We generate it by the terabyte. We copy it from database to database, transform it from NoSQL to Relational data to graphs and charts. We ingest it from customers and reflect it back to them with augmentations and additions our Sales departments have promised are just what is needed to drive them to the next level. Yet whether it’s an unstructured data lake or a decades-old dusty schema everyone eventually comes to a point where they realize that a lot of this data is possibly wrong, missing, or maybe just useless junk.
How do you measure the quality of your data? What are the actual metrics that can be used to measure data quality to ensure confidence in your decision-making process? What does data quality even mean?
Data Quality is a comparison of the actual state of a particular set of data to a desired state.
For data quality to be measured you need a standard of comparison. Within a given data set you can compare that data with itself or statistically with similar data sets. Between copies of data you can compare the source data to the target data while compensating for business rules that transform or selectively filter data between the two.
Each of those methods has a number of data quality measurements that can be used as components of total data quality such as:
Consistency
Completeness
Validity
Exactness
Aptronymity
Uniqueness
Further we can elaborate on practices and cultures that encourage data quality or that enable poor quality to sneak in. Data Stewardship is a great start for creating an ownership of data and connection with its purpose and uses. Data standards enable development teams to build data quality in from the get go with practices like rigorous input validation or periodic data surveys.
If your business and customers depend on data you need date you can depend on.
Nick Bonnichsen
Nick Bonnichsen is a Software QA Engineer with over 20+ years of experience, some of it actually worthwhile. He has focused on data and data centric testing for the last 8 years or so having worked in various parts of testing and deciding to stay the heck away from UIs whenever possible. In his free time Nick enjoys live music, overly complicated video games, poor attempts at woodworking, international travel, and the occasion backpacking trip.
Видео Measuring Data Quality and Data Stewardship - April 2023 Meetup канала PNSQC
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Agile Team Size - It Makes Huge Difference w/Michael MahPNSQC2015 - Ken Pugh05 End to End Quality with the Sonar Ecosystem and the Water Leak Metaphor G Ann Campbell, SonarSouWeb Application Attack Surface - Measurement and ImplementationSoft Skills a Tester Should Have - Mesut DurukalINVITED SPEAKER How Testing Strategy can Increase Developer Efficiency and Effectiveness Brian OkkenAgile Risk Management in the Large Enterprise | 2019 Webinar SeriesInformation Security -- Practices and Trends in Agile Enterprises - PNSQC WebinarKEYNOTE Who Owns Quality in Agile - Katy Sherman, Premier, IncPNSQC 2013 - Douglas HoffmanPNSQC President Brian Gaudreau - Call for Software Quality ProposalsPNSQC2016 Submit Your Abstract and Be Part of Software Quality HistoryKEYNOTE Cultivating a Champion Mindset to Dramatically Improve Your Life, Darlene Bennett GreeneTest Architectures and Support Environments for IoT - Jon Hagar, Grand Software TestingPNSQC2021: Ritu Walia - QA Best Practices - GUI Test Automation For EDA SoftwareLightning Talk - Q & A10 Embedding Security in Product Lifecycle Arvind Srinivasa Babu, McAfee LLC & Deepti Chauhan, McAfe06 - From 3 to 1 Easier Said Than Done with Shiva Srinivasan02 Influencing Change Levi Siebens, Vertafore04 Building a Customer Quality Dashboard – A Case Study John Ruberto, First DataSecurity Metrics with Caroline Wong