Caterina Constantinescu - Data Validation in R: From Principles to Tools and Packages [Remote]
Talk delivered October 21, 2020. Visit https://nyhackr.org/ to learn more and follow https://twitter.com/nyhackr. Don't forget to join the Government & Public Sector R Conference Dec 2-4, 2020 at rstats.ai/gov. Much like our recent NYR, it is virtual, so anyone around the world can attend. Visit https://rstats.ai/gov/ to learn more and use code nyhackr for a 20% discount on tickets.
About the Talk:
Although data cleaning is a frequent topic of conversation (and commiseration) in the world of data science, data validation---somewhat surprisingly---is discussed relatively less often. So in this talk, data validation will take centre stage, as we take a look at what it is (and is not), as well as some guiding principles, best practices and overall criteria to assess/ensure data validity. The talk will also cover several R packages aimed at this precise topic, for instance: {validate}, {assertr} and {ensurer}, as well as other related packages or functions, with examples provided as we go along. By the end of this talk, the aim is to have provided an overview on the principles and tools in this area, while highlighting the importance of the topic itself.
About Caterina:
Dr. Caterina Constantinescu is a data scientist working at Tesco Bank, whose past work ranges across areas such as research methods, national health data, occupational therapy, transport and data for good. Her academic background prior to this involved researching if various emotion-generating stimuli used in lab settings could approximate emotional states occurring in daily life. For several years she was also the organiser of the R meetup in Edinburgh (EdinbR), followed by organising the DataTech conference in 2019. Currently, her work focuses on writing Shiny apps that support data-driven decision-making across the bank.
Where to find Caterina:
https://datapowered.io
https://twitter.com/c__constantine
https://github.com/CaterinaC
Thank you EcoHealth Alliance (https://www.ecohealthalliance.org/) for providing the Zoom link.
Видео Caterina Constantinescu - Data Validation in R: From Principles to Tools and Packages [Remote] канала Lander Analytics
About the Talk:
Although data cleaning is a frequent topic of conversation (and commiseration) in the world of data science, data validation---somewhat surprisingly---is discussed relatively less often. So in this talk, data validation will take centre stage, as we take a look at what it is (and is not), as well as some guiding principles, best practices and overall criteria to assess/ensure data validity. The talk will also cover several R packages aimed at this precise topic, for instance: {validate}, {assertr} and {ensurer}, as well as other related packages or functions, with examples provided as we go along. By the end of this talk, the aim is to have provided an overview on the principles and tools in this area, while highlighting the importance of the topic itself.
About Caterina:
Dr. Caterina Constantinescu is a data scientist working at Tesco Bank, whose past work ranges across areas such as research methods, national health data, occupational therapy, transport and data for good. Her academic background prior to this involved researching if various emotion-generating stimuli used in lab settings could approximate emotional states occurring in daily life. For several years she was also the organiser of the R meetup in Edinburgh (EdinbR), followed by organising the DataTech conference in 2019. Currently, her work focuses on writing Shiny apps that support data-driven decision-making across the bank.
Where to find Caterina:
https://datapowered.io
https://twitter.com/c__constantine
https://github.com/CaterinaC
Thank you EcoHealth Alliance (https://www.ecohealthalliance.org/) for providing the Zoom link.
Видео Caterina Constantinescu - Data Validation in R: From Principles to Tools and Packages [Remote] канала Lander Analytics
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![David Robinson - The {widyr} Package](https://i.ytimg.com/vi/mApnx5NJwQA/default.jpg)
![Validation Rules in Salesforce | How to create it to restrict users from entering incorrect data](https://i.ytimg.com/vi/5-fEWo6b34U/default.jpg)
![How to Write Production-Ready R Code: Tools and Patterns](https://i.ytimg.com/vi/U1-j7c_8LFQ/default.jpg)
![Wes McKinney and Dr. Neal Richardson - Speeding Up Data Access in R with Apache Arrow](https://i.ytimg.com/vi/p87NFQbTP-A/default.jpg)
![Choosing which statistical test to use - statistics help](https://i.ytimg.com/vi/rulIUAN0U3w/default.jpg)
![Will Landau - Reproducible Computation at Scale in R with Targets [Remote]](https://i.ytimg.com/vi/Gqn7Xn4d5NI/default.jpg)
![David Smith - MLOps with R: An End-to-End Process for Building Machine Learning Applications](https://i.ytimg.com/vi/hCCZZyHz-ko/default.jpg)
![Camelia Hssaine - What to do When You Can't A/B test: Exploring Different Causal Inference Methods](https://i.ytimg.com/vi/sLks6_iurSY/default.jpg)
![Henrik Bengtsson - Future - Simple, Friendly Parallel Processing for R [Remote]](https://i.ytimg.com/vi/2ZlpFkFMy7E/default.jpg)
![How to Become a Data Analyst in 2020](https://i.ytimg.com/vi/5HlbV1wKBmo/default.jpg)
![Dr. Sebastian Teran Hidalgo - Doubly Robust Estimation of Causal Effects in R](https://i.ytimg.com/vi/5rSTEzp_n48/default.jpg)
![David Robinson - Ten Tremendous Tricks in the Tidyverse](https://i.ytimg.com/vi/NDHSBUN_rVU/default.jpg)
![Academic Writing in Markdown](https://i.ytimg.com/vi/hpAJMSS8pvs/default.jpg)
![The Immune System Explained I – Bacteria Infection](https://i.ytimg.com/vi/zQGOcOUBi6s/default.jpg)
![Dr. Max Kuhn - Resampling, Repeated Measures Designs, and You](https://i.ytimg.com/vi/4DGrehpMx4k/default.jpg)
![Deep Learning vs Machine Learning in R](https://i.ytimg.com/vi/TX2qgtGt3r8/default.jpg)
![Ludmila Janda - Using R to Better Understand R-Ladies NYC](https://i.ytimg.com/vi/Ix8OKC8UJbA/default.jpg)
![How Knowledge is Power in Nutrition | Dr. Wendy Pogozelski | TEDxSUNYGeneseo](https://i.ytimg.com/vi/WIebxoTx408/default.jpg)
![Dr. Erin LeDell - Scalable Automatic Machine Learning in R](https://i.ytimg.com/vi/zHQPLTDddMg/default.jpg)
![Catherine Zhou - Why NYR? Finding You(R) Community](https://i.ytimg.com/vi/Fy9tMfjs9F8/default.jpg)