Alex Woodie, managing editor of Datanami (a big data news portal for emerging trends and solutions) recently wrote of a report by Monte Carlo. The report outlines how the quality of data is decreasing despite the availability of better tools and technology.
Data is a valuable resource that powers digital commerce. Data engineers create pathways to transport data from its origin to its intended location. However, maintaining these pathways can be challenging due to various issues such as data schema changes, data drift, data freshness problems, sensor malfunctions, and human error.
Woodie outlines a survey conducted by Monte Carlo in March 2023, consisting of 200 data decision-makers that revealed an increase in the number of data incidents. The study found that the percentage of respondents spending four or more hours resolving data issues increased from approximately 45% in 2022 to about 60% in 2023. The survey also discovered that IT team members are identifying data problems less often.
The report discusses the concept of data downtime, a new concept that Monte Carlo explores by measuring three key metrics: time-to-detection, time-to-resolution, and the number of issues over time. Keeping track of the raw number of issues is important, and Monte Carlo recommends enhancing communication and visibility within teams to reduce the frequency of data issues.
As businesses increase their use of database tables and models, the complexity of the data pipeline grows substantially due to the need to correlate each table or model with all the others. To accommodate such complexity and to avoid quality errors, having well-defined roles and effective communication channels is recommended now more than ever.
Read more about the survey here.