Here There Be Monsters: The Value of Data Profiling
In the ever-growing sea of data that drives business, research, and opportunity today, there be monsters.
Monsters and monstrous creatures appeared frequently on medieval and Renaissance maps to identify the unknown dangers of the sea. As technology improved our ability to explore and chart the world, these monsters were replaced by the solid, reassuring lines of known landforms and safe paths for travel and commerce.
Likewise, the data profiles for an organization identify the known and unknown points within its data. Data grows and moves so fast that a perfect storm of conflicting information, quality issues, and poor usage can wipe out months of work or erode years of reputation-building in a single afternoon.
A robust data-profiling strategy can prevent tremendous amounts of damage and rework. It can provide a more accurate picture of an organization’s data systems and shine light into the pools of dark data to find risks before they become full-blown monsters.
Technology improvements have expanded our ability to explore and plot the increasingly diverse pools and streams of data. Profiling tools allow operational gaps and risks to be identified in living data sets, before they become a maintenance nightmare. Automation of common profiling tasks support the continuous regression needed for DevOps. And data exploration and analysis drive insights that, in turn, drive decision-making. Predictive analytics would be impossible—or at least unreliable—without a good data profile.
A comprehensive data profile bridges the gap between technical and business roles, allowing for faster communication of obstacles and more effective collaboration on opportunities for cost savings or new revenue. Mapping the measurements from profiling activity to business classifications facilitates proactive dashboarding and alert mechanisms, which show the business impact of changes that may be detectable only within the data itself. Product and process opportunities also emerge from the weaknesses detected by data profiling. Insights that strengthen input validation and gathering mechanisms can save costs and improve customer experience for the data consumer.
Developing and maintaining a data profile is not confined to a single role within an organization. It is instead a practice through which the mutually supporting roles of data quality, data governance, software development, operational support, security, risk management, and product development can share knowledge to the benefit of all parties.
Create a strong, resilient data profile in your company to reduce uncertainties and illuminate the places where monsters could be lurking.
Shauna Ayers and Catherine Cruz Agosto are presenting the session Uncover Untold Stories in Your Data: A Deep Dive on Data Profiling at STAREAST 2016.