The cornerstone of any effective Data Strategy is high-quality data. But what is Data Quality, and why bother measuring it in the first place?
Let’s start with the basics. Data Quality refers to whether or not a particular data element is fit for purpose. If a particular data point can achieve the goals you’ve set for it, then it’s considered high quality.
You might be tempted to think of Data Quality wholly in terms of accuracy, completeness, reliability, or timeliness—and those are all key, easily measurable parts of it—but Data Quality is about relevance as well.
Even a flawless, error-free data set can still fail to perform the task or function it’s been assigned. In that case, it’s still not fit for purpose—it’s still not high-quality data. Keep this strategic aspect of Data Quality in mind.
So now that we know what Data Quality is, why should we bother assessing it? The short answer is that Data Quality can make or break even the most bulletproof Data Strategy. Without ensuring that your company data remains relevant, up-to-date and totally reliable, you risk making business-critical decisions on completely faulty assumptions.
Internal processes could fall apart, customer relationships could suffer, and time and money could be wasted on putting out fires instead of generating value. In other words, the company’s long-term corporate strategies and business goals could be put in jeopardy.
Clearly, keeping close tabs on Data Quality is a must-do.
But measuring Data Quality isn’t all about warding off catastrophe. Do it right and you’ll see data prep times plummet, customer relationships blossom and cross-functional communications dramatically improve.
Ok, so measuring Data Quality is worthwhile and rewarding—but where to begin?
Combing through your company databases to manually highlight duplicates, missing fields and redundant formats—to say nothing of actually standardising or correcting those errors—could literally take a lifetime.
To get the job done you’re going to want a tool.
What to look for in a Data Quality Tool
Data Quality tools can dramatically speed up the process of finding and cleaning up your company’s dirty data. To do this, most Data Quality tools will be able to perform a similar range of basic functions, such as profiling, parsing, standardising, cleansing, matching, monitoring and nowadays even enriching your data.
When looking for just the right Data Quality tool, you’ll want to be especially mindful of three things:
- Your company’s specific Data Quality challenges. This one is a no-brainer; make sure you settle on a tool that can actually tackle your company’s specific needs—that means compatibility with your existing systems as well as capacity to perform the required functions.
- The strengths and weaknesses of your chosen tool. You don’t want to depend too heavily on a tool that just wasn’t designed to perform the tasks that you’ll most often need it to perform. You’ll also want to be careful not to settle on a tool that’s overly complex or beyond your IT department’s ability to operate. After all, there’s no point investing in a flashy new tool if it’s not going to be put to use.
- What tools can and can not do. This last point will take some explaining. A Data Quality assessment might reveal organisational or strategy related issues way beyond the capacity of any tool to fix. For example, the presence of excessive Data Quality errors might suggest a lack of ownership or accountability over your company data. You might discover a disregard for privacy or security processes—such as if certain private data sets aren’t being displayed in the proper format. You could even reveal a disconnect between your company’s IT and business strategies—such as if potentially valuable data is being misrepresented, poorly maintained or simply overlooked. In all these cases a Data Quality tool may be able to help you diagnose the problem, but you’ll need to invest in your people and your processes to develop a solution, not just your tech.
Now that you know what to watch out for, here are 5 top-notch Data Quality tools to help you get the job done.
Ataccama’s Data Quality product, ‘Ataccama ONE,’ is described as a ‘self-driving Data Management & governance platform for digital transformation.’ That may sound like a mouthful of buzzwords, but Ataccama’s Data Quality tool has been well received by its roughly 330 active customers.
Ataccama ONE scores particularly well on multidomain functionality and integration with other master Data Management tools, so there’s a good chance it’ll work well with your existing systems. Plus, a free trial license gives prospective buyers a chance to see whether or not the tool works for them before they buy. Not bad.
Be aware, customer reviews frequently refer to “difficulty of implementation and use” as a top concern with Ataccama ONE. If your teams aren’t confident in learning the ropes of a new tool, Ataccama may not be the Data Quality tool for you.
Experian’s recently launched Aperture Data Studio offers an intuitive new user interface with enhanced Data Quality functions and customisable processing rules—but its real strength comes from its overall business friendliness.
Experian’s new Data Quality tool excels at providing business-focused workflows and management-friendly visualisations. Its data matching functions use machine learning to automatically link duplicate records in your database, getting you straight to a single customer view, while an overhauled data enrichment function generates novel insights about customer needs—giving you a leg up on timely, targeted messaging.
The depth of rules Aperture can accommodate and the speed of its results provide quick, relatable intel on the shape of your data—just the thing for reducing friction and engaging business users.
The only trouble customers report is with migrating over to Aperture Data Studio from the older Experian Pandora platform. New users have nothing to fear, and old-time customers can rest easy thanks to freshly expanded technical support options.
Informatica boasts an impressive list of partners and a variety of Data Quality products, including Informatica Data Quality (IDQ) and Big Data Quality. Across the board, these tools excel in terms of sheer technical innovation.
Informatica’s Data Quality platform uses metadata-driven machine learning to identify domain consistency and errors. This allows the platform’s algorithms to learn from human Data Stewards, theoretically increasing productivity and automating a wide range of tasks. Impressive stuff.
Unfortunately, Informatica users often complain about ease-of-use, citing the difficulty of creating rules and dashboards as a major drawback—especially for business users. Coupled with a steep price tag and a complicated licensing model, all but the most tech-savvy tool-seekers should consider looking elsewhere.
Precisely Trillium, previously known as Syncsort Trillium, provides not just one but six different Data Quality applications, including Trillium DQ, Trillium Global Locator, Trillium Cloud, Trillium Quality for Big Data, Trillium Quality for SAP and Trillium Quality for Dynamics.
Precisely Trillium’s strength lies in the variety of specialised functions and deployment opportunities covered by these various tools. Trillium DQ alone supports more than 200 countries and territories and can integrate into a range of system architectures, so compatibility is rarely an issue.
Meanwhile, Trillium Quality for Big Data leverages machine learning to cleanse and optimise hefty corporate data lakes, while Trillium Cloud offers solid Data Quality solutions for public, private and hybrid cloud platforms.
Much like Informatica, Precisely Trillium’s downsides have to do with ease-of-use, with customer reviews frequently citing challenging interfaces and a complex installation process as top hurdles. The complex operating processes might not be a problem for tech-savvy PHDs, but expect some extra translation work if you intend to share your results with the rest of the business. Thankfully, Precisely Trillium’s customer support is considered top of the line, so the steep learning curve doesn’t have to be a definite deal-breaker.
Talend has a reputation for user-friendly Data Quality tools and recently launched a new line of metaData Management solutions for data lakes and big data projects. Good news if your organisation is looking to clean up a hefty database.
Talend’s strong suits are without a doubt its ease of use and simple start-up process, both of which are certified customer favorites. Another perk is Talend’s remarkably active open-source user community—a hugely useful resource for anyone struggling to learn the ropes.
Unfortunately, what Talend gains from its helpful community, it loses in terms of formal technical support. Customers cite technical issues with monitoring, reporting and scheduling functions—as well as patchy support services— as serious obstacles to success.
Another frequently cited technical snafu involves the limited number of rows that Talend’s Data Quality application can profile at a given time—capping off at around 10,000 rows. If you’re dealing with millions of rows of data at one time, you may struggle to find the full picture with Talend.
Is a tool really the solution?
While investing in a powerful Data Quality tool can seem like a fix-all solution to your company’s dirty data, the truth is that tools are just a small part of a larger picture.
You should always prioritise developing your personnel and internal processes before investing in tools.
Be sure to align your Data Governance policies with corporate strategy, identify concrete business objectives and clarify roles and responsibilities before you begin shopping for software. After all, what good is high quality data if you won’t be able to put it use?
Data Quality tools should only be seen as a means of empowering your Data Stewards to achieve their goals more efficiently, not as a substitute for a solid Data Strategy. We advise waiting until you’re above a level 3 in the Data Maturity Assessment to begin browsing for Data Quality tools.
Still have questions about which Data Quality solution is right for you? Need help drafting RFP requirements for a new Data Management system? Want to learn more about Data Quality, Data Maturity, or the bigger picture of Data Governance? Not sure if your organisation is even ready for Data Governance in the first place? Cognopia is here to help.
With free Data Maturity assessments and a range of bespoke consulting services, Cognopia’s team of Data Governance experts will help you craft a bulletproof Data Strategy unique to your organisation’s needs.
Schedule a chat here and find out what Cognopia can do for you.