Let’s face it – if data quality were easy, everyone would have good data, and it wouldn’t be such a hot topic. Yet despite all the tools and advice out there, selecting and implementing a comprehensive data quality solution still presents some hefty challenges.
- Secure Advocates for Data Quality Initiatives
One of the top challenges for data quality initiatives is securing the interest of business stakeholders. This doesn’t just mean C-level executives; stakeholders for data quality can be anyone from department managers to end-users within the company who face the consequences of bad data on a regular basis.
For example, Marketing relies on clean and accurate data to reach the correct audience and maintain a positive brand image. Customer Service depends on that data to respond to customer queries in a timely and accurate manner. Other departments like Finance, Logistics, or Manufacturing, all leverage that same data for effective operations and to feed future decisions.
When it comes to obtaining business buy-in, consider how the organization is using (or could be using) the data they already own more effectively, and seek input from the relevant team members.
2. Opt for Solution-Specific Software
While appealing, “umbrella” data management solutions that promise a one-stop shop approach to contact data often fall short when it comes to truly cleansing customer and business data. But when it comes to the complexities involved in contact data, most companies need an approach fine-tuned to match, de-dupe, and throw away records that can do so accurately, safely, and confidently. That means understanding the difference between things like match quality vs. match quantity, phonetic typing errors, record linking, address validation, and canonicalization.
3. Avoid Costly Builds
Often when companies look into a contact data quality solution, someone is bound to float the notion of developing a data cleansing solution in-house. While these in-house solutions appear to have their advantages at first pass – saving the company from onboarding yet another platform that still may fall short.
Unbeknownst to many, contact data cleansing is both art and science. Best-of-breed data cleansing programs have been developed over decades , with very deep knowledgebases, and sophisticated match algorithms built specifically to manage customer data.
4. Pay Attention to the Benchmarks that Matter
Don’t be tempted by platform vendors who only push the benchmarks they perform the best at. To make an unbiased decision familiarize yourself with the benchmarks most important to your company, such as:
- Number of Duplicates: Often touted as a key measure of an application’s efficacy, deduplication figures are only valuable if they are accurate – in othr words, true duplicates. Ask potential vendors if you can use a sample of real customer data to determine
- Speed: Don’t be lured by fancy statistics – processing power largely depends on your data and the machine running the program.
- Versatility: Versatile solutions are great, as long as your users will really be able to take advantage of all the bells and whistles.
- Volume: Volume, or scalability, should consider the amount of data you process today – and five years from today. Consider solutions that can handle the ever-growing variety and variecty of that incoming data without needing hours for processing.
5. Build a Business-Specific Test Case
Data – especially contact and business data – can be so widely complex that not every data cleansing software will work to address your specific needs. When comparing various record management solutions, develop test cases that serve as relevant and appropriate examples of the kinds of data quality issues your organization is experiencing. These will serve as a litmus test to determine which applications will best suited for your specific use case. Be detailed so you can get down to the granular features in the software that address them, such as:
- Do you have contact records with phonetic variations in their names?
- Are certain fields prone to missing or incorrect data?
- Do your datasets consistently have data in the wrong fields (e.g. names in address lines, postal code in city fields, etc)?
- Is business name matching a major priority?
- Do customers often have multiple addresses?
Once you have identified a specific list of recurring challenges within your data, pull several real-world examples from your actual database and use them in any data sample you send to vendors for trial cleansing. When reviewing the results, make sure the solutions you are considering can find these matches on a trial. Each test case will require specific features and strengths that not all data quality software offers. Without this granular level of information about the names, addresses, emails, zip codes and phone numbers that are in your system, you will not be able to fully evaluate whether a software can resolve them or not.
6. Remember It’s Not All Black and White
Contact data quality solutions are often presented as binary – they either find the match or they don’t. In fact, as we mentioned earlier, some vendors will tout the number of matches found as the key benchmark for efficiency. The problem with this perception is that matching is not black and white – there is always a gray area of matches that ‘might be the same, but you can’t really be sure without inspecting each match pair’ so it is important to anticipate how large your gray area will be and have a plan for addressing it. This is where the false match/true match discussion comes into play.
True matches are just what they sound like while false matches are contact records that look and sound alike to the matching engine, but are in fact, different. While it’s great when a software package can find lots of matches, the scary part is in deciding what to do with them. Do you merge and purge them all? What if they are false matches? Which one do you treat as a master record? What info will you lose? What other consequence flowed from that incorrect decision?
The bottom line is: know how your chosen data quality vendor or solution will address the gray area. Ideally, you’ll want a solution that allows the user to set the threshold of match strictness. A mass marketing mailing may err on the side of removing records in the gray area to minimize the risk of mailing dupes whereas customer data integration may require manual review of gray records to ensure they are all correct. If a solution doesn’t mention the gray area or have a way of addressing it, that’s a red flag indicating they do not understand data quality.
7. Don’t Forget About Format
Most companies do not have the luxury of one nice, cleanly formatted database where everyone follows the rules of entry. In fact, most companies have data stored in a variety of places with incoming files muddying the waters on a daily basis. Users and customers are creative in entering information. Legacy systems often have inflexible data structures. Ultimately, every company has a variety of formatting anomalies that need to be considered when exploring data cleansing tools. To avoid finding out too late, make sure to pull together data samples from all your sources and run them during your trial. The data quality solution needs to handle data amalgamation from systems with different structures and standards. Otherwise, inconsistencies will migrate and continue to cause systemic quality problems.
8. Plan for the Future
Wouldn’t it be nice if once data is cleansed, the record set remains clean and static? On the contrary, information constantly evolves, even in the most closed-loop system. Contact records represent real people with changing lives and as a result, decay by at least 4 percent per year through deaths, moves, name changes, postal address changes or even contact preference updates. Business-side changes such as acquisitions/mergers, system changes, upgrades and staff turnover also drive data decay. The post-acquisition company often faces the task of either hybridizing systems or migrating data into the chosen solution. Project teams must not only consider record integrity, but they must update business rules and filters that can affect data format and cleansing standards.
Valid data being entered into the system during the normal course of business (either by CSR reps or by customers themselves) also contributes to ongoing changes within the data. New forms and data elements may be added by marketing and will need to be accounted for in the database. Incoming lists or big data sources will muddy the water. Expansion of sales will result in new audiences and languages providing data in formats you haven’t anticipated. Remember, the only constant in data quality is change. If you begin with this assumption, you skyrocket your project’s likelihood of success. Identify the ways that your data changes over time so you can plan ahead and establish a solution or set of business processes that will scale with your business.
Unfortunately, there is no one-size-fits-all approach to data quality management and there isn’t even a single vendor that can solve all your data quality problems. However, by being aware of some of the common pitfalls and doing a thorough and comprehensive evaluation of any vendors involved, you can get your initiative off to the right start and give yourself the best possible chances of success.
So, use your own data file, test several software options and compare the results in your own environment, with your own users. Plus remember those intangibles like how long it will take you to get it up and running, users trained, quality of reports, etc. These very targeted parameters should be the measure of success for your chosen solution – not what anyone else dictates.